GSoC Ideas List
Project list for Google Summer of Code 2026
Develop a Python SDK for TranscribeIt
Project Size: Medium (175 hours) Difficulty: Medium
Mentors: Keerthana, Shalini, Bowrna
Project Link: https://codeberg.org/fossiaorg/transcribeit
Description
TranscribeIt currently provides multimedia accessibility services such as audio and video transcriptions (with or without timestamping, speaker diarization) with video descriptions via a web interface, using the FastAPI server (backend). However, this means limiting developers working on integrating multimedia accessibility to write custom wrappers for TranscribeIt's RESTful API, which is tedious and error-prone.
A well-documented, configurable Python SDK that factors the components from the backend server will help in ensuring modul focuses on creating a robust, well-documented Python SDK that allows developers to integrate TranscribeIt into scripts, applications, and data pipelines.
Required Skills
- Python programming
Bonus Skills
- SDK or library design experience
- Familiarity with multimedia processing systems: FFMpeg, OpenAI Whisper
- Python packaging and versioning
Recommended Reading / Resources
Package TranscribeIt Web Application as a Cross-Platform Desktop App (Wails)
Project Size: Small (90 hours) Difficulty: Easy–Medium
Mentors: Aqsa, Keerthana
Project Link: https://codeberg.org/fossiaorg/transcribeit
Description
Package the existing TranscribeIt web application into a cross-platform desktop application using the Wails framework for Windows, macOS, and Linux.
Ensure the server is packaged as an independent service for supporting local operations for multimedia accessibility, using the sidecar pattern. This enables end-users to navigate through the application in an intuitive manner.
Required Skills
- TypeScript
- Next.js
Bonus Skills
- Go (basics)
- Wails (or similar frameworks like Tauri, Electron)
Recommended Reading / Resources
Improve User Experience and Accessibility of TranscribeIt Web Interface
Project Size: Medium (175 hours) Difficulty: Medium
Mentors: Deepraj, Keerthana, Shalini
Project Link: https://codeberg.org/fossiaorg/transcribeit
Description
Improve the usability and accessibility of the TranscribeIt web interface, making it WCAG 2.2 AA compliant, by performing accessibility audit, remediation, testing and validation during development.
Integrate multimedia accessibility features for end-user experience improvement, by providing keyboard bindings, customization in accessibility services (transcription, diarization, video description, etc.)
Required Skills
- Next.js/React
- WCAG knowledge (2.2)
- Accessibility testing with axe-core, WAVE and manual testing
- State Management
Recommended Reading / Resources
Integrate ASR Support for Indic Languages in TranscribeIt
Project Size: Small (90 hours) Difficulty: Medium
Mentors: Bowrna, Shalini
Project Link: https://codeberg.org/fossiaorg/transcribeit
Description
Currently, OpenAI Whisper's fork, named faster-whisper, is integrated in the project which provides timestamped transcription for multi-lingual audio (audio stream containing multiple languages). However, the accuracy for Indic languages is not optimal with the base model. Integrate support for extending ASR capabilities by integrating support (on-demand) for multiple Indic ASR models such as Whisperx-Hindi to reduce WER from 170% to 5%. Similarly, extend to other Indian languages to improve accessibility and integrate support for multi-lingual detection with these language models in a parallelized manner. This would significantly improve accessibility for the Indian audience.
Required Skills
- Python
- Basics of machine learning or NLP
- Audio processing fundamentals
Bonus Skills
- Experience with Whisper, Kaldi, or Vosk
- Knowledge of Indic language datasets
Recommended Reading / Resources
Develop Optimized Multi-Lingual Video Description Generation for framestoryx
Project Size: Medium (175 hours) Difficulty: Medium
Mentors: Keerthana, Shalini, Deepraj, Bowrna
Project Link: https://codeberg.org/fossiaorg/framestoryx
Description
Build an optimized, portable pipeline for generating multi-lingual video descriptions to improve accessibility for visually impaired users.
Required Skills
- Python
- Computer vision basics
- NLP fundamentals
Recommended Reading / Resources
Develop and Deploy Backend for TravelFolks
Project Size: Large (350 hours) Difficulty: Medium–Hard
Mentors: Varshha, Deepraj, Keerthana
Description
Design and deploy a scalable backend to support TravelFolks user accounts, travel listings, recommendations, and accessibility features.
Required Skills
- Backend development (Node.js / Python / Java)
- RESTful API design
- SQL or NoSQL databases
Recommended Reading / Resources
Develop an Accessible Web Interface for TravelFolks
Project Size: Large (350 hours) Difficulty: Medium
Mentors: Varshha, Deepraj
Description
Build a modern, accessible, and user-friendly web interface for TravelFolks, ensuring inclusivity for users with diverse accessibility needs.
Required Skills
- HTML, CSS, JavaScript
- Frontend frameworks (React, Vue, etc.)
- Accessibility best practices
Bonus Skills
- UX research and usability testing
- Design systems and component libraries