A powerful, intelligent audio transcription application built with love for accessibility π
Features β’ Installation β’ Usage β’ Deployment β’ License
This project holds a special place in my heart. It was created for my father, who is hearing impaired, to help him convert audio messages, voice notes, and recordings into readable text. Watching him struggle to understand audio content inspired me to build something that could make his daily life easier.
This is not just another project in my portfolioβit's one of my most cherished creations throughout my entire development journey. Every line of code was written with purpose, every feature designed with empathy, and every bug fixed with determination.
This app represents more than technology; it represents the power of using our skills to make a meaningful difference in the lives of those we love.
- ποΈ Real-time Audio Recording - Record directly from your device
- π File Upload Support - Import existing audio files from whatsapp (.opus)
- π€ AI-Powered Transcription - Powered by Google Gemini 2.5 with triple-fallback system
- πΎ History Management - Save, view, and manage all your transcriptions
- ποΈ Smart Garbage Detection - Automatically filters out accidental or empty recordings
- π Secure API - Protected endpoints with secret-based authentication
- Triple Safety Net Architecture:
- Primary: Gemini 2.5 Pro (High Intelligence)
- Secondary: Gemini 2.5 Flash (High Speed)
- Tertiary: Gemini 2.5 Flash Lite (Lightweight Backup)
- Intelligent Error Handling - Network timeouts, connectivity checks, and graceful failures
- Cross-Platform - Built with Flutter for Android (iOS support possible)
- Offline Storage - Local database for transcription history
- Material Design 3 - Modern, beautiful UI with accessibility in mind
βββββββββββββββββββββββββββββββββββββββββββ
β Flutter Mobile App β
β (Audio Recording + UI + History) β
βββββββββββββββ¬ββββββββββββββββββββββββββββ
β HTTPS + API Secret
βΌ
βββββββββββββββββββββββββββββββββββββββββββ
β FastAPI Backend (Python) β
β (File Processing + API Gateway) β
βββββββββββββββ¬ββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββ
β Google Gemini 2.5 API β
β (Audio β Text Transcription) β
βββββββββββββββββββββββββββββββββββββββββββ
- Flutter SDK: 3.0 or higher
- Python: 3.11 or higher
- Google Gemini API Key: Get it from Google AI Studio
- Git: For cloning the repository
git clone https://github.com/MabelMoncy/TranscriberAppWithServer.git
cd TranscriberAppWithServercd backend# Windows
python -m venv myvenv
myvenv\Scripts\activate
# macOS/Linux
python3 -m venv myvenv
source myvenv/bin/activatepip install -r requirements.txtCreate a .env file in the backend directory:
GEMINI_API_KEY=your_gemini_api_key_here
APP_SECRET=your_secure_secret_hereGenerate a secure secret:
python -c "import secrets; print(secrets.token_urlsafe(32))"
uvicorn main:app --reload --host 0.0.0.0 --port 8000Backend will be available at: http://localhost:8000
cd transcriberappflutter pub getCreate a .env file in the transcriberapp directory:
SERVER_URL=http://YOUR_LOCAL_IP:8000
API_SECRET=your_secure_secret_hereImportant: Replace
YOUR_LOCAL_IPwith your computer's local IP address (e.g.,192.168.1.100)
# Check connected devices
flutter devices
# Run on connected device
flutter run
# Or build APK
flutter build apk --releaseNote: Since free hosting have limits try other backend hosting platforms like koyeb or try by creating a new email id for render or for other hosting platform you are familiar with.
- Create a Render Account: render.com
- Create a New Web Service
- Connect Your GitHub Repository
- Configure Build Settings:
- Build Command:
pip install -r requirements.txt - Start Command:
uvicorn main:app --host 0.0.0.0 --port $PORT
- Build Command:
- Set Environment Variables:
GEMINI_API_KEY: Your API keyAPP_SECRET: Your secure secret
- Deploy!
π Detailed Guide: See backend/RENDER_DEPLOYMENT.md
- Generate Release Keystore
- Configure Signing
- Update Environment Variables with production backend URL
- Build Release APK/AAB
- Upload to Google Play Console
π Detailed Checklist: See transcriberapp/DEPLOYMENT_CHECKLIST.md
- Open the app
- Tap the microphone button to start recording
- Speak clearly
- Tap the stop button when finished
- Wait for transcription (usually takes time since server is hosted for free)
- View your transcribed text!
- Open WhatsApp and choose the voice message you want to transcribe
- Long press and share to the app
- Click the Start Transcription button
- View the transcribed result and you can copy or share it.
- Tap the history button (clock icon)
- View all past transcriptions
- Tap any entry to view details. You can View and also hear by taping the play button
- For deleting tap the delete button
- Flutter - Cross-platform framework
- Dart - Programming language
- record - Audio recording package
- http - API communication
- sqflite - Local database
- flutter_dotenv - Environment configuration
- FastAPI - Modern Python web framework
- Uvicorn - ASGI server
- Google Generative AI - Gemini API integration
- Python-dotenv - Environment management
- Pydantic - Data validation
- Google Gemini 2.5 - Audio transcription
- Render - Backend hosting (recommended)
- Firebase - (Optional) for analytics
- β Environment-based configuration (no hardcoded secrets)
- β API authentication via secret headers
- β Request timeout protection (3 minutes)
- β Input validation and sanitization
- β Garbage audio detection to prevent wasted API calls
- β HTTPS support for production
- β Secure keystore for release builds
Problem: GEMINI_API_KEY not found
- Solution: Ensure
.envfile exists inbackend/directory with valid API key
Problem: 503 Service Unavailable
- Solution: Check Gemini API quota and backend logs
Problem: Connection failed
- Solution: Verify
SERVER_URLin.envpoints to correct backend address
Problem: 401 Unauthorized
- Solution: Ensure
API_SECRETmatches between backend and Flutter app
While this is a personal project, I welcome contributions! If you'd like to help improve it:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- iOS support
- Multi-language transcription
- Speaker identification
- Export transcriptions (PDF, TXT)
- Real-time streaming transcription
- Voice-to-voice translation
- Cloud sync for history
- Dark mode improvements
- My Father - The inspiration behind this project
- Google Gemini Team - For the powerful AI API
- Flutter Community - For amazing packages and support
- FastAPI Team - For the excellent framework
- Everyone who believes in using technology for accessibility
This project is licensed under the MIT License - see the LICENSE file for details.
Mabel Moncy
This project represents countless hours of learning, debugging, and determination. It taught me that the best code we write isn't for grades or portfoliosβit's for the people we love.
If this project helps you or inspires you, please β star it on GitHub!
- GitHub: @MabelMoncy
- Issues: Report a Bug
- Discussions: Ask Questions
