Professional Real-Time Speech-to-Text Solution for Events, Seminars, and Presentations
🚀 Try Live Demo 🚀
👉 Click the link above → Allow microphone access → Start speaking instantly!
Features • Use Cases • Getting Started • Documentation • Future Roadmap
LiveCaptions Pro is a powerful, browser-based real-time speech-to-text application designed for professional settings. It provides live transcription with multi-language support, AI-powered insights, and customizable display options perfect for conferences, seminars, lectures, and accessibility needs.
- Zero Installation Required - Runs entirely in modern web browsers
- Multi-Language Support - Supports English, Hindi, Kannada, Marathi, Tamil, Telugu, and Malayalam
- AI-Powered Analysis - Generate summaries and action items using Google Gemini
- Professional Display - Multiple caption styles and presentation modes
- Accessibility First - Making content accessible to all audiences
-
Real-Time Transcription
- Live speech recognition using Web Speech API
- Automatic text normalization and punctuation
- Visual microphone status and voice intensity indicators
- Support for code-switching between languages
-
Multi-Language Support
- English (US, UK, India)
- Hindi (हिंदी)
- Kannada (ಕನ್ನಡ)
- Marathi (मराठी)
- Tamil (தமிழ்)
- Telugu (తెలుగు)
- Malayalam (മലയാളം)
-
Three Caption Styles
- Classic: Traditional scrolling captions
- Modern: Large single-line display with fade-in effects
- Teleprompter: Professional presentation style
-
Appearance Controls
- Dark and Light themes
- Adjustable font sizes (1-5rem)
- Multiple font families (Inter, Lora, Roboto Mono, Oswald)
- Text alignment options (Left, Center, Right)
- Gemini Integration
- Generate quick summaries of presentations
- Extract action items and key takeaways
- Automatic transcript analysis
-
Presentation Mode
- Fullscreen captions display
- Optimized for projection on screens and halls
- Clean, distraction-free interface
-
Processing Options
- Number normalization (Indian numerals → Arabic)
- Automatic punctuation
- Text cleanup and formatting
-
Lectures & Seminars
- Provide real-time captions for students
- Assist hearing-impaired learners
- Enable multilingual classrooms
-
University Events
- Conferences and symposiums
- Guest lectures
- Academic presentations
-
Business Meetings
- Board room presentations
- Training sessions
- Client presentations
-
Conferences & Events
- Keynote speeches
- Panel discussions
- Workshop sessions
-
Auditoriums & Halls
- Theater performances with narration
- Public speeches
- Community events
-
Religious & Cultural Events
- Multi-language services
- Cultural programs
- Community gatherings
- Inclusive Events
- Making content accessible for deaf/hard-of-hearing attendees
- Multilingual audience support
- Real-time translation assistance
The fastest way to get started!
- Visit the Live Demo: https://snapandcap.netlify.app/
- Allow Microphone Access: Click "Allow" when your browser asks for microphone permissions
- Click "Start Mic": The button in the sidebar
- Start Speaking: That's it! Your words will appear as live captions
No installation, no setup—just open and use!
Want to customize or develop further?
- Modern web browser (Google Chrome recommended for optimal performance)
- Microphone access
- Stable internet connection (for AI features)
-
Download the Project
git clone https://github.com/yourusername/livecaptions-pro.git cd livecaptions-pro -
Open in Browser
- Simply open
index.htmlin Google Chrome - No server or installation required
- Simply open
-
Grant Microphone Access
- Click "Allow" when prompted for microphone permissions
- This is required for speech recognition
-
Start Captioning
- Click the "Start Mic" button
- Select your preferred language
- Begin speaking!
To enable AI-powered summaries and action items:
- Get a free Google Gemini API key from Google AI Studio
- Open
index.htmlin a text editor - Locate the
apiKeyvariable (around line 872) - Replace the empty string with your API key:
let apiKey = "YOUR_API_KEY_HERE";
- Save and reload the page
-
Language Selection
- Use the language dropdown to select your preferred language
- Supports switching between languages on the fly
-
Caption Styles
- Classic: Traditional scrolling captions at the bottom
- Modern: Large centered text with smooth animations
- Teleprompter: Professional full-screen text display
-
Customization
- Click the settings icon to access customization options
- Adjust font size, family, alignment, and theme
-
Presentation Mode
- Click the fullscreen icon for distraction-free display
- Perfect for projecting on large screens
- Voice Intensity Indicator: Visual feedback showing speech detection
- Microphone Status: Real-time indicator of recording status
- Auto-Scrolling: Captions automatically scroll to show latest text
- Text History: Maintains transcript of the entire session
- Frontend: HTML5, CSS3, Vanilla JavaScript
- Speech Recognition: Web Speech API (SpeechRecognition)
- AI Integration: Google Gemini API
- Icons: Phosphor Icons
- Fonts: Google Fonts (Inter, Lora, Roboto Mono, Oswald)
- WebSocket-based multi-device synchronization
- Cloud transcript storage and retrieval
- Real-time translation between languages
- Custom vocabulary and technical term recognition
- Speaker identification and diarization
- Export transcripts to PDF, DOCX, and TXT
- Integration with Zoom, Google Meet, Microsoft Teams
- Mobile app versions (iOS & Android)
- Offline mode with local models
- Custom AI prompts and templates
- Multi-track audio support
- Advanced analytics dashboard
- User accounts and saved preferences
- Learning Management Systems (Moodle, Canvas, Blackboard)
- Video Conferencing Platforms (Zoom, Meet, Teams)
- Content Management Systems (WordPress, Drupal)
- Event Management Software (Eventbrite, Hopin)
You can easily customize the app's appearance by modifying the CSS variables in index.html:
:root {
--accent-color: #8b5cf6; /* Primary brand color */
--bg-color: #0f0f13; /* Background color */
--text-primary: #f4f4f5; /* Primary text color */
}To add support for additional languages:
- Locate the language selector in the HTML
- Add a new option with the appropriate language code:
<option value="fr-FR">Français (French)</option>
- Refer to Web Speech API language codes
We welcome contributions! Here's how you can help:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Bug fixes and performance improvements
- New caption styles and themes
- Additional language support
- Documentation improvements
- Accessibility enhancements
This project is licensed under the MIT License - see the LICENSE file for details.
- Web Speech API by Google
- Phosphor Icons for beautiful iconography
- Google Fonts for typography
- Google Gemini for AI capabilities
For questions, suggestions, or support:
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: support@livecaptions.example.com
If you find LiveCaptions Pro helpful, please consider:
- ⭐ Starring the repository
- 🐛 Reporting bugs
- 💡 Suggesting new features
- 📢 Sharing with others
Made with ❤️ for accessible and inclusive communication