LiveCaptions Pro

Professional Real-Time Speech-to-Text Solution for Events, Seminars, and Presentations

🚀 Try Live Demo 🚀

👉 Click the link above → Allow microphone access → Start speaking instantly!

Features • Use Cases • Getting Started • Documentation • Future Roadmap

📖 Overview

LiveCaptions Pro is a powerful, browser-based real-time speech-to-text application designed for professional settings. It provides live transcription with multi-language support, AI-powered insights, and customizable display options perfect for conferences, seminars, lectures, and accessibility needs.

Why LiveCaptions Pro?

Zero Installation Required - Runs entirely in modern web browsers
Multi-Language Support - Supports English, Hindi, Kannada, Marathi, Tamil, Telugu, and Malayalam
AI-Powered Analysis - Generate summaries and action items using Google Gemini
Professional Display - Multiple caption styles and presentation modes
Accessibility First - Making content accessible to all audiences

✨ Features

Core Functionality

Real-Time Transcription
- Live speech recognition using Web Speech API
- Automatic text normalization and punctuation
- Visual microphone status and voice intensity indicators
- Support for code-switching between languages
Multi-Language Support
- English (US, UK, India)
- Hindi (हिंदी)
- Kannada (ಕನ್ನಡ)
- Marathi (मराठी)
- Tamil (தமிழ்)
- Telugu (తెలుగు)
- Malayalam (മലയാളം)

Display & Customization

Three Caption Styles
- Classic: Traditional scrolling captions
- Modern: Large single-line display with fade-in effects
- Teleprompter: Professional presentation style
Appearance Controls
- Dark and Light themes
- Adjustable font sizes (1-5rem)
- Multiple font families (Inter, Lora, Roboto Mono, Oswald)
- Text alignment options (Left, Center, Right)

AI-Powered Features

Gemini Integration
- Generate quick summaries of presentations
- Extract action items and key takeaways
- Automatic transcript analysis

Professional Features

Presentation Mode
- Fullscreen captions display
- Optimized for projection on screens and halls
- Clean, distraction-free interface
Processing Options
- Number normalization (Indian numerals → Arabic)
- Automatic punctuation
- Text cleanup and formatting

🎯 Use Cases

Educational Institutions

Lectures & Seminars
- Provide real-time captions for students
- Assist hearing-impaired learners
- Enable multilingual classrooms
University Events
- Conferences and symposiums
- Guest lectures
- Academic presentations

Corporate & Professional

Business Meetings
- Board room presentations
- Training sessions
- Client presentations
Conferences & Events
- Keynote speeches
- Panel discussions
- Workshop sessions

Public Venues

Auditoriums & Halls
- Theater performances with narration
- Public speeches
- Community events
Religious & Cultural Events
- Multi-language services
- Cultural programs
- Community gatherings

Accessibility

Inclusive Events
- Making content accessible for deaf/hard-of-hearing attendees
- Multilingual audience support
- Real-time translation assistance

🚀 Getting Started

Option 1: Try Online (Recommended)

The fastest way to get started!

Visit the Live Demo: https://snapandcap.netlify.app/
Allow Microphone Access: Click "Allow" when your browser asks for microphone permissions
Click "Start Mic": The button in the sidebar
Start Speaking: That's it! Your words will appear as live captions

No installation, no setup—just open and use!

Option 2: Run Locally

Want to customize or develop further?

Prerequisites

Modern web browser (Google Chrome recommended for optimal performance)
Microphone access
Stable internet connection (for AI features)

Quick Start (Local)

Download the Project

git clone https://github.com/yourusername/livecaptions-pro.git
cd livecaptions-pro

Open in Browser
- Simply open index.html in Google Chrome
- No server or installation required
Grant Microphone Access
- Click "Allow" when prompted for microphone permissions
- This is required for speech recognition
Start Captioning
- Click the "Start Mic" button
- Select your preferred language
- Begin speaking!

AI Features Setup (Optional)

To enable AI-powered summaries and action items:

Get a free Google Gemini API key from Google AI Studio
Open index.html in a text editor
Locate the apiKey variable (around line 872)
Replace the empty string with your API key:
```
let apiKey = "YOUR_API_KEY_HERE";
```
Save and reload the page

📚 Documentation

Basic Usage

Language Selection
- Use the language dropdown to select your preferred language
- Supports switching between languages on the fly
Caption Styles
- Classic: Traditional scrolling captions at the bottom
- Modern: Large centered text with smooth animations
- Teleprompter: Professional full-screen text display
Customization
- Click the settings icon to access customization options
- Adjust font size, family, alignment, and theme
Presentation Mode
- Click the fullscreen icon for distraction-free display
- Perfect for projecting on large screens

Advanced Features

Voice Intensity Indicator: Visual feedback showing speech detection
Microphone Status: Real-time indicator of recording status
Auto-Scrolling: Captions automatically scroll to show latest text
Text History: Maintains transcript of the entire session

🛠️ Technical Stack

Frontend: HTML5, CSS3, Vanilla JavaScript
Speech Recognition: Web Speech API (SpeechRecognition)
AI Integration: Google Gemini API
Icons: Phosphor Icons
Fonts: Google Fonts (Inter, Lora, Roboto Mono, Oswald)

🔮 Future Roadmap

Planned Features

Version 2.0

WebSocket-based multi-device synchronization
Cloud transcript storage and retrieval
Real-time translation between languages
Custom vocabulary and technical term recognition

Version 3.0

Speaker identification and diarization
Export transcripts to PDF, DOCX, and TXT
Integration with Zoom, Google Meet, Microsoft Teams
Mobile app versions (iOS & Android)

Advanced Capabilities

Offline mode with local models
Custom AI prompts and templates
Multi-track audio support
Advanced analytics dashboard
User accounts and saved preferences

Integration Possibilities

Learning Management Systems (Moodle, Canvas, Blackboard)
Video Conferencing Platforms (Zoom, Meet, Teams)
Content Management Systems (WordPress, Drupal)
Event Management Software (Eventbrite, Hopin)

🎨 Customization Guide

Branding

You can easily customize the app's appearance by modifying the CSS variables in index.html:

:root {
    --accent-color: #8b5cf6;  /* Primary brand color */
    --bg-color: #0f0f13;       /* Background color */
    --text-primary: #f4f4f5;   /* Primary text color */
}

Adding New Languages

To add support for additional languages:

Locate the language selector in the HTML
Add a new option with the appropriate language code:
```
<option value="fr-FR">Français (French)</option>
```
Refer to Web Speech API language codes

🤝 Contributing

We welcome contributions! Here's how you can help:

Fork the repository
Create a feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

Areas for Contribution

Bug fixes and performance improvements
New caption styles and themes
Additional language support
Documentation improvements
Accessibility enhancements

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Web Speech API by Google
Phosphor Icons for beautiful iconography
Google Fonts for typography
Google Gemini for AI capabilities

📞 Support & Contact

For questions, suggestions, or support:

Issues: GitHub Issues
Discussions: GitHub Discussions
Email: support@livecaptions.example.com

🌟 Show Your Support

If you find LiveCaptions Pro helpful, please consider:

⭐ Starring the repository
🐛 Reporting bugs
💡 Suggesting new features
📢 Sharing with others

Made with ❤️ for accessible and inclusive communication

⬆ Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
LICENSE		LICENSE
README.md		README.md
index.html		index.html

License

Shreyas-S-809/SnapandCap

Folders and files

Latest commit

History

Repository files navigation