Skip to content

Turn speech into beautiful live captions instantly — in English, Hindi, Kannada, Tamil & more. AI summaries + pro display modes. No install, just open and speak.

License

Notifications You must be signed in to change notification settings

Shreyas-S-809/SnapandCap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

LiveCaptions Pro

LiveCaptions Pro Browser Support License Live Demo

Professional Real-Time Speech-to-Text Solution for Events, Seminars, and Presentations

🚀 Try Live Demo 🚀

👉 Click the link above → Allow microphone access → Start speaking instantly!

FeaturesUse CasesGetting StartedDocumentationFuture Roadmap


📖 Overview

LiveCaptions Pro is a powerful, browser-based real-time speech-to-text application designed for professional settings. It provides live transcription with multi-language support, AI-powered insights, and customizable display options perfect for conferences, seminars, lectures, and accessibility needs.

Why LiveCaptions Pro?

  • Zero Installation Required - Runs entirely in modern web browsers
  • Multi-Language Support - Supports English, Hindi, Kannada, Marathi, Tamil, Telugu, and Malayalam
  • AI-Powered Analysis - Generate summaries and action items using Google Gemini
  • Professional Display - Multiple caption styles and presentation modes
  • Accessibility First - Making content accessible to all audiences

✨ Features

Core Functionality

  • Real-Time Transcription

    • Live speech recognition using Web Speech API
    • Automatic text normalization and punctuation
    • Visual microphone status and voice intensity indicators
    • Support for code-switching between languages
  • Multi-Language Support

    • English (US, UK, India)
    • Hindi (हिंदी)
    • Kannada (ಕನ್ನಡ)
    • Marathi (मराठी)
    • Tamil (தமிழ்)
    • Telugu (తెలుగు)
    • Malayalam (മലയാളം)

Display & Customization

  • Three Caption Styles

    • Classic: Traditional scrolling captions
    • Modern: Large single-line display with fade-in effects
    • Teleprompter: Professional presentation style
  • Appearance Controls

    • Dark and Light themes
    • Adjustable font sizes (1-5rem)
    • Multiple font families (Inter, Lora, Roboto Mono, Oswald)
    • Text alignment options (Left, Center, Right)

AI-Powered Features

  • Gemini Integration
    • Generate quick summaries of presentations
    • Extract action items and key takeaways
    • Automatic transcript analysis

Professional Features

  • Presentation Mode

    • Fullscreen captions display
    • Optimized for projection on screens and halls
    • Clean, distraction-free interface
  • Processing Options

    • Number normalization (Indian numerals → Arabic)
    • Automatic punctuation
    • Text cleanup and formatting

🎯 Use Cases

Educational Institutions

  • Lectures & Seminars

    • Provide real-time captions for students
    • Assist hearing-impaired learners
    • Enable multilingual classrooms
  • University Events

    • Conferences and symposiums
    • Guest lectures
    • Academic presentations

Corporate & Professional

  • Business Meetings

    • Board room presentations
    • Training sessions
    • Client presentations
  • Conferences & Events

    • Keynote speeches
    • Panel discussions
    • Workshop sessions

Public Venues

  • Auditoriums & Halls

    • Theater performances with narration
    • Public speeches
    • Community events
  • Religious & Cultural Events

    • Multi-language services
    • Cultural programs
    • Community gatherings

Accessibility

  • Inclusive Events
    • Making content accessible for deaf/hard-of-hearing attendees
    • Multilingual audience support
    • Real-time translation assistance

🚀 Getting Started

Option 1: Try Online (Recommended)

The fastest way to get started!

  1. Visit the Live Demo: https://snapandcap.netlify.app/
  2. Allow Microphone Access: Click "Allow" when your browser asks for microphone permissions
  3. Click "Start Mic": The button in the sidebar
  4. Start Speaking: That's it! Your words will appear as live captions

No installation, no setup—just open and use!

Option 2: Run Locally

Want to customize or develop further?

Prerequisites

  • Modern web browser (Google Chrome recommended for optimal performance)
  • Microphone access
  • Stable internet connection (for AI features)

Quick Start (Local)

  1. Download the Project

    git clone https://github.com/yourusername/livecaptions-pro.git
    cd livecaptions-pro
  2. Open in Browser

    • Simply open index.html in Google Chrome
    • No server or installation required
  3. Grant Microphone Access

    • Click "Allow" when prompted for microphone permissions
    • This is required for speech recognition
  4. Start Captioning

    • Click the "Start Mic" button
    • Select your preferred language
    • Begin speaking!

AI Features Setup (Optional)

To enable AI-powered summaries and action items:

  1. Get a free Google Gemini API key from Google AI Studio
  2. Open index.html in a text editor
  3. Locate the apiKey variable (around line 872)
  4. Replace the empty string with your API key:
    let apiKey = "YOUR_API_KEY_HERE";
  5. Save and reload the page

📚 Documentation

Basic Usage

  1. Language Selection

    • Use the language dropdown to select your preferred language
    • Supports switching between languages on the fly
  2. Caption Styles

    • Classic: Traditional scrolling captions at the bottom
    • Modern: Large centered text with smooth animations
    • Teleprompter: Professional full-screen text display
  3. Customization

    • Click the settings icon to access customization options
    • Adjust font size, family, alignment, and theme
  4. Presentation Mode

    • Click the fullscreen icon for distraction-free display
    • Perfect for projecting on large screens

Advanced Features

  • Voice Intensity Indicator: Visual feedback showing speech detection
  • Microphone Status: Real-time indicator of recording status
  • Auto-Scrolling: Captions automatically scroll to show latest text
  • Text History: Maintains transcript of the entire session

🛠️ Technical Stack

  • Frontend: HTML5, CSS3, Vanilla JavaScript
  • Speech Recognition: Web Speech API (SpeechRecognition)
  • AI Integration: Google Gemini API
  • Icons: Phosphor Icons
  • Fonts: Google Fonts (Inter, Lora, Roboto Mono, Oswald)

🔮 Future Roadmap

Planned Features

Version 2.0

  • WebSocket-based multi-device synchronization
  • Cloud transcript storage and retrieval
  • Real-time translation between languages
  • Custom vocabulary and technical term recognition

Version 3.0

  • Speaker identification and diarization
  • Export transcripts to PDF, DOCX, and TXT
  • Integration with Zoom, Google Meet, Microsoft Teams
  • Mobile app versions (iOS & Android)

Advanced Capabilities

  • Offline mode with local models
  • Custom AI prompts and templates
  • Multi-track audio support
  • Advanced analytics dashboard
  • User accounts and saved preferences

Integration Possibilities

  • Learning Management Systems (Moodle, Canvas, Blackboard)
  • Video Conferencing Platforms (Zoom, Meet, Teams)
  • Content Management Systems (WordPress, Drupal)
  • Event Management Software (Eventbrite, Hopin)

🎨 Customization Guide

Branding

You can easily customize the app's appearance by modifying the CSS variables in index.html:

:root {
    --accent-color: #8b5cf6;  /* Primary brand color */
    --bg-color: #0f0f13;       /* Background color */
    --text-primary: #f4f4f5;   /* Primary text color */
}

Adding New Languages

To add support for additional languages:

  1. Locate the language selector in the HTML
  2. Add a new option with the appropriate language code:
    <option value="fr-FR">Français (French)</option>
  3. Refer to Web Speech API language codes

🤝 Contributing

We welcome contributions! Here's how you can help:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Areas for Contribution

  • Bug fixes and performance improvements
  • New caption styles and themes
  • Additional language support
  • Documentation improvements
  • Accessibility enhancements

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


🙏 Acknowledgments

  • Web Speech API by Google
  • Phosphor Icons for beautiful iconography
  • Google Fonts for typography
  • Google Gemini for AI capabilities

📞 Support & Contact

For questions, suggestions, or support:


🌟 Show Your Support

If you find LiveCaptions Pro helpful, please consider:

  • ⭐ Starring the repository
  • 🐛 Reporting bugs
  • 💡 Suggesting new features
  • 📢 Sharing with others

Made with ❤️ for accessible and inclusive communication

⬆ Back to Top

About

Turn speech into beautiful live captions instantly — in English, Hindi, Kannada, Tamil & more. AI summaries + pro display modes. No install, just open and speak.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages