Skip to content

codingaslu/FastRTC-Building-Real-Time-Audio-Application

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Gemini Voice Chat

A real-time voice interface for Google's Gemini AI model that allows you to have natural conversations with Gemini using speech. This application uses WebRTC for low-latency audio streaming.

Gemini Voice Chat

Features

  • Real-time voice interaction with Gemini
  • Audio visualization with responsive waveform display
  • Multiple voice options (Puck, Charon, Kore, Fenrir, Aoede)
  • Low-latency response with WebRTC streaming
  • Simple and intuitive user interface

Requirements

  • Python 3.8+
  • Gemini API key (get one at Google AI Studio)
  • Modern web browser with WebRTC support

Installation

  1. Clone this repository:

    git clone <repository-url>
    cd gemini-voice-chat
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Set up your environment variables:

    cp .env.example .env
    

    Then edit the .env file and add your Gemini API key.

Usage

Running the Application

Start the application with:

python app.py

By default, the application will run in the mode specified in your .env file:

  • MODE=UI: Launches the Gradio UI interface
  • MODE=PHONE: Uses fastphone mode for mobile compatibility
  • Leave blank to run with uvicorn server

Accessing the Interface

  1. Open your browser and navigate to:

  2. Enter your Gemini API key (if not pre-configured in .env)

  3. Select your preferred voice

  4. Click "Start Recording" and speak with Gemini

  5. Click "Stop Recording" when you're done

Voice Options

The application offers several voice options for Gemini's responses:

  • Puck: Default voice
  • Charon: Alternative voice option
  • Kore: Alternative voice option
  • Fenrir: Alternative voice option
  • Aoede: Alternative voice option

Configuration

The application can be configured using environment variables in the .env file:

  • GEMINI_API_KEY: Your Google Gemini API key
  • MODE: Application mode (UI, PHONE, or blank for uvicorn)

Troubleshooting

Common issues:

  • Connection errors: If you're using a VPN, try disabling it as it might interfere with WebRTC connections.
  • Audio not working: Ensure your browser has permission to access your microphone.
  • API key errors: Verify that your Gemini API key is valid and correctly entered.

License

[Specify your license information here]

Acknowledgements

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published