Skip to content

rajumanoj333/Call-System

Repository files navigation

Gemini Phone

Voice interface for Google Gemini via SIP/Twilio. Call your AI, and your AI can call you.

Overview

Gemini Phone gives your Gemini AI a phone number. It bridges voice calls with the power of Google's Gemini models, allowing for natural, low-latency voice conversations.

  • Inbound Calls: Call your AI assistant to run commands, check status, or just have a conversation.
  • Outbound Calls: Programmatically trigger the AI to call you with alerts or updates, then interact with it via voice.
  • Multimodal capabilities: Leverages Gemini's advanced reasoning for complex voice-based tasks.

Tech Stack

Component Technology
AI Backend Google Gemini API
Voice Gateway Twilio
Text-to-Speech ElevenLabs
Speech-to-Text OpenAI Whisper
SIP Infrastructure Drachtio & FreeSWITCH (Dockerized)
Control Logic Node.js

Prerequisites

  • Gemini API Key: Get it from Google AI Studio.
  • Twilio Account: For the phone number and SIP trunking.
  • ElevenLabs API Key: For high-quality voice synthesis.
  • OpenAI API Key: For accurate speech-to-text (Whisper).
  • Docker & Node.js: Required for running the voice server and CLI.

Quick Start

1. Install

Clone the repository and install dependencies:

git clone https://github.com/rajumanoj333/Call-System.git
cd Call-System
npm install

2. Setup

Run the interactive setup wizard to configure your API keys and SIP settings:

npm run setup

3. Start

Launch the Gemini Phone services:

npm start

CLI Commands

The gemini-phone CLI (accessible via npm run ... or linked globally) provides several tools:

Command Description
setup Interactive configuration wizard
start Start all services (Docker + API server)
stop Stop all services
status Show service and container status
doctor Health check for dependencies
logs View real-time logs
device add Add a new SIP extension/identity
device list List configured extensions

Architecture

┌─────────────────┐       ┌──────────────┐       ┌──────────────────┐
│   Your Phone    │ ◄───► │    Twilio    │ ◄───► │  Gemini Phone    │
│ (Voice/SIP)     │       │ (Cloud PBX)  │       │  (Voice App)     │
└─────────────────┘       └──────────────┘       └────────┬─────────┘
                                                          │
                                         ┌────────────────┴─────────┐
                                         │       Gemini AI          │
                                         │ (Model: gemini-1.5-pro)  │
                                         └──────────────────────────┘

API Reference

The system exposes a REST API for programmatic interaction:

  • POST /api/outbound-call: Trigger an outbound call to a specified number.
  • POST /api/query: Send a text query to the AI as if it came from a specific device.
  • GET /api/devices: List all configured extensions.

Documentation

License

MIT

About

AI voice agent that calls users and talks using Twilio, Gemini, ElevenLabs, and GCP.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors