AI Receptionist -- Open Source, Self-Hosted, No Compromises

This project is in active development. Core functionality works (voice conversations, FAQ answering, call transfers, message taking), but expect breaking changes and rough edges. Contributions welcome.

AI Receptionist -- Open Source, Self-Hosted, No Compromises

A production-grade, open-source AI receptionist that answers your business phone calls using OpenAI's Realtime API -- the same speech-to-speech model that powers ChatGPT Advanced Voice. Self-hosted. No vendor lock-in. No monthly SaaS fees bleeding you dry.

This is not another cascaded STT-to-LLM-to-TTS hack. This is a direct speech-to-speech AI voice agent built on the highest-fidelity model available today, connected to your phone system via LiveKit and SIP. It sounds like a real person because it uses the same model that makes ChatGPT's voice mode sound like a real person.

If you have been paying $200-500/month for a SaaS AI receptionist that sounds robotic, interrupts callers, and takes 2 seconds to respond -- stop. Deploy this instead.

Why This Exists

The current crop of AI receptionist SaaS products -- Bland AI, Vapi, Retell AI, Smith.ai, Ruby Receptionist, and the rest -- share the same fundamental problems:

High latency. Most use a cascaded pipeline: transcribe speech to text, send text to an LLM, convert the LLM response back to speech. Each hop adds latency. Callers notice. It feels like talking to a machine on a bad connection.
Robotic voices. Cheap TTS engines produce output that sounds like a GPS navigator reading a script. Callers hang up.
Poor turn-taking. They interrupt you. They talk over you. They go silent for awkward stretches. Real conversations have natural rhythm -- these products do not.
Expensive subscriptions. $200-500/month for what amounts to a wrapper around the same APIs you can call directly. You are paying a markup for a dashboard.
Vendor lock-in. Your call flows, prompts, business logic, and caller data live on someone else's servers. Want to switch providers? Start over.
No data privacy. Your callers' conversations, phone numbers, and messages sit in a third-party database you do not control.
Limited customization. Want to change how call transfers work? Want a custom integration? Submit a feature request and wait.

This project solves all of it:

OpenAI Realtime API (speech-to-speech). No transcription chain. The model hears the caller and speaks back directly. Sub-second response times. Natural turn-taking. The same model behind ChatGPT Advanced Voice.
Self-hosted. Runs on your infrastructure. Your data stays on your servers. Full control.
No monthly SaaS fee. You pay OpenAI for API usage (roughly $0.20-0.30/min) and that is it. No platform markup. No per-seat pricing. No "enterprise tier" upsell.
Fully configurable. Business hours, FAQs, call routing, voice selection, personality -- all defined in a simple YAML file. Change anything, redeploy in seconds.
Multi-business from a single deployment. One agent process handles calls for multiple businesses. Each phone number routes to its own config.
Open source under AGPL-3.0. The code is yours. Fork it, modify it, extend it. Nobody can take this and lock it behind a paywall without releasing their changes.

Comparison: This vs. SaaS AI Receptionists

	AIReceptionist (this project)	Typical SaaS AI Receptionist
Voice fidelity	OpenAI Realtime speech-to-speech -- near-human quality	Cascaded STT + LLM + TTS -- robotic, high latency
Response latency	Sub-second (direct speech-to-speech)	1-3 seconds (multi-hop pipeline)
Turn-taking	Natural, model-native	Awkward pauses, interruptions
Monthly cost	~$0.20-0.30/min API usage only	$200-500/month subscription + per-minute overages
Data privacy	Your servers, your data	Third-party stores your call data
Customization	Full source code, modify anything	Limited to what their dashboard exposes
Vendor lock-in	None -- open source, standard SIP	Proprietary platform, migration is painful
Multi-business	Built in, single deployment	Usually requires separate accounts/plans
Self-hosted	Yes	No
Source code access	Full	None

Features

Natural speech-to-speech conversations via OpenAI Realtime API
Inbound phone call handling via SIP/Twilio/Telnyx
FAQ answering from configurable knowledge base
Call transfers to departments and specific people
Message taking with file-based or webhook delivery
Multi-business support from a single running agent
Built-in noise cancellation optimized for phone audio (LiveKit BVC Telephony)
YAML-based configuration -- no code changes needed to customize
After-hours detection with configurable messages

Prerequisites

Python 3.11+
OpenAI API key (with Realtime API access)
LiveKit server (self-hosted or LiveKit Cloud)
SIP trunk provider (Twilio or Telnyx) with a phone number

Quick Start

Clone and install:

git clone https://github.com/kirklandsig/AIReceptionist.git
cd AIReceptionist
pip install -e .

Configure environment:

cp .env.example .env
# Edit .env with your keys:
#   LIVEKIT_URL, LIVEKIT_API_KEY, LIVEKIT_API_SECRET, OPENAI_API_KEY

Configure your business:

cp config/businesses/example-dental.yaml config/businesses/my-business.yaml
# Edit with your business name, FAQs, routing numbers, hours

Set up telephony:

Follow the LiveKit SIP Trunk Setup Guide to:

Create an inbound SIP trunk pointing to your Twilio/Telnyx number
Create a dispatch rule to route calls to the receptionist agent

Run:

python -m receptionist.agent dev

Call your phone number -- you should hear your AI receptionist answer with your custom greeting.

Configuration

Each business is defined by a YAML file in config/businesses/. See example-dental.yaml for a complete example.

Key sections:

business -- name, type, timezone
voice -- OpenAI voice selection (coral, alloy, ash, ballad, echo, sage, shimmer, verse)
greeting -- what the receptionist says when answering
personality -- system prompt personality instructions
hours -- business hours per day of week
after_hours_message -- what to say when the office is closed
routing -- departments/people the receptionist can transfer to
faqs -- question/answer pairs the receptionist draws from
messages -- how to store messages (file or webhook)

Multi-Business Setup

One running agent can serve multiple businesses. Each inbound phone number maps to a business config via SIP dispatch rule metadata:

{
  "metadata": "{\"config\": \"my-business\"}"
}

This loads config/businesses/my-business.yaml. Add as many business configs as you need -- one agent process handles them all.

Cost

You pay OpenAI directly for Realtime API usage. There is no platform fee, no markup, no subscription.

Estimated cost: ~$0.20-0.30 per minute of conversation.

Business type	Calls/day	Avg duration	Daily cost	Monthly cost
Small office	10	2 min	~$5	~$150
Dental practice	30	2 min	~$15	~$450
Busy front desk	60	1.5 min	~$22	~$660

Compare that to a SaaS AI receptionist at $300-500/month that sounds worse and gives you zero control. At higher call volumes the per-minute model costs more, but you get dramatically better quality and full ownership of the system. For most small-to-medium businesses, the cost is comparable or lower -- and the experience for your callers is not even close.

Alternatives This Replaces

This project is a direct, self-hosted, open-source alternative to:

Bland AI -- AI phone calls API. Cascaded pipeline, closed source, per-minute pricing with platform markup.
Vapi -- Voice AI platform. Another middleman between you and the model. Vendor lock-in.
Retell AI -- Conversational voice AI. Same cascaded architecture, same latency problems.
Smith.ai -- Virtual receptionist service. Expensive, limited customization, your data on their servers.
Ruby Receptionist -- Live + AI receptionist. Premium pricing for a service you can run yourself.

If you are evaluating any of these, try this first. It is free to deploy, and the voice quality speaks for itself.

License

AGPL-3.0

This project is licensed under the GNU Affero General Public License v3.0.

This means: you can use it, modify it, self-host it, and deploy it for your business with no restrictions. But if you run a modified version of this code as a hosted service (i.e., you let other people interact with it over a network), you must release your modifications under the same license.

Why AGPL and not MIT? Because this license specifically prevents companies from taking this code, wrapping it in a SaaS product, and charging people a monthly fee without giving anything back. The whole point of this project is that you should not have to pay rent on software you can run yourself. AGPL ensures it stays that way.

Support the Project

If this saved you from a $300/month SaaS subscription, consider buying me a coffee.

BTC: bc1q573f3x6zlsh06lcfetpmrquw5jr5e26ahu4syn

ETH: 0x5d48560C58b65dc7FeECa2F452c2Df817d1d61CC

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
config/businesses		config/businesses
docs/plans		docs/plans
documentation		documentation
receptionist		receptionist
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
HANDOFF.md		HANDOFF.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Receptionist -- Open Source, Self-Hosted, No Compromises

Why This Exists

Comparison: This vs. SaaS AI Receptionists

Features

Prerequisites

Quick Start

Configuration

Multi-Business Setup

Cost

Alternatives This Replaces

License

Support the Project

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Receptionist -- Open Source, Self-Hosted, No Compromises

Why This Exists

Comparison: This vs. SaaS AI Receptionists

Features

Prerequisites

Quick Start

Configuration

Multi-Business Setup

Cost

Alternatives This Replaces

License

Support the Project

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages