Callforge is a small, focused helper service for speech processing in telephony.
It gives you:
- a long-running daemon that keeps a Whisper model in memory.
- simple CLI tools for development and testing.
- a clean way to plug STT (speech-to-text) into systems like Asterisk.
- a place to add TTS (text-to-speech) later, behind the same interface.
The goal is to move all the heavy ML work out of your dialplan and into a single, reusable process.
-
Speech-to-text backed by faster-whisper
The daemon loads a configurable Whisper model once and exposes a simple interface over a Unix socket.
Polish is a first-class citizen (default language ispl). -
Daemon designed for telephony workflows
You point it at recorded audio files (e.g. from AsteriskRecord()), it gives you back text.
No Python inside the dialplan, no reloading models on every call. -
Simple CLI for local development
Callforge ships with small command-line tools so you can test STT/TTS standalone, without touching PBX configuration. -
Configurable via a single INI file
Model size, device, compute type, logging level and other knobs live in one place.
Callforge is a regular Python package.
From the project root:
# development / editable install
pip install -e .
# or a normal install
pip install .You’ll need a Python environment that can install the dependencies required by faster-whisper (standard modern CPython on Linux is usually enough).
Callforge looks for a configuration file in the Python “data” directory, under:
${DATA_CONFIG}/callforge/settings.confThe exact prefix depends on your Python installation, but it will typically be something like:
/usr/local/callforge/settings.conf
The config is a simple INI file with three sections:
- [GENERAL] – logging and global options
- [STT] – speech-to-text settings
- [TTS] – text-to-speech settings (for future use)
Example settings.conf:
[GENERAL]
log_level = INFO # log level: DEBUG, INFO, WARNING, ERROR, CRITICAL
[STT]
model_size = small # Whisper model size: tiny|base|small|medium|large-v2|large-v3
device = cpu # inference device: cpu|cuda
compute_type = int8 # inference precision: int8|int8_float16|float16|float32
default_language = pl # language hint (ISO 639-1 code), e.g. pl, en, de
beam_size = 5 # beam search width (quality vs speed)
best_of = 5 # number of sampled candidates (mainly for non-zero temperature)
[TTS]
enabled = false # whether TTS backend is enabled at all
voice = default # default voice / preset name (backend-specific)
sample_rate = 22050 # target sample rate for generated audioAll of these settings have sensible defaults in code. Unknown keys are treated as errors on purpose, so typos in config are caught early.
After installing the package, Callforge exposes a few scripts.
callforge-daemon: Starts the main Callforge daemon.callforge-stt: Sends an audio file to the daemon for speech-to-text processing.callforge-tts: Sends text to the daemon for text-to-speech processing.callforge-agi: AGI script used for Asterisk interaction with the callforge daemon.
Callforge is designed to integrate cleanly with Asterisk via AGI.
A typical flow:
- Asterisk records the caller’s audio into a WAV file.
- An AGI script calls Callforge over the Unix socket and asks for STT.
- The AGI script writes the recognized text into a channel variable.
- The dialplan uses that variable to decide what to do next.
A very simple dialplan might look like:
[from-client]
exten => 1000,1,Answer()
same => n,Wait(3)
same => n(start),Record(/tmp/voip-ai-input.wav,5,60)
same => n,AGI(agi.py,/tmp/voip-ai-input.wav)
same => n,NoOp(STT_TEXT = ${STT_TEXT})
same => n,Goto(start)Callforge is released under the MIT License. You are free to use it in open-source or proprietary projects, modify it, and ship it as part of your own systems. See the LICENSE file in this repository for the full text.