Add local-first TTS/STT providers, event-driven architecture, and uv/justfile tooling by lancekrogers · Pull Request #1 · ethanplusai/samantha-cli

lancekrogers · 2026-03-16T11:38:38Z

Summary

Major overhaul that makes Samantha fully usable without any API keys or cloud services. Adds a pluggable provider system for both TTS and STT, replaces the tightly-coupled conversation loop with an event-driven architecture, and modernizes the build/dev tooling.

Local-First Audio Providers

Kokoro TTS (default) — high-quality local ONNX text-to-speech, no API key needed
Whisper STT (default) — local speech-to-text via faster-whisper, no internet required
Edge TTS — free cloud alternative (Microsoft), no API key
Existing Fish Audio TTS preserved as an optional paid provider for custom voice clones
Google STT preserved as an optional cloud fallback
All providers follow a common base class (TTSProvider / STTProvider) for easy extensibility

Event-Driven Architecture

New EventBus (samantha/events.py) decouples the conversation engine from the UI
New ConversationEngine (samantha/engine.py) manages the listen → think → speak loop independently of display concerns
UI subscribes to engine events (listening, thinking, speaking, error, etc.) instead of being called directly
Adds real-time status callbacks for STT listening/transcribing phases
Live microphone level visualization during listening state

New CLI Commands

samantha test — end-to-end mic + speaker test with per-provider diagnostics
samantha voices — list available TTS voices with locale/gender filtering
samantha providers — show installed/active TTS and STT providers with install hints

Build & Tooling

Migrated from setuptools to hatchling build backend
Added uv as the package manager with optional dependency groups ([whisper], [fish], [edge], [local], [cloud], [all])
Added python-dotenv for .env file support
Full justfile system with modular recipes: voice.just, dev.just, install.just
Recipes for switching providers, going fully local, testing audio, and more

Other Improvements

Animated microphone waveform indicator for listening state
Conversation history and config management unchanged (backward compatible)
Updated CLAUDE.md to reflect new defaults and zero-API-key setup

Changed Files

samantha/tts/ — New TTS provider package (kokoro, edge, fish + base class)
samantha/stt/ — New STT provider package (whisper, google + base class)
samantha/engine.py — New conversation engine (decoupled from UI)
samantha/events.py — New event bus for engine↔UI communication
samantha/voice.py — Refactored to provider-agnostic orchestration
samantha/cli.py — New subcommands (test, voices, providers)
samantha/config.py — New provider settings and defaults
samantha/ui.py — Event-driven status display with waveform animation
pyproject.toml — hatchling backend, optional deps, dotenv
justfile + .justfiles/ — Full modular just recipe system
uv.lock — Lockfile for reproducible installs

Test Plan

samantha --text works with default Kokoro TTS (no API keys)
samantha works with default Whisper STT + Kokoro TTS (fully local)
samantha test correctly reports mic and speaker status
samantha providers shows installed vs missing providers
samantha voices lists available voices for the active TTS provider
Switching providers via samantha config tts_provider edge works
just recipes work (just talk, just text, just voice go-local)
Fish Audio TTS still works when API key is provided
Google STT still works as a fallback

…h edge-tts default

…t device

…or each pipeline step

…ning/transcribing phases

Replace static status dot with pulsing waveform bars animation that shows real-time visual feedback during voice capture phases.

Extract conversation logic into engine.py with event bus (events.py), allowing UI hot-reload without losing conversation state.

…ation Default STT provider is now Whisper instead of Google, completing the move to fully local providers (Kokoro TTS was already default). No API keys needed out of the box. Also adds real-time mic level tracking with smooth waveform animation that responds to actual voice input, replacing the static looping frames. The mic animation now shows idle breathing dots when waiting and a live level-driven waveform when speech is detected.

Add configurable TTS/STT provider system with Kokoro-82M local TTS

lancekrogers added 13 commits March 15, 2026 23:57

Packaged with uv

97dc376

Add dotenv support

5cf19be

[OBEY-CAMPAIGN-8a57dff6] Add configurable TTS/STT provider system wit…

fadc40b

…h edge-tts default

[OBEY-CAMPAIGN-8a57dff6] Add modular justfile system for samantha-cli

b1f41a3

[OBEY-CAMPAIGN-8a57dff6] Fix mic selection to use system default inpu…

0d171e8

…t device

[OBEY-CAMPAIGN-8a57dff6] Add Kokoro-82M as default local TTS provider

6433ca2

[OBEY-CAMPAIGN-8a57dff6] Add granular status indicators with timing f…

06e3767

…or each pipeline step

[OBEY-CAMPAIGN-8a57dff6] Add real-time status callbacks for STT liste…

7fca2ed

…ning/transcribing phases

Add animated microphone waveform indicator for listening state

219d72a

Replace static status dot with pulsing waveform bars animation that shows real-time visual feedback during voice capture phases.

Separate conversation engine from UI with event-driven architecture

d56a792

Extract conversation logic into engine.py with event bus (events.py), allowing UI hot-reload without losing conversation state.

Update

6c33497

Merge pull request #1 from lancekrogers/lr/develop

f2d26ea

Add configurable TTS/STT provider system with Kokoro-82M local TTS

lancekrogers changed the title ~~Update with local model support, uv package manager and dotenv~~ Add local-first TTS/STT providers, event-driven architecture, and uv/justfile tooling Mar 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add local-first TTS/STT providers, event-driven architecture, and uv/justfile tooling#1

Add local-first TTS/STT providers, event-driven architecture, and uv/justfile tooling#1
lancekrogers wants to merge 13 commits intoethanplusai:mainfrom
Obedience-Corp:main

lancekrogers commented Mar 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lancekrogers commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Local-First Audio Providers

Event-Driven Architecture

New CLI Commands

Build & Tooling

Other Improvements

Changed Files

Test Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lancekrogers commented Mar 16, 2026 •

edited

Loading