Yammer

Local speech-to-text with LLM correction for Linux. Speak naturally, get polished text typed into any application.

Features

Voice Activity Detection - Automatically detects when you start/stop speaking
Speech-to-Text - Local Whisper model for accurate transcription
LLM Text Correction - Fixes homophones, adds punctuation, removes filler words
Floating Overlay - Transparent window with real-time waveform visualization
Universal Input - Types into any application via xdotool
100% Local - No cloud services, all processing on-device
GPU Accelerated - Optional CUDA support for faster transcription

Requirements

Linux with X11 (Wayland not yet supported)
4GB RAM minimum, 8GB recommended
~2GB disk for models

# Check your session type
echo $XDG_SESSION_TYPE  # Should output: x11

Quick Start

1. Install Dependencies

# Ubuntu/Debian
sudo apt install clang libclang-dev libasound2-dev xdotool libayatana-appindicator3-dev nodejs npm

# Arch Linux
sudo pacman -S clang alsa-lib xdotool libayatana-appindicator nodejs npm

# Fedora
sudo dnf install clang clang-devel alsa-lib-devel xdotool libayatana-appindicator-gtk3-devel nodejs npm

2. Build

git clone https://github.com/andynu/yammer.git
cd yammer
cargo build --release

First build takes 10-15 minutes (compiles Whisper and llama.cpp).

3. Download Models

cargo run --release --bin yammer download-models

Downloads Whisper (~141MB) and TinyLlama (~1.6GB) to ~/.cache/yammer/models/.

4. Run

# GUI mode
cd yammer-app && npm install && npm run tauri dev

# CLI mode (for testing)
cargo run --release --bin yammer dictate

Usage

GUI Mode

The floating overlay shows:

Microphone status and state indicator
Real-time waveform visualization
Transcribed and corrected text

Press the global hotkey (configurable) to start/stop dictation. Text is automatically typed into the focused application.

CLI Commands

# Live dictation
yammer dictate

# List audio devices
yammer list-devices

# Record audio
yammer record --duration 5 --output test.wav

# Transcribe a file
yammer transcribe test.wav

# Test voice activity detection
yammer vad-test --duration 30

# Correct text with LLM
yammer correct "your transcribed text here"

Project Structure

yammer/
├── yammer-core/     # Shared types, config, model management
├── yammer-audio/    # Audio capture, VAD, resampling
├── yammer-stt/      # Whisper speech-to-text
├── yammer-llm/      # LLM text correction
├── yammer-output/   # Text output via xdotool
├── yammer-cli/      # CLI interface
└── yammer-app/      # Tauri desktop app

Architecture

Microphone → Audio Capture → VAD → Resampler (16kHz)
                                       ↓
                                   Whisper → LLM Correction → xdotool → Active Window
                                       ↓
                               Tauri UI (waveform, status, text)

Configuration

VAD sensitivity can be adjusted:

# More sensitive (quiet speech)
yammer dictate --threshold 0.005

# Less sensitive (loud/clear speech only)
yammer dictate --threshold 0.02

Performance

Component	CPU	GPU (CUDA)
Whisper (5s audio)	500ms-1.5s	250-750ms
LLM correction	100-500ms	-
VAD latency	~200ms start, ~400ms end	-

Troubleshooting

Build fails with "stdbool.h not found"

sudo apt install clang libclang-dev

"No audio devices found"

sudo apt install libasound2-dev
cargo clean && cargo build --release

Window not transparent Requires X11 with a compositor (standard on GNOME/KDE).

Transcription slow Use --release builds. Consider enabling CUDA if you have an NVIDIA GPU.

Name		Name	Last commit message	Last commit date
Latest commit History 180 Commits
.beads		.beads
.github/workflows		.github/workflows
docs		docs
scripts		scripts
yammer-app		yammer-app
yammer-audio		yammer-audio
yammer-cli		yammer-cli
yammer-core		yammer-core
yammer-llm		yammer-llm
yammer-output		yammer-output
yammer-stt		yammer-stt
.gitignore		.gitignore
AUDIO_TESTING.md		AUDIO_TESTING.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
THIRD_PARTY_NOTICES.md		THIRD_PARTY_NOTICES.md
idea.md		idea.md
implementation_research.md		implementation_research.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Yammer

Features

Requirements

Quick Start

1. Install Dependencies

2. Build

3. Download Models

4. Run

Usage

GUI Mode

CLI Commands

Project Structure

Architecture

Configuration

Performance

Troubleshooting

Documentation

License

Credits

About

Uh oh!

Releases

Packages

Languages

License

andynu/yammer

Folders and files

Latest commit

History

Repository files navigation

Yammer

Features

Requirements

Quick Start

1. Install Dependencies

2. Build

3. Download Models

4. Run

Usage

GUI Mode

CLI Commands

Project Structure

Architecture

Configuration

Performance

Troubleshooting

Documentation

License

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages