ghisper

Real-time speech-to-text combining Go client with SimulStreaming Python backend.

Overview

ghisper splits speech recognition into two parts:

Go client: Audio capture, typing automation, system integration
Python backend: SimulStreaming ASR engine with AlignAtt policy

Features

Real-time transcription streaming
Multi-language support (100+ languages via Whisper)
Progressive typing via uinputd-go
Low latency (~200-500ms)
Unix socket IPC for minimal overhead
Interactive installation with GPU detection

Architecture

Go Client (ghisper)
  - Audio capture (malgo)
  - Unix socket client
  - Progressive typing (uinputd-go)
        |
        v  Unix Socket
Python Backend (user systemd service)
  - SimulStreaming ASR
  - Whisper (tiny → large-v3-turbo) + AlignAtt
  - HuggingFace model browser

Quick Start

Prerequisites

# System packages
sudo pacman -S python git

# Go 1.23+
# Python 3.10+

Build and Install

# Build Go client
make build

# Install to ~/.local/bin (user-local, no sudo)
make install

# Install Python backend (interactive)
ghisper install backend
# - Detects GPU (NVIDIA/AMD/none)
# - Choose PyTorch variant (CPU/CUDA/ROCm)
# - Select Whisper model (tiny → large-v3-turbo)
# - Creates venv in ~/.local/share/ghisper/venv
# - Generates config at ~/.config/ghisper/config.toml

# Install and start systemd service
ghisper install systemd-service

Usage

# Check system status
ghisper status

# Start recording (press 'r' or Space to toggle)
ghisper record

# Stop all sessions
ghisper stop

# Run health checks
ghisper doctor

Configuration

Config: ~/.config/ghisper/config.toml

[server]
type = "unix"
socket_path = "/tmp/ghisper.sock"

[model]
name = "base"
device = "auto"  # auto, cpu, cuda, rocm

[processing]
language = ""  # auto-detect
task = "transcribe"

[client.audio]
device = "default"
chunk_size_ms = 100

[client.typing]
enabled = true
layout = "us"
progressive = true

[logging]
level = "info"

Development

make build         # Build to bin/ghisper
make install       # Install to ~/.local/bin
make uninstall     # Remove binary and backend
make purge         # Full cleanup (config + models)
make check         # Format, vet, test

Project Structure

ghisper/
├── cmd/ghisper/          # CLI commands
├── internal/
│   ├── audio/            # Audio capture (malgo)
│   ├── client/           # Backend client
│   ├── typer/            # Typing (uinputd-go)
│   ├── config/           # Config management
│   ├── models/           # Model registry
│   ├── protocol/         # IPC protocol
│   ├── installer/        # Installation logic
│   └── doctor/           # Health checks
└── backend/              # Python backend
    ├── server.py         # SimulStreaming server
    ├── config.py         # Config parser
    └── convert_model.py  # HF → Whisper converter

Dependencies

Go

github.com/gen2brain/malgo - Audio capture
github.com/bnema/uinputd-go - Keyboard typing
github.com/spf13/cobra - CLI framework
github.com/charmbracelet/* - Terminal UI

Python

torch - Deep learning backend
openai-whisper - ASR model
SimulStreaming - Streaming ASR
go-huggingface - Model downloads

License

MIT

References

SimulStreaming: https://github.com/ufal/SimulStreaming
uinputd-go: https://github.com/bnema/uinputd-go
voxd: https://github.com/jakov-nordic/voxd

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
backend		backend
cmd/ghisper		cmd/ghisper
configs		configs
internal		internal
.gitignore		.gitignore
.gitmodules		.gitmodules
Makefile		Makefile
README.md		README.md
ghisper		ghisper
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ghisper

Overview

Features

Architecture

Quick Start

Prerequisites

Build and Install

Usage

Configuration

Development

Project Structure

Dependencies

Go

Python

License

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ghisper

Overview

Features

Architecture

Quick Start

Prerequisites

Build and Install

Usage

Configuration

Development

Project Structure

Dependencies

Go

Python

License

References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages