Skip to content

appautomaton/tnt-asr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

TNT 🧨

Terminal voice-to-text powered by Qwen3-ASR-0.6B via antirez/qwen-asr (pure C inference, no PyTorch).

Press Space to record, Space again to transcribe. All local, no network calls.

Setup

Note

Requires Python 3.12+, uv, and a C compiler (for the ASR binary).

On macOS (Apple Silicon), BLAS uses Apple Accelerate automatically (make blas path). If compile tools are missing, run xcode-select --install.

Platform Instructions
Linux / macOS Follow the steps below
Android proot (Termux + Debian/Ubuntu) See README_android.md

Quick start

git clone https://github.com/appautomaton/tnt-asr.git
cd tnt-asr
uv sync
./bootstrap-qwen-asr.sh
uv run tnt

Important

model.safetensors is ~1.7 GB. The bootstrap script will download it on first run.

What does bootstrap do?
  • Uses vendored source at bin/qwen-asr/ to compile bin/qwen_asr
  • Downloads Qwen3-ASR-0.6B model files to bin/qwen3-asr-0.6b/
  • Everything stays inside the repo (bin/), no /tmp build step
  • bin/qwen-asr/ is intended to be retained and committed in this repository

Run

uv run tnt

Keybindings

Key Action
Space Start / stop recording
c Copy last transcript entry to clipboard
C Copy all transcript entries to clipboard
x Clear transcript
q Quit

Project structure

src/tnt/
β”œβ”€β”€ app.py             # Textual TUI, state machine, keybindings
β”œβ”€β”€ audio.py           # Capture backends (live + termux_api)
β”œβ”€β”€ transcriber.py     # Subprocess wrapper for qwen_asr binary
└── widgets/
    β”œβ”€β”€ transcript.py  # Scrollable transcript log
    └── status.py      # Recording indicator + audio level visualizer
bin/
β”œβ”€β”€ qwen-asr/          # Upstream C source snapshot (committed)
β”œβ”€β”€ qwen_asr           # Compiled binary (gitignored)
└── qwen3-asr-0.6b/    # Model weights (gitignored)

Notes

Tip

  • Audio format: 16 kHz, mono, 16-bit PCM β€” what the qwen_asr binary expects.
  • Inference is CPU-only via the C binary. No GPU, no PyTorch, no transformers.
  • The binary and model weights are gitignored. Each developer downloads them locally via bootstrap.

Third-party attribution

License

MIT. See LICENSE.

About

TNT 🧨, powered by Qwen3-ASR

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors