Whisper-Wait is a real-time audio transcription CLI powered by OpenAI's Whisper, with a wide-screen friendly TUI, local models, and OpenAI API support.
- Live recording loop: Press ENTER to start/stop, with a clear recording indicator
- Dual modes: Run local Whisper models or the OpenAI Whisper API
- TUI dashboard: Status, commands, and recent transcripts laid out for wide terminals
- Input device picker: Switch microphones without restarting
- Clipboard auto-copy: Copy transcripts automatically, with an in-app toggle
- Transcript history: Recent transcripts shown in a readable table (newest at bottom)
- Cost tracking (API): Per-clip estimate plus session total
- Safe defaults: .env support, chunking for Whisper limits, audio archiving
- Python 3.8+
- A working microphone
- Optional: CUDA-capable GPU for local models (CPU works too)
-
Clone or download this repository
-
Create a virtual environment:
python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Set up your API key (for API mode):
- Create a
.envfile in the project root:OPENAI_API_KEY=your_api_key_here OPENAI_ORG_ID=your_org_id # Optional - Or export it in your shell:
export OPENAI_API_KEY="your_api_key_here"
- Create a
-
Make the shell script executable:
chmod +x whisper-wait.sh
Run without arguments to enter the menu and dashboard:
./whisper-wait.shYou will:
- Choose OpenAI API or Local Model
- Select a Whisper model (local mode only)
- Enter the recording loop
From the dashboard:
ENTERstart recordinghshow history (press ENTER to return)ctoggle auto-copy to clipboarddchoose input devicemchange model (local mode only)qquit
./whisper-wait.sh api./whisper-wait.sh localSpecify a model directly:
./whisper-wait.sh local -m medium.enLocal Models:
tiny,tiny.enbase,base.ensmall,small.enmedium,medium.enlarge,large-v2,large-v3
API Mode: Uses whisper-1.
- Recording: Press ENTER to start recording. Press ENTER again to stop.
- Processing: Audio is saved to
~/audio_archivewith a UUID filename. Large files are split into smaller chunks. - Transcription:
- Local Mode: Runs the selected Whisper model on CPU/GPU
- API Mode: Uploads audio to the OpenAI API; cost is estimated at $0.006/min
- Output: The transcript is displayed in a styled panel and optionally copied to your clipboard.
- Loop: The dashboard returns for another recording (or
qto quit).
Edit config.py to customize:
SAMPLE_RATE: Recording sample rate (default: 16000 Hz)AUDIO_ARCHIVE_DIR: Archive location (default:~/audio_archive)DEVICE: GPU/CPU selection (auto-detected)DEFAULT_MODEL: Default Whisper model (default:medium.en)MAX_DURATION_SEC/MAX_SIZE_BYTES: Chunk limits for WhisperCHUNK_SIZE: Audio frames per readDEFAULT_AUTO_COPY: Clipboard auto-copy defaultHISTORY_PREVIEW_COUNT: Rows shown in the dashboard history
History + cost logs are stored in ~/.whisper_wait.
For ZSH users, a completion script is included:
# Add to your .zshrc:
fpath=(/path/to/my-rt-whisper $fpath)
autoload -Uz compinit && compinitOr source it directly:
source /path/to/my-rt-whisper/_whisper-wait- Create a
.envfile with your API key or export it in your shell.
- Ensure you're using one of the available models listed above.
- The interactive menu prevents this by restricting selection to valid choices.
- Try a smaller model like
small.enorbase.en. - Ensure CUDA is installed and your GPU has enough memory.
- Check microphone permissions and default input device.
- Verify
sounddevicecan access your microphone:python -m sounddevice
- Some environments do not provide clipboard access. Toggle auto-copy off with
c.
my-rt-whisper/
├── whisper_wait.py # Main application
├── whisper-wait.sh # Shell wrapper script
├── config.py # Configuration constants
├── audio.py # Audio recording/processing utilities
├── transcribe.py # Transcription interfaces
├── requirements.txt # Python dependencies
├── _whisper-wait # ZSH completion script
└── .env # Environment variables (create this)
Runtime state (not in repo):
~/audio_archive/(audio recordings)~/.whisper_wait/last_transcriptions.txt(history)~/.whisper_wait/session_costs.json(API costs)
This project uses OpenAI's Whisper model. Please refer to OpenAI's usage policies and the Whisper repository for licensing information.
Built with:
- OpenAI Whisper
- Rich for terminal UI
- Questionary for interactive prompts
- Click for CLI framework