A macOS push-to-talk assistant powered by OpenAI Whisper. Press a key to record, release to get transcript where your cursor is.
- Low-latency dictation – capture when the hotkey is held, see text 1–2 seconds after release.
- Flexible hotkeys – Fn/F19, Command, Option, raw key codes (e.g.
code63) and toggle mode for keys with release-only events. - Privacy aware – microphone stream is opened only while recording.
- Auto paste – transcripts go to the clipboard by default, optionally auto
⌘Vinto the focused app. - Debug friendly –
--log-keysand--verboseexpose keyboard and API activity instantly.
- OS: macOS (keyboard hooks and AppleScript integration are macOS-specific)
- Python: 3.12 (enforced by the script)
- OpenAI API Key: Whisper-capable key (recommended model
gpt-4o-mini-transcribe) - Dependencies: see
requirements.txt - System permissions (grant on first run):
- Privacy & Security ▸ Input Monitoring → Terminal / python3.12
- Privacy & Security ▸ Accessibility → Terminal / python3.12 (required for auto paste)
- Privacy & Security ▸ Microphone → Terminal / python3.12
-
Create a virtualenv and install deps
python3.12 -m venv .venv source .venv/bin/activate pip install -r requirements.txt -
Configure
.env(auto-loaded if present)OPENAI_API_KEY=sk-xxxxx WHISPER_MODEL=gpt-4o-mini-transcribe HOTKEY_LISTENER=auto
Use plain ASCII characters—remove smart quotes. Alternatively
export OPENAI_API_KEY=.... -
Optional: remap Fn to F19
- Install Karabiner-Elements
- Add a Simple Modification
fn → f19 - Or skip remap and use
--hotkey code63(Fn virtual key on macOS)
-
Run the script
source .venv/bin/activate python3.12 fn_whisper_transcribe.py \ --hotkey option \ --listener pynput \ --model gpt-4o-mini-transcribe \ --auto-paste \ --verbose- Place the cursor in any text field
- Hold the hotkey, speak, release →
[HH:MM:SS] ...appears and text pastes automatically - Stop with
Ctrl+C
python3.12 fn_whisper_transcribe.py \
--hotkey f19 # Default hotkey; swap to cmd / option / code63 etc.
--listener pynput # Hotkey backend (auto/pynput/keyboard)
--toggle # Tap to start/stop for release-only keys
--min-duration 0.25 # Ignore clips shorter than X seconds
--block-duration 0.05 # Audio block size (smaller = lower latency, higher CPU)
--device "MacBook Pro Microphone" # Explicit input device
--channels 1 # Channel count (set 2 for stereo mics)
--prompt "Meeting notes" # Prompt to bias transcription
--env-file configs/demo.env # Extra env file
--auto-paste # Copy and immediately send ⌘V (needs Accessibility)
--no-clipboard # Skip copying to clipboard
--log-keys # Trace all key events
--verbose # Verbose logging| Symptom | Likely cause | Fix |
|---|---|---|
| “This process is not trusted” | Input Monitoring / Accessibility not granted | Grant Terminal (or .venv/bin/python3.12) under Privacy & Security; run tccutil reset InputMonitoring com.apple.Terminal if stuck |
| Fn key ignored | Not remapped / release-only event | Map to F19 via Karabiner or use --hotkey code63 --toggle |
| Mic icon appears immediately | Other service (e.g. Willow) occupies mic | Stop related agents or reboot |
| 404 Invalid URL | Model doesn't support audio | Use --model gpt-4o-mini-transcribe or update .env |
| Auto paste fails | Accessibility or AppleScript blocked | Allow Terminal and /usr/bin/osascript under Accessibility (“Control your computer”) |
- Compile check:
python3.12 -m compileall fn_whisper_transcribe.py - Keyboard event probe:
python3.12 - <<'PY' from pynput import keyboard print("Listening (Esc to exit)...") def on_press(key): print("down:", key) def on_release(key): print("up:", key); return key == keyboard.Key.esc with keyboard.Listener(on_press=on_press, on_release=on_release) as listener: listener.join() PY
- Fallback to
keyboardbackend (requires sudo):python3.12 fn_whisper_transcribe.py --listener keyboard
No license is included yet. Add a LICENSE file (MIT, Apache-2.0, etc.) before publishing publicly.
Open an issue for hotkey compatibility, permission hurdles, model latency, or quota concerns. Feel free to fork and extend—e.g. live captions or local Whisper backends.