Fn Key Whisper Transcriber

A macOS push-to-talk assistant powered by OpenAI Whisper. Press a key to record, release to get transcript where your cursor is.

✨ Highlights

Low-latency dictation – capture when the hotkey is held, see text 1–2 seconds after release.
Flexible hotkeys – Fn/F19, Command, Option, raw key codes (e.g. code63) and toggle mode for keys with release-only events.
Privacy aware – microphone stream is opened only while recording.
Auto paste – transcripts go to the clipboard by default, optionally auto ⌘V into the focused app.
Debug friendly – --log-keys and --verbose expose keyboard and API activity instantly.

✅ Requirements

OS: macOS (keyboard hooks and AppleScript integration are macOS-specific)
Python: 3.12 (enforced by the script)
OpenAI API Key: Whisper-capable key (recommended model gpt-4o-mini-transcribe)
Dependencies: see requirements.txt
System permissions (grant on first run):
- Privacy & Security ▸ Input Monitoring → Terminal / python3.12
- Privacy & Security ▸ Accessibility → Terminal / python3.12 (required for auto paste)
- Privacy & Security ▸ Microphone → Terminal / python3.12

🚀 Quick Start

Create a virtualenv and install deps

python3.12 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Configure .env (auto-loaded if present)
```
OPENAI_API_KEY=sk-xxxxx
WHISPER_MODEL=gpt-4o-mini-transcribe
HOTKEY_LISTENER=auto
```
Use plain ASCII characters—remove smart quotes. Alternatively export OPENAI_API_KEY=....
Optional: remap Fn to F19
- Install Karabiner-Elements
- Add a Simple Modification fn → f19
- Or skip remap and use --hotkey code63 (Fn virtual key on macOS)

Run the script

source .venv/bin/activate
python3.12 fn_whisper_transcribe.py \
  --hotkey option \
  --listener pynput \
  --model gpt-4o-mini-transcribe \
  --auto-paste \
  --verbose

Place the cursor in any text field
Hold the hotkey, speak, release → [HH:MM:SS] ... appears and text pastes automatically
Stop with Ctrl+C

🔧 Common Flags

python3.12 fn_whisper_transcribe.py \
  --hotkey f19                 # Default hotkey; swap to cmd / option / code63 etc.
  --listener pynput            # Hotkey backend (auto/pynput/keyboard)
  --toggle                     # Tap to start/stop for release-only keys
  --min-duration 0.25          # Ignore clips shorter than X seconds
  --block-duration 0.05        # Audio block size (smaller = lower latency, higher CPU)
  --device "MacBook Pro Microphone"   # Explicit input device
  --channels 1                 # Channel count (set 2 for stereo mics)
  --prompt "Meeting notes"     # Prompt to bias transcription
  --env-file configs/demo.env  # Extra env file
  --auto-paste                 # Copy and immediately send ⌘V (needs Accessibility)
  --no-clipboard               # Skip copying to clipboard
  --log-keys                   # Trace all key events
  --verbose                    # Verbose logging

📝 Permissions & Hotkey Troubleshooting

Symptom	Likely cause	Fix
“This process is not trusted”	Input Monitoring / Accessibility not granted	Grant Terminal (or `.venv/bin/python3.12`) under Privacy & Security; run `tccutil reset InputMonitoring com.apple.Terminal` if stuck
Fn key ignored	Not remapped / release-only event	Map to F19 via Karabiner or use `--hotkey code63 --toggle`
Mic icon appears immediately	Other service (e.g. Willow) occupies mic	Stop related agents or reboot
404 Invalid URL	Model doesn't support audio	Use `--model gpt-4o-mini-transcribe` or update `.env`
Auto paste fails	Accessibility or AppleScript blocked	Allow Terminal and `/usr/bin/osascript` under Accessibility (“Control your computer”)

🧪 Development Tips

Compile check: python3.12 -m compileall fn_whisper_transcribe.py

Keyboard event probe:

python3.12 - <<'PY'
from pynput import keyboard
print("Listening (Esc to exit)...")
def on_press(key): print("down:", key)
def on_release(key): print("up:", key); return key == keyboard.Key.esc
with keyboard.Listener(on_press=on_press, on_release=on_release) as listener:
    listener.join()
PY

Fallback to keyboard backend (requires sudo): python3.12 fn_whisper_transcribe.py --listener keyboard

📄 License

No license is included yet. Add a LICENSE file (MIT, Apache-2.0, etc.) before publishing publicly.

🙋 Support

Open an issue for hotkey compatibility, permission hurdles, model latency, or quota concerns. Feel free to fork and extend—e.g. live captions or local Whisper backends.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
README.zh-CN.md		README.zh-CN.md
fn_whisper_transcribe.py		fn_whisper_transcribe.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fn Key Whisper Transcriber

✨ Highlights

✅ Requirements

🚀 Quick Start

🔧 Common Flags

📝 Permissions & Hotkey Troubleshooting

🧪 Development Tips

📄 License

🙋 Support

About

Uh oh!

Releases

Packages

Languages

iamguoyisahn/TaskTranscriber

Folders and files

Latest commit

History

Repository files navigation

Fn Key Whisper Transcriber

✨ Highlights

✅ Requirements

🚀 Quick Start

🔧 Common Flags

📝 Permissions & Hotkey Troubleshooting

🧪 Development Tips

📄 License

🙋 Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages