Skip to content

tsungtwu/whisper-echo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

WhisperEcho

WhisperEcho

Version Platform Apple Silicon License

A macOS menu bar app for real-time speech-to-text using MLX Whisper

Press a hotkey, speak, and the transcribed text is automatically pasted into the active input field.

Download Latest Release


✨ Features

  • πŸŽ™οΈ Menu Bar App β€” lives in the system tray, always accessible
  • ⌨️ Hotkey Triggered β€” configurable keyboard shortcut (default: Option+Tab)
  • πŸ”„ Push-to-Talk or Toggle Mode β€” hold the hotkey or press to start/stop
  • πŸ”’ On-Device Transcription β€” runs locally via MLX Whisper, no cloud API needed
  • πŸ“‹ Auto-Paste β€” transcribed text is automatically inserted into the focused input field
  • πŸ’¬ Floating Bubble UI β€” visual feedback with waveform animation during recording
  • 🌏 Multi-Language Support β€” optimized for Traditional Chinese with English code-switching
  • ⚑ Auto-Setup β€” automatically creates Python venv and installs dependencies on first launch

πŸ–₯️ How It Works

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  βŒ₯ Tab                                                      β”‚
β”‚  ──────                                                     β”‚
β”‚                                                             β”‚
β”‚  1. Press Option+Tab    ──→  πŸŽ™οΈ Recording starts            β”‚
β”‚                              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”‚
β”‚                              β”‚  ● Recording...   β”‚          β”‚
β”‚                              β”‚  ≋≋≋≋≋≋≋≋≋≋≋≋≋≋  β”‚          β”‚
β”‚                              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚
β”‚                                                             β”‚
β”‚  2. Speak               ──→  πŸ”Š Audio captured locally      β”‚
β”‚                                                             β”‚
β”‚  3. Release / Press     ──→  βš™οΈ MLX Whisper transcribes     β”‚
β”‚                              (on-device, no cloud)          β”‚
β”‚                                                             β”‚
β”‚  4. Done                ──→  πŸ“‹ Text auto-pasted to cursor  β”‚
β”‚                              (Cmd+V via CGEvent)            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Quick Start

Option 1: Download Release

  1. Download WhisperEcho-v1.0.0.zip from Releases
  2. Extract and move WhisperEcho.app to /Applications
  3. Launch WhisperEcho

Option 2: Build from Source

# Install XcodeGen if not already installed
brew install xcodegen

# Clone the repository
git clone https://github.com/tsungtwu/whisper-echo.git
cd whisper-echo

# Build and install
make install

πŸ“¦ Makefile Commands

Command Description
make build Debug build
make run Debug build + launch
make release Release build + package zip to release/
make install Release build + reset permissions + copy to /Applications
make clean Clean build artifacts
make reset-permissions Reset app permissions only

πŸ” First Launch

  1. Launch WhisperEcho β€” it appears in the menu bar
  2. Grant Microphone permission when prompted
  3. Grant Accessibility permission when prompted (required for auto-paste)
  4. Add WhisperEcho to Input Monitoring in System Settings > Privacy & Security (required for hotkey)
  5. The app will automatically create ~/.mlx-whisper-env and install mlx-whisper
  6. Wait for the model to download (first time only)

Note: When updating to a new build, make install automatically resets permissions to avoid stale entries. You will need to re-grant permissions on first launch after update.


⌨️ Usage

  1. Click the menu bar icon to see status and settings
  2. Press the hotkey (default: Option+Tab) to start recording
  3. Speak β€” the floating bubble shows a waveform animation
  4. Release (push-to-talk) or press again (toggle) to stop
  5. Text is transcribed and automatically pasted into the active input field

βš™οΈ Settings

Setting Options Default
Language English, Traditional Chinese, Japanese, Korean, ... Traditional Chinese
Model whisper-large-v3-turbo, large-v3, medium, small, base, tiny large-v3-turbo
Mode Push-to-talk, Toggle Toggle
Hotkey Any modifier + key combo Option+Tab
Launch at Login On/Off Off

πŸ—οΈ Architecture

WhisperEcho/
β”œβ”€β”€ WhisperEchoApp.swift          # App entry point (MenuBarExtra)
β”œβ”€β”€ AppCoordinator.swift          # Central orchestrator
β”œβ”€β”€ AppState.swift                # Observable state machine
β”œβ”€β”€ Audio/
β”‚   β”œβ”€β”€ AudioRecorder.swift       # AVAudioEngine recording
β”‚   └── AudioLevelMonitor.swift   # Audio level tracking
β”œβ”€β”€ Transcription/
β”‚   β”œβ”€β”€ PythonBridge.swift        # Python venv setup & PythonKit init
β”‚   β”œβ”€β”€ PythonThread.swift        # Dedicated thread for Python GIL
β”‚   └── WhisperEngine.swift       # MLX Whisper transcription
β”œβ”€β”€ Hotkey/
β”‚   └── HotkeyManager.swift       # CGEvent tap for global hotkey
β”œβ”€β”€ UI/
β”‚   β”œβ”€β”€ MenuBarView.swift         # Menu bar dropdown UI
β”‚   β”œβ”€β”€ BubbleWindow.swift        # Floating bubble window
β”‚   β”œβ”€β”€ BubbleView.swift          # Bubble content view
β”‚   └── WaveformView.swift        # Waveform animation
β”œβ”€β”€ Output/
β”‚   └── TextOutputManager.swift   # Clipboard + Cmd+V paste
β”œβ”€β”€ Settings/
β”‚   └── Settings.swift            # UserDefaults persistence
β”œβ”€β”€ Permissions/
β”‚   └── PermissionChecker.swift   # Permission checks & openers
└── Resources/
    β”œβ”€β”€ Info.plist
    β”œβ”€β”€ WhisperEcho.entitlements
    └── Assets.xcassets/

Tech Stack

  • Language: Swift 5.9
  • UI: SwiftUI + MenuBarExtra
  • Audio: AVAudioEngine (16kHz mono float32)
  • Transcription: MLX Whisper via PythonKit
  • Hotkey: CGEvent tap
  • Build: XcodeGen + xcodebuild

πŸ“‹ Requirements

  • macOS 14 (Sonoma) or later
  • Apple Silicon Mac (M1/M2/M3/M4)
  • Python 3.9+ installed (Homebrew or system)
  • Xcode 15+ and XcodeGen (for building from source)

πŸ“„ License

MIT License - see LICENSE for details.


πŸ‘¨β€πŸ’» Author

tsungtwu - @tsungtwu

Built with assistance from Claude Code πŸ€–


⭐ Star this repo if you find it useful!

About

πŸŽ™οΈ macOS menu bar app for real-time speech-to-text using MLX Whisper. Press a hotkey, speak, auto-paste. On-device, no cloud.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors