Your voice, your Mac, your privacy. Open-source dictation powered by AI.
Speak. It types. 100% offline, open-source voice-to-text for macOS - powered by WhisperKit. No cloud, no subscriptions, no data leaves your device. Just hold a hotkey, speak, and your words appear wherever your cursor is.
- 🔒 100% Local - All audio processing happens on your machine. No internet required — the Tiny model ships bundled and works out of the box offline.
- ⌨️ System-Wide Text Injection - Transcribed text is typed wherever your cursor is: browsers, Slack, VS Code, spreadsheets, terminals - everywhere.
- 🎯 Push-to-Talk - Hold a hotkey (default: Right Option) to record. Release to transcribe.
- 👆 Double-Tap Toggle - Double-tap the hotkey to start/stop recording.
- 🧠 Smart Model Selection - Auto-detects your hardware (Apple Silicon/Intel, RAM) and recommends the best whisper model via WhisperKit.
- ⚡ Native Apple Acceleration - CoreML + Metal + Neural Engine acceleration on Apple Silicon. No manual setup.
- 📊 Visual Feedback - Menu bar icon changes color during recording and processing. Audio level indicator shows input.
- 🔄 Auto-Updates - Built-in update checker queries GitHub Releases on launch and lets you download and install the latest version in one click from within the app.
- ⚙️ Configurable - Choose hotkeys, models, languages, silence detection thresholds, and more.
Menu bar popover with status and controls
Menu bar icon: idle (left) and recording (right)
Settings: General tab (left) and Models tab with resource monitoring (right)
Settings: Audio tab (left) and About tab (right)
Floating mic indicator near text cursor during recording
VocaMac uses WhisperKit instead of raw whisper.cpp because:
| WhisperKit | whisper.cpp | |
|---|---|---|
| Language | Pure Swift (native) | C++ (requires bridging) |
| Apple Silicon | CoreML + Neural Engine | Metal only |
| SPM Integration | One-line dependency | Complex vendoring |
| Model Format | CoreML (optimized per device) | GGML (generic) |
| Streaming | First-class async/await | Manual threading |
| Quality | Same OpenAI Whisper models | Same OpenAI Whisper models |
| Maintenance | Argmax Inc. (commercial) | Community |
Same accuracy, dramatically better Apple platform integration.
- macOS 13 (Ventura) or later
- Apple Silicon (M1/M2/M3/M4)
- Xcode 15+ or Swift 5.9+ (only for building from source)
VocaMac requires three macOS permissions:
| Permission | Why |
|---|---|
| Microphone | Capture your voice for transcription |
| Accessibility | Global hotkeys and text injection into apps |
| Input Monitoring | Detect hotkey presses system-wide |
Note: After granting Input Monitoring, a restart of VocaMac is required for it to take effect.
- Download the latest
VocaMac-x.x.x-arm64.dmgfrom the Releases page - Open the DMG and drag VocaMac to Applications
- Open VocaMac from Applications
- Grant permissions: Microphone, Accessibility, and Input Monitoring when prompted
VocaMac is Developer ID signed and notarized by Apple — macOS will open it without any security warnings.
git clone https://github.com/jatinkrmalik/vocamac.git
cd vocamac
make installThis builds VocaMac, installs it to /Applications, and launches it. Permissions are granted directly to VocaMac, just like the DMG method.
git clone https://github.com/jatinkrmalik/vocamac.git
cd vocamac
make install-cliThis installs two commands to ~/.local/bin:
vocamac &: Launch VocaMac in backgroundvocamac-build: Rebuild from source after pulling updates
Permissions note: In CLI mode, macOS assigns permissions to your terminal app (Terminal, iTerm2, etc.) rather than VocaMac itself. Grant Microphone, Accessibility, and Input Monitoring to your terminal app instead.
- VocaMac appears in your menu bar (microphone icon, no Dock icon)
- Grant permissions: Microphone, Accessibility, and Input Monitoring (see Permissions above)
- First model download: WhisperKit automatically downloads the recommended model for your device (~40–500 MB depending on hardware)
- Start dictating: Hold the Right Option key, speak, and release. Your words appear at the cursor!
Nightly builds are automated builds from the latest main branch, published every day at midnight UTC when there are new commits. They let you try the latest features, fixes, and improvements before they land in a stable release.
Why use a nightly build?
- Early access — Test new features days or weeks before the next stable release
- Help improve VocaMac — Your feedback on nightly builds catches bugs before they reach everyone
- Fully signed & notarized — Nightly builds are Developer ID signed and notarized by Apple, just like stable releases. No Gatekeeper warnings, no right-click workarounds
How to install:
- Download the latest
VocaMac-nightly-*.dmgfrom the Nightly Release - Open the DMG and drag VocaMac to Applications
- Grant permissions when prompted (same as a stable release)
How to identify your build:
Nightly builds embed the date and commit SHA in the version string. Open Settings → About to see something like:
Version 0.5.0-nightly.20260414+abc1234 (Nightly)
This helps us pinpoint the exact code you're running if you report an issue.
Cadence & stability:
| Stable Release | Nightly Build | |
|---|---|---|
| Frequency | When ready (manual tag) | Daily at midnight UTC |
| Source | Tagged commit | Latest main branch |
| Signed & notarized | ✅ Yes | ✅ Yes |
| Stability | Production-ready | May contain incomplete features or bugs |
| Best for | Daily use | Testing & early feedback |
⚠️ Nightly builds may be unstable. If you encounter issues, please open a bug report — your feedback helps us ship better stable releases!
| Action | What Happens |
|---|---|
| Hold Right Option | Recording starts (menu bar icon turns red) |
| Speak | Audio is captured locally |
| Release Right Option | Recording stops → transcription → text injected at cursor |
| Action | What Happens |
|---|---|
| Double-tap Right Option | Recording starts |
| Speak | Audio is captured |
| Double-tap Right Option again | Recording stops → transcription → text injection |
Switch between modes in Settings → General → Activation.
VocaMac uses OpenAI Whisper models via WhisperKit's CoreML format. The app auto-detects your hardware and recommends the best model:
| Model | Parameters | Size | Speed | Quality | Best For |
|---|---|---|---|---|---|
| Tiny | 39M | ~0.4 GB | ⚡⚡⚡⚡⚡ | Good | Quick notes, older Macs |
| Base | 74M | ~0.8 GB | ⚡⚡⚡⚡ | Better | Daily use on 8GB Macs |
| Small | 244M | ~1.5 GB | ⚡⚡⚡ | Great | 16GB+ Apple Silicon |
| Medium | 769M | ~2.5 GB | ⚡⚡ | Excellent | 24GB+ for high accuracy |
| Large v3 | 1550M | ~4.8 GB | ⚡ | Best | Maximum accuracy |
Models are downloaded automatically from HuggingFace on first use and cached locally. Download additional models from Settings → Models.
Open Settings from the menu bar popover or with ⌘,
- Activation mode - Push-to-Talk or Double-Tap Toggle
- Hotkey - Choose from Right Option, Right Command, Fn, function keys, etc.
- Language - Auto-detect or specify (English, Spanish, French, German, Chinese, Japanese, and more)
- Launch at login
- Max recording duration - 30s, 60s, 120s, or 300s
- Silence detection - Auto-stop recording after configurable silence
- Sound effects - Toggle audio feedback for recording start/stop
- Input device - Select which microphone to use
- View system info and WhisperKit's hardware recommendation
- Download, load, and switch between models
- See which models are supported on your device
VocaMac is built with a clean, modular architecture using native Swift and SwiftUI:
VocaMacApp (SwiftUI MenuBarExtra)
├── AppState - Central observable state
├── HotKeyManager - CGEventTap global hotkey listener
├── AudioEngine - AVAudioEngine mic capture (16kHz, mono, Float32)
├── WhisperService - WhisperKit async transcription wrapper
│ └── ModelManager - Model download, storage, device recommendations
│ └── SystemInfo - Hardware detection & model recommendation
├── SoundManager - Audio feedback (start/stop recording cues)
├── TextInjector - Clipboard + Cmd+V text injection
├── MenuBarView - Status popover UI
└── SettingsView - Configuration tabs (General, Models, Audio, Debug, About)
For detailed documentation, see:
docs/ARCHITECTURE.md- Technical Architecturedocs/DATA_MODEL.md- Data Model & Entity Relationships
- Xcode 15+ or Swift 5.9+ toolchain
- macOS 13+
VocaMac/
├── Package.swift # SPM config (WhisperKit dependency)
├── Sources/
│ └── VocaMac/
│ ├── App/
│ │ └── VocaMacApp.swift # Entry point, MenuBarExtra
│ ├── Views/
│ │ ├── MenuBarView.swift # Menu bar popover
│ │ └── SettingsView.swift # Settings window (5 tabs)
│ ├── Services/
│ │ ├── AudioEngine.swift # AVAudioEngine mic capture
│ │ ├── HotKeyManager.swift # CGEventTap global hotkeys
│ │ ├── WhisperService.swift# WhisperKit transcription wrapper
│ │ ├── ModelManager.swift # Model download & management
│ │ ├── SoundManager.swift # Audio feedback for recording
│ │ ├── TextInjector.swift # Clipboard-based text injection
│ │ └── SystemInfo.swift # Hardware detection
│ ├── Models/
│ │ ├── AppState.swift # Central observable state
│ │ ├── TranscriptionResult.swift # VocaTranscription type
│ │ └── WhisperModel.swift # ModelSize enum, WhisperModelInfo
│ └── Resources/
├── Tests/
│ └── VocaMacTests/
├── Makefile # make build, install, test, clean
├── scripts/
│ ├── build.sh # Build .app bundle (dev)
│ ├── install.sh # Install to /Applications or CLI
│ └── uninstall.sh # Full uninstall & cleanup
├── web/ # Marketing website (vocamac.com)
├── docs/
│ ├── ARCHITECTURE.md # Technical Architecture
│ └── DATA_MODEL.md # Data Model & Entity Relationships
├── LICENSE # AGPL-3.0 License
└── .gitignore
make install # Build + install to /Applications (recommended)
make install-cli # Install CLI commands to ~/.local/bin
make build # Build .app bundle in repo root (dev iteration)
make test # Run tests
make run # Launch the locally built .app
make clean # Remove build artifacts
make help # Show all commandsTo completely remove VocaMac and all its data (downloaded models, preferences, caches):
./scripts/uninstall.shUse --keep-build to preserve build artifacts:
./scripts/uninstall.sh --keep-buildReset onboarding: To re-trigger the first-launch onboarding wizard (e.g., after an upgrade or for testing), reset the onboarding flag:
defaults delete com.vocamac.app vocamac.hasCompletedOnboardingThen relaunch VocaMac. This only clears the onboarding state; all other preferences (hotkey, language, model) are preserved.
Reset all preferences: To start completely fresh:
defaults delete com.vocamac.appReset permissions (troubleshooting): If permissions appear stuck or aren't being recognized after an update, you can reset them from Settings → Debug → Reset All Permissions, or manually via Terminal:
tccutil reset All com.vocamac.appThis clears all permission entries (Microphone, Accessibility, Input Monitoring) for VocaMac. On next launch, macOS will prompt you to re-grant them. With Developer ID signing, permissions normally persist across updates — this reset is only needed for troubleshooting.
VocaMac is the macOS member of the Voca family:
| Platform | Project | Status |
|---|---|---|
| Linux | VocaLinux | ✅ Available |
| macOS | VocaMac | 🚀 Beta |
| 🪟 Windows | VocaWin | 📋 Planned |
Each platform uses native technologies for the best possible integration, while sharing the same UX patterns and Whisper model family.
- WhisperKit - Swift native on-device speech recognition
- VocaLinux - Voice-to-text for Linux
- OpenAI Whisper - Original Whisper model
- Larger models require a one-time download: VocaMac ships with the Whisper Tiny model bundled — you can dictate immediately with no internet connection. Switching to a larger model (Small, Medium, Large) requires a one-time download; all subsequent launches work fully offline.
- macOS only: Requires macOS 13 (Ventura) or later.
- Permissions reset on rebuild (build-from-source only): When building from source without a Developer ID certificate, macOS resets Accessibility and Input Monitoring permissions on every rebuild due to ad-hoc signing. Release builds are Developer ID signed so permissions persist across updates.
Release builds of VocaMac are Developer ID signed and notarized by Apple. Accessibility and Input Monitoring permissions persist across updates — no manual re-granting required.
For developers building from source: If you don't have a Developer ID certificate, build.sh falls back to ad-hoc signing. With ad-hoc signing, macOS resets Accessibility and Input Monitoring permissions on every rebuild because the CDHash changes. This is standard macOS security behavior — all open-source apps with Accessibility (Rectangle, Maccy, AltTab, etc.) have the same limitation when ad-hoc signed.
Workarounds for ad-hoc builds:
| Approach | How | Permissions Persist |
|---|---|---|
| Run from Terminal | Grant permissions to Terminal.app once, then run make run |
✅ Always |
| Re-grant manually | System Settings → Privacy & Security after each rebuild | Per rebuild |
💡 Developer tip: Add your Terminal app (Terminal.app or iTerm2) to both Accessibility and Input Monitoring in System Settings. Then run VocaMac directly from Terminal. Permissions are inherited and never reset.
AGPL-3.0 License - see LICENSE for details.
Made with ❤️ for the macOS community!
