VocaMac

Your voice, your Mac, your privacy. Open-source dictation powered by AI.

Speak. It types. 100% offline, open-source voice-to-text for macOS - powered by WhisperKit. No cloud, no subscriptions, no data leaves your device. Just hold a hotkey, speak, and your words appear wherever your cursor is.

✨ Features

🔒 100% Local - All audio processing happens on your machine. No internet required — the Tiny model ships bundled and works out of the box offline.
⌨️ System-Wide Text Injection - Transcribed text is typed wherever your cursor is: browsers, Slack, VS Code, spreadsheets, terminals - everywhere.
🎯 Push-to-Talk - Hold a hotkey (default: Right Option) to record. Release to transcribe.
👆 Double-Tap Toggle - Double-tap the hotkey to start/stop recording.
🧠 Smart Model Selection - Auto-detects your hardware (Apple Silicon/Intel, RAM) and recommends the best whisper model via WhisperKit.
⚡ Native Apple Acceleration - CoreML + Metal + Neural Engine acceleration on Apple Silicon. No manual setup.
📊 Visual Feedback - Menu bar icon changes color during recording and processing. Audio level indicator shows input.
🔄 Auto-Updates - Built-in update checker queries GitHub Releases on launch and lets you download and install the latest version in one click from within the app.
⚙️ Configurable - Choose hotkeys, models, languages, silence detection thresholds, and more.

📸 Screenshots

Menu bar popover with status and controls

Menu bar icon: idle (left) and recording (right)

Settings: General tab (left) and Models tab with resource monitoring (right)

Settings: Audio tab (left) and About tab (right)

Floating mic indicator near text cursor during recording

🏛️ Why WhisperKit?

VocaMac uses WhisperKit instead of raw whisper.cpp because:

	WhisperKit	whisper.cpp
Language	Pure Swift (native)	C++ (requires bridging)
Apple Silicon	CoreML + Neural Engine	Metal only
SPM Integration	One-line dependency	Complex vendoring
Model Format	CoreML (optimized per device)	GGML (generic)
Streaming	First-class async/await	Manual threading
Quality	Same OpenAI Whisper models	Same OpenAI Whisper models
Maintenance	Argmax Inc. (commercial)	Community

Same accuracy, dramatically better Apple platform integration.

📋 Requirements

macOS 13 (Ventura) or later
Apple Silicon (M1/M2/M3/M4)
Xcode 15+ or Swift 5.9+ (only for building from source)

Permissions

VocaMac requires three macOS permissions:

Permission	Why
Microphone	Capture your voice for transcription
Accessibility	Global hotkeys and text injection into apps
Input Monitoring	Detect hotkey presses system-wide

Note: After granting Input Monitoring, a restart of VocaMac is required for it to take effect.

🚀 Quick Start

Option 1: Download DMG (Recommended)

Download the latest VocaMac-x.x.x-arm64.dmg from the Releases page
Open the DMG and drag VocaMac to Applications
Open VocaMac from Applications
Grant permissions: Microphone, Accessibility, and Input Monitoring when prompted

VocaMac is Developer ID signed and notarized by Apple — macOS will open it without any security warnings.

Option 2: Build from Source (Recommended)

git clone https://github.com/jatinkrmalik/vocamac.git
cd vocamac
make install

This builds VocaMac, installs it to /Applications, and launches it. Permissions are granted directly to VocaMac, just like the DMG method.

Option 3: CLI Commands (For Developers)

git clone https://github.com/jatinkrmalik/vocamac.git
cd vocamac
make install-cli

This installs two commands to ~/.local/bin:

vocamac &: Launch VocaMac in background
vocamac-build: Rebuild from source after pulling updates

Permissions note: In CLI mode, macOS assigns permissions to your terminal app (Terminal, iTerm2, etc.) rather than VocaMac itself. Grant Microphone, Accessibility, and Input Monitoring to your terminal app instead.

First Launch

VocaMac appears in your menu bar (microphone icon, no Dock icon)
Grant permissions: Microphone, Accessibility, and Input Monitoring (see Permissions above)
First model download: WhisperKit automatically downloads the recommended model for your device (~40–500 MB depending on hardware)
Start dictating: Hold the Right Option key, speak, and release. Your words appear at the cursor!

🌙 Nightly Builds

Nightly builds are automated builds from the latest main branch, published every day at midnight UTC when there are new commits. They let you try the latest features, fixes, and improvements before they land in a stable release.

Why use a nightly build?

Early access — Test new features days or weeks before the next stable release
Help improve VocaMac — Your feedback on nightly builds catches bugs before they reach everyone
Fully signed & notarized — Nightly builds are Developer ID signed and notarized by Apple, just like stable releases. No Gatekeeper warnings, no right-click workarounds

How to install:

Download the latest VocaMac-nightly-*.dmg from the Nightly Release
Open the DMG and drag VocaMac to Applications
Grant permissions when prompted (same as a stable release)

How to identify your build:

Nightly builds embed the date and commit SHA in the version string. Open Settings → About to see something like:

Version 0.5.0-nightly.20260414+abc1234 (Nightly)

This helps us pinpoint the exact code you're running if you report an issue.

Cadence & stability:

	Stable Release	Nightly Build
Frequency	When ready (manual tag)	Daily at midnight UTC
Source	Tagged commit	Latest `main` branch
Signed & notarized	✅ Yes	✅ Yes
Stability	Production-ready	May contain incomplete features or bugs
Best for	Daily use	Testing & early feedback

⚠️ Nightly builds may be unstable. If you encounter issues, please open a bug report — your feedback helps us ship better stable releases!

🎮 Usage

Push-to-Talk (Default)

Action	What Happens
Hold Right Option	Recording starts (menu bar icon turns red)
Speak	Audio is captured locally
Release Right Option	Recording stops → transcription → text injected at cursor

Double-Tap Toggle

Action	What Happens
Double-tap Right Option	Recording starts
Speak	Audio is captured
Double-tap Right Option again	Recording stops → transcription → text injection

Switch between modes in Settings → General → Activation.

🧠 Whisper Models

VocaMac uses OpenAI Whisper models via WhisperKit's CoreML format. The app auto-detects your hardware and recommends the best model:

Model	Parameters	Size	Speed	Quality	Best For
Tiny	39M	~0.4 GB	⚡⚡⚡⚡⚡	Good	Quick notes, older Macs
Base	74M	~0.8 GB	⚡⚡⚡⚡	Better	Daily use on 8GB Macs
Small	244M	~1.5 GB	⚡⚡⚡	Great	16GB+ Apple Silicon
Medium	769M	~2.5 GB	⚡⚡	Excellent	24GB+ for high accuracy
Large v3	1550M	~4.8 GB	⚡	Best	Maximum accuracy

Models are downloaded automatically from HuggingFace on first use and cached locally. Download additional models from Settings → Models.

⚙️ Configuration

Open Settings from the menu bar popover or with ⌘,

General

Activation mode - Push-to-Talk or Double-Tap Toggle
Hotkey - Choose from Right Option, Right Command, Fn, function keys, etc.
Language - Auto-detect or specify (English, Spanish, French, German, Chinese, Japanese, and more)
Launch at login

Audio

Max recording duration - 30s, 60s, 120s, or 300s
Silence detection - Auto-stop recording after configurable silence
Sound effects - Toggle audio feedback for recording start/stop
Input device - Select which microphone to use

Models

View system info and WhisperKit's hardware recommendation
Download, load, and switch between models
See which models are supported on your device

🏗️ Architecture

VocaMac is built with a clean, modular architecture using native Swift and SwiftUI:

VocaMacApp (SwiftUI MenuBarExtra)
├── AppState          - Central observable state
├── HotKeyManager     - CGEventTap global hotkey listener
├── AudioEngine       - AVAudioEngine mic capture (16kHz, mono, Float32)
├── WhisperService    - WhisperKit async transcription wrapper
│   └── ModelManager  - Model download, storage, device recommendations
│       └── SystemInfo - Hardware detection & model recommendation
├── SoundManager      - Audio feedback (start/stop recording cues)
├── TextInjector      - Clipboard + Cmd+V text injection
├── MenuBarView       - Status popover UI
└── SettingsView      - Configuration tabs (General, Models, Audio, Debug, About)

For detailed documentation, see:

docs/ARCHITECTURE.md - Technical Architecture
docs/DATA_MODEL.md - Data Model & Entity Relationships

🔧 Development

Prerequisites

Xcode 15+ or Swift 5.9+ toolchain
macOS 13+

Project Structure

VocaMac/
├── Package.swift                   # SPM config (WhisperKit dependency)
├── Sources/
│   └── VocaMac/
│       ├── App/
│       │   └── VocaMacApp.swift    # Entry point, MenuBarExtra
│       ├── Views/
│       │   ├── MenuBarView.swift   # Menu bar popover
│       │   └── SettingsView.swift  # Settings window (5 tabs)
│       ├── Services/
│       │   ├── AudioEngine.swift   # AVAudioEngine mic capture
│       │   ├── HotKeyManager.swift # CGEventTap global hotkeys
│       │   ├── WhisperService.swift# WhisperKit transcription wrapper
│       │   ├── ModelManager.swift  # Model download & management
│       │   ├── SoundManager.swift  # Audio feedback for recording
│       │   ├── TextInjector.swift  # Clipboard-based text injection
│       │   └── SystemInfo.swift    # Hardware detection
│       ├── Models/
│       │   ├── AppState.swift      # Central observable state
│       │   ├── TranscriptionResult.swift  # VocaTranscription type
│       │   └── WhisperModel.swift  # ModelSize enum, WhisperModelInfo
│       └── Resources/
├── Tests/
│   └── VocaMacTests/
├── Makefile                        # make build, install, test, clean
├── scripts/
│   ├── build.sh                    # Build .app bundle (dev)
│   ├── install.sh                  # Install to /Applications or CLI
│   └── uninstall.sh                # Full uninstall & cleanup
├── web/                            # Marketing website (vocamac.com)
├── docs/
│   ├── ARCHITECTURE.md             # Technical Architecture
│   └── DATA_MODEL.md               # Data Model & Entity Relationships
├── LICENSE                         # AGPL-3.0 License
└── .gitignore

Build Commands

make install        # Build + install to /Applications (recommended)
make install-cli    # Install CLI commands to ~/.local/bin
make build          # Build .app bundle in repo root (dev iteration)
make test           # Run tests
make run            # Launch the locally built .app
make clean          # Remove build artifacts
make help           # Show all commands

Uninstall

To completely remove VocaMac and all its data (downloaded models, preferences, caches):

./scripts/uninstall.sh

Use --keep-build to preserve build artifacts:

./scripts/uninstall.sh --keep-build

Troubleshooting

Reset onboarding: To re-trigger the first-launch onboarding wizard (e.g., after an upgrade or for testing), reset the onboarding flag:

defaults delete com.vocamac.app vocamac.hasCompletedOnboarding

Then relaunch VocaMac. This only clears the onboarding state; all other preferences (hotkey, language, model) are preserved.

Reset all preferences: To start completely fresh:

defaults delete com.vocamac.app

Reset permissions (troubleshooting): If permissions appear stuck or aren't being recognized after an update, you can reset them from Settings → Debug → Reset All Permissions, or manually via Terminal:

tccutil reset All com.vocamac.app

This clears all permission entries (Microphone, Accessibility, Input Monitoring) for VocaMac. On next launch, macOS will prompt you to re-grant them. With Developer ID signing, permissions normally persist across updates — this reset is only needed for troubleshooting.

🌐 Cross-Platform

VocaMac is the macOS member of the Voca family:

Platform	Project	Status
Linux	VocaLinux	✅ Available
macOS	VocaMac	🚀 Beta
🪟 Windows	VocaWin	📋 Planned

Each platform uses native technologies for the best possible integration, while sharing the same UX patterns and Whisper model family.

🤝 Related Projects

WhisperKit - Swift native on-device speech recognition
VocaLinux - Voice-to-text for Linux
OpenAI Whisper - Original Whisper model

⚠️ Known Limitations

Larger models require a one-time download: VocaMac ships with the Whisper Tiny model bundled — you can dictate immediately with no internet connection. Switching to a larger model (Small, Medium, Large) requires a one-time download; all subsequent launches work fully offline.
macOS only: Requires macOS 13 (Ventura) or later.
Permissions reset on rebuild (build-from-source only): When building from source without a Developer ID certificate, macOS resets Accessibility and Input Monitoring permissions on every rebuild due to ad-hoc signing. Release builds are Developer ID signed so permissions persist across updates.

Permissions and Code Signing

Release builds of VocaMac are Developer ID signed and notarized by Apple. Accessibility and Input Monitoring permissions persist across updates — no manual re-granting required.

For developers building from source: If you don't have a Developer ID certificate, build.sh falls back to ad-hoc signing. With ad-hoc signing, macOS resets Accessibility and Input Monitoring permissions on every rebuild because the CDHash changes. This is standard macOS security behavior — all open-source apps with Accessibility (Rectangle, Maccy, AltTab, etc.) have the same limitation when ad-hoc signed.

Workarounds for ad-hoc builds:

Approach	How	Permissions Persist
Run from Terminal	Grant permissions to Terminal.app once, then run `make run`	✅ Always
Re-grant manually	System Settings → Privacy & Security after each rebuild	Per rebuild

💡 Developer tip: Add your Terminal app (Terminal.app or iTerm2) to both Accessibility and Input Monitoring in System Settings. Then run VocaMac directly from Terminal. Permissions are inherited and never reset.

📄 License

AGPL-3.0 License - see LICENSE for details.

Made with ❤️ for the macOS community!

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
.github/workflows		.github/workflows
Sources/VocaMac		Sources/VocaMac
Tests/VocaMacTests		Tests/VocaMacTests
docs		docs
scripts		scripts
web		web
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
Makefile		Makefile
Package.swift		Package.swift
README.md		README.md
VocaMac.entitlements		VocaMac.entitlements

Folders and files

Latest commit

History

Repository files navigation