MacBot is a comprehensive offline AI voice assistant for macOS that runs entirely on your local machine. It features a complete 5-model pipeline with voice activity detection, speech-to-text, large language model processing, text-to-speech, and native macOS tool integration.
- Advanced Offline Voice Pipeline: VAD + Whisper Large v3 STT (Metal accelerated) + Neural TTS
- High-Performance LLM: Local inference with llama.cpp, optimized for Apple Silicon
- Superior Text-to-Speech: Piper neural TTS with 70% smaller models, 2-3x faster synthesis, intelligent caching, and hardware acceleration
- Enterprise Security: JWT authentication, input validation, XSS protection, and secure API access
- Optimized Performance: Metal GPU acceleration, ~0.2s STT latency, memory leak prevention, and intelligent caching
- Enhanced macOS Integration: Web search, screenshots, app launching, system monitoring
- Modern Web Dashboard: Real-time monitoring with WebSocket live updates and circuit breaker status
- Advanced RAG System: Document ingestion and semantic search with ChromaDB and API key authentication
- Production-Ready: Docker deployment with orchestrator, comprehensive health monitoring, and automatic recovery
- Comprehensive Configuration: YAML-based configuration with extensive customization and environment variable support
- Smart Interruptibility: Natural conversation flow with voice activity detection and barge-in capability
- Real-Time Communication: WebSocket bidirectional communication for live interaction
- Performance Optimized: Circuit breaker pattern, resource management, and backpressure handling
- Production Ready: Zero type checker errors, structured logging, and enterprise-grade reliability
- TTS Performance: 70% smaller models, 2-3x faster synthesis, MPS acceleration, and real-time monitoring
# Install system dependencies
xcode-select --install
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
brew install cmake ffmpeg portaudio python@3.11 git git-lfsgit clone https://github.com/lukifer23/MacBot.git
cd MacBot
# Initialize Git LFS for model files
git lfs install
git lfs track "*.gguf"
git lfs track "*.bin"
# Create virtual environment
python3.11 -m venv macbot_env
source macbot_env/bin/activate
# Install dependencies
pip install -r requirements.txt
# Install TTS engines (optional but recommended)
pip install piper-tts
# Download Piper voice model
mkdir -p piper_voices/en_US-lessac-medium
curl -L "https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx" \
-o piper_voices/en_US-lessac-medium/model.onnx
curl -L "https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json" \
-o piper_voices/en_US-lessac-medium/model.onnx.json# Build all dependencies (whisper.cpp, llama.cpp)
make build-all
# Or build individually
make build-whisper
make build-llama# Download Whisper Large v3 model (recommended for best accuracy)
cd models/whisper.cpp
sh ./models/download-ggml-model.sh large-v3-turbo-q5_0
# Download LLM model (GGUF format) to models/llama.cpp/models/
# Recommended: Qwen3-4B-Instruct-2507-Q4_K_M.gguf or similarEdit config/config.yaml to customize settings:
models:
llm:
path: "models/llama.cpp/models/Qwen_Qwen3-4B-Instruct-2507-Q4_K_M.gguf"
context_length: 8192
temperature: 0.4
stt:
model: "models/whisper.cpp/models/ggml-large-v3-turbo-q5_0.bin"
language: "en"
tts:
voice: "en_US-lessac-medium" # Piper voice
speed: 1.0
tools:
enabled:
- web_search
- screenshot
- app_launcher
- system_monitor
- weather
- rag_search# Start all services with orchestrator
python src/macbot/orchestrator.py
# Or use individual commands
make run-llama # Start LLM server
make run-assistant # Start voice assistant
# Or use CLI
python src/macbot/cli.py orchestrator
Note:
- The Voice Assistant now exposes a lightweight control server (default: http://localhost:8123) used by the Web Dashboard to send interruption requests and perform health checks.
- Ensure `ffmpeg` is installed for voice input from the browser (used to convert WebM/Opus to WAV for Whisper).
- Assistant UI states: The assistant now notifies the dashboard about speaking start/end/interrupt events so the banner shows Listening / Speaking / Interrupted / Ready in real time.
- Orchestrator API binds to http://127.0.0.1:8090 by default for safety; the Web Dashboard proxies common calls.
### Quick Verify
After starting, run the verification to check core endpoints:
make verify
python scripts/verify_setup.py
# Build and run with docker-compose
docker-compose up --build
# Or run individual services
docker-compose up macbot-orchestrator- docs/ENHANCED_FEATURES.md - Comprehensive feature guide
- docs/API_REFERENCE.md - API endpoint documentation
- docs/CONFIGURATION.md - Detailed configuration guide
- docs/TROUBLESHOOTING.md - Common issues and solutions
- docs/DEVELOPMENT.md - Development setup and contribution guide
MacBot/
├── src/macbot/ # Main package
│ ├── cli.py # Command-line interface
│ ├── __init__.py # Package initialization
│ ├── voice_assistant.py # Voice assistant with interruption
│ ├── audio_interrupt.py # TTS interruption handler
│ ├── conversation_manager.py # Conversation state management
│ ├── message_bus.py # Real-time communication
│ ├── orchestrator.py # Service orchestration
│ ├── web_dashboard.py # Web interface
│ ├── rag_server.py # RAG knowledge base
│ ├── auth.py # JWT authentication system
│ ├── validation.py # Input validation and sanitization
│ ├── resource_manager.py # Resource lifecycle management
│ ├── error_handler.py # Centralized error handling
│ └── logging_utils.py # Structured logging utilities
├── scripts/ # Shell scripts
│ ├── bootstrap_mac.sh # Bootstrap script
│ └── start_macbot.sh # Startup script
├── tests/ # Test files
│ ├── test_interruptible_conversation.py
│ └── test_message_bus.py
├── config/ # Configuration files
│ └── config.yaml # Main configuration
├── docs/ # Documentation
├── data/ # Data directories
│ ├── rag_data/ # Knowledge base data
│ └── rag_database/ # Vector database
├── models/ # Model directories
│ ├── llama.cpp/ # LLM inference engine
│ └── whisper.cpp/ # Speech recognition
├── logs/ # Log files
│ └── macbot.log # Application logs
├── requirements.txt # Python dependencies
├── requirements-dev.txt # Development dependencies
├── pyproject.toml # Modern Python packaging
├── setup.py # Legacy packaging
├── Makefile # Build and run commands
├── docker-compose.yml # Docker orchestration
├── Dockerfile # Container definition
└── README.md
MacBot uses a comprehensive YAML configuration system. Key sections:
models:
llm:
path: "llama.cpp/models/model.gguf"
context_length: 4096
threads: -1
stt:
model: "base.en"
language: "en"
tts:
voice: "af_heart"
speed: 1.0tools:
enabled:
- web_search
- screenshot
- app_launcher
- system_monitor
web_search:
default_engine: "google"
timeout: 10
### Security Configuration
```yaml
# Authentication (set via environment variables for security)
auth:
enabled: true
jwt_secret: null # Set MACBOT_JWT_SECRET environment variable
token_expiry_hours: 24
# Input validation
validation:
enabled: true
max_text_length: 10000
xss_protection: true
# Resource management
resource_management:
enabled: true
cleanup_interval: 300services:
web_dashboard:
host: "0.0.0.0"
port: 3000
rag_server:
host: "localhost"
port: 8001
api_tokens: null # Set MACBOT_RAG_API_TOKENS environment variable
voice_assistant:
host: "localhost" # Control server host
port: 8123 # Control server port
## Voice Commands
MacBot supports various voice commands:
- **"system info"** - Get system status
- **"take screenshot"** - Capture screen
- **"open app calculator"** - Launch applications
- **"search for weather"** - Web search
- **"browse github.com"** - Open websites
- **"what's the weather"** - Weather app
## Interruptible Conversations
MacBot features natural conversation flow with barge-in capability:
### How It Works
- **Real-time Interruption**: Speak while MacBot is responding to interrupt
- **Context Preservation**: Conversation history is maintained across interruptions
- **Seamless Flow**: Natural back-and-forth conversation without waiting for responses to complete
### Configuration
Configure interruption settings in `config.yaml`:
```yaml
interruption:
enabled: true
voice_threshold: 0.3
cooldown_period: 1.0
interruption_timeout: 5.0
buffer_size: 100
- Start speaking naturally during MacBot's responses
- The system will detect your voice and stop current speech
- Your new request will be processed immediately
- Previous conversation context is preserved
# Install development dependencies
pip install -r requirements-dev.txt
# Install pre-commit hooks
pre-commit install
# Run tests
pytest
# Format code
black src/
isort src/# Install in development mode
pip install -e .
# Or build distribution
python -m build# Build development image
docker build -t macbot:dev .
# Run with development mounts
docker run -v $(pwd):/app -p 3000:3000 macbot:dev- CPU: Apple Silicon (M1/M2/M3) or Intel x64
- RAM: 8GB minimum, 16GB recommended
- Storage: 5GB for models and dependencies
- GPU: Metal support (Apple Silicon)
- macOS: 12.0+ (Monterey or later)
- Python: 3.11+ (recommended; Apple Silicon optimized)
- Git LFS: For model file management
- TTS Engines: Piper (neural quality) or pyttsx3 (fallback)
- STT Engine: Whisper.cpp v1.7.6 with Metal acceleration
- STT Latency: ~0.2 seconds (Whisper Large v3)
- TTS Speed: 178 WPM (Piper neural voices)
- LLM Context: 8192+ tokens (configurable)
- GPU Acceleration: Metal framework on Apple Silicon
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
See docs/DEVELOPMENT.md for detailed contribution guidelines.
This project is licensed under the MIT License - see the LICENSE file for details.
- llama.cpp - High-performance LLM inference engine
- Whisper.cpp - Optimized speech recognition with Metal acceleration
- Piper TTS - Modern neural text-to-speech with natural voice quality
- Kokoro - Advanced neural TTS framework (framework ready)
- ChromaDB - Vector database for RAG knowledge base
- LiveKit - Voice activity detection and real-time communication
- ONNX Runtime - Cross-platform ML inference acceleration
- SYSTRAN - FasterWhisper optimization research
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: See docs/ folder
MacBot - Your local AI assistant with the power of native macOS tools.