π¨ Vibe Coded with Claude Code
Nod.ie (pronounced "Nodey" or "Node-ee") is an always-available AI voice assistant that integrates with Kyutai Unmute for natural voice conversations. Originally created to assist with running Bitcoin and Lightning nodes, Nod.ie can manage LND channels, execute Lightning payments, and serve as a general PC assistant or interactive tutor. Designed to work with the black-panther home server.
- π€ Push-to-talk with global hotkey (Ctrl+Shift+Space)
- π Natural voice synthesis using Kyutai Unmute
- π― Small, draggable overlay interface
- π N8N webhook integration for automation
- π¬ Real-time voice conversations
- OS: Linux (built and tested on Kubuntu 24.04) or macOS
- RAM: 8GB minimum (16GB recommended)
- GPU: NVIDIA GPU with 12GB+ VRAM recommended for optimal performance
- Storage: 2GB for Unmute models
- Microphone: Any standard USB or built-in microphone
- Audio Output: Speakers or headphones
- Node.js: Version 16+ with npm
- Docker: For running Unmute services
- Kyutai Unmute: Real-time voice AI system
- Provides <200ms latency voice conversations
- Includes STT (Whisper), LLM (via Ollama), and TTS models
- Requires GPU for optimal performance (CPU mode available)
The following services must be running (typically via Docker):
- unmute-backend: WebSocket server on port 8765
- unmute-stt: Speech-to-text service (Moshi-based, ~2.6GB VRAM)
- unmute-tts: Text-to-speech service (Moshi-based, ~6.4GB VRAM)
- ollama: LLM inference (requires ~4-8GB VRAM for good performance)
To start these services:
# Navigate to your Unmute/AI stack directory
docker compose up -dGPU Memory Requirements: The full stack requires ~12-16GB VRAM total. Ensure other GPU-intensive services are stopped to avoid CPU fallback.
# Clone the repository
git clone https://github.com/KnowAll-AI/Nod.ie.git
cd Nod.ie
# Install dependencies
npm install
# Copy required decoder files (if not already present)
cp node_modules/opus-recorder/dist/decoderWorker.min.wasm .
# Create environment configuration
cp .env.example .env
# Edit .env with your configurationNod.ie uses the Opus decoder for audio playback:
decoderWorker.min.js- Opus decoder worker (included)decoderWorker.min.wasm- WebAssembly module (copy from node_modules as shown above)
npm startNod.ie will appear as a small circular overlay on your screen.
Nod.ie is an always-listening voice assistant that responds to your voice commands in real-time.
- Nod.ie starts listening automatically when launched (purple glow)
- Speak naturally - Nod.ie is always listening unless muted
- Nod.ie will respond with natural voice
- Click to mute/unmute (red = muted, purple = listening)
- Click: Toggle mute/unmute
- Long press (hold 300ms): Drag the window to a new position
- Right-click: Open menu
- Ctrl+Shift+Space: Toggle mute from anywhere
- Ctrl+Shift+A: Show Nod.ie window
- Gray spinning: Loading/connecting to services
- Purple: Listening (unmuted)
- Purple pulsing: Processing/thinking
- Red: Muted
- White ring: Audio activity visualization
Settings are stored in ~/.config/nodie/config.json:
globalHotkey: Customize the push-to-talk keyn8nWebhookUrl: Set your N8N webhook for notificationsvoice: Choose voice model (default: explanation voice)
Customize Nod.ie's personality and capabilities by editing SYSTEM-PROMPT.md. This file contains:
- Core identity and personality traits
- Communication style guidelines
- Capability descriptions
- Response examples
Changes to SYSTEM-PROMPT.md take effect on next restart.
Nod.ie can work alongside Claude Code. While Claude Code handles text-based interactions, Nod.ie provides voice interface to the same AI services.
Nod.ie is built as an Electron desktop application that provides an always-on voice interface to AI models through Kyutai Unmute's real-time voice conversation system.
graph TB
subgraph User_Interface ["π€ User Interface"]
User["π€ User"]
Mic["π€ Microphone"]
Speaker["π Speaker"]
end
subgraph Nod_ie ["π£ Nod.ie Electron App"]
direction TB
MainProcess["Main Process<br/>(main.js)"]
Renderer["Renderer Process<br/>(renderer.js)"]
UI["Circular UI<br/>(index.html)"]
subgraph Modules ["Modules"]
WSHandler["WebSocket Handler"]
AudioCapture["Audio Capture<br/>(opus-recorder)"]
AudioPlayback["Audio Playback<br/>(AudioWorklet)"]
UIManager["UI Manager"]
end
MainProcess --> Renderer
Renderer --> Modules
UI --> UIManager
end
subgraph Local_Services ["π₯οΈ Local Services (Docker)"]
direction TB
UnmuteBackend["Unmute Backend<br/>:8765"]
subgraph Unmute_Stack ["Unmute Stack"]
STT["Speech-to-Text<br/>(Moshi)"]
TTS["Text-to-Speech<br/>(Moshi)"]
LLM["LLM<br/>(Ollama)"]
end
UnmuteBackend --> STT
UnmuteBackend --> TTS
UnmuteBackend --> LLM
end
%% User interactions
User --> Mic
Speaker --> User
Mic --> AudioCapture
AudioPlayback --> Speaker
%% WebSocket connections
WSHandler -.->|"WebSocket<br/>ws://localhost:8765"| UnmuteBackend
%% Audio flow
AudioCapture -->|"Base64 Opus<br/>250ms chunks"| WSHandler
WSHandler -->|"response.audio.delta"| AudioPlayback
%% Visual feedback
UIManager -->|"Visual States"| UI
AudioCapture -->|"Audio Activity"| UIManager
style Nod_ie fill:#9333ea,stroke:#7c3aed,color:#fff
style Local_Services fill:#1e293b,stroke:#334155,color:#fff
style User_Interface fill:#059669,stroke:#047857,color:#fff
- Electron: Cross-platform desktop application framework
- WebSocket: Real-time bidirectional communication with Unmute backend
- Web Audio API: Audio capture and visualization
- MediaRecorder API: Audio streaming with Opus codec
- Node.js: Backend runtime
- Creates frameless, transparent, always-on-top window
- Manages global keyboard shortcuts
- Handles system tray integration
- Stores configuration using electron-store
- Establishes WebSocket connection to Unmute (ws://localhost:8765)
- Captures microphone audio using MediaRecorder API
- Streams audio chunks as base64-encoded Opus data
- Receives and plays TTS audio responses
- Manages UI state and animations
- Circular overlay design with CSS animations
- Visual feedback states:
- Purple (idle/listening)
- Red (muted)
- Spinning ring (audio activity)
- Yellow spin (thinking)
- Draggable window with CSS
-webkit-app-region
sequenceDiagram
participant User
participant Nod.ie
participant Unmute
participant STT
participant LLM
participant TTS
User->>Nod.ie: Speaks into microphone
activate Nod.ie
Note over Nod.ie: opus-recorder captures<br/>audio in OGG Opus format
Nod.ie->>Unmute: input_audio_buffer.append<br/>(Base64 Opus, 250ms chunks)
activate Unmute
Unmute->>STT: Stream audio
activate STT
STT-->>Unmute: Transcription<br/>(real-time)
deactivate STT
Note over Unmute: Detects end of speech<br/>(semantic VAD)
Unmute->>LLM: Generate response<br/>(with system prompt)
activate LLM
LLM-->>Unmute: Text response<br/>(streaming)
deactivate LLM
Unmute->>TTS: Convert text to speech
activate TTS
TTS-->>Unmute: Audio chunks<br/>(Opus format)
deactivate TTS
Unmute-->>Nod.ie: response.audio.delta<br/>(Base64 Opus)
deactivate Unmute
Note over Nod.ie: AudioWorklet decodes<br/>and plays audio
Nod.ie-->>User: Plays response
deactivate Nod.ie
- Capture: opus-recorder with OGG Opus container format
- Streaming: 250ms chunks sent via WebSocket
- Format: Base64-encoded Opus audio in
input_audio_buffer.appendmessages - Processing: Unmute handles STT using Moshi models (~2.6GB VRAM)
- Reception:
response.audio.deltamessages with base64 Opus audio - Decoding: AudioWorklet with Opus decoder (WASM)
- Playback: Real-time audio streaming through Web Audio API
- Voice: Configurable (8 available voices)
Nod.ie uses a subset of Unmute's WebSocket API:
// Session initialization
{
"type": "session.update",
"session": {
"id": "nodie-timestamp",
"voice": "unmute-prod-website/ex04_narration_longform_00001.wav",
"model": "unmute-mini",
"modalities": ["text", "audio"],
"allow_recording": false
}
}
// Audio streaming
{
"type": "input_audio_buffer.append",
"audio": "base64-encoded-opus-data"
}
// Audio commit (triggers processing)
{
"type": "input_audio_buffer.commit"
}- Always Listening: Continuous audio streaming when unmuted
- Low Latency: <200ms response time with local Unmute
- Visual Feedback: Real-time audio visualization using Web Audio API
- Error Handling: Automatic reconnection on WebSocket failure
- Privacy: Click to mute, no audio stored when
allow_recording: false
- Unmute Backend: WebSocket connection for voice processing
- Ollama: LLM inference (via Unmute)
- n8n: Webhook notifications for automation
- Claude Code: Can trigger TTS through system hooks
Nod.ie supports multiple voices. The current voice is set to "Explanation" - a clear voice for explanatory content. To change voices, edit /modules/websocket-handler.js and use one of these paths:
unmute-prod-website/ex04_narration_longform_00001.wav- Explanation (current)unmute-prod-website/p329_022.wav- Watercoolerunmute-prod-website/freesound/519189_request-42---hmm-i-dont-knowwav.mp3- Quiz Show (British male)unmute-prod-website/freesound/440565_why-is-there-educationwav.mp3- Gertrude (warm female)unmute-prod-website/developer-1.mp3- Dev voice
Nod.ie includes a comprehensive test suite to verify functionality:
# Run all tests (including GUI tests)
node tests/run-all-tests.js
# Run non-GUI tests only (faster)
node tests/run-non-electron-tests.js
# Run interactive browser test
node tests/serve-browser-test.js
# Then open http://localhost:8090Key tests include WebSocket connectivity, audio format validation, and end-to-end voice interaction testing.
| Problem | Solution |
|---|---|
| No audio output | β’ Check Unmute services are running β’ Verify audio device in system settings β’ Check Developer Tools console for errors |
| Can't hear me | β’ Click circle to unmute (should be purple, not red) β’ Grant microphone permissions β’ Check audio ring animation when speaking |
| Connection failed | β’ Verify Unmute at ws://localhost:8765β’ Check Docker containers are healthy β’ Restart Unmute backend |
| Audio format errors | β’ Browser must support Opus codec β’ Try different browser/Electron version β’ Check Unmute logs for OGG errors |
| Circle moves when clicking | β’ Fixed in latest version β’ Restart Nod.ie after update |
| Can't drag window | β’ Hold click for 300ms before dragging β’ Check window manager compatibility |
For detailed troubleshooting, see TROUBLESHOOTING.md
