Transform your Sunfounder PiCar-X into a smart, voice-activated robot powered by local Large Language Models (LLMs). No cloud. No lag. Just fast, private AI at your fingertips. Can also be used to turn your Raspberry Pi in a google home like powered by your LLM of choice
- Voice request capture — record your question directly from the robot.
- On-premise processing — audio is sent to your local computer for analysis.
- LLM-powered intelligence — local LLM (via Ollama) generates a reply (Mistral, Gemma, Llama, etc..).
- Text-to-speech (TTS) — reply is converted to audio and played back on the robot.
- Robot Actions — LLM can command the robot to perform pre-defined actions (e.g., wave hands, nod, express emotions) synchronized with its speech.
All processing is done locally — ensuring fast responses and data privacy.
User → PiCar-X Mic → (client.py) → Local PC (ask_server.py + Ollama) → LLM (generates Text + Actions) → TTS (for Text) → Server sends (Audio + Action List) → PiCar-X (Speaker plays Audio, client.py executes Actions)
- Receives audio from PiCar-X
- Transcribes speech
- Sends prompt to LLM via Ollama
- Extracts text and action commands from LLM's JSON response
- Converts textual response into speech (TTS)
- Returns both the audio file and the list of actions to PiCar-X
✅ Requires Ollama with a compatible LLM installed (e.g.
llama3,mistral, etc.)
- Detects sound environment and caliber the mic input settings
- Records voice input
- Sends audio to
ask_server.py - Receives audio reply and a list of action commands
- Plays the audio reply through the speaker
- Executes the received robot actions using
preset_actions.py
-
Install Ollama and pull your preferred LLM.
-
Clone this repo and install dependencies:
python3 -m venv .venv source .venv/bin/activate pip install -r requirements.txt -
Run the server:
python ask_server.py
- Enable microphone and speaker support.
Some commands to manage microphone and speaker: Microphone: to find which hw id is used by the mic: arecord -l
Speaker: test sound: aplay /usr/share/sounds/alsa/Front_Center.wav to test different hw id: aplay -D plughw:2,0 /usr/share/sounds/alsa/Front_Center.wav to list the hw speakers: aplay -l
1.1.
Create /root/.asoundrc or /etc/asound.conf with (see Examples/.asoundrc):
############################################################
pcm.dmixer { type dmix ipc_key 1024 slave { pcm "hw:0,0" # HAT = card 0, device 0 rate 48000 # fréquence native du DAC period_size 1024 buffer_size 4096 } }
############################################################
pcm.capplug { type plug slave { pcm "hw:1,0" # Micro USB = card 1, device 0 rate 48000 # fréquence native du mic } }
############################################################
pcm.asym { type asym playback.pcm "dmixer" capture.pcm "capplug" }
############################################################
pcm.!default { type plug slave.pcm "asym" }
ctl.!default { type hw card 0 # contrôle global sur la carte 0 (HAT) }
-
Install Python dependencies:
sudo apt install portaudio19-dev python3 -m venv .venv source .venv/bin/activate pip install pvporcupine webrtcvad pyaudio requests -
Run the client in the root session:
python client.py # Optional: specify an alternate microphone index python client.py --input-device 1The client uses the capture device defined in your
.asoundrcby default. SeeExamples/.asoundrcfor a ready-to-use configuration. Use--input-deviceto override the PyAudio index when required.
Read Project.md for more information about the project technical details and progresss status.
MIT License
-
Based on the PiCar-X by Sunfounder
-
Local LLM support powered by Ollama
Questions or ideas? Contributions welcome!