Project for an automated dog trainer that performs dog pose recognition, speaks instructions, and dispenses treats via a servo. The system includes:
- A device application (Raspberry Pi / Linux) that runs a vision pipeline, servo control, and a WebSocket server for host/frontend control.
- A React frontend that connects via WebSocket to the device for manual control and sending recorded audio.
- An optional InfluxDB persistence pipeline (collector + writer) to store events and metrics for visualization.
Dataset:
Visit the training dataset used for pose classification:
https://universe.roboflow.com/vitto-rossetto-nbvy6/dog-pose-ntugs
Quick Links
- Device main:
src/main.py - Device host comms / WS server:
src/host_comms.py - Audio helpers (TTS + playback):
src/audio_comms.py - Influx writer:
src/influx_writer.py - Influx collector & API:
src/influx_collector.py,src/influx_api.py - Frontend app:
frontend/(React + Vite) - Scripts:
scripts/start_influx_services.py
Features
- Real-time pose classification and pose transition events
- Device-hosted WebSocket server for low-latency bidirectional commands
- Commands from host/frontend:
set_mode,servoactions,treat_now,audio(TTS or prerecorded/base64) - Audio: TTS via
espeakorpico2wave, plus playback of prerecorded files (src/recordings/) and base64 audio blobs - Optional persistence to InfluxDB (supports v2 client, v1 client, or HTTP line-protocol fallback)
- Simple React UI for manual control and an Influx viewer
Requirements
- Python 3.8+ on the device and host machines
- System packages (device):
ffmpeg/pulseaudioutilities for audio playback (e.g.paplay,ffplay,aplay), and TTS engines likeespeakorpico2wavefor TTS - Python packages (recommended):
flask,websockets,requests,opencv-python,torch(if using the included PyTorch models), plusinfluxdb-clientorinfluxdbif you want Python client support - Node.js and npm/yarn for building and running the frontend
Note: The project includes an HTTP fallback for Influx v1 (line-protocol POST) so the Python Influx libraries are optional for simple setups.
Models aren't provided by default, given the dimensions:
model/yolov8n.ptwill be installed automatically on first runmodel/best.pthas to be manually trained by runningnotebook/dogPoseClassifierTrain.ipynbeither locally or on colab.
- Create and activate a virtual environment and install Python deps:
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt- Ensure audio tools and TTS engines are installed:
sudo apt update
sudo apt install pulseaudio-utils espeak ffmpeg- Start the device app :
python3 src/main.py- Default device WebSocket server URL (used by the frontend):
ws://raspberrypi.local:8765/ws
Usually RaspberryPys are available at raspberrypi.local, but you could need to replace it with your Pi's IP (e.g. 192.168.80.173).
These components run on the Influx host (or same machine) and connect to the device WebSocket to receive events and write them into dog_training database.
- Configure
src/influx_collector.pyto point at your device WS and Influx server. - Start the collector and API:
python3 scripts/start_influx_services.py- The collector writes events to Influx and the
src/influx_api.pyexposes a small HTTP proxy used by the frontend viewer.
Note: If your Influx version is v1.x (e.g. 1.6.7), the writer will use an HTTP line-protocol fallback when the Python client isn't available.
- From the
frontend/directory, install dependencies and run the dev server:
cd frontend
npm install
npm run dev- Open the UI in your browser and connect to the device WebSocket (use the Connect box). You can:
- Send TTS text and prerecorded audio
- Upload a small audio file (the UI will send base64 over WebSocket)
- Dispense treats and control servo actions
- Set mode:
{ "cmd": "set_mode", "mode": "auto" } - Treat now:
{ "cmd": "treat_now" } - Servo sweep:
{ "cmd": "servo", "action": "sweep" } - TTS:
{ "cmd": "audio", "text": "Good dog!" } - Send recorded audio (base64):
{ "cmd": "audio", "b64": "<BASE64>", "filename": "cheer.wav" }
The device responds by broadcasting event envelopes to connected UIs. Events look like:
{ "type": "event", "event": "pose_transition", "timestamp": 1234567890.0, "payload": { ... } }To further understand the project, and more in-depth explanations, please refer to: ProjectPaper.pdf
Vittorio Rossetto
Contributions welcome. Open an issue or PR describing your change. Suggested next improvements:
- Mobile App