dogTrainer

Project for an automated dog trainer that performs dog pose recognition, speaks instructions, and dispenses treats via a servo. The system includes:

A device application (Raspberry Pi / Linux) that runs a vision pipeline, servo control, and a WebSocket server for host/frontend control.
A React frontend that connects via WebSocket to the device for manual control and sending recorded audio.
An optional InfluxDB persistence pipeline (collector + writer) to store events and metrics for visualization.

Dataset:

Visit the training dataset used for pose classification:

https://universe.roboflow.com/vitto-rossetto-nbvy6/dog-pose-ntugs

Quick Links

Device main: src/main.py
Device host comms / WS server: src/host_comms.py
Audio helpers (TTS + playback): src/audio_comms.py
Influx writer: src/influx_writer.py
Influx collector & API: src/influx_collector.py, src/influx_api.py
Frontend app: frontend/ (React + Vite)
Scripts: scripts/start_influx_services.py

Features

Real-time pose classification and pose transition events
Device-hosted WebSocket server for low-latency bidirectional commands
Commands from host/frontend: set_mode, servo actions, treat_now, audio (TTS or prerecorded/base64)
Audio: TTS via espeak or pico2wave, plus playback of prerecorded files (src/recordings/) and base64 audio blobs
Optional persistence to InfluxDB (supports v2 client, v1 client, or HTTP line-protocol fallback)
Simple React UI for manual control and an Influx viewer

Requirements

Python 3.8+ on the device and host machines
System packages (device): ffmpeg/pulseaudio utilities for audio playback (e.g. paplay, ffplay, aplay), and TTS engines like espeak or pico2wave for TTS
Python packages (recommended): flask, websockets, requests, opencv-python, torch (if using the included PyTorch models), plus influxdb-client or influxdb if you want Python client support
Node.js and npm/yarn for building and running the frontend

Note: The project includes an HTTP fallback for Influx v1 (line-protocol POST) so the Python Influx libraries are optional for simple setups.

Models

Models aren't provided by default, given the dimensions:

model/yolov8n.pt will be installed automatically on first run
model/best.pt has to be manually trained by running notebook/dogPoseClassifierTrain.ipynb either locally or on colab.

Device quickstart (Raspberry Pi)

Create and activate a virtual environment and install Python deps:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Ensure audio tools and TTS engines are installed:

sudo apt update
sudo apt install pulseaudio-utils espeak ffmpeg

Start the device app :

python3 src/main.py

Default device WebSocket server URL (used by the frontend):

ws://raspberrypi.local:8765/ws

Usually RaspberryPys are available at raspberrypi.local, but you could need to replace it with your Pi's IP (e.g. 192.168.80.173).

Influx collector & API

These components run on the Influx host (or same machine) and connect to the device WebSocket to receive events and write them into dog_training database.

Configure src/influx_collector.py to point at your device WS and Influx server.
Start the collector and API:

python3 scripts/start_influx_services.py

The collector writes events to Influx and the src/influx_api.py exposes a small HTTP proxy used by the frontend viewer.

Note: If your Influx version is v1.x (e.g. 1.6.7), the writer will use an HTTP line-protocol fallback when the Python client isn't available.

Frontend (development)

From the frontend/ directory, install dependencies and run the dev server:

cd frontend
npm install
npm run dev

Open the UI in your browser and connect to the device WebSocket (use the Connect box). You can:

Send TTS text and prerecorded audio
Upload a small audio file (the UI will send base64 over WebSocket)
Dispense treats and control servo actions

Frontend messages (examples sent to device over WS):

Set mode: { "cmd": "set_mode", "mode": "auto" }
Treat now: { "cmd": "treat_now" }
Servo sweep: { "cmd": "servo", "action": "sweep" }
TTS: { "cmd": "audio", "text": "Good dog!" }
Send recorded audio (base64): { "cmd": "audio", "b64": "<BASE64>", "filename": "cheer.wav" }

The device responds by broadcasting event envelopes to connected UIs. Events look like:

{ "type": "event", "event": "pose_transition", "timestamp": 1234567890.0, "payload": { ... } }

Further Informations

To further understand the project, and more in-depth explanations, please refer to: ProjectPaper.pdf

Author

Vittorio Rossetto

Contributing

Contributions welcome. Open an issue or PR describing your change. Suggested next improvements:

Mobile App

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dogTrainer

Models

Device quickstart (Raspberry Pi)

Influx collector & API

Frontend (development)

Frontend messages (examples sent to device over WS):

Further Informations

Author

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
frontend		frontend
notebook		notebook
scripts		scripts
src		src
test		test
.gitignore		.gitignore
ProjectPaper.pdf		ProjectPaper.pdf
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

dogTrainer

Models

Device quickstart (Raspberry Pi)

Influx collector & API

Frontend (development)

Frontend messages (examples sent to device over WS):

Further Informations

Author

Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages