SenseCAP Watcher Local Server

A complete local replacement for SenseCAP cloud services with offline AI capabilities for voice interaction, vision analysis, and task automation.

Features

Core Services

✅ Voice Interaction Pipeline - Complete audio chat with Whisper STT, Ollama LLM, and Piper TTS
✅ Vision Analysis - LLaVA-powered image understanding and event detection
✅ Task Automation - Create and manage monitoring tasks with natural language
✅ Offline AI - All AI processing runs locally (no cloud required)
✅ Docker Support - Easy deployment with docker-compose
✅ Dual Mode Operation - Chat mode for conversations, Task mode for automation

API Implementation

✅ Full API Compatibility - Implements all SenseCAP Watcher local server endpoints
✅ Authentication - Optional token-based authentication
✅ Database Storage - SQLite persistence for tasks and events
✅ Standards Compliant - Follows exact API spec from firmware source

Architecture

┌─────────────────┐      ┌──────────────────┐      ┌─────────────────┐
│  SenseCAP       │      │  Local Server    │      │  AI Services    │
│  Watcher Device │◄────►│  (Go)            │◄────►│                 │
│                 │      │  Port 8834       │      │  - Ollama LLM   │
│  - Camera       │      │                  │      │  - LLaVA Vision │
│  - Microphone   │      │  ┌────────────┐  │      │  - Whisper STT  │
│  - Speaker      │      │  │ SQLite DB  │  │      │  - Piper TTS    │
└─────────────────┘      │  └────────────┘  │      └─────────────────┘
                         │                  │
                         │  Audio Service   │
                         │  (Python)        │
                         │  Port 8835       │
                         └──────────────────┘

Quick Start

Option 1: Docker (Recommended)

1. Copy environment file:

cp .env.example .env
# Edit .env to set your AUTH_TOKEN

2. Start all services:

docker-compose up -d

3. Pull AI models:

docker exec sensecap-ollama ollama pull llama3.1:8b-instruct-q4_1
docker exec sensecap-ollama ollama pull llava:7b

4. Check status:

docker-compose ps
curl http://localhost:8834/health
curl http://localhost:8835/health

Option 2: Local Development

1. Install Dependencies:

# Install Go dependencies and download AI models
make install

# Set up Python environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r python/requirements.txt

2. Install AI Services:

Ollama (LLM + Vision):

# macOS/Linux
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.1:8b-instruct-q4_1
ollama pull llava:7b

# Or via brew on macOS
brew install ollama
brew services start ollama

3. Start Services:

Using the startup script:

chmod +x scripts/start-all.sh
./scripts/start-all.sh

Or manually:

# Terminal 1: Audio service
source venv/bin/activate
python3 python/audio_service.py

# Terminal 2: Go server
make run
# Or with authentication:
make run TOKEN=your-secret-token

Project Structure

sensecap-server/
├── cmd/
│   └── server/
│       └── main.go              # Application entry point
├── internal/
│   ├── config/                  # Configuration management
│   ├── handlers/                # HTTP handlers
│   │   ├── audio_stream.go     # Voice interaction endpoint
│   │   ├── vision.go           # Image analysis endpoint
│   │   ├── notification.go     # Event notification endpoint
│   │   ├── task_detail.go      # Task flow endpoint
│   │   ├── constants.go        # API constants
│   │   └── prompts.go          # AI prompts
│   ├── middleware/              # HTTP middleware
│   ├── database/                # SQLite layer
│   └── models/                  # Data models
├── python/
│   ├── audio_service.py         # Whisper STT + Piper TTS service
│   ├── requirements.txt         # Python dependencies
│   └── Dockerfile              # Python service container
├── scripts/
│   └── start-all.sh            # Startup script
├── Dockerfile                   # Go service container
├── docker-compose.yaml          # Multi-service orchestration
├── Makefile                     # Development commands
├── go.mod                       # Go dependencies
└── *.md                         # Documentation

API Endpoints

V2 API (Voice & Tasks)

POST /v2/watcher/talk/audio_stream

Voice interaction endpoint with dual-mode support.

Modes:

Chat Mode: Natural conversation with the AI
Task Mode: Create monitoring tasks via voice ("notify me when...")

Request Headers:

Authorization: <token>
API-OBITER-DEVICE-EUI: <16-char hex EUI>
Content-Type: application/octet-stream

Request Body: Raw PCM audio (16kHz, 16-bit, mono)

Response: Multipart (JSON + audio)

POST /v2/watcher/talk/view_task_detail

Get task flow details for a created monitoring task.

Request: {"task_id": "abc123"}

Response: Task flow JSON with nodes and edges

V1 API (Vision & Events)

POST /v1/watcher/vision

Image analysis with LLaVA vision model.

Request:

{
  "img": "base64-image",
  "prompt": "What do you see?",
  "type": 1
}

POST /v1/notification/event

Receive device notifications and sensor data.

Request:

{
  "requestId": "uuid",
  "deviceEui": "...",
  "events": {
    "timestamp": 1234567890,
    "text": "Alert message",
    "data": {
      "inference": {...},
      "sensor": {...}
    }
  }
}

Health Checks

GET /health - Go server health
GET http://localhost:8835/health - Python audio service health
GET http://localhost:11434/api/tags - Ollama service

Configuration

Environment Variables

Variable	Default	Description
`SERVER_PORT`	8834	Go server port
`DB_PATH`	data/sensecap.db	SQLite database path
`AUTH_TOKEN`	(none)	Authentication token
`WHISPER_URL`	http://localhost:8835	Whisper STT service
`PIPER_URL`	http://localhost:8835	Piper TTS service
`OLLAMA_URL`	http://localhost:11434	Ollama LLM service
`OLLAMA_MODEL`	llama3.1:8b-instruct-q4_1	LLM model
`LLAVA_MODEL`	llava:7b	Vision model
`PIPER_VOICE`	en_US-lessac-medium	Piper TTS voice model
`API_HOST`	localhost	API callback host
`API_SCHEMA`	http	API callback schema

Changing TTS Voice

The Piper TTS voice can be customized by setting the PIPER_VOICE environment variable. Available voices can be browsed at Piper Voices.

Popular English voices:

Voice	Description	Quality	Download Command
`en_US-lessac-medium`	Neutral, clear (default)	Medium	Default
`en_US-amy-medium`	Female, professional	Medium	`PIPER_VOICE=en_US-amy-medium make download-models`
`en_US-ryan-medium`	Male, deep	Medium	`PIPER_VOICE=en_US-ryan-medium make download-models`
`en_US-kristin-medium`	Female, warm	Medium	`PIPER_VOICE=en_US-kristin-medium make download-models`
`en_US-danny-low`	Male, fast	Low	`PIPER_VOICE=en_US-danny-low make download-models`

To change the voice:

Download the voice model:

PIPER_VOICE=en_US-amy-medium make download-models

Set environment variable in .env:
```
PIPER_VOICE=en_US-amy-medium
```

Restart the audio service:

# If using Docker
docker-compose restart audio-service

# If running locally
# Restart the Python audio service

The voice will be automatically loaded on startup.

Command-Line Flags

go run ./cmd/server \
  -port 8834 \
  -token my-secret-token \
  -whisper-url http://localhost:8835 \
  -ollama-url http://localhost:11434 \
  -ollama-model llama3.1:8b-instruct-q4_1 \
  -llava-model llava:7b \
  -piper-url http://localhost:8835 \
  -db-path data/sensecap.db

Configure Your Device

Use AT commands via Bluetooth to configure your SenseCAP Watcher:

Notification Proxy:

AT+localservice={"data":{"notification_proxy":{
  "switch":1,"url":"http://<your-ip>:8834","token":"your-token"}}}

Image Analyzer:

AT+localservice={"data":{"image_analyzer":{
  "switch":1,"url":"http://<your-ip>:8834","token":"your-token"}}}

Replace <your-ip> with your server's IP address.

Development

Make Commands

make help          # Show all available commands
make build         # Build the Go binary
make run           # Run without auth
make run TOKEN=xx  # Run with auth
make test          # Run tests
make clean         # Clean build artifacts
make fmt           # Format code
make lint          # Run linters

Docker Commands

docker-compose up -d        # Start all services
docker-compose down         # Stop all services
docker-compose logs -f      # View logs
docker-compose ps           # Check status
docker-compose restart      # Restart services

Testing Endpoints

Test voice endpoint:

curl -X POST http://localhost:8834/v2/watcher/talk/audio_stream \
  -H "Authorization: your-token" \
  -H "API-OBITER-DEVICE-EUI: 2CF7F1C04430000C" \
  -H "Content-Type: application/octet-stream" \
  --data-binary @test-audio.pcm

Test vision endpoint:

curl -X POST http://localhost:8834/v1/watcher/vision \
  -H "Authorization: your-token" \
  -H "API-OBITER-DEVICE-EUI: 2CF7F1C04430000C" \
  -H "Content-Type: application/json" \
  -d '{
    "img": "base64-encoded-image",
    "prompt": "Is there a person in this image?",
    "type": 1
  }'

Production Deployment

Security Best Practices

Always use authentication - Set a strong AUTH_TOKEN
Use HTTPS - Deploy behind a reverse proxy (Caddy, nginx)
Firewall rules - Restrict access to trusted devices
Regular updates - Keep AI models and dependencies updated
Monitor logs - Set up log aggregation and alerting

Performance Tuning

GPU Acceleration: Configure Ollama to use GPU for faster inference
Model Selection: Use smaller models (7B-8B) for lower latency
Resource Limits: Set Docker memory/CPU limits based on your hardware
Database: Regular VACUUM for SQLite optimization

Scaling

For multiple devices:

Run multiple Go server instances behind a load balancer
Share a single Ollama/audio service instance
Use PostgreSQL instead of SQLite for better concurrency

Troubleshooting

Ollama connection errors:

# Check if Ollama is running
curl http://localhost:11434/api/tags

# Start Ollama
brew services start ollama  # macOS
# or
docker exec sensecap-ollama ollama list

Audio service not responding:

# Check Python service
curl http://localhost:8835/health

# View Python logs
docker logs sensecap-audio
# or
tail -f python-audio.log

Build errors after restructuring:

go mod tidy
go clean -cache
make build

Documentation

LOCAL_SERVER_API.md - Complete API reference
PIPELINE.md - Voice interaction pipeline details
QUICK_START.md - Quick setup guide
openapi-local-server.yaml - OpenAPI specification

License

MIT License - see LICENSE file for details.

This implementation is based on the SenseCAP Watcher factory firmware source code analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
cmd		cmd
internal		internal
python		python
scripts		scripts
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
BLUETOOTH_API.md		BLUETOOTH_API.md
CLAUDE.md		CLAUDE.md
CLI-README.md		CLI-README.md
Dockerfile		Dockerfile
LICENSE		LICENSE
LOCAL_SERVER_API.md		LOCAL_SERVER_API.md
Makefile		Makefile
PIPELINE.md		PIPELINE.md
QUICK_START.md		QUICK_START.md
README.md		README.md
docker-compose.yaml		docker-compose.yaml
go.mod		go.mod
go.sum		go.sum
openapi-local-server.yaml		openapi-local-server.yaml
prompts.md		prompts.md

Folders and files

Latest commit

History

Repository files navigation

SenseCAP Watcher Local Server

Features

Core Services

API Implementation

Architecture

Quick Start

Option 1: Docker (Recommended)

Option 2: Local Development

Project Structure

API Endpoints

V2 API (Voice & Tasks)

POST /v2/watcher/talk/audio_stream

POST /v2/watcher/talk/view_task_detail

V1 API (Vision & Events)

POST /v1/watcher/vision

POST /v1/notification/event

Health Checks

Configuration

Environment Variables

Changing TTS Voice

Command-Line Flags

Configure Your Device

Development

Make Commands

Docker Commands

Testing Endpoints

Production Deployment

Security Best Practices

Performance Tuning

Scaling

Troubleshooting

Documentation

License

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages