Skip to content

Commit d91a572

Browse files
committed
release: v0.0.9 mode-gated AI and snapshot intelligence
1 parent 793a239 commit d91a572

File tree

13 files changed

+1423
-333
lines changed

13 files changed

+1423
-333
lines changed

AI/ARCHITECTURE.md

Lines changed: 200 additions & 133 deletions
Large diffs are not rendered by default.

AI/DECISIONS.md

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -285,6 +285,101 @@ This file records key decisions made during planning. Before changing anything l
285285

286286
---
287287

288+
---
289+
290+
## DEC-017 — Three deployment modes as a first-class runtime concept
291+
292+
**Decision:** RTSPanda has exactly three deployment modes (`pi`, `standard`, `viewer`), resolved at startup via `RTSPANDA_MODE` or auto-detected from `runtime.GOARCH`. Each mode gates specific subsystems.
293+
294+
**Rationale:**
295+
- Prior architecture had implicit behavior changes via individual env vars (`AI_MODE`, `AI_WORKER_URL`), which left room for misconfiguration and unclear capability expectations.
296+
- A single mode enum makes the deployment surface explicit and testable.
297+
- ARM auto-detection removes the most common Pi misconfiguration (running standard mode on Pi).
298+
299+
**Mode capabilities:**
300+
301+
| Mode | YOLO detection | Snapshot AI | Notes |
302+
|------|---------------|-------------|-------|
303+
| `pi` | ✗ disabled | ✔ optional | Default on ARM |
304+
| `standard` | ✔ enabled || Default on x86 |
305+
| `viewer` | ✗ disabled || No AI of any kind |
306+
307+
**Implications:**
308+
- `internal/mode/mode.go` is the authority on mode resolution.
309+
- `main.go` consults `deployMode.AIInferenceAllowed()` and `deployMode.SnapshotAIAllowed()` before starting each subsystem.
310+
- Forcing `RTSPANDA_MODE=standard` on ARM emits a warning but does not block startup (users may have capable ARM servers).
311+
312+
**Status:** Decided. Do not add new per-feature env vars that circumvent mode gating.
313+
314+
---
315+
316+
## DEC-018 — Raspberry Pi is a viewer and snapshot AI node, not a YOLO inference host
317+
318+
**Decision:** Raspberry Pi is explicitly unsupported as a real-time YOLO inference host. This constraint is enforced in code, documentation, scripts, and error messages. There is no "experimental AI on Pi" path.
319+
320+
**Rationale:**
321+
- YOLOv8n ONNX inference requires ~400–600 MB RAM at runtime. A Pi 4 (4 GB) running the full stack (Go backend, mediamtx, 4 RTSP streams, SQLite) is already at its practical memory ceiling.
322+
- ONNX Runtime on arm64 runs at 3–8 FPS on a Pi 4 CPU — not viable for real-time alerting.
323+
- Thermal throttling degrades performance further under sustained load.
324+
- Messaging "experimental AI on Pi" sets false expectations and produces user frustration at root causes that are fundamental, not fixable.
325+
326+
**Enforcement points:**
327+
- `internal/mode/mode.go``AIInferenceAllowed()` returns false for `ModePi`
328+
- `scripts/pi-up.sh``full` mode is blocked with an explicit error message
329+
- `docker-compose.yml``rtspanda-pi` service sets `RTSPANDA_MODE=pi`
330+
- `README.md` — contains a clear statement in the deployment section header
331+
332+
**Status:** Non-negotiable. Do not add YOLO inference paths for Pi hardware.
333+
334+
---
335+
336+
## DEC-019 — Snapshot Intelligence Engine as the Pi AI replacement
337+
338+
**Decision:** Pi mode's AI capability is the Snapshot Intelligence Engine: interval-based JPEG capture → external vision API (Claude or OpenAI) → structured events → Discord alerts.
339+
340+
**Rationale:**
341+
- Cloud vision APIs (GPT-4o-mini, claude-haiku) handle complex scene understanding that YOLO cannot (e.g., "Amazon driver detected" vs "person detected").
342+
- API latency (1–5 seconds) is acceptable for interval-based alerting — this is not a continuous tracking use case.
343+
- Cost per alert is negligible for homelab usage (< $0.001 per call for haiku/mini).
344+
- No GPU, no model downloads, no Python process — Pi handles only frame capture (FFmpeg) and HTTP dispatch.
345+
346+
**Output contract:**
347+
- Emits `detection_events` rows with identical schema to YOLO events.
348+
- Discord alerts use the same `NotifyExternalDetectionEvents` path with a `sourceLabel` of "Snapshot AI (Claude)" or "Snapshot AI (OpenAI)".
349+
- The UI cannot and does not need to distinguish snapshot AI events from YOLO events.
350+
351+
**Constraints:**
352+
- Not real-time. One API call per camera per interval tick.
353+
- Not suitable for sub-second response or continuous tracking.
354+
- Requires `SNAPSHOT_AI_ENABLED=true` and a valid `SNAPSHOT_AI_API_KEY`.
355+
- Positioning: "smart alerting via AI interpretation, not real-time detection."
356+
357+
**Status:** Decided. Configuration via env vars; Settings UI integration is a follow-up task.
358+
359+
---
360+
361+
## DEC-020 — Snapshot AI uses external vision APIs, not a local model
362+
363+
**Decision:** The Snapshot Intelligence Engine sends frames to hosted APIs (Anthropic Claude or OpenAI). There is no local vision model for Pi.
364+
365+
**Rationale:**
366+
- Local vision models capable of scene description (LLaVA, InternVL, etc.) require 4–8 GB RAM — not viable on Pi.
367+
- External APIs are available on-demand, require no local resources, and produce higher-quality structured descriptions than YOLO labels.
368+
- The cloud dependency is explicit and opt-in (requires API key). Core viewing and recording work without it.
369+
370+
**Provider support:**
371+
- `claude` — Uses `claude-haiku-4-5-20251001` (fast, cheap, capable vision)
372+
- `openai` — Uses `gpt-4o-mini` (comparable cost and capability)
373+
374+
**Implications:**
375+
- Snapshot AI only activates if `SNAPSHOT_AI_ENABLED=true` and `SNAPSHOT_AI_API_KEY` is set.
376+
- No API key = no AI on Pi (core viewer still works fully).
377+
- Prompt is user-configurable (`SNAPSHOT_AI_PROMPT`) for property-specific detection.
378+
379+
**Status:** Decided. Do not add local model paths for Pi — they are not viable hardware targets.
380+
381+
---
382+
288383
## DEC-012 — Discord rich alerts are emitted from detection manager via notifier boundary
289384

290385
**Decision:** Send Discord alerts from backend detection pipeline through a notifier interface.

AI/PROJECT_CONTEXT.md

Lines changed: 46 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# RTSPanda — Project Context
22

3-
Last updated: 2026-03-14
3+
Last updated: 2026-03-20
44

55
## What This Project Is
66

@@ -28,23 +28,48 @@ Target users: homelab operators, self-hosters, developers, and small business op
2828

2929
---
3030

31+
## Deployment Modes
32+
33+
RTSPanda has three explicitly separated deployment modes:
34+
35+
| Mode | Hardware | AI | Command |
36+
|------|----------|----|---------|
37+
| `pi` | Raspberry Pi / ARM | Snapshot AI (Claude/OpenAI) | `./scripts/pi-up.sh` |
38+
| `standard` | Server (x86/GPU) | YOLO (ONNX) | `docker compose up --build -d` |
39+
| `viewer` | Desktop | None | `RTSPANDA_MODE=viewer` |
40+
41+
**Raspberry Pi does NOT support real-time YOLO inference.** Pi uses the Snapshot
42+
Intelligence Engine: interval frame capture → Claude/OpenAI vision API → events.
43+
44+
Set `RTSPANDA_MODE` to override auto-detection (default: `pi` on ARM, `standard` on x86).
45+
3146
## High-Level Architecture
3247

48+
**Standard mode:**
49+
```
50+
RTSP Cameras → mediamtx → Go backend → Browser UI
51+
52+
Detection scheduler (YOLO)
53+
54+
FastAPI ai-worker (/detect)
55+
```
56+
57+
**Pi mode:**
3358
```
34-
RTSP Cameras
35-
36-
mediamtx (RTSP relay/transcode to HLS paths)
37-
38-
Go backend (API, camera config, detection scheduler, notifications)
39-
↓ ↘
40-
Browser UI FastAPI ai-worker (YOLOv8 /detect, /health)
59+
RTSP Cameras → mediamtx → Go backend → Browser UI
60+
61+
Snapshot Intelligence Engine
62+
(interval capture + FFmpeg)
63+
64+
Claude / OpenAI vision API (HTTPS)
4165
```
4266

4367
Key behavior:
4468

4569
- Browser does not connect directly to camera RTSP endpoints.
46-
- Detection is async and queue-based (workers in backend call AI worker).
47-
- Detection events persist to SQLite with snapshot paths and frame dimensions.
70+
- YOLO detection is async and queue-based (Standard mode only).
71+
- Snapshot AI is interval-based with external API round-trip (Pi mode only).
72+
- Detection events use identical schema regardless of source.
4873

4974
---
5075

@@ -56,31 +81,39 @@ Key behavior:
5681
| Frontend | React + Vite + TypeScript |
5782
| Database | SQLite (`modernc.org/sqlite`) |
5883
| Streaming | mediamtx |
59-
| AI Worker | Python FastAPI + Ultralytics YOLOv8 |
84+
| AI Worker (Standard) | Python FastAPI + ONNX Runtime (YOLOv8) |
85+
| Snapshot AI (Pi) | Claude / OpenAI vision API (Go HTTP clients) |
6086
| Media tooling | FFmpeg |
6187
| Deployment | Docker + docker-compose |
6288

6389
---
6490

65-
## Current Status (v0.0.3)
91+
## Current Status (v0.0.9)
6692

6793
Shipped and working:
6894

6995
- Camera CRUD + stream status + recordings
70-
- Detection sampler + async worker queue
96+
- Detection sampler + async worker queue (YOLO — Standard mode)
97+
- Snapshot Intelligence Engine (Pi mode — Claude/OpenAI vision API)
98+
- Three deployment modes: `pi`, `standard`, `viewer` (RTSPANDA_MODE)
7199
- YOLO API integration with test detection endpoint
72100
- Live overlay + detection history UI
73101
- Discord detection alerts + interval screenshot alerts
74102
- Manual Discord screenshot/record actions
75103
- Clip format fallback (`webm`, `webp`, `gif`)
76104
- Legacy alert-rule APIs preserved for compatibility
105+
- Performance + observability (v0.0.6): stream status cache, Prometheus metrics, 76% JS bundle reduction
106+
- Multi-view UI, Operator Dark theme (v0.0.6)
107+
- Deployment modes + snapshot AI rollout (v0.0.9)
77108

78109
Not done yet:
79110

80111
- Retention cleanup for snapshots/events
81112
- Detection history pagination/filtering
82113
- Discord retry/backoff and failure queue
83114
- Auth layer
115+
- WebRTC streaming (Phase 2)
116+
- Snapshot AI settings UI (currently env-var only)
84117

85118
---
86119

CHANGELOG.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,19 @@
11
# Changelog
22

3+
## v0.0.9 - 2026-03-20
4+
5+
### Added
6+
- First-class runtime deployment modes (`pi`, `standard`, `viewer`) with startup auto-detection and mode-gated subsystem startup.
7+
- Snapshot Intelligence Engine for Pi mode (`backend/internal/snapshotai`) with Claude/OpenAI vision providers and structured event persistence.
8+
- Shared frame capture helper (`CaptureFrameToPath`) for external snapshot pipelines.
9+
- New architecture and decision documentation for deployment mode guarantees and Pi AI constraints.
10+
11+
### Changed
12+
- Backend boot flow now initializes YOLO detection only when mode allows it, and runs degraded detection handles in non-YOLO modes for API compatibility.
13+
- `docker-compose.yml` now sets explicit `RTSPANDA_MODE` defaults and includes Snapshot AI environment controls for Pi profile runs.
14+
- `scripts/pi-up.sh` now hard-blocks unsupported local AI-worker paths on ARM, clarifies supported Pi paths, and improves post-deploy guidance.
15+
- README was fully rewritten around the three deployment modes and explicit Pi constraints.
16+
317
## v0.0.8 - 2026-03-19
418

519
### Added

0 commit comments

Comments
 (0)