248Tech
diff --git a/‎AI/ARCHITECTURE.md‎
Lines changed: 200 additions & 133 deletions b/‎AI/ARCHITECTURE.md‎
Lines changed: 200 additions & 133 deletions
diff --git a/‎AI/DECISIONS.md‎
Lines changed: 95 additions & 0 deletions b/‎AI/DECISIONS.md‎
Lines changed: 95 additions & 0 deletions
diff --git a/‎AI/PROJECT_CONTEXT.md‎
Lines changed: 46 additions & 13 deletions b/‎AI/PROJECT_CONTEXT.md‎
Lines changed: 46 additions & 13 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 14 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 14 additions & 0 deletions
@@ -285,6 +285,101 @@ This file records key decisions made during planning. Before changing anything l
 
 ---
 
+---
+
+## DEC-017 — Three deployment modes as a first-class runtime concept
+
+**Decision:** RTSPanda has exactly three deployment modes (`pi`, `standard`, `viewer`), resolved at startup via `RTSPANDA_MODE` or auto-detected from `runtime.GOARCH`. Each mode gates specific subsystems.
+
+**Rationale:**
+- Prior architecture had implicit behavior changes via individual env vars (`AI_MODE`, `AI_WORKER_URL`), which left room for misconfiguration and unclear capability expectations.
+- A single mode enum makes the deployment surface explicit and testable.
+- ARM auto-detection removes the most common Pi misconfiguration (running standard mode on Pi).
+
+**Mode capabilities:**
+
+| Mode | YOLO detection | Snapshot AI | Notes |
+|------|---------------|-------------|-------|
+| `pi` | ✗ disabled | ✔ optional | Default on ARM |
+| `standard` | ✔ enabled | ✗ | Default on x86 |
+| `viewer` | ✗ disabled | ✗ | No AI of any kind |
+
+**Implications:**
+- `internal/mode/mode.go` is the authority on mode resolution.
+- `main.go` consults `deployMode.AIInferenceAllowed()` and `deployMode.SnapshotAIAllowed()` before starting each subsystem.
+- Forcing `RTSPANDA_MODE=standard` on ARM emits a warning but does not block startup (users may have capable ARM servers).
+
+**Status:** Decided. Do not add new per-feature env vars that circumvent mode gating.
+
+---
+
+## DEC-018 — Raspberry Pi is a viewer and snapshot AI node, not a YOLO inference host
+
+**Decision:** Raspberry Pi is explicitly unsupported as a real-time YOLO inference host. This constraint is enforced in code, documentation, scripts, and error messages. There is no "experimental AI on Pi" path.
+
+**Rationale:**
+- YOLOv8n ONNX inference requires ~400–600 MB RAM at runtime. A Pi 4 (4 GB) running the full stack (Go backend, mediamtx, 4 RTSP streams, SQLite) is already at its practical memory ceiling.
+- ONNX Runtime on arm64 runs at 3–8 FPS on a Pi 4 CPU — not viable for real-time alerting.
+- Thermal throttling degrades performance further under sustained load.
+- Messaging "experimental AI on Pi" sets false expectations and produces user frustration at root causes that are fundamental, not fixable.
+
+**Enforcement points:**
+- `internal/mode/mode.go` — `AIInferenceAllowed()` returns false for `ModePi`
+- `scripts/pi-up.sh` — `full` mode is blocked with an explicit error message
+- `docker-compose.yml` — `rtspanda-pi` service sets `RTSPANDA_MODE=pi`
+- `README.md` — contains a clear statement in the deployment section header
+
+**Status:** Non-negotiable. Do not add YOLO inference paths for Pi hardware.
+
+---
+
+## DEC-019 — Snapshot Intelligence Engine as the Pi AI replacement
+
+**Decision:** Pi mode's AI capability is the Snapshot Intelligence Engine: interval-based JPEG capture → external vision API (Claude or OpenAI) → structured events → Discord alerts.
+
+**Rationale:**
+- Cloud vision APIs (GPT-4o-mini, claude-haiku) handle complex scene understanding that YOLO cannot (e.g., "Amazon driver detected" vs "person detected").
+- API latency (1–5 seconds) is acceptable for interval-based alerting — this is not a continuous tracking use case.
+- Cost per alert is negligible for homelab usage (< $0.001 per call for haiku/mini).
+- No GPU, no model downloads, no Python process — Pi handles only frame capture (FFmpeg) and HTTP dispatch.
+
+**Output contract:**
+- Emits `detection_events` rows with identical schema to YOLO events.
+- Discord alerts use the same `NotifyExternalDetectionEvents` path with a `sourceLabel` of "Snapshot AI (Claude)" or "Snapshot AI (OpenAI)".
+- The UI cannot and does not need to distinguish snapshot AI events from YOLO events.
+
+**Constraints:**
+- Not real-time. One API call per camera per interval tick.
+- Not suitable for sub-second response or continuous tracking.
+- Requires `SNAPSHOT_AI_ENABLED=true` and a valid `SNAPSHOT_AI_API_KEY`.
+- Positioning: "smart alerting via AI interpretation, not real-time detection."
+
+**Status:** Decided. Configuration via env vars; Settings UI integration is a follow-up task.
+
+---
+
+## DEC-020 — Snapshot AI uses external vision APIs, not a local model
+
+**Decision:** The Snapshot Intelligence Engine sends frames to hosted APIs (Anthropic Claude or OpenAI). There is no local vision model for Pi.
+
+**Rationale:**
+- Local vision models capable of scene description (LLaVA, InternVL, etc.) require 4–8 GB RAM — not viable on Pi.
+- External APIs are available on-demand, require no local resources, and produce higher-quality structured descriptions than YOLO labels.
+- The cloud dependency is explicit and opt-in (requires API key). Core viewing and recording work without it.
+
+**Provider support:**
+- `claude` — Uses `claude-haiku-4-5-20251001` (fast, cheap, capable vision)
+- `openai` — Uses `gpt-4o-mini` (comparable cost and capability)
+
+**Implications:**
+- Snapshot AI only activates if `SNAPSHOT_AI_ENABLED=true` and `SNAPSHOT_AI_API_KEY` is set.
+- No API key = no AI on Pi (core viewer still works fully).
+- Prompt is user-configurable (`SNAPSHOT_AI_PROMPT`) for property-specific detection.
+
+**Status:** Decided. Do not add local model paths for Pi — they are not viable hardware targets.
+
+---
+
 ## DEC-012 — Discord rich alerts are emitted from detection manager via notifier boundary
 
 **Decision:** Send Discord alerts from backend detection pipeline through a notifier interface.
 
@@ -1,6 +1,6 @@
 # RTSPanda — Project Context
 
-Last updated: 2026-03-14
+Last updated: 2026-03-20
 
 ## What This Project Is
 
@@ -28,23 +28,48 @@ Target users: homelab operators, self-hosters, developers, and small business op
 
 ---
 
+## Deployment Modes
+
+RTSPanda has three explicitly separated deployment modes:
+
+| Mode | Hardware | AI | Command |
+|------|----------|----|---------|
+| `pi` | Raspberry Pi / ARM | Snapshot AI (Claude/OpenAI) | `./scripts/pi-up.sh` |
+| `standard` | Server (x86/GPU) | YOLO (ONNX) | `docker compose up --build -d` |
+| `viewer` | Desktop | None | `RTSPANDA_MODE=viewer` |
+
+**Raspberry Pi does NOT support real-time YOLO inference.** Pi uses the Snapshot
+Intelligence Engine: interval frame capture → Claude/OpenAI vision API → events.
+
+Set `RTSPANDA_MODE` to override auto-detection (default: `pi` on ARM, `standard` on x86).
+
 ## High-Level Architecture
 
+**Standard mode:**
+```
+RTSP Cameras → mediamtx → Go backend → Browser UI
+                                ↓
+                       Detection scheduler (YOLO)
+                                ↓
+                     FastAPI ai-worker (/detect)
+```
+
+**Pi mode:**
 ```
-RTSP Cameras
-    ↓
-mediamtx (RTSP relay/transcode to HLS paths)
-    ↓
-Go backend (API, camera config, detection scheduler, notifications)
-    ↓                     ↘
-Browser UI                FastAPI ai-worker (YOLOv8 /detect, /health)
+RTSP Cameras → mediamtx → Go backend → Browser UI
+                                ↓
+                    Snapshot Intelligence Engine
+                    (interval capture + FFmpeg)
+                                ↓
+                  Claude / OpenAI vision API (HTTPS)
 ```
 
 Key behavior:
 
 - Browser does not connect directly to camera RTSP endpoints.
-- Detection is async and queue-based (workers in backend call AI worker).
-- Detection events persist to SQLite with snapshot paths and frame dimensions.
+- YOLO detection is async and queue-based (Standard mode only).
+- Snapshot AI is interval-based with external API round-trip (Pi mode only).
+- Detection events use identical schema regardless of source.
 
 ---
 
@@ -56,31 +81,39 @@ Key behavior:
 | Frontend | React + Vite + TypeScript |
 | Database | SQLite (`modernc.org/sqlite`) |
 | Streaming | mediamtx |
-| AI Worker | Python FastAPI + Ultralytics YOLOv8 |
+| AI Worker (Standard) | Python FastAPI + ONNX Runtime (YOLOv8) |
+| Snapshot AI (Pi) | Claude / OpenAI vision API (Go HTTP clients) |
 | Media tooling | FFmpeg |
 | Deployment | Docker + docker-compose |
 
 ---
 
-## Current Status (v0.0.3)
+## Current Status (v0.0.9)
 
 Shipped and working:
 
 - Camera CRUD + stream status + recordings
-- Detection sampler + async worker queue
+- Detection sampler + async worker queue (YOLO — Standard mode)
+- Snapshot Intelligence Engine (Pi mode — Claude/OpenAI vision API)
+- Three deployment modes: `pi`, `standard`, `viewer` (RTSPANDA_MODE)
 - YOLO API integration with test detection endpoint
 - Live overlay + detection history UI
 - Discord detection alerts + interval screenshot alerts
 - Manual Discord screenshot/record actions
 - Clip format fallback (`webm`, `webp`, `gif`)
 - Legacy alert-rule APIs preserved for compatibility
+- Performance + observability (v0.0.6): stream status cache, Prometheus metrics, 76% JS bundle reduction
+- Multi-view UI, Operator Dark theme (v0.0.6)
+- Deployment modes + snapshot AI rollout (v0.0.9)
 
 Not done yet:
 
 - Retention cleanup for snapshots/events
 - Detection history pagination/filtering
 - Discord retry/backoff and failure queue
 - Auth layer
+- WebRTC streaming (Phase 2)
+- Snapshot AI settings UI (currently env-var only)
 
 ---
 
 
@@ -1,5 +1,19 @@
 # Changelog
 
+## v0.0.9 - 2026-03-20
+
+### Added
+- First-class runtime deployment modes (`pi`, `standard`, `viewer`) with startup auto-detection and mode-gated subsystem startup.
+- Snapshot Intelligence Engine for Pi mode (`backend/internal/snapshotai`) with Claude/OpenAI vision providers and structured event persistence.
+- Shared frame capture helper (`CaptureFrameToPath`) for external snapshot pipelines.
+- New architecture and decision documentation for deployment mode guarantees and Pi AI constraints.
+
+### Changed
+- Backend boot flow now initializes YOLO detection only when mode allows it, and runs degraded detection handles in non-YOLO modes for API compatibility.
+- `docker-compose.yml` now sets explicit `RTSPANDA_MODE` defaults and includes Snapshot AI environment controls for Pi profile runs.
+- `scripts/pi-up.sh` now hard-blocks unsupported local AI-worker paths on ARM, clarifies supported Pi paths, and improves post-deploy guidance.
+- README was fully rewritten around the three deployment modes and explicit Pi constraints.
+
 ## v0.0.8 - 2026-03-19
 
 ### Added