feat: per-service local LLM token accounting via Token Spy by nt1412 · Pull Request #384 · Light-Heart-Labs/DreamServer

nt1412 · 2026-03-18T16:16:00Z

Summary

Enables optional per-service token accounting for local LLM traffic. Users can route individual services through Token Spy to track token usage, cost, and per-agent metrics at :3005/dashboard. Off by default — no behavior change for existing installs.

How it works

Token Spy runs multiple uvicorn processes inside a single container, each with its own AGENT_NAME and port. All processes share one SQLite database, so the dashboard shows all agents together.

Port	Agent	Service
8080	token-spy	Main instance (dashboard)
8081	open-webui	Open WebUI chat
8082	perplexica	Perplexica deep research
8083	openclaw	OpenClaw agents
8084	litellm	LiteLLM gateway
8085	n8n	n8n workflows (via UI credential)

To enable

Add to .env and restart:

TOKEN_SPY_AUTH_MODE=local
WEBUI_LLM_URL=http://token-spy:8081

Monitoring instances only start when TOKEN_SPY_AUTH_MODE=local.

Verified on fresh official macOS install

Service	Agent	Input Tok	Output Tok	Status
Open WebUI	`open-webui`	276–385	370–1K	Verified
Perplexica	`perplexica`	1.1K–1.3K	12–85	Verified
OpenClaw	`openclaw`	14.9K	1.0K	Verified

Commits

feat: enable optional per-service LLM monitoring via Token Spy — multi-process launcher (start-monitoring.sh), compose env var overrides for Open WebUI/Perplexica/OpenClaw/LiteLLM, AUTH_MODE=local bypass for internal Docker services, macOS overlay, .env.schema.json + .env.example documentation
fix(token-spy): inject stream_options.include_usage — llama-server only includes token counts in streaming responses when explicitly requested. Without this, all streaming requests log 0/0 tokens.
docs(token-spy): add local LLM monitoring quick start guide — README section with env vars, port mapping, per-service instructions, OpenClaw #token= URL workaround
fix(openclaw): log browser-accessible URL with auth token on startup — OpenClaw's Control UI requires #token= in the URL hash for Docker deployments. Now logged to container output on startup: docker logs dream-openclaw | grep "Control UI"
fix(dashboard): use URL hash for OpenClaw sidebar link — sidebar was generating ?token= (query param) but OpenClaw reads #token= (hash fragment). One character fix.

Known limitations

AUTH_MODE=local disables auth on Token Spy proxy routes (safe: only reachable within Docker network)
Routing through Token Spy adds one network hop to LLM requests
Restarting a service drops active connections
n8n requires manual credential setup in the n8n UI (not automatable via env var)
OpenClaw requires #token= URL hash for Docker — logged on startup, fixed in dashboard sidebar

🤖 Generated with Claude Code

Users can opt-in to routing individual services through Token Spy for per-service token accounting. All monitoring runs inside ONE container using multiple uvicorn processes — each with its own AGENT_NAME and port, sharing one SQLite database. The dashboard at :3005 shows all agents together. Port mapping (inside the token-spy container): 8080 — main instance (dashboard, cloud/agent monitoring) 8081 — open-webui monitoring (AGENT_NAME=open-webui) 8082 — perplexica monitoring (AGENT_NAME=perplexica) 8083 — openclaw monitoring (AGENT_NAME=openclaw) 8084 — litellm monitoring (AGENT_NAME=litellm) To enable, add to .env and restart: TOKEN_SPY_AUTH_MODE=local WEBUI_LLM_URL=http://token-spy:8081 Off by default. Monitoring instances only start when TOKEN_SPY_AUTH_MODE=local. No behavior change for existing installs. Known weaknesses: - AUTH_MODE=local disables auth on proxy routes (safe: Docker-internal) - Restarting a service drops active connections - Routing through Token Spy adds one network hop to LLM requests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…uests llama-server (and other OpenAI-compatible endpoints) only include token usage in streaming responses when stream_options.include_usage is explicitly set to true. Without this, Token Spy logs 0 input/output tokens for all streaming requests — making per-service token accounting useless for OpenClaw and other streaming clients. Injects {"include_usage": true} into the request body before forwarding, only when stream_options is not already set by the client. This is a standard OpenAI field, not a custom extension. Verified: OpenClaw streaming request now shows 14.9K input, 1.0K output (was 0/0 before this fix). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Documents how to enable per-service token accounting: env vars, port mapping, restart commands, and the OpenClaw #token= URL fix for Docker deployments. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

OpenClaw's Control UI requires the gateway token in the URL hash (#token=...) for Docker deployments — without it, the browser gets "device identity required" on every WebSocket connection. The token is auto-generated but never shown to the user. Now inject-token.js logs the full URL to container logs on startup: docker logs dream-openclaw | grep "Control UI" Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

OpenClaw's Control UI reads the gateway token from the URL hash fragment, not query params. The sidebar was generating ?token= which doesn't work — the WebSocket handshake fails with "device identity required" because the token isn't passed to the WS connect. One character fix: ? → # Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…l-monitoring # Conflicts: # dream-server/installers/macos/docker-compose.macos.yml

Lightheartdevs

Approve — Well-designed token monitoring feature.

Architecture

Multi-process uvicorn inside a single Token Spy container, each with its own AGENT_NAME and port (8080-8085). All processes share one SQLite database, so the dashboard shows all agents together. Off by default — only activates when TOKEN_SPY_AUTH_MODE=local.

Changes reviewed

start-monitoring.sh: Launches per-service monitoring instances
Compose env overrides for Open WebUI, Perplexica, OpenClaw, LiteLLM using ${SERVICE_LLM_URL:-${LLM_API_URL:-default}} pattern — clean fallback chain
inject-token.js: OpenClaw provider baseUrl override for monitoring + browser URL logging with #token=
Sidebar fix: ?token= → #token= (OpenClaw reads hash fragment, not query param)
stream_options.include_usage injection for llama-server streaming responses
.env.example and .env.schema.json documentation

Verified claims

Off by default (no behavior change for existing installs)
Per-service agent names in dashboard
Tested on fresh macOS install per PR description

Minor notes

AUTH_MODE=local disables auth on proxy routes — acceptable since only reachable within Docker network
Apple Silicon overlay correctly included

Well-documented, well-tested, clean implementation. LGTM.

nt1412 and others added 6 commits March 18, 2026 09:19

docs(token-spy): add local LLM monitoring quick start guide

dc6cff4

Documents how to enable per-service token accounting: env vars, port mapping, restart commands, and the OpenClaw #token= URL fix for Docker deployments. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Merge remote-tracking branch 'upstream/main' into feat/token-spy-loca…

5c6efa3

…l-monitoring # Conflicts: # dream-server/installers/macos/docker-compose.macos.yml

Lightheartdevs approved these changes Mar 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: per-service local LLM token accounting via Token Spy#384

feat: per-service local LLM token accounting via Token Spy#384
nt1412 wants to merge 6 commits intoLight-Heart-Labs:mainfrom
nt1412:feat/token-spy-local-monitoring

nt1412 commented Mar 18, 2026

Uh oh!

Lightheartdevs left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nt1412 commented Mar 18, 2026

Summary

How it works

To enable

Verified on fresh official macOS install

Commits

Known limitations

Uh oh!

Lightheartdevs left a comment

Choose a reason for hiding this comment

Architecture

Changes reviewed

Verified claims

Minor notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants