GitHub - iriseye931-ai/iriseye: Local AI mesh — multiple agents, shared persistent memory, real-time dashboard, browser automation. Runs on your hardware.

A fully local, multi-agent AI mesh running 24/7 on Apple Silicon.
Specialized agents. Shared persistent memory. Real-time monitoring.
Local-first routing. Premium reasoning by exception. Real observability.
Premium pool: Claude Code plus Codex when needed. Local stack handles the volume.

Most local AI setups are either one smart premium model or one isolated local model. iriseye is both: a local-first mesh where Hermes absorbs the volume, Mission Control tells the truth about the system, and premium reasoning is reserved for the hard edge cases.

Active development. We're shipping updates regularly. PRs and issues welcome.

What it is

Most local AI setups are single-agent and stateless. iriseye is a mesh:

Atlas is a role, not a model — the lead path can be served by Claude Code or Codex. Premium reasoning is reserved for planning, ambiguous debugging, tricky refactors, and final review.
Hermes (@NousResearch) is the workhorse — cron, summaries, routing, memory consolidation, repo scans, and routine execution stay local by default.
Hermes runs as a profile stack, not one monolith:
- workhorse — Qwen3.5-35B-A3B-4bit
- sidecar — Qwen2.5-7B-Instruct-4bit
- code-specialist — Qwen2.5-Coder-32B-Instruct-4bit
- reasoning-specialist — DeepSeek-R1-Distill-Qwen-32B-4bit
Persistent shared memory via OpenViking — every agent reads and writes to the same vector store.
Mission Control is the operational source of truth — health, routing, heartbeats, premium availability, cron freshness, local profile state, permission decisions, and normalized agent presence are visible live.
AI Maestro is the registry/orchestration layer — useful for addresses and AMP routing, but not treated as the primary liveness authority.
Smart token management — routine mesh messages never touch the premium pool. Most work stays local.
Self-building knowledge graph — sessions, logs, and memories get indexed nightly into GraphRAG
250+ prompt patterns via fabric wired to your local LLM — summarize, extract, analyze, with zero API calls
Real-time dashboard showing every agent's status, tasks, AMP inbox, and memory activity live — mission-control-dashboard
Full observability — Netdata for system metrics, Glance for service health, Screenpipe for visual history
Auto-start on boot — all services come up on login, restart on crash
30-minute memory snapshots — git-based backup so nothing is lost
Config safety net — every settings/hook change reviewed by local MLX via config-review.sh. Issues routed to Hermes via AMP. No Claude API tokens burned on routine checks.

The cost model

Layer	Cost	What it handles
Premium pool (Claude Code / Codex)	scarce	Lead-agent judgment, planning, review, high-stakes work
MLX local inference	$0	Hermes workhorse, sidecar, coding specialist, reasoning specialist
Hermes (NousResearch)	$0	Local execution, cron, tooling, summaries, routing, file/web tasks
OpenViking memory	$0	Persistent memory across all agents

The important constraint is not "use the smartest model first." It is "use the cheapest model that can do the job correctly, and reserve premium reasoning for the tasks that actually justify it."

In practice:

Hermes should handle most mesh traffic.
Codex and Claude Code should be used as scarce premium paths.
If one premium path is capped or unavailable, the other takes over the Atlas role.

Everything runs locally. Your data stays on your machine.

Architecture

The mesh runs three protocols and one control-plane policy.

MCP — tools the model calls while thinking. Memory lookups, file ops, agent delegation — synchronous and inline. The model gets the result mid-thought.

CLI — direct subprocess call to an agent. Blocking, immediate, zero infrastructure. hermes chat -q — your script calls an agent like any other command.

AMP — async message passing between agents via file-based inbox, routed by AI Maestro. Fire and forget. Agents talk to each other without you in the middle.

Routing policy — local first, premium by exception:

routine -> hermes
specialized -> iriseye
premium -> atlas, fallback claude

Recent control-plane improvements:

Permission audit trail — Hermes and Mission Control can emit allow/deny/ask/bypass events into one shared audit stream.
Presence-aware mesh state — the dashboard distinguishes agents that are online, merely registered, and truly offline.
Operator-visible decisions — routing, profile starts/stops, and control-plane overrides now leave an inspectable paper trail.

They layer like this:

┌────────────────────────────────────────────────────────────┐
│                 Atlas (premium lead role)                 │
│            served by Codex or Claude Code                 │
│                                                            │
│   MCP (inline tools)          AMP / CLI (delegation)      │
│   ├── memory_recall :2033     ├── amp-send → hermes       │
│   ├── memory_store  :2033     ├── amp-send → iriseye      │
│   └── docs / file tools       └── direct CLI if blocking  │
└────────────────────────────────────────────────────────────┘
         │                                 │
         ▼                                 ▼
  OpenViking :1933                 AI Maestro :23000
  shared memory                    registry + AMP routing
         │                                 │
         └──────────────┬──────────────────┘
                        ▼
               Mission Control :8000 / :3000
               operational truth + routing
                        │
          ┌─────────────┴────────────────────────┐
          ▼                                      ▼
    Hermes local stack                         iriseye
    workhorse / sidecar /                      specialized file/web
    code-specialist / reasoning-specialist

Hermes profile stack

workhorse handles default local execution
sidecar handles summaries, routing, compression, and cheap helper work
code-specialist is loaded on demand for implementation-heavy tasks
reasoning-specialist is loaded on demand for harder local analysis before premium escalation

Full architecture writeup: docs/mesh-architecture.md

┌──────────────────────────────────────────────────────────────────┐
│                          Your Machine                            │
│                                                                  │
│  ┌───────────────┐   ┌──────────────────────────────────────┐   │
│  │  Claude Code  │   │    Hermes (NousResearch)             │   │
│  │  (Atlas)      │   │  long-running tasks · web research   │   │
│  │  lead agent   │   │  file ops · tool chaining · cron     │   │
│  └──────┬────────┘   └──────────────────────────────────────┘   │
│         │                             │                          │
│         └─────────────────────────────┘                          │
│                          │                                       │
│                ┌─────────▼──────────┐                            │
│                │    OpenViking       │                            │
│                │  shared memory      │                            │
│                │  localhost:1933     │                            │
│                └─────────┬──────────┘                            │
│                          │                                       │
│  ┌───────────────────────▼────────────────────────────────────┐  │
│  │               Mission Control Dashboard                    │  │
│  │         real-time status · memory · AMP inbox · cron jobs  │  │
│  └────────────────────────────────────────────────────────────┘  │
│                                                                  │
│  ┌────────────┐  ┌─────────────┐  ┌──────────┐  ┌───────────┐  │
│  │  FastAPI   │  │  Page Agent │  │ Netdata  │  │  Glance   │  │
│  │  :8000     │  │   :38401    │  │  :19999  │  │  :8080    │  │
│  └────────────┘  └─────────────┘  └──────────┘  └───────────┘  │
│                                                                  │
│  ┌───────────────────────────────┐  ┌──────────────────────┐   │
│  │  MLX Server (mlx_lm) :8081   │  │  Ollama :11434       │   │
│  │  Qwen3.5-35B-A3B — chat/LLM  │  │  nomic-embed-text    │   │
│  │  Apple Silicon MoE 4-bit     │  │  embeddings only     │   │
│  └───────────────────────────────┘  └──────────────────────┘   │
└──────────────────────────────────────────────────────────────────┘

What's in this repo

iriseye/
├── agents/
│   ├── pydantic_agent.py             # pydantic-ai agent wired to local LLM
│   └── swarm_mesh.py                 # 3-agent Swarm mesh (Atlas / Researcher / Coder)
├── backend/
│   ├── main.py                       # FastAPI server — polls mesh, broadcasts via WebSocket
│   ├── requirements.txt
│   └── .env.example
├── config/
│   ├── claude-settings.json          # Claude Code settings (hooks, autoDreamEnabled: false)
│   └── ov.conf.example               # OpenViking config template
├── dashboard/
│   └── mission-control.html          # legacy single-file dashboard snapshot
├── docs/
│   ├── setup.md                      # Full step-by-step setup guide
│   └── mesh-architecture.md          # MCP vs CLI vs AMP — how the protocols layer
├── hooks/
│   ├── auto-store-worker.sh          # Claude Code Stop hook — auto-stores session summaries to memory
│   ├── config-review.sh              # PostToolUse hook — reviews config changes via local MLX, alerts Hermes on issues
│   └── subconscious-worker.sh        # Session summarization via MLX → OpenViking
├── launchagents/                     # macOS auto-start templates (edit paths, then load)
│   ├── local.mlx-server.plist        # MLX LLM server (Apple Silicon)
│   ├── local.openviking-server.plist
│   ├── local.openviking-mcp.plist
│   ├── local.mission-control-backend.plist
│   ├── local.graphrag-producer.plist
│   ├── local.memory-backup.plist
│   └── local.amp-hermes-bridge.plist # AMP bridge daemon for Hermes
├── mcp/
│   ├── openviking-mcp-server.py      # memory_recall / memory_store / memory_forget tools
│   └── requirements.txt
└── scripts/
    ├── mlx-server                    # MLX LLM server startup script (4GB KV cache, concurrency 2)
    ├── amp-hermes-bridge.sh          # AMP → Hermes bridge (parallel workers, session resumption)
    ├── start-mesh.sh                 # Start + health-check all services
    ├── backup-memories.sh            # Git-commit memory snapshots every 30 min
    ├── rebuild-index.py              # Rebuild vector index after crash or config change
    ├── graphrag-producer.py          # Collect sessions/logs/memories for nightly indexing
    └── llm-proxy.py                  # Rate-limiting proxy — prevents LLM queue flooding

Quick Start

See docs/setup.md for the full walkthrough.

Short version:

# 1. Install OpenViking (the memory store)
python3 -m venv ~/.openviking/venv
source ~/.openviking/venv/bin/activate
pip install openviking

cp config/ov.conf.example ~/.openviking/ov.conf
# Edit: set API key, LLM endpoint, embedding dimension

# 2. Start OpenViking
OPENVIKING_CONFIG_FILE=~/.openviking/ov.conf \
  ~/.openviking/venv/bin/python -c "
import uvicorn; from openviking.server import create_app
uvicorn.run(create_app(), host='0.0.0.0', port=1933)
" &

# 3. Start the memory MCP server (gives Atlas memory tools)
OV_API_KEY=your-key python mcp/openviking-mcp-server.py &
claude mcp add --transport http --scope user openviking-memory http://127.0.0.1:2033/mcp

# 4. Install Hermes (NousResearch)
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
# Configure to use your local MLX endpoint

# 5. Start the AMP bridge (routes messages to Hermes)
cp scripts/amp-hermes-bridge.sh ~/.local/bin/amp-hermes-bridge.sh
chmod +x ~/.local/bin/amp-hermes-bridge.sh
# Load launchagents/local.amp-hermes-bridge.plist

# 6. Start the dashboard backend
cd backend && pip install -r requirements.txt
cp .env.example .env && ./run_mission_control.sh

# 7. Run the live Mission Control frontend
# Use the dedicated mission-control-dashboard repo for the current UI on :3000
open http://127.0.0.1:3000

# 8. Check everything
./scripts/start-mesh.sh

The dashboard/mission-control.html file in this repo is retained as a lightweight legacy snapshot. The live operational UI is the separate mission-control-dashboard repo.

Agent ecosystem

Atlas — the lead role in the mesh. In practice this can be served by Claude Code or Codex depending on availability. Atlas handles planning, architecture, high-stakes debugging, and final review. It gets full memory tools via MCP.

Hermes (@NousResearch) — handles long-running tasks, cron-scheduled automations, web research, file ops, and most routine execution. Hermes is backed by a local profile stack:

workhorse — Qwen3.5-35B-A3B-4bit
sidecar — Qwen2.5-7B-Instruct-4bit
code-specialist — Qwen2.5-Coder-32B-Instruct-4bit
reasoning-specialist — DeepSeek-R1-Distill-Qwen-32B-4bit

That means Hermes is not just "the local model." It is the local execution layer.

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

Key commands:

hermes chat -q "your task"              # one-shot task
hermes chat -c "session-name" -q "..."  # resume named session (warm startup)
hermes doctor --fix                     # health check + auto-fix
hermes claw migrate                     # migrate from OpenClaw

Page Agent — browser automation. Agents can navigate pages, extract data, fill forms.

claude mcp add --scope user page-agent \
  -e LLM_BASE_URL=http://YOUR_LLM_HOST:PORT/v1 \
  -e LLM_MODEL_NAME=your-model \
  -e LLM_API_KEY=local \
  -- npx @page-agent/mcp
# Then install the Chrome extension

The tool stack

Beyond the core agents, we run these on top:

Tool	What it does	Install
fabric	250+ prompt patterns (summarize, extract wisdom, analyze, etc.)	`brew install fabric-ai`
pydantic-ai	Type-safe agent framework	`pip install pydantic-ai`
Swarm	Multi-agent orchestration with handoffs	`pip install Swarm`
mem0	In-code agent memory layer	`pip install mem0ai`
browser-use	Python browser automation for agents	`pip install browser-use`
screenpipe	Screen recording + OCR + searchable history	`brew install screenpipe`
GraphRAG	Knowledge graph from your documents	`pip install graphrag`
netdata	Real-time system metrics	`brew install netdata`
glance	Self-hosted status dashboard	`brew install glance`
context-hub	Curated API docs for agents	`pip install context-hub`

All tools are configured to use your local LLM — no external API calls.

The data pipeline

The mesh gets smarter every day automatically:

Atlas sessions ───┐
Hermes logs ──────┤──► graphrag-producer.py (2am) ──► ~/.graphrag/workspace/input/
OpenViking memory ┤                                           │
Shell history ────┘                                          ▼
                                                    graphrag index (via llm-proxy)
                                                           │
                                                           ▼
                                              Knowledge graph you can query

Every session you have, every command you run, every memory stored — captured nightly and indexed into a graph that reasons across all of it.

Run the producer manually any time:

python3 scripts/graphrag-producer.py

Run GraphRAG indexing overnight (use the proxy to avoid flooding your LLM):

# Terminal 1 — rate-limiting proxy (4 req/min)
python3 scripts/llm-proxy.py --port 6699 --rpm 4

# Terminal 2 — indexer
cd ~/.graphrag/workspace
GRAPHRAG_API_KEY=local GRAPHRAG_API_BASE=http://localhost:6699/v1 \
  ~/.graphrag/venv/bin/graphrag index --root .

Services & Ports

Service	Port	Purpose
OpenViking	1933	Vector memory server
Memory MCP	2033	MCP tools for Atlas (memory_recall, memory_store, memory_forget)
Hermes Gateway	18789	Hermes messaging gateway (Telegram, Discord)
AI Maestro	23000	Multi-agent orchestration (optional)
Mission Control backend	8000	Dashboard WebSocket + REST API
Mission Control frontend	3000	Dashboard UI
Page Agent hub	38401	Chrome extension bridge (optional)
Screenpipe	3030	Screen history API
Netdata	19999	System metrics
MLX Server	8081	Local LLM — Qwen3.5-35B-A3B-4bit via mlx_lm (chat + completions)
Hermes sidecar	8083	Qwen2.5-7B-Instruct-4bit
Hermes code-specialist	8084	Qwen2.5-Coder-32B-Instruct-4bit
Hermes reasoning-specialist	8085	DeepSeek-R1-Distill-Qwen-32B-4bit
Glance	8080	Service health dashboard
Ollama	11434	Embeddings only — nomic-embed-text

If memory search breaks

# Wipe broken index
rm -rf ~/.openviking/data/vectordb

# Restart OpenViking, then rebuild
OV_API_KEY=your-key OV_ACCOUNT=your-account OV_USER=$(whoami) \
  python scripts/rebuild-index.py

Roadmap

Linux systemd unit files
Docker compose for full mesh
Web UI for memory browser / management
Multi-machine mesh (agents on different hosts, shared memory store)
GraphRAG query interface in the dashboard
Agent-to-agent messaging with signatures (AMP protocol)
Hermes (NousResearch) as primary local agent — full tool-call support on local models
Config safety net — MLX-reviewed hook/settings changes, zero Claude API cost
Dashboard alerts + notification routing

Contributing

Issues and PRs welcome. If you build something with this, open a discussion — we want to see it.

Built with Hermes by @NousResearch — the agent that actually handles tool calling correctly on local models.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What it is

The cost model

Architecture

What's in this repo

Quick Start

Agent ecosystem

The tool stack

The data pipeline

Services & Ports

If memory search breaks

Roadmap

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github		.github
agents		agents
assets		assets
backend		backend
config		config
dashboard		dashboard
docs		docs
hooks		hooks
launchagents		launchagents
mcp		mcp
scripts		scripts
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

What it is

The cost model

Architecture

What's in this repo

Quick Start

Agent ecosystem

The tool stack

The data pipeline

Services & Ports

If memory search breaks

Roadmap

Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages