Skip to content

Commit c063d3e

Browse files
committed
Fix build prerequisites, tool count, and link PRIVACY.md
1 parent 675ee19 commit c063d3e

2 files changed

Lines changed: 9 additions & 8 deletions

File tree

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ cd shadow
2323
./scripts/build-rust.sh
2424

2525
# Download CLIP models (~190 MB, one-time)
26-
pip3 install huggingface_hub
26+
pip3 install huggingface_hub open_clip_torch
2727
python3 scripts/provision-clip-models.py
2828

2929
# Generate Xcode project and build

README.md

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717

1818
Shadow is a personal intelligence engine for macOS. It captures every signal your computer produces while you work, turns raw behavior into structured understanding, and acts on what it learns. Screen, audio, keystrokes, the full accessibility tree, clipboard, files, git, terminal, search queries, notifications, calendar, system context. All of it, synchronized by timestamp, stored locally, processed on-device. Crash-proof recording that loses at most ten seconds on a force quit. Automatic sleep/wake recovery with display hot-plug detection. Under 3% CPU average. Under 600 MB per day.
1919

20-
This is not a screen recorder. Shadow generates episodes from your work, runs a continuous heartbeat that pushes proactive observations, operates vision models and LLMs entirely on Apple Silicon, fine-tunes its own grounding models on your behavior, replays learned procedures through a safety-gated computer-use engine, and exposes a 25-tool agent runtime with streaming UI. It captures how you work, learns why, and starts helping before you ask.
20+
This is not a screen recorder. Shadow generates episodes from your work, runs a continuous heartbeat that pushes proactive observations, operates vision models and LLMs entirely on Apple Silicon, fine-tunes its own grounding models on your behavior, replays learned procedures through a safety-gated computer-use engine, and exposes a 26-tool agent runtime with streaming UI. It captures how you work, learns why, and starts helping before you ask.
2121

2222
We are open-sourcing Shadow because the capture layer is the hardest problem to solve and we have solved it. The next layer, memory graphs, MCP servers, personal models, agents trained on real human behavior, belongs to the community. Build on top of what is here.
2323

@@ -69,7 +69,7 @@ Every existing tool looks at one or two modalities. Shadow captures fourteen.
6969
| Proactive intelligence | Heartbeat with push suggestions | No | No | No | No |
7070
| On-device LLM | Qwen 7B/32B via MLX | No | No | No | Cloud API |
7171
| Vision grounding | ShowUI-2B + LoRA fine-tuning | No | No | No | Screenshot-only |
72-
| Computer-use agent | 25-tool agent + Mimicry system | No | No | No | Yes (cloud) |
72+
| Computer-use agent | 26-tool agent + Mimicry system | No | No | No | Yes (cloud) |
7373
| Safety gates | Pre-action checks + undo manager | No | No | No | No |
7474
| Meeting intelligence | Whisper + summaries + speaker attribution | No | No | No | No |
7575
| Learned procedures | Workflow replay from observation | No | No | No | No |
@@ -86,7 +86,7 @@ Shadow records your Mac like a studio records a band. Each signal gets its own t
8686

8787
**Understand.** Episode generation detects activity boundaries and produces structured work units with LLM summaries. A proactive heartbeat runs two-tier analysis and pushes suggestions without being asked. Semantic search combines CLIP vector embeddings (search by meaning), Tantivy full-text search, and timeline queries. Meeting intelligence transcribes, summarizes, and attributes speakers. Pattern detection over weeks reflects how you actually work: when your focus happens, how you communicate, what you consistently underestimate. A two-tier local LLM system (7B for fast tasks, 32B for deep reasoning) runs entirely on Apple Silicon with KV-cache session reuse that drops first-token latency from 14 seconds to under 1 second across multi-turn conversations.
8888

89-
**Act.** A 25-tool agent runtime with streaming UI handles search, context retrieval, visual analysis, AX-based actions, and memory operations. The Mimicry system watches how you perform tasks, synthesizes replayable procedures, and executes them through a safety-gated pipeline with pre-action checks, post-action verification, and undo support. A grounding oracle cascades through four strategies: AX exact match, AX fuzzy match, on-device VLM (ShowUI-2B), and cloud vision. 70-80% of interactions are resolved by the free, instant AX path. Built-in LoRA training generates grounding data from your actual clicks and fine-tunes the vision model to your specific apps and workflows. When the agent takes actions, those events are tagged and excluded from recording. Shadow learns from you, not from itself.
89+
**Act.** A 26-tool agent runtime with streaming UI handles search, context retrieval, visual analysis, AX-based actions, and memory operations. The Mimicry system watches how you perform tasks, synthesizes replayable procedures, and executes them through a safety-gated pipeline with pre-action checks, post-action verification, and undo support. A grounding oracle cascades through four strategies: AX exact match, AX fuzzy match, on-device VLM (ShowUI-2B), and cloud vision. 70-80% of interactions are resolved by the free, instant AX path. Built-in LoRA training generates grounding data from your actual clicks and fine-tunes the vision model to your specific apps and workflows. When the agent takes actions, those events are tagged and excluded from recording. Shadow learns from you, not from itself.
9090

9191
**Remember.** A semantic memory store holds knowledge entries by category: preferences, facts, patterns, relationships, skills. Directive memory stores your instructions. Behavioral search finds past workflows similar to what you are doing now. Procedure matching surfaces learned workflows when context suggests they are relevant. Three-tier retention manages storage automatically: hot (7 days, full video and audio), warm (8-30 days, keyframes and transcripts), cold (31+ days, indices only). Transcripts are never deleted until their source audio has been fully transcribed. Storage stays under a configurable cap.
9292

@@ -142,7 +142,7 @@ Shadow (macOS menu bar app, Swift + Rust)
142142
| |-- nomic-embed text embeddings
143143
| |-- Episode engine boundary detection + summarization
144144
| |-- Proactive heartbeat fast 10min / deep 30min, push suggestions
145-
| |-- Agent runtime 25 tools, streaming UI, task decomposition
145+
| |-- Agent runtime 26 tools, streaming UI, task decomposition
146146
| +-- Mimicry procedure learning, safety gates, undo support
147147
|
148148
+-- UI (SwiftUI, native macOS)
@@ -162,7 +162,8 @@ cd shadow
162162
# Build Rust storage engine and generate Swift bindings
163163
./scripts/build-rust.sh
164164

165-
# Download CLIP models (~190 MB)
165+
# Install Python dependencies and download CLIP models (~190 MB)
166+
pip3 install huggingface_hub open_clip_torch
166167
python3 scripts/provision-clip-models.py
167168

168169
# Generate Xcode project and build
@@ -173,7 +174,7 @@ xcodebuild -project Shadow/Shadow.xcodeproj -scheme Shadow -configuration Debug
173174
open ~/Library/Developer/Xcode/DerivedData/Shadow-*/Build/Products/Debug/Shadow.app
174175
```
175176

176-
Requires Apple Silicon (M1 or later), macOS 14+, Xcode 16.4+, Rust via rustup, XcodeGen (`brew install xcodegen`). Grant permissions when prompted. After granting Screen Recording, quit and relaunch.
177+
Requires Apple Silicon (M1 or later), macOS 14+, Xcode 16.4+, Rust via rustup, Python 3.8+, XcodeGen (`brew install xcodegen`). The Qwen 32B model requires 48 GB+ RAM. Grant permissions when prompted. After granting Screen Recording, quit and relaunch.
177178

178179
## Privacy
179180

@@ -183,7 +184,7 @@ Passwords and sensitive fields are detected at the CGEventTap level and excluded
183184

184185
Cloud LLM features (Claude, GPT) are opt-in with your own API key. When disabled, all intelligence runs locally via MLX on Apple Silicon.
185186

186-
This is open source. You do not need to trust a privacy policy. Read the code.
187+
This is open source. You do not need to trust a privacy policy. Read the code. See [PRIVACY.md](PRIVACY.md) for the full data handling details.
187188

188189
## Contributing
189190

0 commit comments

Comments
 (0)