Skip to content

docs: tutorial foundation — architecture, browse engine, snapshots, commands#177

Open
johnxie wants to merge 1 commit intogarrytan:mainfrom
johnxie:docs/tutorial-foundation
Open

docs: tutorial foundation — architecture, browse engine, snapshots, commands#177
johnxie wants to merge 1 commit intogarrytan:mainfrom
johnxie:docs/tutorial-foundation

Conversation

@johnxie
Copy link

@johnxie johnxie commented Mar 18, 2026

Summary

Adds the first half of a comprehensive tutorial for gstack — five interconnected chapters that take a newcomer from "what is this?" to understanding every browse command. Each chapter builds on the previous, uses real code examples from the codebase, and includes Mermaid diagrams for visual learners.

This PR covers the infrastructure layer — what gstack is, how the browser works, how snapshots see pages, and the full command vocabulary. A follow-up PR will cover the workflow layer (skills, templates, planning, shipping, QA, tests).

Motivation

Signal Evidence
Onboarding gap No tutorial content exists beyond README and CONTRIBUTING.md
Community ask Issue #63 ("Is this repo maintained?") suggests newcomers struggle to orient
Setup confusion Issue #147 ("Bun command not found during ./setup") — docs could prevent this
Precedent PR #146 (docs restructure) was merged — docs contributions are valued
22K stars A project this popular deserves documentation matching its quality

What's included

File Lines Topic Key Visuals
docs/index.md 65 Tutorial entry point Full architecture Mermaid flowchart
docs/01_architecture.md 208 Big picture: 3 layers, virtual team, design decisions Layer diagram, dev workflow sequence diagram
docs/02_browse_engine.md 302 Client-server model, lifecycle, security, buffers Client-server sequence, lifecycle states, buffer flow
docs/03_snapshot_and_refs.md 305 Accessibility tree, @ref assignment, staleness, flags 4-stage pipeline, ref lifecycle, staleness flow
docs/04_command_system.md 390 All 52 commands: read, write, meta categories Registry → consumers dependency diagram

Total: ~1,270 lines across 5 files, ~46KB

Architecture diagram (from index.md)

flowchart TD
    subgraph Planning["Planning Phase"]
        CEO["/plan-ceo-review\n(CEO/Founder)"]
        ENG["/plan-eng-review\n(Eng Manager)"]
        DESIGN_PLAN["/plan-design-review\n(Senior Designer)"]
    end

    subgraph Implementation["Implementation Phase"]
        BROWSE["/browse\n(Headless Browser)"]
        QA["/qa & /qa-only\n(QA Lead)"]
        DESIGN_FIX["/design-review\n(Designer Who Codes)"]
    end

    subgraph Shipping["Shipping Phase"]
        REVIEW["/review\n(Staff Engineer)"]
        SHIP["/ship\n(Release Engineer)"]
        DOCS["/document-release\n(Technical Writer)"]
    end

    CEO --> ENG --> DESIGN_PLAN
    DESIGN_PLAN --> BROWSE
    BROWSE --> QA
    QA --> DESIGN_FIX
    DESIGN_FIX --> REVIEW
    REVIEW --> SHIP
    SHIP --> DOCS
Loading

Browse engine deep dive (from chapter 2)

sequenceDiagram
    participant Skill as Skill ($B goto ...)
    participant CLI as CLI (cli.ts)
    participant State as .gstack/browse.json
    participant Server as HTTP Server (server.ts)
    participant Browser as Playwright + Chromium

    Skill->>CLI: $B goto https://example.com
    CLI->>State: Read pid, port, token
    CLI->>Server: POST /command {cmd: "goto", args: [...]}
    Server->>Browser: page.goto("https://example.com")
    Browser-->>Server: Page loaded
    Server-->>CLI: {output: "Navigated to https://example.com"}
    CLI-->>Skill: Navigated to https://example.com
Loading

Snapshot pipeline (from chapter 3)

flowchart LR
    PAGE["Web Page\n(HTML + DOM)"]
    A11Y["Accessibility Tree\n(Playwright)"]
    PARSE["Parse & Assign\n@e1, @e2, ..."]
    OUTPUT["YAML-like\nSnapshot Text"]
    REFS["Ref Map\n@e1 → Locator\n@e2 → Locator"]

    PAGE --> A11Y --> PARSE
    PARSE --> OUTPUT
    PARSE --> REFS
Loading

Command system coverage (from chapter 4)

┌─────────────────────────────────────────────────────────┐
│                    commands.ts                           │
│                (Single Source of Truth)                  │
├───────────────────┬──────────────────┬──────────────────┤
│  READ (16 cmds)   │  WRITE (21 cmds) │  META (15 cmds)  │
│  text, html,      │  goto, click,    │  tabs, screenshot,│
│  links, forms,    │  fill, select,   │  pdf, responsive, │
│  js, eval, css,   │  hover, type,    │  chain, diff,     │
│  attrs, console,  │  press, scroll,  │  snapshot, status, │
│  network, cookies,│  wait, viewport, │  stop, restart,   │
│  storage, perf,   │  cookie, header, │  url, tab, newtab,│
│  dialog, is       │  useragent, ...  │  closetab         │
└───────────────────┴──────────────────┴──────────────────┘
        │                    │                   │
        ▼                    ▼                   ▼
   Consumed by: server.ts dispatch, gen-skill-docs,
                skill-validation tests, CLI help text

Verification methodology

Every technical claim was verified against source code:

Claim Verified Against Status
52 commands (16 + 21 + 15) browse/src/commands.ts ✅ Exact match
8 snapshot flags (-i, -c, -d, -s, -D, -a, -o, -C) browse/src/snapshot.ts SNAPSHOT_FLAGS ✅ Exact match
Buffer capacity 50,000 browse/src/buffers.ts HIGH_WATER_MARK ✅ Exact match
Port range 10000-60000 browse/src/server.ts ✅ Exact match
Idle timeout 30 min browse/src/server.ts BROWSE_IDLE_TIMEOUT ✅ Exact match
Default viewport 1280×720 browse/src/browser-manager.ts ✅ Exact match
Interaction timeout 15s browse/src/write-commands.ts ✅ Exact match
14 skill directories Glob for */SKILL.md.tmpl ✅ Exact match

Design decisions

  • Tutorial style: Progressive complexity — each chapter builds on the previous
  • Real code: Examples pulled from actual source files, not hypothetical
  • Mermaid diagrams: Every chapter has at least one visual (GitHub renders natively)
  • Cross-references: Chapters link to each other with relative Markdown links
  • Jekyll front matter: Ready for GitHub Pages / docs site if desired
  • No generated content markers: Clean prose, no "AI generated" badges

Relationship to existing docs

docs/
├── images/              # Existing (unchanged)
├── skills.md            # Existing (unchanged)
├── index.md             # NEW — tutorial entry point
├── 01_architecture.md   # NEW — big picture
├── 02_browse_engine.md  # NEW — browser deep dive
├── 03_snapshot_and_refs.md  # NEW — snapshot system
└── 04_command_system.md # NEW — command reference

No existing files are modified. The new docs complement skills.md (which documents skill usage) with deep technical content about how gstack works under the hood.

Follow-up

A second PR (docs/tutorial-workflows) will add chapters 5-10 covering:

  • Skill system anatomy and the 14 roles
  • Template engine and placeholder resolution
  • Planning skills (CEO, Eng, Design reviews)
  • Ship & review pipeline
  • QA & design review with real browsers
  • 3-tier test infrastructure

Test plan

  • All $B commands referenced match commands.ts registry
  • All snapshot flags match SNAPSHOT_FLAGS array
  • Mermaid diagrams use valid syntax (tested in GitHub preview)
  • Cross-links between chapters resolve correctly
  • No stale file paths or function names
  • No modifications to existing files

…s, commands

Add comprehensive onboarding documentation covering gstack's core
infrastructure. Five files forming a self-contained tutorial that takes
readers from "what is gstack?" to understanding every browse command.

- index.md: Tutorial entry point with Mermaid architecture flowchart
- 01_architecture.md: Three-layer design, virtual team, project structure
- 02_browse_engine.md: Client-server model, lifecycle, security, buffers
- 03_snapshot_and_refs.md: Accessibility tree, @ref system, staleness
- 04_command_system.md: All 52 commands (read/write/meta), error handling

All technical claims verified against source code (commands.ts,
snapshot.ts, server.ts, browser-manager.ts, buffers.ts).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant