Pawl is an exploded view of the macOS Seatbelt sandbox built with search. It operates through mechanical exercise of the operating system, pulling the same policy along two directions: forward, from profile records through policy bytes to runtime denial; and reverse, from compiled blobs back to inputs—across multiple lanes and recording where they align or diverge.
Seatbelt policy is enforced through a confounded, censored stack: multiple layers contribute to outcomes. Closing the loop from structure → meaning → runtime behavior is hard even when you have good tools and strong priors. Furthermore, the reward for concerted effort is a fragile, host-bound understanding that can be undone with the next OS update. Rebuilding that understanding involves a new reverse engineering effort.
Pawl is different on purpose: a unitary inspectable account with enough internal wiring that it can notice when it stops being true. Instead of re-fitting reverse engineering tools to macOS when it updates, then exploring the Seatbelt, we instrument policy from the inside out. We find out what is different about a new version of macOS the same way we learn this one: each step in pawl is instrumented and diffed against system outputs. We converge on system features by keeping those outputs in productive doubt while testing new configurations.
The IR is operational. The evidence layer extracts byte-faithful, hash-stable structure from compiled profiles with explicitly marked unknown regions. The derived policy layer interprets evidence into operations, predicates, and control flow while retaining provenance links (origin_offsets, origin_spans) back to underlying bytes. Both layers are stable and exercised across synthetic fixtures and corpus profiles.
The five-point harness validates policy equivalence through roundtrip: source SBPL compiles to a blob, reverse renders back to SBPL, which recompiles to a roundtrip blob, with equality checks at each stage. The regex harness is mathematically precise—DFA equivalence plus a runtime file-read oracle that validates without trusting static comparisons. The profile harness has structural checks working (op-table classification, graph comparison); runtime checks are implemented via PolicyWitness but blocked for entitled or message-filter profiles.
Compiler-implicit operations are fully handled. A baseline artifact tracks ~54 implicit allows (deny-default profiles) and ~5 implicit denies (allow-default profiles); the reverse lane suppresses these, guarded by a dedicated test suite. Message-filter framing (0x4000 variant) is structurally decoded—allow-inner forms are canonicalized away (the compiler discards them), deny-inner forms reconstruct correctly. The remaining roundtrip gap is multi-regex predicate ordering in rules with two regex filters.
Known capability boundaries are characterized and scoped. Unicode casefolding is a runtime rule not encoded in bytecode—the harness expands it and tags the rewrite. POSIX bracket classes compile but do not match at runtime; the harness marks them unsupported. Message-filter profiles require a private entitlement (com.apple.private.security.message-filter) for runtime application; structural comparisons proceed, runtime checks report blocked with reason.
Pawl's core validation structure is the five-point harness: a roundtrip circuit that closes only when multiple independent paths agree, making failures localizable instead of mysterious. The regex harness (integration/ir/regex/) is the tightest instantiation: DFA equivalence is mathematically precise, and the runtime file-read checker provides a genuinely independent oracle. The profile harness (integration/ir/profile/) extends this pattern to full sandbox profiles, with op-table classification and graph comparison replacing DFA checks.
1 ◄═══ SBPL eq ═══► 3
Source Reversed
│ ↗ │
│ reverse │ compile
▼ ╱ ▼
2 ─────────────╯ 4
Source ◄═══ DFA eq ═══► Roundtrip
blob blob
│ │
└──────────┬───────────┘
│
▼
5
Oracle
Points 1–4 form a square of artifacts: source SBPL compiles to source blob, reverse renders back to SBPL, which recompiles to the roundtrip blob. The horizontal edges are cross-checks—SBPL surface equality above, DFA equivalence below. Point 5 is the runtime oracle, which validates behavior without trusting any of the static comparisons. When the circuit disagrees, the location of the disagreement tells you which transformation is at fault.
| Level | What it catches |
|---|---|
| Surface equality | Normalization drift, parser bugs |
| Structural hash | Op-table divergence, node graph differences |
| DFA equivalence | Language acceptance mismatches |
| Runtime oracle | Host disagrees despite internal consensus |
Everything is scoped to a host via world_id.
Core loop
pawl/README.md— workbench overview and CLIpawl/LOOP.md— cross-check and solver closure contractpawl/casefile/CASEFILE.md— casefile schema and receipts
IR and harnesses
integration/ir/ir_schema.md— IR specification (evidence + policy layers)integration/ir/PLAN.md— IR development plan and statusintegration/ir/regex/FIVE_POINT_HARNESS.md— regex harness specintegration/ir/profile/FIVE_POINT_HARNESS.md— profile harness spec
Runtime
runtime/PolicyWitness.md— runtime oracle and probe executionprobe/SPEC.md— probe plan schema
Fixtures and evidence
integration/fixtures/pawl/d-ring/— synthetic fixture families for compiler behaviorintegration/carton/README.md— frozen mappings and reviewable contracts
Supporting
api/README.md— Ghidra/Frida surfacestools/README.md— host-bound CLIs
Agent instructions live in local AGENTS.md files; closer is more authoritative.
The project has many moving parts; static artifacts, decoders, mappings, harnesses, probes, and their surrounding macOS environment all interact. It is a living, interdependent mesh, reaching into the agents who work on it as intimately as it does the sandbox.
Your part in this tangle may be to glue some of these parts together or to unstick them, depending. Trace inputs and outputs across the boundary you are touching, and update the generator or contract that defines the interface rather than patching artifacts directly. Leave a clear, rerunnable path from source to output so the wiring stays explicit for the next change. That is how to proceed generally; in particular, proceed first by reading CONTRIBUTING.md and AGENTS.md.