Skip to content
/ PAWL Public

enumerate the macOS Seatbelt sandbox through the magic of science

License

Notifications You must be signed in to change notification settings

Protonk/PAWL

Repository files navigation

Pawl: sandbox instrumentation through search

Pawl is an exploded view of the macOS Seatbelt sandbox built with search. It operates through mechanical exercise of the operating system, pulling the same policy along two directions: forward, from profile records through policy bytes to runtime denial; and reverse, from compiled blobs back to inputs—across multiple lanes and recording where they align or diverge.

Seatbelt policy is enforced through a confounded, censored stack: multiple layers contribute to outcomes. Closing the loop from structure → meaning → runtime behavior is hard even when you have good tools and strong priors. Furthermore, the reward for concerted effort is a fragile, host-bound understanding that can be undone with the next OS update. Rebuilding that understanding involves a new reverse engineering effort.

Pawl is different on purpose: a unitary inspectable account with enough internal wiring that it can notice when it stops being true. Instead of re-fitting reverse engineering tools to macOS when it updates, then exploring the Seatbelt, we instrument policy from the inside out. We find out what is different about a new version of macOS the same way we learn this one: each step in pawl is instrumented and diffed against system outputs. We converge on system features by keeping those outputs in productive doubt while testing new configurations.

Where we are today

The IR is operational. The evidence layer extracts byte-faithful, hash-stable structure from compiled profiles with explicitly marked unknown regions. The derived policy layer interprets evidence into operations, predicates, and control flow while retaining provenance links (origin_offsets, origin_spans) back to underlying bytes. Both layers are stable and exercised across synthetic fixtures and corpus profiles.

The five-point harness validates policy equivalence through roundtrip: source SBPL compiles to a blob, reverse renders back to SBPL, which recompiles to a roundtrip blob, with equality checks at each stage. The regex harness is mathematically precise—DFA equivalence plus a runtime file-read oracle that validates without trusting static comparisons. The profile harness has structural checks working (op-table classification, graph comparison); runtime checks are implemented via PolicyWitness but blocked for entitled or message-filter profiles.

Compiler-implicit operations are fully handled. A baseline artifact tracks ~54 implicit allows (deny-default profiles) and ~5 implicit denies (allow-default profiles); the reverse lane suppresses these, guarded by a dedicated test suite. Message-filter framing (0x4000 variant) is structurally decoded—allow-inner forms are canonicalized away (the compiler discards them), deny-inner forms reconstruct correctly. The remaining roundtrip gap is multi-regex predicate ordering in rules with two regex filters.

Known capability boundaries are characterized and scoped. Unicode casefolding is a runtime rule not encoded in bytecode—the harness expands it and tags the rewrite. POSIX bracket classes compile but do not match at runtime; the harness marks them unsupported. Message-filter profiles require a private entitlement (com.apple.private.security.message-filter) for runtime application; structural comparisons proceed, runtime checks report blocked with reason.

Five-point harness

Pawl's core validation structure is the five-point harness: a roundtrip circuit that closes only when multiple independent paths agree, making failures localizable instead of mysterious. The regex harness (integration/ir/regex/) is the tightest instantiation: DFA equivalence is mathematically precise, and the runtime file-read checker provides a genuinely independent oracle. The profile harness (integration/ir/profile/) extends this pattern to full sandbox profiles, with op-table classification and graph comparison replacing DFA checks.

     1 ◄═══ SBPL eq ═══► 3
  Source              Reversed
     │                 ↗  │
     │          reverse   │ compile
     ▼               ╱    ▼
     2 ─────────────╯     4
  Source ◄═══ DFA eq ═══► Roundtrip
   blob                    blob
     │                      │
     └──────────┬───────────┘
                │
                ▼
                5
             Oracle

Points 1–4 form a square of artifacts: source SBPL compiles to source blob, reverse renders back to SBPL, which recompiles to the roundtrip blob. The horizontal edges are cross-checks—SBPL surface equality above, DFA equivalence below. Point 5 is the runtime oracle, which validates behavior without trusting any of the static comparisons. When the circuit disagrees, the location of the disagreement tells you which transformation is at fault.

Level What it catches
Surface equality Normalization drift, parser bugs
Structural hash Op-table divergence, node graph differences
DFA equivalence Language acceptance mismatches
Runtime oracle Host disagrees despite internal consensus

Everything is scoped to a host via world_id.

Docs index

Core loop

IR and harnesses

Runtime

Fixtures and evidence

Supporting

Agent instructions live in local AGENTS.md files; closer is more authoritative.

Welcome aboard

The project has many moving parts; static artifacts, decoders, mappings, harnesses, probes, and their surrounding macOS environment all interact. It is a living, interdependent mesh, reaching into the agents who work on it as intimately as it does the sandbox.

Your part in this tangle may be to glue some of these parts together or to unstick them, depending. Trace inputs and outputs across the boundary you are touching, and update the generator or contract that defines the interface rather than patching artifacts directly. Leave a clear, rerunnable path from source to output so the wiring stays explicit for the next change. That is how to proceed generally; in particular, proceed first by reading CONTRIBUTING.md and AGENTS.md.

About

enumerate the macOS Seatbelt sandbox through the magic of science

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Contributors 2

  •  
  •