This document defines the current executable architecture for RADE's deterministic UI intelligence wedge and the boundaries between the primary proof path and secondary experimental paths.
- Primary path: deterministic report generation from a structured JSON payload or a Playwright-collected public web page.
- Secondary path: accessibility-like tree collection -> construction graph -> blueprint SVG / scrubbed graph ingest helpers.
- Thin shells: worker, web, and agent surfaces exist to keep entrypoints explicit without claiming hosted product completeness. The served API surface is real, but intentionally narrow.
The current wedge is proof-backed interface analysis for repeated structure, accessibility gaps, and modernization risk. The architecture is designed to keep those outputs deterministic and traceable while preserving a broader path toward interface intelligence infrastructure.
src/core/cli.py is the main entrypoint.
- Input is either a local JSON payload or a public
http/httpsURL passed throughsrc/collectors/web_dom_adapter.py. - The same CLI also accepts two existing RADE JSON reports for deterministic report-to-report comparison.
src/core/schemas.pyvalidates required fields, uniqueness, parent references, slab layers, and scalar types.- Only
ios,android, andwebare valid platforms. - The web collector uses Playwright
locator("body").aria_snapshot()as the primary collection source and falls back to a semantic DOM walk when the ARIA snapshot is empty.
src/core/normalizer.py converts validated payloads into a stable project model.
- every node is stamped with
app_id,platform,screen_id, andscreen_name - traits are normalized and sorted
- bounds are normalized to integer arrays
- slab layers are inferred or normalized through
src/core/layering.py
The deterministic structural path is:
- fingerprint each node in
src/core/fingerprint.py - deduplicate nodes into ordered clusters in
src/core/deduplicator.py - score the project in
src/core/scoring.py - build recommendations in
src/core/recommendation_engine.py - build a roadmap in
src/core/roadmap_generator.py
Current score outputs are:
complexityreusabilityaccessibility_riskmigration_risk
Current recommendations are standards-backed and derived from current deterministic rules rather than model inference.
src/core/report_generator.py writes:
- JSON report
- Markdown report
- HTML report
Before write, report artifacts are scrubbed by src/scrubber/pii_scrubber.py.
- stable identifiers such as
node_ref,rule_id,recommendation_id, and fingerprints are intentionally preserved - emitted artifacts receive public repository metadata from
src/core/compliance.py
src/core/report_diff.py compares two existing RADE JSON reports.
- score deltas reuse the directional semantics from
src/core/pr_score_diff.py - all four current scores are compared deterministically:
complexity,reusability,accessibility_risk, andmigration_risk - recommendation additions and removals are traced by stable
recommendation_id - repeated-structure changes are traced by duplicate-cluster
fingerprintand node references - output artifacts are local JSON and Markdown files, not hosted history or persistence claims
agent/cli.pyforwardsscanto the core CLI analyze path for either--inputor--urlsrc/api/wsgi.pyis the served entrypoint for/,/healthz, andPOST /analyze; it wraps the coresrc/api/app.pyhandler with API key auth middlewaresrc/worker/main.pyemits staged telemetry but performs no real queue workweb/lib/shell.mjsserves the active web shell;web/app/is dormant scaffold only- Root
action.ymldefines a GitHub Action boundary that compares PR base/head fixture reports, posts score deltas forreusabilityandaccessibility_riskwith explicit regression-gate status, validates strict boolean regression-gate input, exports deterministic gate/delta/reason/flag outputs for workflow reuse, and can optionally fail on score regression after commenting
This path is implemented and tested, but it is not the default product workflow.
src/engine/rade_orchestrator.py can collect from:
- an in-memory accessibility-like root object
- a caller-provided driver
It produces a ConstructionGraph containing normalized nodes, containment edges, and plumbing edges for interactive destinations.
The file also defines managed-session configuration and capability building for Appium-style providers, but the repo does not ship a full provider integration.
src/demo/run_raid_visualizer.py renders a deterministic SVG blueprint.
Current blueprint behavior includes:
- slab-layer-aware node styling
data-rade-dnaanddata-slab-layermetadata on groups- public metadata and visible watermark text
- deterministic demo outputs validated against a golden SVG fixture
src/database/graph_ingestor.py persists scrubbed construction graphs to Neo4j Aura.
Current status:
- schema creation and write queries exist
- scrub-before-write is enforced
- pattern IDs and plumbing edges are derived deterministically
- the ingestor is library-level proof, not a user-facing persistence workflow
RADE currently supports two proven deconstruction inputs:
- structured JSON payloads for the report path
- public unauthenticated web pages converted through the Playwright web collector
- accessibility-like trees for the blueprint / graph path
Pixel-first analysis is not the current core architecture.
Allowed collection sources:
- authorized customer payloads
- customer-consented accessibility trees or simulators when the caller provides them
- public unauthenticated surfaces where collection is permitted
Forbidden collection behaviors:
- login bypass
- fake accounts
- stealth collection
- access-control circumvention
- Graph / blueprint persistence uses
src/scrubber/edge_shield.py. - Report artifact emission uses
src/scrubber/pii_scrubber.py.
Required behavior:
- regex-first PII removal
- optional second-pass Presidio escalation for ambiguous free-form text
- preserve structural nodes and edges where needed for proof
- emit audit metadata for persistence-side scrubbing
- neutralize persistence-side sensitive strings into placeholders such as
DATA_SLOT_01 - preserve public-page collector outputs as structured schema payloads before report scrubbing
The labels 5-Slab Taxonomy and Ambient Engine are retained as project terminology in this repository.
Use hiQ v. LinkedIn as a narrow CFAA-domain reference for public-facing collection logic.
Do not write the system as if hiQ is a blanket authorization to bypass gates, ignore terms, or collect private user content.
These are not current architecture claims:
- hosted auth
- tenant-aware persistence
- queue-backed execution
- build scanning beyond deterministic local repo metadata
- a real Next.js runtime
- a shipped AWS Device Farm / Appium integration
- no pixel-first reverse engineering
- no LLM-based deterministic scoring
- no stealth collection model
- no hosted persistence assumptions inside the engine layer