Skip to content

Framework detection architecture: spec-driven horizontal layer (like tree-sitter) #418

@peteromallet

Description

@peteromallet

Context

PR #414 adds Next.js framework detection — the first framework-specific detector in the codebase. The scanners and detection logic are solid, but before we establish the pattern that future frameworks will follow, we should think about what scales to 20 languages × 20 frameworks.

Current plugin architecture

The codebase has two plugin depth levels and several horizontal layers:

                        Shallow                              Deep
                        (generic_lang)                       (LangConfig class)
                        ─────────────────────────────────────────────────
Language depth:         JS, Nim, Zig, Bash                   TS, Go, Rust, C++

Horizontal layers (work at any depth):
  Tree-sitter:          ✓ (via treesitter_spec=)             ✓ (via *all_treesitter_phases())
  Tool specs:           ✓ (via tools=[...])                  ✓ (via make_tool_phase())

Tree-sitter is the key precedent: it's spec-driven, generates DetectorPhase objects, and works with both shallow and deep plugins. Adding a new language's tree-sitter support = adding a spec, not modifying infrastructure.

The question: what are frameworks?

Most framework-specific issues fall into four categories:

  1. File convention checks — "error.tsx needs use client", "Django views belong in views.py". File pattern + content pattern. Mechanical.
  2. API misuse — "use server in a client module", "useEffect only to sync state". String or AST pattern matches.
  3. External tool outputnext lint, django check. Tool running + output parsing.
  4. Architectural violations — "mixing pages/ and app/ router", "circular model imports". Needs dep graph / cross-file analysis.

Categories 1-3 are shallow — pattern matching and tool running. Category 4 needs the language plugin's deep infrastructure (dep graph, zone map), but doesn't need deep framework-specific wiring — it just consumes what the language plugin already provides.

Frameworks aren't a new depth level. They're a horizontal layer, like tree-sitter.

Proposed architecture

_framework/
  frameworks/
    types.py              # FrameworkSpec, ScannerRule, DetectionConfig
    registry.py           # registered framework specs
    detection.py          # generic "is this framework present?"
                          #   reads package.json, requirements.txt, Cargo.toml, go.mod
                          #   keyed by ecosystem
    phases.py             # framework_phases(lang_name) → list[DetectorPhase]
                          #   analogous to all_treesitter_phases()
    specs/
      nextjs.py           # Next.js: detection config + scanner rules + issue templates
      django.py           # Django
      rails.py            # Rails
      spring.py           # Spring
      gin.py              # etc.

A framework spec would be data-driven, like a tree-sitter spec:

FrameworkSpec(
    id="nextjs",
    ecosystem="node",
    detection=DetectionConfig(
        dependencies=["next"],
        config_files=["next.config.js", "next.config.mjs"],
        markers=["app/", "pages/"],
    ),
    scanners=[
        ScannerRule(
            id="use_server_in_client",
            extensions=[".tsx", ".jsx", ".ts", ".js"],
            check=check_use_server_in_client,
            tier=2,
            confidence="high",
            summary_template="'use server' directive in client module {file}",
        ),
        # ...more scanner rules
    ],
    tool_integration=ToolConfig(
        cmd="npx next lint --format json",
        parser=parse_next_lint,
        slow=True,    # respects --skip-slow
    ),
)

Language plugins opt in with one line:

# Deep plugin (typescript/__init__.py)
phases = [
    ...existing phases...,
    *framework_phases("typescript"),
]

# Shallow plugin (javascript/__init__.py)
generic_lang(
    name="javascript",
    ...,
    frameworks=True,
)

Why this scales

  • Adding a framework = adding a spec file. No language plugin modifications needed.
  • Cross-language coverage is automatic. Scanner rules declare which extensions they apply to, so TS and JS both get Next.js checks from the same spec. No cross-plugin imports, no duplication.
  • Detection is ecosystem-based. Node reads package.json, Python reads requirements.txt/pyproject.toml, Go reads go.mod. Centralized and consistent.
  • The copy-paste problem disappears. The spec→issue mapping is generic (like how generic_support maps tool specs to issues), not 25 hand-coded blocks.
  • Specs are testable in isolation. Test the spec, not the wiring.
  • Works at any plugin depth. Shallow plugins get framework support via one flag. Deep plugins get it via one line. Scanners can optionally consume the dep graph if available.

The updated spectrum

                        Shallow                              Deep
                        (generic_lang)                       (LangConfig class)
                        ─────────────────────────────────────────────────
Language depth:         JS, Nim, Zig, Bash                   TS, Go, Rust, C++

Horizontal layers (work at any depth):
  Tree-sitter:          ✓ (via treesitter_spec=)             ✓ (via *all_treesitter_phases())
  Frameworks:           ✓ (via frameworks=True)              ✓ (via *framework_phases())
  Tool specs:           ✓ (via tools=[...])                  ✓ (via make_tool_phase())

Migration path

  1. MacHatter1's Next.js scanners (PR feat(frameworks): add shared Next.js detector pipeline + enforced next lint (TS/JS) #414) become a nextjs.py framework spec
  2. Existing React detectors in typescript/detectors/react/ eventually become a react.py spec
  3. Both TS and JS get framework coverage from the same specs
  4. The hand-coded phases_smells.py React wiring gets replaced by *framework_phases("typescript")

Open questions

  • Should the check function in scanner rules receive the full LangRuntimeContract (for dep graph access), or just file content + path?
  • How do we handle framework-specific fixers? (e.g., auto-adding 'use client' to error boundaries)
  • Should framework specs support "requires deep plugin" scanners that gracefully skip for shallow plugins?

Feedback welcome — especially from @MacHatter1 who has already done the hard work of building the first framework detector and has the best sense of what the scanner authoring experience should feel like.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions