-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Hypothesis
An agent using domain classification will place new code in architecturally appropriate locations more often than an agent without domain context.
The Problem
When agents add new code (functions, classes, files), they often:
- Put code in the wrong module/directory
- Create new files when they should extend existing ones
- Violate architectural boundaries (e.g., UI code in data layer)
- Miss existing patterns (e.g., there's already a
utils/folder)
This happens because agents lack understanding of the codebase's architectural organization.
Rationale
Baseline approach:
- Agent greps for similar code
- Guesses location based on file names
- May find one example but miss the pattern
- No understanding of domain boundaries
Domain graph approach:
- Agent sees architectural domains (e.g., "Authentication", "Billing", "API")
- Knows which files/functions belong to each domain
- Can ask: "Where does authentication code live?"
- Understands existing organizational patterns
Proposed Eval
Task Types
- "Add a new utility function for X" - Should go in existing utils, not new file
- "Add a new API endpoint for Y" - Should follow existing API patterns
- "Add validation for Z" - Should go in validation layer, not scattered
- "Add logging for W" - Should use existing logging patterns
Ground Truth
- Human expert labels the "correct" location for each task
- Or: Use real PRs where reviewers requested code be moved
Metrics
- Placement accuracy: Did code end up in the right module/directory?
- Pattern adherence: Did it follow existing conventions?
- Boundary violations: Did it cross architectural boundaries?
- File proliferation: Did it create unnecessary new files?
Agents
- Baseline: Standard tools (grep, read, glob)
- MCP: Baseline +
get_domain_graph
Example Scenario
Task: "Add a function to validate email addresses"
Baseline agent might:
- Create
src/emailValidator.ts(new file, wrong location) - Or put it in
src/api/users.ts(wrong layer)
Domain-aware agent should:
- See domain graph shows
src/validation/exists with other validators - See
StringValidator,PhoneValidatoralready there - Add
EmailValidatortosrc/validation/following the pattern
Success Criteria
MCP agent should show:
- Higher placement accuracy (code in right module)
- Fewer boundary violations
- Better pattern adherence
- Fewer unnecessary new files
Dataset Ideas
- Synthetic: Take well-organized repos, create tasks that have clear "right" answers
- Historical: Find PRs where code was moved during review (original = wrong, final = right)
- Expert-labeled: Have developers label correct locations for hypothetical additions
Related
- Consider focusing on call graph + classification as primary tools #83 - Focus on call graph + classification tools
- Domain graph provides the architectural context needed for this eval
Metadata
Metadata
Assignees
Labels
No labels