-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Hypothesis
An agent using call graphs and dependency graphs will find dead code more accurately and efficiently than an agent using grep/search alone.
Rationale
Finding dead code requires answering: "Is this function/class reachable from any entry point?"
Grep approach:
- Search for function name as string
- High false positive rate (comments, strings, similar names)
- Misses indirect calls (callbacks, dynamic dispatch)
- Can't trace through call chains
- Agent burns iterations on false leads
Graph approach:
- Query call graph for incoming edges to function
- Zero false positives from comments/strings
- Can trace full call chain to entry points
- Dependency graph shows if module is imported anywhere
- Direct answer: "nothing calls this" vs "these 3 functions call this"
Proposed Eval
Setup
- Select repos with known dead code (or inject dead functions into test repos)
- Ground truth: list of functions that are actually unreachable
Metrics
- Precision: What % of reported dead code is actually dead?
- Recall: What % of actual dead code was found?
- Iterations: How many tool calls to reach conclusion?
- False positives: Functions incorrectly flagged as dead
Test Cases
- Simple unused function (no callers)
- Function only called by other dead code (transitive dead)
- Function with name that appears in comments but no actual calls
- Function called via callback/dynamic dispatch (tricky for both approaches)
Agents
- Baseline: Standard tools (grep, read, glob)
- MCP: Baseline +
get_call_graph,get_dependency_graph
Success Criteria
MCP agent should show:
- Higher precision (fewer false positives)
- Comparable or better recall
- Fewer iterations to reach conclusion
Related
- Expose individual graph types as separate MCP tools #81 - Expose individual graph tools (enables this eval)
dead-code-hunterGitHub Action uses this approach
Notes
This eval tests a concrete use case where graphs should clearly outperform text search. Unlike SWE-bench (bug fixing), dead code detection fundamentally requires understanding call relationships.
Metadata
Metadata
Assignees
Labels
No labels