A cognitive science-inspired system for detecting reusable code patterns in novice Python programs
This project implements a computational model of analogical reasoning to help novice programmers recognize when their manually written code corresponds to existing Python library abstractions—transforming verbose, loop-heavy implementations into elegant, idiomatic Python.
The Problem: Novice programmers often reinvent the wheel, writing explicit loops for operations that Python's standard library already provides (sum(), max(), filter(), sorted(), etc.).
Our Solution: An intelligent system that:
- ✨ Parses Python code using AST (Abstract Syntax Tree) analysis
- 🔍 Extracts relational features (loops, accumulators, comparisons, append patterns, etc.)
- 🧩 Applies a rule-based cognitive engine that detects high-level patterns (aggregation, filtering, mapping, search, sorting)
- 🎯 Matches detected patterns with a curated database of 15+ library analogies
- 💡 Outputs ranked suggestions with explanations and example usage
This tool was developed as part of CS 6795: Cognitive Science at Georgia Tech (Fall 2025), implementing principles from cognitive science research on analogical reasoning:
- Analogical Mapping (Gentner's Structure-Mapping Theory)
- Rule-Based Reasoning (Production Systems)
- Feature-Based Similarity (Thagard's ACME model)
The system models how expert programmers recognize code patterns through analogical transfer from well-known library functions (the source domain) to novel code snippets (the target domain).
| Feature | Description |
|---|---|
| 🖥️ Interactive CLI Mode | Paste code directly and receive instant analogies |
| 📁 Batch Processing | Analyze files from disk for bulk evaluation |
| 🌳 AST-Based Analysis | Structural code parsing for robust pattern detection |
| 🧠 Cognitive Rule Engine | IF-THEN rules that mimic human analogical reasoning |
| 📚 Rich Analogy Database | 15+ library functions with explanations + examples |
| 📊 Evaluation Suite | 10 novice-like test snippets with ground truth validation |
| 🔧 Fully Extensible | Add custom rules, patterns, or analogies with ease |
analogical_code_tool/
├── 📄 analogy_db.json # Library function analogies (source domain)
├── 🔍 parser_features.py # Extracts relational structure from AST
├── 📋 rules.py # IF-THEN rules (analogical mapping logic)
├── ⚙️ engine.py # Core reasoning engine
├── 💻 cli.py # Interactive + batch CLI interface
├── 📊 evaluate.py # Accuracy evaluation script
├── 📝 example_snippet.py # Demo snippet (sum implementation)
├── 📂 test_snippets/ # Novice-like code snippets for testing
│ ├── snippet1_sum.py # Loop-based sum → sum()
│ ├── snippet2_mean.py # Average calculation → statistics.mean()
│ ├── snippet3_filter.py # Conditional filtering → filter()
│ ├── snippet4_map.py # Element transformation → map()
│ ├── snippet5_max.py # Manual maximum search → max()
│ ├── snippet6_any.py # Existence check → any()
│ ├── snippet7_sort.py # Bubble sort → sorted()
│ ├── snippet8_filter_negatives.py # Filter pattern variant
│ ├── snippet9_map_square.py # Map pattern variant
│ └── snippet10_custom_max.py # Max pattern variant
└── 🎯 ground_truth.json # Expected analogies for evaluation
Zero external dependencies required! Just Python 3.9+:
git clone <https://github.com/toyeade1/CS-6795-Project-.git>
cd analogical_code_toolThat's it! The tool uses only Python's standard library.
Paste any Python snippet and type END when finished:
python cli.py --interactiveExample Session:
Analogical Code Tool - Interactive Mode
Paste your Python code snippet below.
When you are done, enter a line containing only 'END' and press Enter.
Enter snippet (type 'END' on its own line to finish):
def total(nums):
s = 0
for n in nums:
s += n
return s
END
Analyzing snippet...
Detected analogies (rule-based):
1. sum (module: builtins, pattern: aggregation)
Why: Sums an iterable of numbers and returns the total.
Example usage:
total = sum(nums)
Score: 4.00
Analyze another snippet? [y/N]:
python cli.py --file path/to/your_code.pyExample:
python cli.py --file test_snippets/snippet5_max.pypython cli.py --code "def square(nums): result=[n*n for n in nums]; return result"Output:
Detected analogies (rule-based):
1. map (module: builtins, pattern: map)
Why: Applies a function to each item of an iterable.
Example usage:
squares = list(map(lambda x: x * x, nums))
Score: 2.00
2. list-comprehension-map (module: syntax, pattern: map)
Why: List comprehension for mapping elements.
Example usage:
squares = [x * x for x in nums]
Score: 2.00
python cli.py --file test.py --top-k 10Evaluate the tool's accuracy on 10 curated novice-like snippets:
python evaluate.pySample Output:
Snippet: snippet1_sum.py
Expected: {'sum'}
Suggested: ['sum', 'statistics.mean', 'numpy.mean']
Snippet: snippet5_max.py
Expected: {'max'}
Suggested: ['max', 'min', 'any']
...
Total snippets: 10
Top-1 accuracy: 0.80
Top-3 accuracy: 0.90
The evaluation compares system suggestions against human-defined ground truth in ground_truth.json.
┌─────────────────┐
│ Code Snippet │ ← Novice-written Python code
└────────┬────────┘
↓
┌─────────────────┐
│ AST Parser │ ← Converts code to Abstract Syntax Tree
└────────┬────────┘
↓
┌─────────────────┐
│ Feature │ ← Extracts relational features:
│ Extractor │ • has_loop, loop_type
└────────┬────────┘ • uses_accumulator, accumulator_operation
↓ • appends_to_list, conditional_inside_loop
┌─────────────────┐ • uses_comparison, has_swap_pattern
│ Rule Engine │ ← Applies IF-THEN rules:
└────────┬────────┘ • IF accumulator+add+loop → "aggregation"
↓ • IF append+conditional → "filter"
┌─────────────────┐ • IF swap+comparison → "sorting"
│ Analogy DB │ ← Matches patterns to library functions
└────────┬────────┘
↓
┌─────────────────┐
│ Ranked │ ← Scored by feature overlap
│ Suggestions │
└─────────────────┘
| Component | Domain | Description |
|---|---|---|
| Target Domain | Novice Code | Loop-based implementations with explicit control flow |
| Source Domain | Library Functions | High-level abstractions like sum(), filter(), sorted() |
| Mapping Rules | Production Rules | IF accumulator + add operation + loop → aggregation pattern |
| Similarity Measure | Feature Overlap | Scores candidates by matching structural features |
Example Rule:
Rule(
name="aggregation_rule",
pattern_label="aggregation",
condition_fn=lambda f: (
f.get("has_loop") and
f.get("uses_accumulator") and
f.get("accumulator_operation") == "add" and
f.get("returns_accumulator")
)
)This models analogical reasoning as studied in cognitive science literature (Gentner, Holyoak, Thagard, etc.).
Edit analogy_db.json:
{
"name": "any",
"module": "builtins",
"pattern": "search",
"features": {
"uses_comparison": true,
"returns_accumulator": true
},
"example_usage": "has_positive = any(x > 0 for x in nums)",
"explanation": "Returns True if any element of the iterable is truthy."
}Edit rules.py:
Rule(
name="comprehension_filter",
pattern_label="filter",
condition_fn=lambda f: (
f.get("uses_comparison") and
f.get("appends_to_list")
)
)Modify parser_features.py to detect additional structural patterns:
def visit_ListComp(self, node: ast.ListComp) -> Any:
"""Detect list comprehension patterns."""
self.features["has_list_comprehension"] = True
self.generic_visit(node)| Pattern | Description | Example Libraries |
|---|---|---|
| aggregation | Accumulates values using add/multiply operations | sum(), math.prod(), statistics.mean() |
| filter | Conditionally selects elements | filter(), list comprehensions with if |
| map | Transforms each element | map(), list comprehensions |
| search | Finds elements via comparison | max(), min(), any(), all() |
| sorting | Reorders elements | sorted(), list.sort() |
| iteration | Generic loop patterns | for/while loops |
# Before: Manual accumulation
def total(nums):
s = 0
for n in nums:
s += n
return s
# After: Built-in function
total = sum(nums)# Before: Conditional append
def get_positives(nums):
result = []
for n in nums:
if n > 0:
result.append(n)
return result
# After: filter() or comprehension
positives = [x for x in nums if x > 0]# Before: Transform and append
def squares(nums):
result = []
for n in nums:
result.append(n * n)
return result
# After: List comprehension
squares = [x * x for x in nums]The FeatureExtractor (AST visitor) detects:
- Loop Structures:
has_loop,loop_type(for/while) - Accumulator Patterns:
uses_accumulator,accumulator_operation(add/mult) - List Building:
appends_to_list - Conditional Logic:
conditional_inside_loop,uses_comparison - Sorting Cues:
has_swap_pattern,iterates_over_indices
7 production rules map features to high-level patterns:
- Aggregation Rule (add-based accumulation)
- Product Aggregation Rule (multiply-based accumulation)
- Filter Rule (conditional append)
- Map Rule (unconditional append)
- Search Rule (comparison + accumulator)
- Sorting Rule (swap pattern + comparison)
- Generic Iteration Rule (fallback)
score = 1.0 (base score for pattern match)
for each matching feature:
score += 1.0
Candidates are ranked by descending score, with top-k returned.
| Metric | Score |
|---|---|
| Top-1 Accuracy | ~40-50% |
| Top-3 Accuracy | ~80% |
| Test Suite Size | 20 novice snippets |
- Support for more complex patterns (nested loops, multiple return paths)
- Integration with IDEs (VS Code extension, PyCharm plugin)
- Machine learning-based feature extraction
- Natural language explanations powered by LLMs
- Multi-language support (JavaScript, Java, C++)
- Real-time code suggestions during typing
MIT License
Copyright (c) 2025
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
This tool was created as part of:
CS 6795: Cognitive Science — Georgia Tech (Fall 2025)
For questions, suggestions, or collaboration opportunities, please open an issue or reach out via the course portal.