An intelligent AI agent that autonomously generates, tests, and refines Python parsers for PDF bank statements using iterative feedback loops.
This project implements an autonomous AI agent designed to solve the Agent-as-Coder Challenge. The system leverages Google's Gemini API to generate custom Python parsers for bank statement PDFs, validates outputs against ground truth CSV data, and employs an iterative refinement workflow to achieve high parsing accuracy.
| Feature | Description |
|---|---|
| Autonomous Code Generation | Generates custom Python parsers using Gemini 1.5 Flash |
| Iterative Refinement | Improves code quality through feedback-driven iteration loops |
| Test-Driven Development | Validates parser accuracy against expected CSV output |
| Extensible Architecture | Designed for easy adaptation to multiple bank formats |
| High Accuracy | Achieves up to Better% parsing accuracy through systematic refinement |
| Simple CLI Interface | Easy-to-use command line interface for agent execution |
The agent operates on a generate-test-refine cycle:
graph TD
A[Initialize Agent] --> B[Generate Parser Code]
B --> C[Load & Execute Parser]
C --> D[Compare with Expected CSV]
D --> E{Score > Previous Best?}
E -->|Yes| F[Save as Best Parser]
E -->|No| G[Keep Previous Best]
F --> H{Max Attempts Reached?}
G --> H
H -->|No| I[Generate Feedback]
H -->|Yes| J[Complete - Return Best Parser]
I --> B
style A fill:#e1f5fe
style F fill:#c8e6c9
style J fill:#fff3e0
|
Core Technologies
|
Processing & Testing
|
# System Requirements
Python 3.9+
Git 2.0+-
Clone and Setup
git clone https://github.com/d-kavinraja/ai-agent-challenge.git cd ai-agent-challenge # Create virtual environment python -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate # Install dependencies pip install -r requirements.txt
-
Configure Environment
# Set your Gemini API key as environment variable export GEMINI_API_KEY="your_api_key_here" # For Windows PowerShell: # $Env:GEMINI_API_KEY="your_api_key_here" # For Windows Command Prompt: # set GEMINI_API_KEY=your_api_key_here
-
Run Your First Parser
python agent.py --target icici --attempts 5
[Agent] Attempt 1/5 — generating parser for 'icici'
[Agent] Wrote custom_parsers/icici_parser.py (1247 chars)
[Agent] Score for attempt 1: 45.50%
[Agent] Attempt 2/5 — generating parser for 'icici'
[Agent] Wrote custom_parsers/icici_parser.py (1389 chars)
[Agent] Score for attempt 2: 87.25%
[Agent] Attempt 3/5 — generating parser for 'icici'
[Agent] Wrote custom_parsers/icici_parser.py (1456 chars)
[Agent] Score for attempt 3: 100.00%
[Agent] ✅ Finished 5 attempts
[Agent] Best score: 100.00%
[Agent] Best parser saved at custom_parsers/icici_parser_best.py
ai-agent-challenge/
│
├── agent.py # Main agent logic (generate, test, refine)
├── requirements.txt # Python dependencies
├── README.md # Documentation
│
├── custom_parsers/ # Generated parsers
│ ├── icici_parser.py # Latest generated parser
│ └── icici_parser_best.py # Best-performing parser
│
├── data/ # Sample data
│ └── icici/
│ ├── icici_sample.pdf # Example bank statement
│ └── results.csv # Ground truth CSV
│
└── tests/ # Test suite (pytest)
└── test_icici.py
# Basic usage
python agent.py --target icici
# With custom number of attempts
python agent.py --target icici --attempts 10
# For different banks (ensure data directory exists)
python agent.py --target sbi --attempts 5
python agent.py --target hdfc --attempts 3The generated parsers extract data into the following CSV format:
Date,Description,Debit Amt,Credit Amt,Balance
2023-01-15,"UPI-PAYTM-123456789","500.00","","15000.50"
2023-01-16,"SALARY CREDIT","","50000.00","65000.50"
2023-01-17,"ATM WDL-AXIS BANK","2000.00","","63000.50"
Note: The exact column names must match: Date, Description, Debit Amt, Credit Amt, Balance
# Required Configuration
GEMINI_API_KEY=your_api_key_here # Get from https://aistudio.google.com/Generated parsers must follow this contract:
def parse(pdf_path: str) -> pd.DataFrame:
"""
Parse bank statement PDF and return DataFrame
Args:
pdf_path: Path to the PDF file
Returns:
pandas.DataFrame with columns: Date, Description, Debit Amt, Credit Amt, Balance
"""
# Implementation will be generated by the agent
pass- Code Generation: Uses Gemini API to generate parser code based on bank target
- Dynamic Loading: Loads the generated parser module at runtime
- Execution & Testing: Runs parser on sample PDF and compares with expected CSV
- Scoring: Calculates similarity percentage between actual and expected results
- Feedback Loop: Provides performance feedback to improve next iteration
- Best Tracking: Keeps track of highest-scoring parser across attempts
def score_parser(df_actual: pd.DataFrame, df_expected: pd.DataFrame) -> float:
"""Return similarity score (%) between two DataFrames"""
min_len = min(len(df_actual), len(df_expected))
df_actual = df_actual.head(min_len).reset_index(drop=True)
df_expected = df_expected.head(min_len).reset_index(drop=True)
matches = (df_actual == df_expected).sum().sum()
total = df_expected.size
return (matches / total) * 100 if total > 0 else 0.0To add support for a new bank:
-
Create Data Directory
mkdir -p data/[bank_name]
-
Add Sample Files
data/[bank_name]/ ├── [bank_name]_sample.pdf # Sample bank statement └── results.csv # Expected parsing output -
Run Agent
python agent.py --target [bank_name] --attempts 5
| Metric | Value | Description |
|---|---|---|
| Max Accuracy | 100% | Perfect parsing when successful |
| Processing Time | ~10-30s | Time per attempt (depends on PDF complexity) |
| Success Rate | Variable | Depends on PDF format complexity |
| Memory Usage | <50MB | Lightweight operation |
Parser Generation Fails
# Check API key is set
echo $GEMINI_API_KEY
# Verify internet connection for API callsLow Parsing Scores
- Ensure
results.csvhas correct format and data - Check if PDF is readable by pdfplumber
- Increase number of attempts for better results
Module Import Errors
# Test PDF readability
import pdfplumber
with pdfplumber.open("data/icici/icici_sample.pdf") as pdf:
print(pdf.pages[0].extract_text())pandas>=1.5.0
pdfplumber>=0.7.0
google-generativeai>=0.3.0
pathlib
importlib
# 1. Fork and clone
git clone https://github.com/yourusername/ai-agent-challenge.git
# 2. Create feature branch
git checkout -b feature/new-bank-support
# 3. Make changes and test
python agent.py --target icici --attempts 3
# 4. Commit and push
git commit -m "Add support for new bank"
git push origin feature/new-bank-support
# 5. Open Pull Request- Style: Follow PEP 8 formatting
- Documentation: Add docstrings to functions
- Testing: Test with sample data before submitting
- Error Handling: Include appropriate exception handling
This project is licensed under the MIT License - see the LICENSE file for details.
- Google AI for providing the Gemini API
- Agent-as-Coder Challenge for the project inspiration
- Open Source Community for pdfplumber and pandas libraries