Agent Smith

An AI coding agent CLI tool that can autonomously read, write, and execute Python files within a sandboxed working directory. Built with Python and Google's Gemini API.

Overview

Agent Smith is an AI agent that uses function calling to interact with a filesystem. It can iteratively work through complex multi-step tasks by:

Listing and exploring files in a directory
Reading file contents
Executing Python files with arguments
Writing or modifying files
Building context across multiple iterations to solve problems

The agent operates within a sandboxed working directory (./calculator) to ensure security.

Features

Agent Loop: Iteratively calls the LLM up to 20 times, maintaining conversation history to work through complex tasks
Function Calling: Four core functions for filesystem and code execution operations
Conversation History: Full context maintained across iterations for intelligent decision-making
Sandboxed Execution: All operations confined to a specified working directory
Verbose Mode: Optional detailed logging of token usage and function calls

Prerequisites

Python 3.10 or higher
uv package manager
Google Gemini API key

Setup

Clone the repository:

git clone <repository-url>
cd agent-smith

Install dependencies:
```
uv sync
```
Set up environment variables: Create a .env file in the project root:
```
GEMINI_API_KEY=your_api_key_here
```

Usage

Basic Usage

uv run python main.py "your prompt here"

Verbose Mode

Get detailed output including token counts and function call details:

uv run python main.py "your prompt here" --verbose

Example Prompts

# Ask questions about the code
uv run python main.py "How does the calculator render results to the console?"

# List and explore files
uv run python main.py "List all Python files in the working directory"

# Fix bugs
uv run python main.py "Fix the calculator and run the tests"

# Read and analyze code
uv run python main.py "Read the calculator code and explain how it works"

# Execute code
uv run python main.py "Run the calculator tests and tell me if they pass"

Architecture

Directory Structure

agent-smith/
├── main.py                 # Entry point and agent loop
├── call_function.py        # Function dispatcher
├── prompts.py             # System prompt
├── config.py              # Configuration constants
├── .env                   # Environment variables (API key)
├── tests.py               # Test suite for agent
├── functions/             # Function implementations
│   ├── get_files_info.py
│   ├── get_file_content.py
│   ├── run_python_file.py
│   └── write_file.py
└── calculator/            # Example project (working directory)
    ├── main.py           # Calculator CLI
    ├── tests.py          # Calculator tests
    └── pkg/
        ├── calculator.py # Calculator logic
        └── render.py     # Output formatting

Core Components

main.py - Entry point:

Parses CLI arguments (prompt and optional --verbose flag)
Initializes Gemini client with API key
Implements the agent loop (up to 20 iterations)
Maintains conversation history across iterations
Prints final response or error message

call_function.py - Function dispatcher:

Defines available_functions tool with 4 function declarations
Routes function calls to appropriate handlers in functions/ directory
Injects WORKING_DIR from config for security
Returns function results in Gemini's expected format

prompts.py - System prompt:

Instructs the AI on available operations
Defines security model (relative paths only)
Guides the agent's behavior

config.py - Configuration:

MAX_CHARS = 10000 - File read character limit
WORKING_DIR = "./calculator" - Sandboxed working directory

Agent Loop

The agent loop in generate_content() works as follows:

Call Gemini API with current conversation history
Add response candidates to conversation history
Check for function calls:
- If none: Print final response and exit
- If present: Execute all function calls
Add function results to conversation history as user messages
Repeat until final response or max iterations (20) reached

This allows the agent to iteratively build context and work through complex multi-step tasks.

Available Functions

All functions enforce security by validating paths stay within working_directory:

get_files_info(directory=None)

Lists files in a directory with sizes and is_dir flags
Default: lists working directory contents
Returns file metadata in a formatted string

get_file_content(file_path)

Reads first 10,000 characters of a file
Appends truncation notice if file is larger
Only accepts relative paths within working directory

run_python_file(file_path, arguments=None)

Executes Python files with optional arguments
30 second timeout
Returns STDOUT, STDERR, and exit code
Only executes .py files

write_file(file_path, content)

Creates or overwrites files
Auto-creates parent directories if needed
Validates path is within working directory

Security Model

All function calls require paths relative to the working directory. The working_directory parameter is automatically injected by call_function() and validated by each function using os.path.abspath() to prevent directory traversal attacks.

Configuration

Environment Variables

GEMINI_API_KEY (required)

Your Google Gemini API key
Loaded from .env file via python-dotenv

Iteration Limit

The agent loop is limited to 20 iterations to prevent infinite loops and excessive token usage. This can be adjusted in main.py:

for _ in range(20):  # Change this number to adjust the limit

Working Directory

The sandboxed working directory is set in config.py:

WORKING_DIR = "./calculator"

Change this to point to a different directory if needed, but be cautious about filesystem access.

Example: Calculator Project

The repository includes a sample calculator project in ./calculator/ that the agent can work with:

calculator/main.py - CLI calculator:

Takes mathematical expressions as arguments
Uses Calculator class for evaluation
Outputs JSON-formatted results

calculator/pkg/calculator.py - Core logic:

Supports +, -, *, / operators
Implements operator precedence
Two-stack algorithm for expression evaluation

calculator/tests.py - Test suite for the calculator

Example Session

$ uv run python main.py "How do I fix the calculator?"
 - Calling function: get_files_info
 - Calling function: get_file_content
 - Calling function: run_python_file
 - Calling function: get_file_content
 - Calling function: write_file
 - Calling function: run_python_file

I found a bug in the calculator where it wasn't handling operator precedence correctly.
I've fixed the issue in pkg/calculator.py and verified the fix by running the tests.
All tests now pass!

Safety Considerations

IMPORTANT: This tool gives an LLM access to your filesystem and Python interpreter. Use with caution:

Always work in a sandboxed directory
Commit your changes before running the agent on important codebases
Review any file modifications the agent suggests
Don't give the agent access to sensitive directories
Be aware of API rate limits and costs
Monitor the agent's actions, especially in verbose mode
If you used the Gemini API on the paid tier, be sure to delete your API key when you're all finished to avoid unexpected charges

Extending the Project

You've completed the required steps, but have some fun with it! (Carefully, though... be very cautious about giving an LLM access to your filesystem and Python interpreter.) See if you can get it to:

Fix harder and more complex bugs
Refactor sections of code
Add entirely new features

You can also try:

Other LLM providers (OpenAI, Anthropic, etc.)
Other Gemini models (gemini-2.0-pro, etc.)
Giving it more functions to call (install packages, run git commands, etc.)
Other codebases (commit your changes before running the agent on a codebase, so you can always revert)

Remember: What we've built is a toy version of something like Cursor/Zed's Agentic Mode, or Claude Code. Even their tools aren't perfectly secure, so be careful what you give them access to. And don't encourage anyone to use this toy agent as-is!

Troubleshooting

Rate Limit Errors:

Gemini free tier has a limit of 5 requests per minute
Wait ~30 seconds between tests if you hit the limit
Consider upgrading to a paid tier for higher limits

Module Not Found:

Ensure you're running with uv run python main.py to use the virtual environment
Run uv sync to install dependencies

API Key Issues:

Verify your .env file exists and contains GEMINI_API_KEY=...
Check your API key is valid at https://ai.google.dev/

License

This project was created as part of a Boot.dev course exercise.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.vscode		.vscode
calculator		calculator
functions		functions
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
README.md		README.md
call_function.py		call_function.py
config.py		config.py
main.py		main.py
prompts.py		prompts.py
pyproject.toml		pyproject.toml
tests.py		tests.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Smith

Overview

Features

Prerequisites

Setup

Usage

Basic Usage

Verbose Mode

Example Prompts

Architecture

Directory Structure

Core Components

Agent Loop

Available Functions

Security Model

Configuration

Environment Variables

Iteration Limit

Working Directory

Example: Calculator Project

Example Session

Safety Considerations

Extending the Project

Troubleshooting

License

About

Uh oh!

Releases

Packages

Languages

jasonbland/agent-smith

Folders and files

Latest commit

History

Repository files navigation

Agent Smith

Overview

Features

Prerequisites

Setup

Usage

Basic Usage

Verbose Mode

Example Prompts

Architecture

Directory Structure

Core Components

Agent Loop

Available Functions

Security Model

Configuration

Environment Variables

Iteration Limit

Working Directory

Example: Calculator Project

Example Session

Safety Considerations

Extending the Project

Troubleshooting

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages