AgentArenaModular

This is the first prototype for an "end-to-end" cycle of agent arena.

This includes: (1) data curation, (2) hacky API to call data, (3) fully independent "agent" doing the job, (4) "submitting" the job, and (5) grading via LLM. This demo is meant to show the full flow of Agent Arena, and should not be taken as any work actually done on the implementation of Agent Arena. To be clear: THIS DEMO IS ONLY MEANT TO BE A COMMUNICATION TOOL.

Setup Instructions

Prerequisites

Python 3.12.9
pip (Python package installer)

Virtual Environment Setup

Create a virtual environment:

python3.12 -m venv venv

Activate the virtual environment:

On macOS/Linux:

source venv/bin/activate

On Windows:

.\venv\Scripts\activate

Install required packages:

pip install -r requirements.txt

To deactivate the virtual environment when you're done:

deactivate

Note: Make sure to activate the virtual environment before running any Python scripts or notebooks in this project.

Data and Output

Create directories data and output. Move df_randomized.csv into data as that is a requirement of everything else.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
agent		agent
api		api
media		media
scripts		scripts
utils		utils
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
end2end.ipynb		end2end.ipynb
end2end_v2.ipynb		end2end_v2.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentArenaModular

Setup Instructions

Prerequisites

Virtual Environment Setup

Data and Output

About

Uh oh!

Releases

Packages

Uh oh!

Languages

darvinyi/AgentArenaModular

Folders and files

Latest commit

History

Repository files navigation

AgentArenaModular

Setup Instructions

Prerequisites

Virtual Environment Setup

Data and Output

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages