IOI Agent

An AI agent for solving International Olympiad in Informatics (IOI) competitive programming problems. This is Vals AI's evaluation harness for measuring large language model performance on the IOI. It's based on our Finance Agent harness, which we've also open-sourced.

Overview

The IOI Agent evaluates AI models on competitive programming problems from the International Olympiad in Informatics, testing their ability to:

Understand complex algorithmic problem statements
Design efficient solutions with appropriate data structures and algorithms
Implement correct C++ code that passes all test cases
Work within IOI constraints (subtask-based scoring, time/memory limits)

How It Works

Problem Loading: The agent loads IOI problem statements and test cases
Conversation Flow: The AI model reasons through the problem in structured turns
Code Testing: Built-in C++ executor allows experimentation and debugging
Submission: Submitted solutions are evaluated against official IOI test cases
Scoring: Uses IOI's subtask-based scoring system (all tests in a subtask must pass)

Evaluation Limits

Maximum 50 submissions per problem
Maximum 100 conversation turns per session
C++20 compilation with standard IOI time/memory constraints

Quick Start

Requirements

Python 3.11+
g++ compiler with support for
- c++ v20
- bits/stdc++
Access to Vals model proxy for LLM integration
Git LFS for test cases

Installing the Model Proxy

Test Agent

The test_agent.py file runs a demo

# Run a test
python test_agent.py

# ... with a specific model
python test_agent.py --model openai/gpt-5-2025-08-07

# ... on a specific question
python test_agent.py --test 2024/sphinx

# ... with verbose output
python test_agent.py --verbose

# Save detailed results
python test_agent.py --save-results

We've also included a --cheat flag that allows the model access to the official solution. Use this to test the infrastructure - most models we tested achieved a full score while cheating (by submitting the provided solution code).

Output

The final score is printed in test_agent.py. Results are automatically saved to logs directory.

Available Problems

The IOI is an annual competition split into 2 days. Within each day, higher-numbered problems are harder. So Problems 1 and 4 are easier than problems 2 and 5 are easier than problems 3 and 6. Our results also corroborate evidence from student scores that the 2024 exam was slightly more difficult across the board.

2024 IOI Problems:

Day 1:

Nile
Message
Tree

Day 2: 4. Hieroglyphs 5. Mosaic 6. Sphinx

2025 IOI Problems:

Day 1:

Souvenirs
Triples
Worldmap

Day 2: 4. Festival 5. Migrations 6. Obstacles

Results

IOI benchmark results are published on vals.ai, where you can see how different AI models perform on competitive programming tasks. Recent evaluations show significant variation in model capabilities, with top performers achieving ~25% of the maximum score.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github/workflows		.github/workflows
exams		exams
submission_scripts		submission_scripts
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
agent.py		agent.py
custom_model.py		custom_model.py
logger.py		logger.py
requirements.txt		requirements.txt
test_agent.py		test_agent.py
tool.py		tool.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IOI Agent

Overview

How It Works

Evaluation Limits

Quick Start

Requirements

Installing the Model Proxy

Test Agent

Output

Available Problems

Results

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

vals-ai/ioi-agent

Folders and files

Latest commit

History

Repository files navigation

IOI Agent

Overview

How It Works

Evaluation Limits

Quick Start

Requirements

Installing the Model Proxy

Test Agent

Output

Available Problems

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages