AI Agent CTF Challenge Series 🎯

A series of Capture The Flag (CTF) challenges demonstrating security risks in AI agents with command execution capabilities.

Overview

This repository contains multiple levels of CTF challenges, each demonstrating different security concepts around AI agents and command execution:

Level 1: Command Execution via API
- Demonstrates basic command injection risks
- Shows why blindly executing commands is dangerous
- Uses a restricted command allowlist
Level 2: Multi-Stage Command Injection
- Demonstrates input validation bypass techniques
- Shows attack chaining through note → report → summary workflow
- Features naive security filtering that can be circumvented

Prerequisites

Docker Desktop for Mac
curl
jq (for pretty JSON output)

brew install jq

Quick Start

Clone the repository:

git clone https://github.com/yourusername/agent-ctf.git
cd agent-ctf

Start a level (example with Level 1):

cd level1
docker compose up --build

Try the challenge by interacting with the API endpoint

Challenge Levels

Level 1: "Helpful... and Root?"

Goal: Find and read a hidden flag file
Concept: Command execution via API endpoints
Target: /tmp/ctf/level1/flag.txt
Level 1 Details

Security Features

Containers run read-only
Dropped capabilities
Command allowlisting
Resource limits
Temporary filesystems

Development

Each level follows this structure:

levelN/
├── docker-compose.yml
├── Dockerfile
├── seed.sh
├── safe_sh
├── app/
│   ├── agent.py
│   └── run.sh
└── README.md

Contributing

Want to add a level? PRs welcome! Each level should:

Be self-contained in Docker
Have clear learning objectives
Include proper security controls
Document solution methods

License

MIT License - See LICENSE for details

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
level1		level1
level2		level2
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Agent CTF Challenge Series 🎯

Overview

Prerequisites

Quick Start

Challenge Levels

Level 1: "Helpful... and Root?"

Security Features

Development

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

Adimarogonas/agent-ctf

Folders and files

Latest commit

History

Repository files navigation

AI Agent CTF Challenge Series 🎯

Overview

Prerequisites

Quick Start

Challenge Levels

Level 1: "Helpful... and Root?"

Security Features

Development

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages