An automated bug reproduction research agent powered by large language models (LLMs) and a code knowledge graph. It clones target repositories, builds a knowledge graph, analyzes and attempts to reproduce issues inside a container, and outputs reproduction commands and results.
- Python 3.11+
- Neo4j 5.x (local or remote)
- Docker
- Git
- Method 1 (recommended): pinned dependencies via requirements.txt
pip install -r requirements.txt- Method 2: install via pyproject.toml
pip install hatchling
pip install .- Create a working directory (for logs and cloned repositories cache):
mkdir -p working_dirCopy the example env file and adjust as needed:
cp example.env .envKey settings (from example.env):
- PROMETHEUS_NEO4J_URI, e.g.
bolt://localhost:7687 - PROMETHEUS_NEO4J_USERNAME / PROMETHEUS_NEO4J_PASSWORD
- PROMETHEUS_WORKING_DIRECTORY, e.g.
working_dir/ - PROMETHEUS_OPENAI_FORMAT_API_KEY and other LLM keys (Anthropic/Gemini as needed)
If you need to access private repositories, prepare a GitHub token:
export GITHUB_TOKEN="github_pat_xxxx"Before running, ensure Docker is available on the host and start the following services.
Default ports: HTTP 7474, Bolt 7687.
Minimal startup (ports consistent with example.env):
docker run -d \
--name neo4j_prometheus \
-p 7474:7474 \
-p 7687:7687 \
-e NEO4J_AUTH=neo4j/password \
neo4j:5Recommended (enable APOC and set memory):
docker run -d \
--name neo4j_prometheus \
-p 7474:7474 \
-p 7687:7687 \
-e NEO4J_AUTH=neo4j/password \
-e NEO4J_PLUGINS='["apoc"]' \
-e NEO4J_dbms_memory_heap_initial__size=2G \
-e NEO4J_dbms_memory_heap_max__size=4G \
-e NEO4J_dbms_memory_pagecache_size=2G \
neo4j:5If you map different host ports (e.g., 7475:7474, 7688:7687), update PROMETHEUS_NEO4J_URI in .env accordingly, e.g., bolt://localhost:7688.
Default port: 5432.
docker run -d \
--name postgres_prometheus \
-p 5432:5432 \
-e POSTGRES_USER=app2_user \
-e POSTGRES_PASSWORD=app2_password \
-e POSTGRES_DB=app2_db \
postgres:16You can skip this if you don't use Postgres checkpoints.
Command-line execution (required argument: --dataset_file_path):
python -m app.main --dataset_file_path projects/swe-polybench-verified.txt --github_token "$GITHUB_TOKEN"Optional arguments:
- --file, -f: output path for predictions, defaults to
projects/predictions_YYYYmmdd_HHMMSS.json
Argument description:
- --dataset_file_path, -d: path to the project list file (required). Each line should look like
<name> <git_https_url>. Lines starting with#are treated as comments. - --github_token, -g: token for accessing private repositories (optional).
- Parse the project list file and iterate projects
- Clone/update repositories and build the knowledge graph
- Start a Docker container (with volume mapping for real-time sync)
- Auto-generate testsuite commands and attempt to run them
- Auto-generate/execute environment setup commands to fix env/dependency issues
- Record and export reproduction results and related information
app/main.py: CLI entry (arguments:--dataset_file_path/--github_token/--file)app/configuration/: configuration loadingapp/container/: container managementapp/lang_graph/: language graphs/subgraphsapp/services/: knowledge graph, repository, LLM, Neo4j servicesprojects/: sample project lists and outputs
- Ensure Docker is running and you have permission to build images
- Ensure Neo4j is reachable and
PROMETHEUS_NEO4J_*are set correctly - Building the knowledge graph for large repos may take time and memory
export GITHUB_TOKEN='github_pat_xxxx'
python3 -m app.main \
--dataset_file_path projects/swe-polybench-verified.txt \
--github_token "$GITHUB_TOKEN"After the program starts, a general-purpose build container is launched and it prints the command to enter the container, for example:
To enter container, run: docker exec -it <container_short_id> /bin/bashInside the container (workdir /app), two key files are generated:
/app/prometheus_setup.sh: initial environment setup script auto-generated by EnvAgent/app/prometheus_testsuite_commands.txt: the extracted/generated testsuite command list (one command per line)
Example: enter the container and run the environment script:
# Enter the container (use the command printed in logs)
docker exec -it <container_short_id> /bin/bash
# Run the environment setup script (idempotent/re-runnable)
bash /app/prometheus_setup.sh
Tip: Edits you make on the host are reflected in `/app` inside the container in real time, which makes it easy to iterate on the environment script and testsuite commands.