Skip to content

Conversation

@alt-glitch
Copy link

@alt-glitch alt-glitch commented Jan 9, 2026

PR Type

  • RL Environment PR - Complete Environment Snapshot & Zero-Training sections
  • Non-Environment PR - Complete Description, Related Issues & Type of Change sections

📝 General Information

Description

Adding verifiers server to atropos to be able to use Prime Intellect's Environments with Atropos.

Completes #258

  • verifiers_server.py: Supports serve (RL training), process (SFT data generation with any API), and evaluate modes
  • verifiers_eval.py: Standalone evaluation environment with detailed metrics and retry logic
  • Automatically loads system prompts, rubrics, and datasets from Prime environments

Type of Change

  • New feature (non-breaking change which adds functionality)

🔖 Environment Snapshot

Field Your Entry
Environment Name Prime Intellect verifiers
Dataset Needed? No
External Deps verifiers
Environment/CLI Args required vf_env_name; env_args. These specify the prime env slug + args those envs might need

Results

Eval run

W&B: https://wandb.ai/sidbin/atropos-environments/runs/1aowtj75/overview?nw=nwusersidbin

Command

python environments/eval_environments/verifiers_eval.py evaluate \
      --env.vf_env_name primeintellect/gsm8k \
      --env.max_eval_items 250 \
      --openai.model_name gpt-4.1-nano \
      --openai.api_key $OPENAI_API_KEY

SFT Datagen

W&B https://wandb.ai/sidbin/atropos-environments/runs/224hu4te?nw=nwusersidbin

Command

python environments/verifiers_server.py process \
    --env.vf_env_name primeintellect/gsm8k \
    --env.data_path_to_save_groups gpt-4.1-nano-gsm8k-sft-dataset \
    --openai.base_url https://api.openai.com/v1 \
    --openai.api_key $OPENAI_API_KEY \
    --env.use_wandb false \
    --env.total_steps 100 \
    --env.group_size 10 \
    --env.use_wandb true

Multi-turn Environment:

W&B
https://wandb.ai/sidbin/atropos-environments/runs/dotetook?nw=nwusersidbin

Command

python environments/verifiers_server.py process \
      --env.vf_env_name primeintellect/mini-swe-agent-plus \
      --env.data_path_to_save_groups gpt-5.2-swe-agent-sft-dataset \
      --openai.base_url https://api.openai.com/v1 \
      --openai.api_key $OPENAI_API_KEY \
      --openai.model_name gpt-5.2 \
      --env.total_steps 5 \
      --env.group_size 2 \
      --env.use_wandb true

✅ Developer & Reviewer Checklist

  • Code follows project style (black, isort, flake8 pass with pre-commit)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • New and existing unit tests pass locally with my changes
  • Docstrings added for all new public classes / functions
  • If .env vars required, did you add it to the .env.example in repo root?

- verifiers_server.py: consistent dataset column selection for train/test,
  remove redundant comments, preserve float precision for scores
- verifiers_eval.py: add env_config_cls, fix constructor signature to match
  BaseEnv (slurm bool), make stub methods raise NotImplementedError
@alt-glitch alt-glitch changed the title [WIP]: Verifiers Integration Verifiers Integration Jan 9, 2026
@alt-glitch alt-glitch marked this pull request as ready for review January 9, 2026 14:00
@alt-glitch
Copy link
Author

paging @teknium1 for a quick review!

@alt-glitch
Copy link
Author

alt-glitch commented Jan 10, 2026

Adding multi-turn runs using primeintellect/mini-swe-agent-plus.

Command

python environments/verifiers_server.py process \
    --env.vf_env_name primeintellect/mini-swe-agent-plus \
    --env.data_path_to_save_groups gpt-5.2-swe-agent-sft-dataset \
    --openai.base_url https://api.openai.com/v1 \
    --openai.api_key $OPENAI_API_KEY \
    --openai.model_name gpt-5.2 \
    --env.total_steps 5 \
    --env.group_size 1 \
    --env.use_wandb true

W&B Run: https://wandb.ai/sidbin/atropos-environments/runs/r2115ttr?nw=nwusersidbin
The run has passing and failing examples both, verifying multi-turn envs work.

@alt-glitch alt-glitch marked this pull request as draft January 10, 2026 07:56
@alt-glitch
Copy link
Author

Marking as draft while I finalize the PR

@alt-glitch alt-glitch marked this pull request as ready for review January 10, 2026 09:27
@alt-glitch
Copy link
Author

Added multi-turn rollout using process.

Ready for review @teknium1

@teknium1 teknium1 mentioned this pull request Jan 13, 2026
17 tasks
@alt-glitch alt-glitch requested a review from dmahan93 January 14, 2026 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants