Skip to content

Conversation

@ob1-s
Copy link

@ob1-s ob1-s commented Jan 10, 2026

PR Type

  • RL Environment PR - Complete Environment Snapshot & Zero-Training sections
  • Non-Environment PR - Complete Description, Related Issues & Type of Change sections

📝 General Information

Description

Adds a new environment that bridges Atropos with the verifiers library, enabling any environment/eval built with verifiers to work with Atropos, including the +100 envs from the Environments Hub

Related Issues

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
  • Code refactor (no functional changes)
  • Build/CI/CD related changes
  • Other (please describe):

🔖 Environment Snapshot

Field Your Entry
Environment Name verifiers_server
Short Description Verifiers environment integration
Category RLRF, evals
Dataset Needed? The dataset associated with the chosen verifiers env
External Deps
Environmental Variables
Compute Footprint Estimate

🧪 Zero-Training Test Results

Details

W&B Link: https://wandb.ai/brunocabeludo321/atropos-environments/runs/9q6cjxum

Examples of the Environment scoring a good example and a bad example:


✅ Developer & Reviewer Checklist

  • Code follows project style (black, isort, flake8 pass with pre-commit)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • New and existing unit tests pass locally with my changes
  • Docstrings added for all new public classes / functions
  • If .env vars required, did you add it to the .env.example in repo root?

@ob1-s
Copy link
Author

ob1-s commented Jan 10, 2026

The bridge integrates verifiers into atropos by delegating rollout and scoring logic to the native verifiers env. Evaluation is also delegated to the original env's vf.Env.evaluate method, providing out-of-the-box support for any environment built with verifiers.

I successfully validated evaluation, atropos-sft-gen, and GRPO training. However, after a few steps of training, I ran into a numpy version issue.

@ob1-s
Copy link
Author

ob1-s commented Jan 10, 2026

fixed the dependency issue and added documentation.

complete training run

@teknium1 a quick review would be great

@ob1-s ob1-s marked this pull request as ready for review January 10, 2026 23:34
@ob1-s ob1-s mentioned this pull request Jan 13, 2026
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant