Skip to content

Conversation

@banga-agents
Copy link

PR Type

  • RL Environment PR - Complete Environment Snapshot & Zero-Training sections

📝 General Information

Description

Complete implementation of the VerifiersEnv adapter for Prime Intellect's Environment Hub, enabling Atropos to run environments installed via the verifiers library and Prime CLI.

Key Features:

  • Full adapter with collect_trajectories(), score(), evaluate(), wandb_log()
  • RubricGroup support for multi-rubric environments (like wordle)
  • Optional dependency - clean ImportError if verifiers not installed
  • 26 unit tests - CI passes without Prime credentials
  • Comprehensive documentation in environments/README.md

This PR completes and extends the work started in #258.


🔖 Environment Snapshot

Field Your Entry
Environment Name VerifiersEnv (Prime Intellect Hub Adapter)
Short Description Adapter to run Prime Intellect Environment Hub environments in Atropos via the verifiers library.
Category Verifiable-Reasoning
Dataset Needed? No (datasets provided by Prime Hub environments)
External Deps verifiers (optional), Prime CLI (uv tool install prime)
Environmental Variables OPENAI_API_KEY (for LLM inference)
Compute Footprint Estimate <1 GB RAM, depends on loaded environment

🧪 Zero-Training Test Results

Details

W&B Link: N/A (tested locally)

Verification Test with will/wordle environment:

✓ rubric type: RubricGroup
✓ has rubrics attr: True
✓ num rubrics: 2
✓ reward_funcs: ['__score_rollout__']
✓ using score_rollout: True
✓ Training items: 2000
✓ Eval items: 20

Unit Test Results:

======================== 26 passed, 3 skipped in 7.41s =========================

Examples of the Environment scoring:

The adapter successfully:

  1. Loads environments from Prime Hub via vf.load_environment()
  2. Extracts rubrics (single or RubricGroup)
  3. Normalizes datasets to Atropos format
  4. Provides scoring via score_rollout() or reward functions

✅ Developer & Reviewer Checklist

  • Code follows project style (black, isort, flake8 pass with pre-commit)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • New and existing unit tests pass locally with my changes
  • Docstrings added for all new public classes / functions
  • If .env vars required, did you add it to the .env.example in repo root?

Verification Steps for Maintainers

# 1. Install dependencies
pip install verifiers
uv tool install prime

# 2. Login to Prime
prime login

# 3. Install test environment
prime env install will/wordle

# 4. Test the adapter
python -c "
import asyncio
from environments.verifiers_server import VerifiersEnv, VfEnvConfig
from atroposlib.envs.base import APIServerConfig

config = VfEnvConfig(vf_env_name='wordle')
env = VerifiersEnv(config, [APIServerConfig(model_name='test')])
asyncio.run(env.setup())
print(f'Train: {len(env.train)}, Eval: {len(env.test)}')
"
# Expected: Train: 2000, Eval: 20

cdreetz and others added 9 commits January 9, 2026 15:36
## Summary
Complete implementation of VerifiersEnv adapter for Prime Intellect's
Environment Hub, enabling Atropos to use environments installed via
the 'verifiers' library and Prime CLI.

## Changes
- environments/verifiers_server.py: Complete rewrite (~580 lines)
  - Implement collect_trajectories() for RL training
  - Implement score() for batch scoring with rubrics
  - Implement evaluate() for model evaluation
  - Add comprehensive error handling and validation
  - Add dataset normalization for various field formats
  - Add proper async method signatures
  - Add wandb_log() for metrics tracking

- atroposlib/tests/test_verifiers_env.py: New test file (26 tests)
  - Configuration tests
  - Import guard tests
  - Initialization validation tests
  - Dataset normalization tests
  - Reward calculation tests
  - Async signature verification
  - Class attribute verification
  - Prime integration tests (skipped in CI)

- atroposlib/tests/conftest.py: Add @pytest.mark.prime marker
  - Skip Prime tests unless --runprime flag is passed
  - CI passes without Prime credentials

- environments/README.md: Add documentation
  - Prerequisites and installation
  - Configuration options
  - Usage examples
  - Troubleshooting guide

## Testing
- 117 tests pass, 6 skipped (Prime integration tests)
- All existing tests continue to pass
- CI-compatible (no Prime credentials required)
- Enhanced _setup_rubric() to detect and handle RubricGroup
- Extract reward functions from each rubric in the group
- Use score_rollout() when available (RubricGroup pattern)
- Added self.rubrics list for individual rubric access
- Improved logging for rubric detection

Tested with will/wordle environment which uses RubricGroup:
- rubric type: RubricGroup
- num rubrics: 2
- using score_rollout: True
- 2000 training items, 20 eval items loaded
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants