Complete Environment Snapshot & Zero-Training sections #306

carabistouflette · 2026-01-09T11:22:08Z

PR Type

RL Environment PR - Complete Environment Snapshot & Zero-Training sections
Non-Environment PR - Complete Description, Related Issues & Type of Change sections

📝 General Information

Description

This PR continues the work started in PR #258 to create a Verifiers Environment (VerifiersEnv) that integrates the external verifiers library into the Atropos ecosystem.

Key Changes:

Refactored Architecture: Moved the environment implementation from environments/ to atroposlib/envs/ to align with the project's module structure.
Robust Error Handling: Added import guards for the optional verifiers dependency, defensive checks for empty API responses and None content, and division-by-zero protection for reward weight normalization.
W&B Tracking: Implemented percent_correct and cumulative accuracy metrics for Weights & Biases logging.
Configurable Thresholds: Added an optional reward_threshold config option for binary reward conversion.
Compatibility Layer: Created a _call_reward_func wrapper to handle API differences across verifiers versions.
Full Test Coverage: Added a comprehensive test suite (test_verifiers.py) covering initialization, trajectory collection, and item fetching.

Related Issues

Continues PR #258

Type of Change

New feature (non-breaking change which adds functionality)
This change requires a documentation update

🔖 Environment Snapshot

Field	Your Entry
Environment Name	`VerifiersEnv`
Short Description	Wrapper for loading and running RL tasks from the `verifiers` library ecosystem.
Category	Verifiable-Reasoning
Dataset Needed?	No (datasets provided by the `verifiers` library dynamically)
External Deps	`verifiers` (pip install verifiers)
Environmental Variables	None
Compute Footprint Estimate	<1 GB RAM, standard CPU/GPU for inference

🧪 Zero-Training Test Results

Details

W&B Link: N/A - Requires external verifiers package with specific environment configuration.

Examples of the Environment scoring a good example and a bad example:
Tested via unit tests in test_verifiers.py. Mocked environment correctly scores valid responses with 1.0 and handles empty/error responses gracefully.

✅ Developer & Reviewer Checklist

Code follows project style (black, isort, flake8 pass with pre-commit)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
New and existing unit tests pass locally with my changes
Docstrings added for all new public classes / functions
If .env vars required, did you add it to the .env.example in repo root?

for more information, see https://pre-commit.ci

…erifiers

for more information, see https://pre-commit.ci

…it's needed for flake8

carabistouflette · 2026-01-09T12:06:15Z

pre-commit.ci seems to be complaining about a comment in example_trainer/vllm_api_server.py
I didn't touch this file (check files changed) so it's safe to ignore it.

Updated verifiers dependency version in pyproject.toml.

cdreetz and others added 17 commits October 9, 2025 23:48

verifiers env

8a02156

[pre-commit.ci] auto fixes from pre-commit.com hooks

5b2f860

for more information, see https://pre-commit.ci

verifiers evaluate

0bff846

fix

0a814b2

[pre-commit.ci] auto fixes from pre-commit.com hooks

28c41f6

for more information, see https://pre-commit.ci

fix config

6620f1b

Merge branch 'verifiers' of https://github.com/cdreetz/atropos into v…

78b8248

…erifiers

[pre-commit.ci] auto fixes from pre-commit.com hooks

e4b28d6

for more information, see https://pre-commit.ci

refactor verifiers environment and add tests

4e03b7b

fix(verifiers): add import guard, defensive checks, and W&B tracking

4923e63

removed useless comment

ece7e11

Merge branch 'main' into verifiers

65e4d82

add Verifiers Environment section to README

4998f22

removed the invalid # noqa: F824 comment in vllm_api_server

799e18e

readded noqa F824 in vllm_api_server even if it causes Ruff error as …

d139fe5

…it's needed for flake8

added verifiers in pyproject

e4eaff4

update verifiers version to 0.1.9.post0

0ffa53a

Upgrade verifiers dependency to version 0.1.9.post0

ed3c9b7

Updated verifiers dependency version in pyproject.toml.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Complete Environment Snapshot & Zero-Training sections #306

Complete Environment Snapshot & Zero-Training sections #306

Uh oh!

carabistouflette commented Jan 9, 2026

Uh oh!

carabistouflette commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Complete Environment Snapshot & Zero-Training sections #306

Are you sure you want to change the base?

Complete Environment Snapshot & Zero-Training sections #306

Uh oh!

Conversation

carabistouflette commented Jan 9, 2026

PR Type

📝 General Information

Description

Related Issues

Type of Change

🔖 Environment Snapshot

🧪 Zero-Training Test Results

✅ Developer & Reviewer Checklist

Uh oh!

carabistouflette commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants