refactor: split eval-env into separate metaworld and procthor environments #7

Shr1ftyy · 2026-01-29T14:22:34Z

Summary

Split the monolithic eval-env/ into separate per-environment packages under environments/
Add new kinitro build-env <family> CLI command for building individual environment images
Remove deprecated eval-env/ directory and build-eval-env CLI command

Changes

New Structure

environments/
├── README.md              # Documentation
├── metaworld/             # MuJoCo-based manipulation tasks
│   ├── Dockerfile
│   ├── env.py
│   └── requirements.txt
└── procthor/              # AI2-THOR procedural house tasks
    ├── Dockerfile
    ├── env.py
    └── requirements.txt

New CLI Command

# Build MetaWorld environment (~1GB image)
kinitro build-env metaworld --tag kinitro/metaworld:v1

# Build ProcTHOR environment (~3GB image, x86_64 Linux only)
kinitro build-env procthor --tag kinitro/procthor:v1

Benefits

Aspect	Before (Monolithic)	After (Split)
Docker image size	~3GB+	MetaWorld: ~1GB, ProcTHOR: ~3GB
Platform support	x86_64 Linux only	MetaWorld: any platform
Build time	Long	Faster per-environment
Dependency isolation	Possible conflicts	Fully isolated

Testing

Built and tested MetaWorld environment
Built and tested ProcTHOR environment with observation image verification
Verified CLI commands work correctly
Ruff format and lint pass

Summary by CodeRabbit

New Features
- Added MetaWorld and ProcTHOR evaluation environments with containerized runtimes and packaged dependencies
- New build-env CLI to build and tag environment-specific images by family
Updates
- Reorganized environments into family-based structure (metaworld, procthor) and adjusted build/output paths
- ProcTHOR-focused defaults and dependency updates for embodied-AI workflows
Documentation
- Added environments README with build, configuration, and extension guidance
Chores
- Build artifact tracking updated (previous ignore removed)

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…ments - Create environments/ directory with per-environment packages - Add environments/metaworld/ with MuJoCo-based manipulation tasks - Add environments/procthor/ with AI2-THOR procedural house tasks - Add new 'kinitro build-env <family>' CLI command - Remove deprecated monolithic eval-env/ directory - Remove deprecated 'kinitro build-eval-env' command - Add scipy to procthor requirements (needed for house generation) Benefits: - Smaller Docker images (metaworld ~1GB vs ~3GB monolithic) - Better platform compatibility (metaworld works on any platform) - Faster builds (only build what you need) - Cleaner separation of dependencies

coderabbitai · 2026-01-29T14:22:54Z

📝 Walkthrough

Walkthrough

Adds MetaWorld and ProcTHOR evaluation environments (Dockerfiles, requirements, env Actors), introduces family-oriented build command and registry metadata, updates CLI and docs for per-family images, and removes eval-env .gitignore entries. No behavioral changes to existing non-environment code.

Changes

Cohort / File(s)	Summary
Top-level docs `environments/README.md`, `README.md`, `docs/backend-guide.md`	New README for environments; document per-family builds and image variables; update overview and backend guide to reference metaworld/procthor image variables and requirements.
MetaWorld env `environments/metaworld/Dockerfile`, `environments/metaworld/requirements.txt`, `environments/metaworld/env.py`	Add MetaWorld Docker context, pinned Python deps, and an Affinetes-compatible `Actor` implementing evaluation flow, miner HTTP interactions, env caching, timeouts, and structured results.
ProcTHOR env `environments/procthor/Dockerfile`, `environments/procthor/requirements.txt`, `environments/procthor/env.py`	Add/adjust ProcTHOR Docker context and deps (remove MuJoCo/metaworld deps, add scipy); update Actor and defaults to enforce `procthor/` env_id and ProcTHOR-focused docs and examples.
CLI & registry `kinitro/cli.py`, `kinitro/environments/registry.py`	Replace `build_eval_env` with `build-env` and `AVAILABLE_ENV_FAMILIES`; build per-family from `environments/<family>`; add `FAMILY_METADATA` and `get_family_metadata` and adjust output/paths accordingly.
Eval-env cleanup `eval-env/.gitignore`	Removed ignore entries for the kinitro build artifact so the kinitro/ build artifact directory will be tracked.
Misc docs/examples other touched docs and examples	Update examples, docs, and config references to use per-family image names, architecture notes, and ProcTHOR/MuJoCo distinctions.

Sequence Diagram(s)

mermaid
sequenceDiagram
rect rgba(240,240,255,0.5)
participant Affinetes as "Affinetes (Actor RPC)"
end
rect rgba(240,255,240,0.5)
participant Actor as "Actor (env.py)"
end
rect rgba(255,240,240,0.5)
participant Env as "MetaWorld / ProcTHOR Env"
end
rect rgba(255,255,240,0.5)
participant Miner as "Miner (policy HTTP)"
end

Affinetes->>Actor: evaluate(task_id, model, base_url, env_id, ...)
Actor->>Env: _get_env(env_id) (lazy init / cache)
Actor->>Miner: POST /reset {task_id, seed}
Miner-->>Actor: initial observation
loop per timestep
    Actor->>Miner: POST /act {observation, meta}
    Miner-->>Actor: action
    Actor->>Env: step(action)
    Env-->>Actor: observation, reward, done, info
    Actor->>Actor: accumulate reward, record timings
    alt done or timeout
        break
    end
end
Actor-->>Affinetes: result {score, success, time_taken, extra, error?}

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat(procthor): add procedural environment with scene-grounded tasks #5 — overlaps ProcTHOR environment support and registry/CLI adjustments.

Suggested reviewers

rishiad

Poem

"I hopped through Docker, code, and log,
Built families, miners, a tiny cog.
Env steps and actions danced in tune,
Tags stamped bright beneath the moon.
A rabbit's cheer — the images bloom! 🐇"

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the main refactoring: splitting a monolithic eval-env into separate metaworld and procthor environments, which aligns with the comprehensive changes throughout the codebase.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/seperate-env-images

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@environments/metaworld/env.py`:
- Around line 93-95: list_environments currently returns
get_all_environment_ids() which includes non-MetaWorld environments; update
environments/metaworld/env.py so MetaWorld.list_environments filters that full
list to only MetaWorld IDs (e.g., by matching the MetaWorld naming/prefix or
consulting a MetaWorld registry) before returning. Locate the list_environments
method and replace the direct return of get_all_environment_ids() with a
filtered comprehension or helper that selects only environment IDs belonging to
MetaWorld, preserving the async signature and docstring.

In `@environments/metaworld/requirements.txt`:
- Around line 10-11: The metaworld git dependency is currently tracking the
repository default branch which risks non-reproducible builds; update the
dependency line "metaworld @
git+https://github.com/Farama-Foundation/Metaworld.git" to pin it to the stable
v3.0.0 release (e.g. add `@v3.0.0` to the git URL or use an exact version
specifier) so installs are reproducible, then re-lock/reinstall dependencies to
verify.

🧹 Nitpick comments (8)

environments/procthor/requirements.txt (1)
9-11: Consider pinning the procthor git dependency to a specific commit for reproducible builds.

Similar to the metaworld requirements, installing procthor from the default branch can lead to non-reproducible builds. The comment mentions a Dec 2022 fix, so pinning to that commit would ensure consistency:
 # Install procthor from git to get the latest house specification fix (Dec 2022)
 # The PyPI version 0.0.1.dev2 has incompatible material format with newer AI2-THOR builds
-procthor @ git+https://github.com/allenai/procthor.git
+procthor @ git+https://github.com/allenai/procthor.git@<commit-sha-from-dec-2022>
environments/README.md (1)
7-18: Add a language identifier to the fenced code block.

Per static analysis (markdownlint MD040), fenced code blocks should have a language specified:
-```
+```text
 environments/
 ├── metaworld/          # MuJoCo-based manipulation tasks
environments/procthor/env.py (1)
143-147: Consider documenting the **kwargs parameter for forward compatibility.

The unused kwargs parameter (flagged by Ruff ARG002) appears intentional for API extensibility. A brief docstring note would clarify this:
         timeout: int = 300,
         **kwargs,
     ) -> dict:
         """
         Evaluate a miner's policy on a ProcTHOR task.
+
+        Note: **kwargs is accepted for forward compatibility with future parameters.
environments/metaworld/env.py (4)
97-109: Use explicit | None type annotations for optional parameters.

Ruff flags implicit Optional types (RUF013). PEP 484 recommends explicit T | None syntax for clarity.
✨ Proposed fix for type hints
     async def evaluate(
         self,
         task_id: int,
-        seed: int = None,
-        model: str = None,
-        base_url: str = None,
+        seed: int | None = None,
+        model: str | None = None,
+        base_url: str | None = None,
         env_id: str = "metaworld/pick-place-v3",
         max_timesteps: int = 500,
         action_timeout: float = 0.5,
         use_images: bool = True,
         timeout: int = 300,
         **kwargs,
     ) -> dict:
284-291: Prefix unused info variable with underscore.

The info variable from env.step() is unpacked but never used. Prefix it with an underscore to indicate it's intentionally unused.
✨ Proposed fix
-            obs, reward, done, info = env.step(action)
+            obs, reward, done, _info = env.step(action)
315-323: Use explicit | None for extra parameter type hint.

Same Ruff RUF013 issue: the extra parameter defaults to None but lacks an explicit | None annotation.
✨ Proposed fix
     def _build_error_result(
         self,
         env_id: str,
         task_id: int,
         seed: int,
         start_time: float,
         error: str,
-        extra: dict = None,
+        extra: dict | None = None,
     ) -> dict:
341-351: Consider logging exceptions during cleanup instead of silently swallowing them.

The try-except-pass pattern can hide issues. While cleanup shouldn't propagate exceptions, logging at debug/warning level aids troubleshooting.
🔧 Proposed fix to log cleanup errors
     async def cleanup(self):
         """Cleanup resources."""
         # HTTP clients are now created per-request, no need to close here

         # Close environments
         for env in self._env_cache.values():
             try:
                 env.close()
-            except Exception:
-                pass
+            except Exception as e:
+                logger.debug("Error closing environment", error=str(e))
         self._env_cache.clear()
environments/metaworld/Dockerfile (1)
27-48: Consider whether git is needed at runtime.

The git package is installed along with rendering dependencies. If it's only needed during pip install for VCS-based dependencies, you could potentially remove it after pip install to reduce image size. However, if metaworld or other packages need it at runtime, this is fine.

If git is only a build-time dependency:
✨ Optional: Remove git after pip install
 RUN apt-get update && apt-get install -y --no-install-recommends \
     git \
     # OpenGL/Mesa libraries for MuJoCo rendering
     ...
     && rm -rf /var/lib/apt/lists/*

 # Copy requirements and install dependencies
 COPY requirements.txt /app/
-RUN pip install --no-cache-dir -r /app/requirements.txt
+RUN pip install --no-cache-dir -r /app/requirements.txt \
+    && apt-get purge -y git \
+    && apt-get autoremove -y

environments/metaworld/env.py

environments/metaworld/requirements.txt

- MetaWorld actor now only lists metaworld/* environments - ProcTHOR actor now only lists procthor/* environments - Pin metaworld dependency to v3.0.0 for reproducible builds Addresses review feedback from CodeRabbit

- Add FAMILY_METADATA to registry with display names and descriptions - Add get_family_metadata() function to registry - Refactor list-envs CLI to use registry functions instead of hardcoding - Derives family info from single source of truth in registry

- Update README.md with new build-env command and ProcTHOR environments - Update backend-guide.md with per-family image configuration - Update environments/README.md with accurate image sizes - Replace deprecated build-eval-env references with build-env

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

environments/metaworld/env.py Outdated Show resolved Hide resolved

environments/metaworld/requirements.txt Outdated Show resolved Hide resolved

Shr1ftyy added 2 commits January 29, 2026 14:48

fix: filter list_environments to return only environment-specific IDs

e6eaaa1

- MetaWorld actor now only lists metaworld/* environments - ProcTHOR actor now only lists procthor/* environments - Pin metaworld dependency to v3.0.0 for reproducible builds Addresses review feedback from CodeRabbit

Shr1ftyy force-pushed the feat/seperate-env-images branch from 90e644b to 48fe38b Compare January 29, 2026 22:52

Shr1ftyy requested a review from rishiad January 29, 2026 23:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: split eval-env into separate metaworld and procthor environments #7

refactor: split eval-env into separate metaworld and procthor environments #7

Shr1ftyy commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 29, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

refactor: split eval-env into separate metaworld and procthor environments #7

Are you sure you want to change the base?

refactor: split eval-env into separate metaworld and procthor environments #7

Conversation

Shr1ftyy commented Jan 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

New Structure

New CLI Command

Benefits

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Shr1ftyy commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 29, 2026 •

edited

Loading