-
Notifications
You must be signed in to change notification settings - Fork 101
Add ApptainerWorkspace implementation for rootless container support #892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit implements ApptainerWorkspace, a container-based workspace that uses Apptainer (formerly Singularity) instead of Docker. This addresses the need for rootless container execution in HPC and shared computing environments where Docker may not be available or permitted. Key features: - No root privileges required for container execution - Converts Docker images to Apptainer SIF format with caching - Full RemoteWorkspace API compatibility - Automatic port management and health checking - Support for directory mounting and environment forwarding - Comprehensive documentation and examples Files added: - openhands-workspace/openhands/workspace/apptainer/workspace.py (implementation) - openhands-workspace/openhands/workspace/apptainer/__init__.py (module init) - openhands-workspace/openhands/workspace/apptainer/README.md (documentation) - examples/02_remote_agent_server/05_convo_with_apptainer_sandboxed_server.py (usage example) - tests/workspace/test_apptainer_workspace.py (test suite) - APPTAINER_WORKSPACE_TEST_LOG.md (test results and validation) Files modified: - openhands-workspace/openhands/workspace/__init__.py (export ApptainerWorkspace) Closes #891 Co-authored-by: openhands <openhands@all-hands.dev>
The ApptainerWorkspace implementation could not be tested end-to-end in the development environment because Apptainer is not installed. This commit adds transparency about testing limitations and provides clear guidance for users who want to test the implementation themselves. Changes: - Updated APPTAINER_WORKSPACE_TEST_LOG.md to explicitly state testing limitations - Added clear distinction between what was tested (code structure, types, API) and what requires Apptainer (runtime execution) - Added testing instructions to README.md for users with Apptainer installed - Clarified that validation focused on code correctness rather than runtime behavior This ensures users understand the implementation is structurally sound and type-correct, but requires Apptainer installation for full validation. Co-authored-by: openhands <openhands@all-hands.dev>
- Remove Docker dependency from _prepare_sif_image() - Use 'apptainer pull docker://image' instead of 'apptainer build ... docker-daemon://image' - This eliminates the need for Docker daemon, which is the main value of Apptainer - Remove unused imports (build, BuildOptions) - Add comprehensive test demonstrating Apptainer functionality - Successfully tested image pull and container execution - Document testing results and limitations Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
- Switch ApptainerWorkspace from instance mode to exec mode for better compatibility - Fix RemoteWorkspace to include API key in default HTTP client headers - Add authentication support via SESSION_API_KEY environment variable - Include demo log showing successful Apptainer workspace operation Co-authored-by: openhands <openhands@all-hands.dev>
Coverage Report •
|
||||||||||||||||||||||||||||||
Keep only the essential implementation and demo log as requested in issue. Co-authored-by: openhands <openhands@all-hands.dev>
ℹ️ Note on
|
Co-authored-by: openhands <openhands@all-hands.dev>
- Fix missing dependency that caused import errors for openhands.agent_server modules - Add assertion for cache_dir to help type checking - This allows ApptainerWorkspace to correctly import BuildOptions and related classes Co-authored-by: openhands <openhands@all-hands.dev>
|
[Automatic Post]: It has been a while since there was any activity on this PR. @neubig, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up. |
2 similar comments
|
[Automatic Post]: It has been a while since there was any activity on this PR. @neubig, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up. |
|
[Automatic Post]: It has been a while since there was any activity on this PR. @neubig, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up. |
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Add comprehensive documentation for ApptainerWorkspace, showing how to run agent servers in rootless Apptainer containers for HPC and shared computing environments. Includes: - When to use Apptainer vs Docker - Configuration options (pre-built image, base image, SIF file) - Key features and differences from Docker - Troubleshooting guide Relates to OpenHands/software-agent-sdk#892
|
The |
|
Fixed the The issue was that the docs branch name ( I've:
The check-examples workflow should now pass once GitHub Actions picks up the new branch. You may need to trigger a re-run of the workflow. |
Just to clarify, the docs check is not required for CI to pass. It’s just for humans or agents, to remind us 😅 |
xingyaoww
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neubig Looks like we are actually able to setup-apptainer in CI 👀
|
@OpenHands We have a workflow which runs in CI, but it’s not required by CI for merge. It’s check-examples workflow. Find it and rename its human-facing title to “[Optional] Docs example / check-examples”. I mean, we want the visible name in CI on GitHub to signal clearly that it’s not a required job (for PR merge). Open a new branch from main and a new PR for this specific task, don’t mess with this PR. |
|
I'm on it! enyst can track my progress at all-hands.dev |
|
Summary of work completed What I changed
Branch and PR
Quality checks
Checklist against request
No behavioral or logic changes—purely a visible name update to clarify the job is optional. |
|
@OpenHands set up apptainer in CI and iterate until you have a test that demonstrates that this example passes: https://github.com/marketplace/actions/setup-apptainer |
|
I'm on it! neubig can track my progress at all-hands.dev |
- Delete apptainer-tests.yml workflow (tests removed) - Rename example from 05 to 07 (06 already exists) - Remove unnecessary _cached_state invalidation in example - Reuse find_available_tcp_port and check_port_available from docker/workspace.py - Remove mocked test file (was not testing real functionality) - Add setup-apptainer to run-examples.yml for running apptainer example Co-authored-by: openhands <openhands@all-hands.dev>
|
Addressed all review comments from @xingyaoww:
|
🔄 Running Examples with
|
| Example | Status | Duration | Cost |
|---|---|---|---|
| 01_standalone_sdk/02_custom_tools.py | ✅ PASS | 1m 7s | $0.12 |
| 01_standalone_sdk/03_activate_skill.py | ✅ PASS | 16.1s | $0.02 |
| 01_standalone_sdk/05_use_llm_registry.py | ✅ PASS | 8.5s | $0.01 |
| 01_standalone_sdk/07_mcp_integration.py | ✅ PASS | 25.6s | $0.03 |
| 01_standalone_sdk/09_pause_example.py | ✅ PASS | 11.3s | $0.01 |
| 01_standalone_sdk/10_persistence.py | ✅ PASS | 20.6s | $0.02 |
| 01_standalone_sdk/11_async.py | ✅ PASS | 25.6s | $0.03 |
| 01_standalone_sdk/12_custom_secrets.py | ✅ PASS | 14.6s | $0.01 |
| 01_standalone_sdk/13_get_llm_metrics.py | ✅ PASS | 15.9s | $0.01 |
| 01_standalone_sdk/14_context_condenser.py | ✅ PASS | 2m 29s | $0.33 |
| 01_standalone_sdk/17_image_input.py | ✅ PASS | 15.5s | $0.02 |
| 01_standalone_sdk/18_send_message_while_processing.py | ✅ PASS | 21.2s | $0.01 |
| 01_standalone_sdk/19_llm_routing.py | ✅ PASS | 15.1s | $0.02 |
| 01_standalone_sdk/20_stuck_detector.py | ✅ PASS | 14.6s | $0.02 |
| 01_standalone_sdk/21_generate_extraneous_conversation_costs.py | ✅ PASS | 9.0s | $0.00 |
| 01_standalone_sdk/22_anthropic_thinking.py | ✅ PASS | 15.1s | $0.01 |
| 01_standalone_sdk/23_responses_reasoning.py | ✅ PASS | 48.2s | $0.01 |
| 01_standalone_sdk/24_planning_agent_workflow.py | ✅ PASS | 6m 46s | $0.50 |
| 01_standalone_sdk/25_agent_delegation.py | ❌ FAIL Exit code 1 |
21.1s | -- |
| 01_standalone_sdk/26_custom_visualizer.py | ✅ PASS | 15.9s | $0.02 |
| 01_standalone_sdk/28_ask_agent_example.py | ✅ PASS | 31.3s | $0.03 |
| 01_standalone_sdk/29_llm_streaming.py | ✅ PASS | 35.4s | $0.03 |
| 01_standalone_sdk/30_tom_agent.py | ✅ PASS | 8.2s | $0.01 |
| 01_standalone_sdk/31_iterative_refinement.py | ✅ PASS | 5m 7s | $0.39 |
| 01_standalone_sdk/32_configurable_security_policy.py | ✅ PASS | 14.2s | $0.02 |
| 02_remote_agent_server/01_convo_with_local_agent_server.py | ✅ PASS | 1m 2s | $0.06 |
| 02_remote_agent_server/02_convo_with_docker_sandboxed_server.py | ✅ PASS | 1m 5s | $0.05 |
| 02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py | ✅ PASS | 1m 12s | $0.07 |
| 02_remote_agent_server/04_convo_with_api_sandboxed_server.py | ❌ FAIL Exit code 1 |
5m 9s | -- |
| 02_remote_agent_server/06_convo_with_cloud_workspace.py | ✅ PASS | 23.2s | $0.02 |
| 02_remote_agent_server/07_convo_with_apptainer_sandboxed_server.py | ✅ PASS | 2m 5s | $0.03 |
❌ Some tests failed
Total: 31 | Passed: 29 | Failed: 2 | Total Cost: $1.92
Failed examples:
- examples/01_standalone_sdk/25_agent_delegation.py: Exit code 1
- examples/02_remote_agent_server/04_convo_with_api_sandboxed_server.py: Exit code 1
|
@OpenHands download artifact and help me understand why these example test failed |
|
I'm on it! xingyaoww can track my progress at all-hands.dev |
Analysis of Failed Example TestsI downloaded and analyzed the CI artifact logs from workflow run #20554219653. Here's what I found: Failed Tests Summary2 tests failed out of 31 total (29 passed): 1.
|
xingyaoww
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM since example test is passing (the failed ones are not related to apptainer)
|
|
||
| # Option 2: Build from a base image (more flexible) | ||
| with ApptainerWorkspace( | ||
| base_image="nikolaik/python-nodejs:python3.12-nodejs22", | ||
| host_port=8010, | ||
| ) as workspace: | ||
| result = workspace.execute_command("python --version") | ||
| print(result.stdout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually i think this is not implemented yet- shall we remove base_image from the ApptainerWorkspace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refer to the difference between DockerWorkspace and DockerDevWorkspace, i think we can simplify ApptainerWorkspace to be an equivalent of DockerWorkspace that is only able to launch docker images that have already been pre-built, as it currently doesn't contain logic to build docker images.
|
@OpenHands check xingyaoww's comments and reflect them. then make sure that all of the examples in the docs actually run by installing apptainer and running them yourself. save the logs to a file and tell me the locations of the logs so i can verify your work. |
|
I'm on it! neubig can track my progress at all-hands.dev |
Address xingyaoww's review comments: - Remove base_image field (ApptainerWorkspace now only supports pre-built images) - Remove target field (TargetType) as it's no longer needed - Update import to use PlatformType from sdk.workspace - Simplify _validate_images to check only server_image or sif_file - Simplify _prepare_sif_image to only handle server_image - Update class docstring to match DockerWorkspace pattern - Update README.md to remove base_image documentation and examples - Update example file comment to remove base_image reference Co-authored-by: openhands <openhands@all-hands.dev>
SummaryI've addressed xingyaoww's review comments on PR #892 and pushed the changes. Here's what was done: Changes Made (commit 812df71)1. Simplified ApptainerWorkspace (workspace.py)
2. Updated README.md
3. Updated Example File (07_convo_with_apptainer_sandboxed_server.py)
Testing
Log Files for VerificationThe following log files have been saved:
The changes have been pushed to the |
- Add note that ApptainerWorkspace only works with pre-built images (like DockerWorkspace) - Update section headers to 'Option 1' and 'Option 2' for clarity - Add use_fakeroot to configuration options table Co-authored-by: openhands <openhands@all-hands.dev>
Move mutual exclusivity check from @model_validator to model_post_init because model_post_init runs before model_validator in Pydantic. This was causing confusing errors when both server_image and sif_file were set - instead of getting a clear 'exactly one must be set' error, users would get 'SIF file not found' error. Tested end-to-end with an existing SIF file to confirm the sif_file option works correctly. Co-authored-by: openhands <openhands@all-hands.dev>
|
Looks like there are a few issues preventing this PR from being merged!
If you'd like me to help, just leave a comment, like Feel free to include any additional details that might help me get this PR into a better state. You can manage your notification settings |
|
HUMAN: I had openhands test this out and it reports the following: 1.
|
* docs: Add Apptainer sandbox documentation Add comprehensive documentation for ApptainerWorkspace, showing how to run agent servers in rootless Apptainer containers for HPC and shared computing environments. Includes: - When to use Apptainer vs Docker - Configuration options (pre-built image, base image, SIF file) - Key features and differences from Docker - Troubleshooting guide Relates to OpenHands/software-agent-sdk#892 * Update example file path from 05 to 07 The example file was renamed in the SDK repo to 07_convo_with_apptainer_sandboxed_server.py since example 06 already exists. Co-authored-by: openhands <openhands@all-hands.dev> * sync: update code blocks from agent-sdk Co-authored-by: openhands <openhands@all-hands.dev> * Update sdk/guides/agent-server/apptainer-sandbox.mdx --------- Co-authored-by: openhands <openhands@all-hands.dev> Co-authored-by: enyst <engel.nyst@gmail.com>
HUMAN: this has been tested
Description
This PR implements
ApptainerWorkspace, a container-based workspace that uses Apptainer (formerly Singularity) instead of Docker. This addresses the need for rootless container execution in HPC and shared computing environments where Docker may not be available or permitted.✨ Critical Bug Fix (2025-10-24): Discovered and fixed a bug where the initial implementation incorrectly used
apptainer build ... docker-daemon://image, which required Docker to be running. This defeated the entire purpose of Apptainer! The fix changes toapptainer pull docker://imagewhich pulls directly from Docker registries without needing Docker daemon. This is the key feature that makes Apptainer valuable.✨ Additional Fixes (2025-10-24):
apptainer execfor better compatibility in environments without systemd/FUSESESSION_API_KEYfrom environment and passes it to RemoteWorkspaceFixes #891
Key Features
apptainer pullImplementation Details
Files Added
openhands-workspace/openhands/workspace/apptainer/workspace.py(378 lines)ApptainerWorkspaceclass implementationapptainer pull(no Docker required!)apptainer execopenhands-workspace/openhands/workspace/apptainer/__init__.pyopenhands-workspace/openhands/workspace/apptainer/README.mdexamples/02_remote_agent_server/05_convo_with_apptainer_sandboxed_server.pytests/workspace/test_apptainer_workspace.pyapptainer_workspace_demo.logFiles Modified
openhands-workspace/openhands/workspace/__init__.pyApptainerWorkspaceto exportsopenhands-workspace/openhands/workspace/apptainer/workspace.pyapptainer pullinstead of Docker daemonapptainer execinstead of instance mode for better compatibilityopenhands-sdk/openhands/sdk/workspace/remote/base.pyUsage
Option 1: Pre-built Server Image (Recommended for HPC)
Option 2: Build from Base Image (Requires Docker for initial build)
Option 3: Use Existing SIF File
Testing
All tests pass successfully:
$ uv run pytest tests/workspace/test_apptainer_workspace.py -v tests/workspace/test_apptainer_workspace.py::test_apptainer_workspace_import PASSED [ 33%] tests/workspace/test_apptainer_workspace.py::test_apptainer_workspace_inheritance PASSED [ 66%] tests/workspace/test_apptainer_workspace.py::test_apptainer_workspace_field_definitions PASSED [100%] ============================== 3 passed in 0.13s ===============================End-to-End Testing with Actual Apptainer
Successfully tested the complete example with Apptainer 1.3.5. See
apptainer_workspace_demo.logfor full details:✅ Image Preparation
/root/.apptainer_cache/ghcr.io_openhands_agent-server_main-python.sif✅ Container Execution
apptainer execmode✅ Command Execution
✅ Authentication
✅ API Endpoints
/healthendpoint: ✅ Working/api/bash/start_bash_command: ✅ Working/api/conversations: ✅ Working (with auth)/api/conversations/{id}/run: ✅ Working (with auth)All pre-commit hooks pass:
Comparison: ApptainerWorkspace vs DockerWorkspace
Prerequisites
Users need to install Apptainer: https://apptainer.org/docs/user/main/quick_start.html
On Ubuntu/Debian:
Or build from source:
Why Apptainer?
As mentioned in issue #891, Docker requires root access which is often not available or permitted in:
Apptainer was specifically designed for these use cases and provides:
Technical Implementation Notes
Exec Mode vs Instance Mode
Initially implemented using Apptainer instance mode (
apptainer instance start), but discovered this requires systemd and/or FUSE which may not be available in all environments. Switched to direct execution mode (apptainer exec) which:Authentication Flow
ApptainerWorkspace discovers SESSION_API_KEY from environment and passes it to RemoteWorkspace, which now properly includes it in the HTTP client's default headers. This ensures all API requests (including conversation creation) are properly authenticated.
Demo Log
See
apptainer_workspace_demo.logfor the complete end-to-end test output showing:Checklist
DockerWorkspaceNext Steps
After merging, users can:
ApptainerWorkspaceas a drop-in replacement forDockerWorkspaceAgent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.12-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:1109833-pythonRun
All tags pushed for this build
About Multi-Architecture Support
1109833-python) is a multi-arch manifest supporting both amd64 and arm641109833-python-amd64) are also available if needed