Skip to content

feat: add Rust tee-launcher crate#2621

Merged
netrome merged 12 commits intomainfrom
barak/rust-launcher-crate
Mar 27, 2026
Merged

feat: add Rust tee-launcher crate#2621
netrome merged 12 commits intomainfrom
barak/rust-launcher-crate

Conversation

@barakeinav1
Copy link
Copy Markdown
Contributor

@barakeinav1 barakeinav1 commented Mar 26, 2026

Summary

  • Add crates/tee-launcher/ — Rust binary that replaces the Python launcher for running MPC nodes in TDX CVMs
  • Validates MPC image hashes against the contract's approved list
  • Extends RTMR3 with the validated image digest
  • Writes MPC node config to shared volume and launches via Docker Compose
  • Pin reqwest to 0.12 with bundled webpki-roots for reproducible builds

The Python launcher in tee_launcher/ is preserved and will be removed separately in #2615.
Deployment configs, CI, test assets, and localnet scripts will follow in separate PRs.

Closes #2622
Split from #2618.

Test plan

  • cargo check --locked -p tee-launcher compiles
  • Deployed 2-node cluster with Rust launcher on localnet — Running state, real Dstack attestation, ECDSA signatures working

@claude
Copy link
Copy Markdown

claude bot commented Mar 26, 2026

Code Review

Overall: Well-structured Rust replacement for the Python launcher with good test coverage and security hardening (read-only FS, no-new-privileges, typed digest validation). A few items to address:

Issues

  • expect on untrusted external input (main.rs:475)docker inspect output is parsed with .expect("is valid digest"). If Docker returns unexpected output (empty string, error text, etc.), this panics the launcher instead of returning a recoverable error. This should use .map_err(...) and ? like the rest of the error handling in the function:

    let pulled_image_id: DockerSha256Digest = String::from_utf8_lossy(&inspect.stdout)
        .trim()
        .to_string()
        .parse()
        .map_err(|_| ImageDigestValidationFailed::DockerInspectFailed(
            format!("unexpected image ID from docker inspect: {}", String::from_utf8_lossy(&inspect.stdout).trim())
        ))?;
  • image_name is user-controlled and injected into YAML via string replacement (main.rs:507)render_compose_file does template.replace("{{IMAGE_NAME}}", image_name) where image_name comes from the user config TOML. A malicious image_name containing YAML syntax (e.g., newlines + extra keys) could alter the compose file semantics. Unlike DockerSha256Digest which is validated, image_name is a plain String. Consider validating that it matches a docker image name regex (alphanumeric, /, -, ., _) or at minimum rejecting values containing newlines.

  • DockerTokenResponse derives Debug + Serialize (docker_types.rs:5-8) — The token field contains a bearer auth token. While short-lived, the Debug impl means any accidental tracing::debug!(?token_response, ...) would leak it to logs. Consider a manual Debug impl that redacts the token, or wrapping it in a Secret<String> type.

Minor / Non-blocking

  • EXTERNAL_MPC_FUTURE_PORT is referenced in deploy-launcher.sh:225 and in the port check list but not defined in default.env. This will cause deployment failures for anyone using the default env file.

✅ Approved — the expect on docker output is the only item I'd consider blocking; the rest are hardening suggestions.

@barakeinav1 barakeinav1 force-pushed the barak/rust-launcher-crate branch from bd8a122 to 32d01bb Compare March 26, 2026 10:38
Add a Rust implementation of the TEE launcher for running MPC nodes
inside TDX CVMs. The launcher validates MPC image hashes against the
contract's approved list, extends RTMR3, writes MPC node config to
shared volume, and launches the node via Docker Compose.

Pin reqwest to 0.12 with bundled webpki-roots for reproducible builds
(reqwest 0.13 uses rustls-platform-verifier which requires system CA
certs not present in the minimal launcher Docker container).

Deployment configs, CI, test assets, and scripts will follow in
separate PRs.
@barakeinav1 barakeinav1 force-pushed the barak/rust-launcher-crate branch from 32d01bb to 5201ba0 Compare March 26, 2026 10:38
@barakeinav1 barakeinav1 changed the title feat: add Rust tee-launcher crate and deployment configs feat: add Rust tee-launcher crate Mar 26, 2026
@barakeinav1 barakeinav1 requested a review from Copilot March 26, 2026 11:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new Rust-based tee-launcher binary crate intended to replace the legacy Python TEE launcher for running MPC nodes in TDX CVMs, including image-hash validation and TEE attestation event emission.

Changes:

  • Introduces crates/tee-launcher/ with CLI/env parsing, TOML config loading, registry manifest resolution, digest validation, and docker-compose launch logic.
  • Adds templates for non-TEE and TEE docker-compose manifests rendered at runtime.
  • Registers the crate in the workspace and locks dependencies (including a pinned reqwest 0.12 + rustls-tls).

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
crates/tee-launcher/src/main.rs Core launcher flow: config load, hash selection/validation, RTMR3 event emission, docker-compose rendering and launch
crates/tee-launcher/src/types.rs CLI + TOML config types and unit tests
crates/tee-launcher/src/error.rs Error types for launcher and digest validation
crates/tee-launcher/src/docker_types.rs Registry API response types and unit tests
crates/tee-launcher/src/constants.rs Runtime file/socket/container constants
crates/tee-launcher/mpc-node-docker-compose*.template.yml docker-compose templates for TEE vs non-TEE
crates/tee-launcher/README.md Operator/developer documentation for config and usage
crates/tee-launcher/Cargo.toml New crate manifest (incl. pinned reqwest + feature gate for external tests)
Cargo.toml Adds crates/tee-launcher to workspace members
Cargo.lock Adds lock entries for the new crate and its dependency graph

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@barakeinav1 barakeinav1 force-pushed the barak/rust-launcher-crate branch from 6fd38b3 to 45f12c8 Compare March 26, 2026 12:02
- Only fall back to DEFAULT_IMAGE_DIGEST on NotFound; return error on
  other I/O failures (permission denied, disk errors) since they could
  indicate a security issue
- Replace .expect() with error return on docker inspect parse failure
- Add timeout to registry auth token request
- Only retry transient errors (timeouts, connect, 5xx); fail fast on 4xx
- Fix error messages: "docker run" → "docker compose up"
- Fix README: [mpc_config] → [mpc_node_config], wrong feature flag,
  mark fields as required (no serde defaults in code)
- Fix field doc comments: remove stale env var references
- Fix test comment: mpc_config → mpc_node_config
@barakeinav1 barakeinav1 force-pushed the barak/rust-launcher-crate branch from 45f12c8 to 522758a Compare March 26, 2026 12:03
Copy link
Copy Markdown
Contributor

@gilcu3 gilcu3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shallow first pass of reviews.

A few of the comments might be blockers, like using reqwest 0.12 for example

Comment on lines +21 to +26
# Pin reqwest 0.12 with bundled webpki-roots for reproducible builds.
# The workspace uses reqwest 0.13 which defaults to rustls-platform-verifier
# (loads CA certs from the system), but the launcher Docker image is a minimal
# container without system CA certs. Using rustls-tls bundles Mozilla's root
# certs into the binary, making TLS work without any system dependencies.
reqwest = { version = "0.12.28", default-features = false, features = ["rustls-tls", "json"] }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why depending on system tls is a problem for reproducibility. All TLS certificates eventually are rejected anyway. Therefore I think we must use 0.13 (recently upgraded in the node), with some feature change if needed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I used 0.13 I got a panic at launcher startup:

  thread 'main' panicked at                                                                                             
  /usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/reqwest-0.13.2/src/async_impl/client.rs:2478:38:       
  Client::new(): reqwest::Error { kind: Builder, source: General("No CA certificates were loaded from the system") }    
              

This happened because reqwest 0.13's default TLS backend (rustls-platform-verifier) calls into the OS to find CA
certificates. Inside the minimal Docker container there are none, so it panics when trying to construct the HTTP
client — before it even makes any request.

If I add the ca-certificates package contents it may vary across builds - not sure this we can support reproducible builds

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it can (achieve reproducibility with CA certificates in the OS), we use it for the mpc-node for example. Also we don't need to use reqwest default tls backend, they offer several

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this, will need to look into it a bit more. is it ok by you to create a follow-up issue for this?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will give it a try tomorrow and merge here if I manage to fix this

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just pushed a fix for this in d41dc42 It includes a new Dockerfile which should make this solution testable, let me know if it works

- Validate image_name contains only safe characters [a-zA-Z0-9/_.-]
  to prevent YAML injection when substituted into compose templates
- Redact bearer token in DockerTokenResponse Debug impl to prevent
  accidental leakage to logs
@barakeinav1 barakeinav1 requested a review from DSharifi March 26, 2026 16:08
- Move compose templates to assets/ folder
- Match specific errors in tests instead of broad Err(_)
- Use test-release profile in README test commands
- Inline platform check instead of intermediate variable (#10)
- Atomic file write via temp file + rename for config (#11)
- Fix duplicate doc comment on insert_reserved (#12)
- Extract Docker auth URL to constant (#13)
@barakeinav1 barakeinav1 force-pushed the barak/rust-launcher-crate branch from 84514ff to 6befaf0 Compare March 26, 2026 17:36
gilcu3
gilcu3 previously approved these changes Mar 27, 2026
Copy link
Copy Markdown
Contributor

@gilcu3 gilcu3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left some nits

| Variable | Required | Description |
|----------|----------|-------------|
| `PLATFORM` | Yes | `TEE` or `NONTEE` |
| `DOCKER_CONTENT_TRUST` | Yes | Must be `1` |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is a bit weird. Why does the user need to set it if it is always 1? The other are user choices, this one is a requirement so technically should be in the section where we explain how to run this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it is not explained well enough those are value the launcher sets, not the user

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh I thought the user sets them as well. For example, the PLATFORM is user set no?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or rather,
DOCKER_CONTENT_TRUST - should not be touched,
but PLATFORM and DEFAULT_IMAGE_DIGEST can be changed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an example Dockerfile I added that should contain everything needed and be reproducible.

Comment on lines +1294 to +1297
// the expected config digest. On native Linux, `.ID` returns the config digest
// (sha256 of the image config blob), but on macOS, Docker Desktop's containerd
// image store returns the manifest digest instead, causing a spurious mismatch.
#[cfg(target_os = "linux")]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, really painful difference

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hhm, we may need to add this to the operator guide.

netrome
netrome previously approved these changes Mar 27, 2026
Copy link
Copy Markdown
Collaborator

@netrome netrome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shallow review. Mostly nits, created follow-ups so we can tackle these separately.

Comment on lines +538 to +540
let rendered = template
.replace("{{IMAGE_NAME}}", image_name)
.replace("{{IMAGE}}", &manifest_digest.to_string())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels very hacky. Why don't we use a proper templating engine like Askama? This would give us better type safety when rendering the templates https://docs.rs/askama/latest/askama/

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #2630

@gilcu3 gilcu3 dismissed stale reviews from netrome and themself via 7b8fae0 March 27, 2026 10:22
Copy link
Copy Markdown
Contributor

@gilcu3 gilcu3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@netrome netrome added this pull request to the merge queue Mar 27, 2026
Merged via the queue into main with commit c0e44b3 Mar 27, 2026
45 of 46 checks passed
@netrome netrome deleted the barak/rust-launcher-crate branch March 27, 2026 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Add Rust tee-launcher crate

4 participants