Skip to content

mv sim to tile and add ssz support to simulator#415

Merged
0w3n-d merged 3 commits intodevelopfrom
od/refactor_sim
Mar 13, 2026
Merged

mv sim to tile and add ssz support to simulator#415
0w3n-d merged 3 commits intodevelopfrom
od/refactor_sim

Conversation

@0w3n-d
Copy link
Collaborator

@0w3n-d 0w3n-d commented Mar 11, 2026

devin-ai-integration[bot]

This comment was marked as resolved.

@0w3n-d 0w3n-d requested review from ninaiiad and vladimir-ea March 11, 2026 09:36
pub version: SubmissionVersion,
pub withdrawals_root: B256,
pub trace: SubmissionTrace,
/// Decompressed bytes to forward verbatim to the simulator, avoiding re-encoding.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it important that these are decompressed already? also, titan sends bids without compression so for them we can save a reference into the dcache here instead of copying out the bytes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - in the case where the bytes can be forwarded as is, a reference to the network bytes can be used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's unfortunate that some submissions are compressed. I did initially have it send the network bytes direct to the simulator but then later changed my mind. The simulator had to perform the same full decode process as the relay which just felt like too big a change for one PR. It's still my goal though.

@vladimir-ea
Copy link
Contributor

Looks good - a comment on the overall design: could we not just forward all received bytes to the simulator nodes?
i.e. round robin bytes from the dcache to the sim nodes (and run a decoder tile on the sim nodes?) and then handle responses only in the relay itself. This would potentially offload some processing?

Maybe as a future iteration of this?

@0w3n-d
Copy link
Collaborator Author

0w3n-d commented Mar 11, 2026

Looks good - a comment on the overall design: could we not just forward all received bytes to the simulator nodes? i.e. round robin bytes from the dcache to the sim nodes (and run a decoder tile on the sim nodes?) and then handle responses only in the relay itself. This would potentially offload some processing?

Maybe as a future iteration of this?

Yeah that's what I initially implemented last week but then changed my mind. It just felt like too big a change to do in one go. I wanted to make sure that the existing path would still work as is with current infra. I want to get this merged and deployed. Test it. And move the remaining geths to reths in production and so on before rolling out full change.

Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 12 additional findings in Devin Review.

Open in Devin Review

Comment on lines +62 to +65
match api.validate_builder_submission_v5(ext).await {
Ok(()) => StatusCode::OK.into_response(),
Err(e) => (StatusCode::BAD_REQUEST, e.message().to_string()).into_response(),
}
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 SSZ server returns HTTP 400 for internal simulator errors, preventing temporary error classification

The SSZ validation server handler returns StatusCode::BAD_REQUEST (400) for all errors from validate_builder_submission_v5, including internal server errors like ProviderError, GetParentError, etc. On the relay client side (crates/relay/src/simulator/client.rs:126-131), HTTP 400 maps to BlockSimError::BlockValidationFailed, which is is_demotable() == true. In contrast, the _ => branch maps any non-200/400/424 status (e.g., 500) to BlockSimError::RpcError, which is is_temporary() == true and thus not demotable. This means internal simulator errors that should be transient (and should NOT trigger builder demotion) will instead be treated as permanent validation failures. While the old JSON-RPC path had a similar limitation (JSON-RPC errors are all in HTTP 200 bodies), the SSZ path has the HTTP status code mechanism available and should use it to properly classify errors.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

}

fn handle_sim_request(&mut self, req: crate::simulator::ValidationRequest, fast_track: bool) {
assert_eq!(req.bid_slot, self.last_bid_slot);
Copy link
Collaborator

@ninaiiad ninaiiad Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be an assert? maybe just an error log? unless we expect this to have been filtered out earlier?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this came from the sim manager. I think it's never supposed to happen. Should be caught earlier unless there's a critical bug in the code.

Having said that I think it has happened on mainnet and we have a ticket open to look into it at some point.

@0w3n-d 0w3n-d merged commit df426cf into develop Mar 13, 2026
2 of 3 checks passed
@0w3n-d 0w3n-d deleted the od/refactor_sim branch March 13, 2026 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants