fix: skip pasta port probe during snapshot restore to prevent 0-byte responses#554
Closed
claude-claude[bot] wants to merge 1 commit intofuse-pipe-restorefrom
Closed
fix: skip pasta port probe during snapshot restore to prevent 0-byte responses#554claude-claude[bot] wants to merge 1 commit intofuse-pipe-restorefrom
claude-claude[bot] wants to merge 1 commit intofuse-pipe-restorefrom
Conversation
…responses During snapshot restore, post_start() runs BEFORE the VM snapshot is loaded into Firecracker. The port forwarding probe in post_start() forces pasta to accept a TCP connection and attempt L2 forwarding to a non-existent guest. This poisoned connection attempt corrupts pasta's internal connection tracking state, causing all subsequent data-bearing connections through pasta to return 0 bytes (TCP connect succeeds but HTTP responses are empty). The fix adds a restore_mode flag to PastaNetwork that skips the premature port probe in post_start(). Port forwarding is still properly verified later via verify_port_forwarding(), which runs after the VM is resumed and fc-agent has sent its gratuitous ARP. Root cause: common.rs calls network.post_start() at line 997, then loads the snapshot at line 1033+. The wait_for_port_forwarding() inside post_start() probes pasta before any guest exists, poisoning pasta's L4 translation state. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Owner
|
Superseded by PR #555 which applies the same fix to main (instead of fuse-pipe-restore branch) and adds a stress test. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
CI Fix
Fixes CI #22648352091
Problem
The
clone_http/rootless/nginxbenchmark panicked with:TCP connections through pasta succeeded but returned 0 bytes. The root cause is a sequencing bug in the snapshot restore path:
restore_from_snapshot()incommon.rscallsnetwork.post_start()(line 997) before loading the VM snapshot (line 1033+)post_start()callswait_for_port_forwarding()which doesTcpStream::connect("127.0.0.3:8080")The health monitor doesn't catch this because it uses
nsenter + curlvia the bridge (L2 path), completely bypassing pasta's L4 translation. So the VM reports "healthy" while pasta's port forwarding is broken.Solution
Added a
restore_modeflag toPastaNetworkthat skips the premature port forwarding probe inpost_start()during snapshot restore. Port forwarding is still properly verified later viaverify_port_forwarding(), which runs after:This is the correct ordering — ports should only be probed when a guest actually exists to respond.
The normal VM boot path (
podman/vm_config.rs) is unaffected —post_start()still probes ports there because the VM is already running.Generated by Claude | Fix Run