-
Notifications
You must be signed in to change notification settings - Fork 38
Description
Summary
Long exfil transfers can become disproportionately slow compared to downloads, even on loopback. The behavior appears protocol‑level: polls (which are needed to elicit ACKs/data) are only sent when picoquic produces a packet, but pacing/cwnd collapse suppresses sends, starving ACKs and locking the connection into a low‑throughput regime.
Environment
- Slipstream: local build from
main - Mode: direct connection (client points at server as resolver)
- Network: loopback, no netem
Reproduction (direct connection, loopback)
- Start a TCP sink for the server target:
nc -l 5201 > /dev/null - Start the slipstream server:
slipstream-server \ --dns-listen-port=8853 \ --target-address=127.0.0.1:5201 \ --domain=test.com
- Start the slipstream client:
slipstream-client \ --congestion-control=bbr \ --tcp-listen-port=7000 \ --resolver=127.0.0.1:8853 \ --domain=test.com
- Exfiltrate 100MB and time it:
/usr/bin/time -f "elapsed=%e" sh -c \ "dd if=/dev/zero bs=1M count=100 2>/dev/null | nc 127.0.0.1 7000"
- (Optional) Repeat with 10MB (
count=10) for comparison.
Observed
Example 100MB run (loopback):
- end‑to‑end exfil: ~1.65 MiB/s
- end‑to‑end download: ~46.7 MiB/s
Despite zero loss and minimal RTT, exfil remains much slower than download and does not recover over time. Short transfers often complete before the low‑throughput regime is visible.
Expected
Exfil throughput should stay in the same order of magnitude as download for local direct‑connection tests, and should not collapse purely as transfer size increases.
Suspected Cause (protocol‑level)
Polling in slipstream depends on the QUIC stack producing a packet:
- The client requests a poll by setting
cnx->is_poll_requestedand callingpicoquic_prepare_packet_ex. - In
src/slipstream_sockloop.c, ifsend_length == 0, the poll loop stops. - In
subprojects/picoquic/picoquic/sender.c, poll insertion only happens when pacing allows sending.
This creates a feedback loop under sustained exfil:
- Congestion controller shrinks
cwin/pacing_rate. picoquic_prepare_packet_exyieldssend_length == 0.- Polls stop (no DNS queries), so ACKs/data are not elicited.
- RTT samples worsen and pacing remains low.
The behavior aligns with docs/protocol.md (poll frames are non‑ACK‑eliciting), but in the DNS tunnel this can suppress the only mechanism that drives ACKs back from the server.
Proposed Fix (primary)
Introduce a rate‑limited poll bypass when there is in‑flight data:
- Allow emitting a poll frame even if pacing would normally block sends.
- Rate‑limit the bypass (e.g., one poll per RTT/4 or a small fixed interval).
- Keep polls non‑ACK‑eliciting to avoid ping‑pong storms.
This preserves congestion safety while preventing the “no poll → no ACK → low cwnd → no poll” deadlock.
Alternative / Additional Mitigations
- Keep
is_poll_requestedsticky until a poll is actually emitted (do not drop onsend_length == 0). - Trigger periodic poll requests while
bytes_in_transit > 0, rather than tying them solely to inbound DNS responses.
Files referenced
src/slipstream_sockloop.c(poll request + send loop)docs/protocol.md(poll frame behavior)subprojects/picoquic/picoquic/sender.c(poll insertion gated by pacing)
Token usage: total=11,254,186 input=9,024,612 (+ 263,297,024 cached) output=2,229,574 (reasoning 1,750,464)
To continue this session, run codex resume 019b8083-f51c-7ed1-88cf-dbdfb7c71136