From 8b74d660b3bf70398370c8b4f15e9a9b8bee70e2 Mon Sep 17 00:00:00 2001 From: huitema Date: Fri, 30 Jan 2026 12:08:22 -0800 Subject: [PATCH 1/3] Update test and doc before next submission. --- .gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitignore b/.gitignore index 0451bcb..def471e 100644 --- a/.gitignore +++ b/.gitignore @@ -22,3 +22,4 @@ /doc/draft.txt /doc/draft.xml /doc/tracker.log +/scripts/mem_subset.csv From 1b3d32d5ee200025427514b642a9aab792c3baa6 Mon Sep 17 00:00:00 2001 From: huitema Date: Fri, 30 Jan 2026 12:08:40 -0800 Subject: [PATCH 2/3] Update text and doc before next drafts --- doc/c4-design.md | 76 ++++++ doc/c4-design.txt | 324 ++++++++++++++++-------- doc/c4-spec.md | 113 +++++++-- doc/c4-spec.txt | 406 ++++++++++++++++++++----------- doc/c4-tests.md | 37 +-- papers/cpu_bound.md | 112 +++++++++ papers/perf-loopback-fix-cpu.PNG | Bin 0 -> 14093 bytes scripts/memory_log.py | 77 ++++++ 8 files changed, 857 insertions(+), 288 deletions(-) create mode 100644 papers/cpu_bound.md create mode 100644 papers/perf-loopback-fix-cpu.PNG create mode 100644 scripts/memory_log.py diff --git a/doc/c4-design.md b/doc/c4-design.md index c850e7b..7964aef 100644 --- a/doc/c4-design.md +++ b/doc/c4-design.md @@ -1051,6 +1051,82 @@ and then maybe switch to startup mode if a lot of capacity is available. This is something that we intend to test, but have not implemented yet. +# Revisiting the Initial Phase + +Our November 2025 design of C4 included a "rate based" +initial phase, during which C4 will send at twice the "nominal rate", +monitor acknowledgments and increase the nominal rate if measurements +increase, and exit if congestion is detected or if the measurements +do not increase for 3 consecutive RTT. That algorithm works +well in most scenario, but we were observing early exits in +"high delay jitter" scenarios, such as Wi-Fi networks with lots of +packet collisions. + +After observing that phenomenon, we realized that the +rate based algorithm was failing in case of high delay jitter +because it was setting the CWND to the product of pacing rate +and the "nominal" max RTT. The nominal Max RTT was set to a fixed +value, observed either before the initial phase or on the first +roundtrip in that phase. It would work if the initial phase +started during a high jitter event and the initial RTT was large +enough, but in many case it was not and became a limiting +factor. + +## Why not increasing Max RTT during Initial phase? + +In the initial phase, the algorithm tries to discover the bandwidth +and does not yet have a good estimate of delay jitter, which typically +requires a series of measurements. In these conditions, it is +easy to underestimate the max RTT. On the other hand, the flow is +deliberately probing at a high data rate. If the algorithm +allows updates of max RTT during that phase, the risks of +spiraling into buffer boat are very high, but if the CWND +remains too low, the risk of exiting startup with a severely +underestimated data rate is also very high. + +We tried to develop simple rules to classify the delay measurements +between caused by jitter, and caused by congestion. If we could do that, +we would be able to increase the max RTT safely, when appropriate. +However, we could not find variables that were both easy to monitor +and well correlated with the actual cause of the delay. + + +## Building a robust initial estimator + +The "rate based" initial estimator requires estimating both the +data rate and the max RTT simultaneously. In contrast, the "CWND based" +initial estimator use in algorithms like Reno or Cubic +only requires estimating the CWND, plus a possibly +loose estimate of the data rate. The Reno algorithm is remarkably +simple: just increase the CWND by the number of bytes acknowledged, +without any explicit dependency on the measured latency. + +The Reno algorithm terminates when packet losses are observed, +leading to bufferbloat. Hystart improves that by terminating when +the measured delays start increasing, but this can lead to early +exit in case of delay jitter. The rate based algorithm terminate when +the measured bandwidth stops growing, which provides good +results. Our proposal is to combine a Reno like growth of the +CWND with a rate-control like exit condition. + +Of course, things are not that simple. The "rate" test only stops the +growth of the CWND after the third "non growing" round. If CWND doubles +after each round it becomes excessive, buffers fill up, and lots +of packets are lost. We dealt with that problem by essentially +freezing the increases of after the first "non growing" round. +If a larger measurement happens before 3 RTT, the increases +resume, otherwise, C4 exits the initial phase. + +When the initial phase completes, we retain as estimate of the +data rate the highest value measured so far. +We also want to obtain a reasonable estimate of the "max RTT". +In the Reno logic, the "ssthresh" is set to half the CWND +value before congestion is detected. C4 will not use the +ssthresh variable after exiting the Initial phase, but it +can set the max RTT to the quotient of ssthresh by the +final rate estimate. + + # State Machine The state machine for C4 has the following states: diff --git a/doc/c4-design.txt b/doc/c4-design.txt index ae0ed0c..103286b 100644 --- a/doc/c4-design.txt +++ b/doc/c4-design.txt @@ -5,9 +5,9 @@ Network Working Group C. Huitema Internet-Draft Private Octopus Inc. Intended status: Informational S. Nandakumar -Expires: 5 June 2026 C. Jennings +Expires: 3 August 2026 C. Jennings Cisco - 2 December 2025 + 30 January 2026 Design of Christian's Congestion Control Code (C4) @@ -43,19 +43,19 @@ Status of This Memo time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on 5 June 2026. + This Internet-Draft will expire on 3 August 2026. Copyright Notice - Copyright (c) 2025 IETF Trust and the persons identified as the + Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved. -Huitema, et al. Expires 5 June 2026 [Page 1] +Huitema, et al. Expires 3 August 2026 [Page 1] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 This document is subject to BCP 78 and the IETF Trust's Legal @@ -97,21 +97,21 @@ Table of Contents 7.1. Coordinated Pushing . . . . . . . . . . . . . . . . . . . 20 7.2. Variable Pushing Rate . . . . . . . . . . . . . . . . . . 21 7.3. Pushing rate and Cascades . . . . . . . . . . . . . . . . 22 - 8. State Machine . . . . . . . . . . . . . . . . . . . . . . . . 22 - 9. Security Considerations . . . . . . . . . . . . . . . . . . . 24 - 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 - 11. Informative References . . . . . . . . . . . . . . . . . . . 24 - Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 26 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 26 + 8. Revisiting the Initial Phase . . . . . . . . . . . . . . . . 22 + 8.1. Why not increasing Max RTT during Initial phase? . . . . 23 + 8.2. Building a robust initial estimator . . . . . . . . . . . 23 + 9. State Machine . . . . . . . . . . . . . . . . . . . . . . . . 24 + 10. Security Considerations . . . . . . . . . . . . . . . . . . . 25 + 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 + 12. Informative References . . . . . . . . . . . . . . . . . . . 25 + Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 28 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 28 - - - -Huitema, et al. Expires 5 June 2026 [Page 2] +Huitema, et al. Expires 3 August 2026 [Page 2] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 1. Introduction @@ -165,9 +165,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 3] +Huitema, et al. Expires 3 August 2026 [Page 3] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 2. Studying the reaction to delays @@ -221,9 +221,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 4] +Huitema, et al. Expires 3 August 2026 [Page 4] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 In our initial deployments, we detected competition when delay based @@ -277,9 +277,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 5] +Huitema, et al. Expires 3 August 2026 [Page 5] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 2.2. Handling Chaotic Delays @@ -333,9 +333,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 6] +Huitema, et al. Expires 3 August 2026 [Page 6] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 Using the pacing rate that way prevents the larger window to cause @@ -389,9 +389,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 7] +Huitema, et al. Expires 3 August 2026 [Page 7] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 3. Simplifying the initial design @@ -445,9 +445,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 8] +Huitema, et al. Expires 3 August 2026 [Page 8] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 To avoid that, we can have periodic periods in which the endpoint @@ -501,9 +501,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 9] +Huitema, et al. Expires 3 August 2026 [Page 9] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 * C4 notices high jitter, increases Nominal Max RTT accordingly, set @@ -557,9 +557,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 10] +Huitema, et al. Expires 3 August 2026 [Page 10] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 3.3. Monitoring the nominal rate @@ -613,9 +613,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 11] +Huitema, et al. Expires 3 August 2026 [Page 11] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 We use the data rate measurement to update the nominal rate, but only @@ -669,9 +669,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 12] +Huitema, et al. Expires 3 August 2026 [Page 12] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 inverse of these times per byte, effectively computing an harmonic @@ -725,9 +725,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 13] +Huitema, et al. Expires 3 August 2026 [Page 13] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 4. Competition with other algorithms @@ -781,9 +781,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 14] +Huitema, et al. Expires 3 August 2026 [Page 14] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 1. Excessive increase of measured RTT (above the nominal Max RTT), @@ -837,9 +837,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 15] +Huitema, et al. Expires 3 August 2026 [Page 15] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 and only react to those losses that are detected by gaps in @@ -893,9 +893,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 16] +Huitema, et al. Expires 3 August 2026 [Page 16] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 proportional to its size. This drives very good long term fairness, @@ -949,9 +949,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 17] +Huitema, et al. Expires 3 August 2026 [Page 17] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 measurements, before any of the big jitter events had occured. This @@ -1005,9 +1005,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 18] +Huitema, et al. Expires 3 August 2026 [Page 18] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 There is no doubt that the current curve will have to be refined. We @@ -1061,9 +1061,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 19] +Huitema, et al. Expires 3 August 2026 [Page 19] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 The second feature is the "make before break" nature of the rate @@ -1117,9 +1117,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 20] +Huitema, et al. Expires 3 August 2026 [Page 20] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 7.2. Variable Pushing Rate @@ -1173,9 +1173,9 @@ Internet-Draft C4 Design December 2025 -Huitema, et al. Expires 5 June 2026 [Page 21] +Huitema, et al. Expires 3 August 2026 [Page 21] -Internet-Draft C4 Design December 2025 +Internet-Draft C4 Design January 2026 7.3. Pushing rate and Cascades @@ -1204,7 +1204,98 @@ Internet-Draft C4 Design December 2025 to startup mode if a lot of capacity is available. This is something that we intend to test, but have not implemented yet. -8. State Machine +8. Revisiting the Initial Phase + + Our November 2025 design of C4 included a "rate based" initial phase, + during which C4 will send at twice the "nominal rate", monitor + acknowledgments and increase the nominal rate if measurements + increase, and exit if congestion is detected or if the measurements + do not increase for 3 consecutive RTT. That algorithm works well in + most scenario, but we were observing early exits in "high delay + jitter" scenarios, such as Wi-Fi networks with lots of packet + collisions. + + After observing that phenomenon, we realized that the rate based + algorithm was failing in case of high delay jitter because it was + setting the CWND to the product of pacing rate and the "nominal" max + RTT. The nominal Max RTT was set to a fixed value, observed either + before the initial phase or on the first roundtrip in that phase. It + would work if the initial phase started during a high jitter event + and the initial RTT was large enough, but in many case it was not and + became a limiting factor. + + + + + + +Huitema, et al. Expires 3 August 2026 [Page 22] + +Internet-Draft C4 Design January 2026 + + +8.1. Why not increasing Max RTT during Initial phase? + + In the initial phase, the algorithm tries to discover the bandwidth + and does not yet have a good estimate of delay jitter, which + typically requires a series of measurements. In these conditions, it + is easy to underestimate the max RTT. On the other hand, the flow is + deliberately probing at a high data rate. If the algorithm allows + updates of max RTT during that phase, the risks of spiraling into + buffer boat are very high, but if the CWND remains too low, the risk + of exiting startup with a severely underestimated data rate is also + very high. + + We tried to develop simple rules to classify the delay measurements + between caused by jitter, and caused by congestion. If we could do + that, we would be able to increase the max RTT safely, when + appropriate. However, we could not find variables that were both + easy to monitor and well correlated with the actual cause of the + delay. + +8.2. Building a robust initial estimator + + The "rate based" initial estimator requires estimating both the data + rate and the max RTT simultaneously. In contrast, the "CWND based" + initial estimator use in algorithms like Reno or Cubic only requires + estimating the CWND, plus a possibly loose estimate of the data rate. + The Reno algorithm is remarkably simple: just increase the CWND by + the number of bytes acknowledged, without any explicit dependency on + the measured latency. + + The Reno algorithm terminates when packet losses are observed, + leading to bufferbloat. Hystart improves that by terminating when + the measured delays start increasing, but this can lead to early exit + in case of delay jitter. The rate based algorithm terminate when the + measured bandwidth stops growing, which provides good results. Our + proposal is to combine a Reno like growth of the CWND with a rate- + control like exit condition. + + Of course, things are not that simple. The "rate" test only stops + the growth of the CWND after the third "non growing" round. If CWND + doubles after each round it becomes excessive, buffers fill up, and + lots of packets are lost. We dealt with that problem by essentially + freezing the increases of after the first "non growing" round. If a + larger measurement happens before 3 RTT, the increases resume, + otherwise, C4 exits the initial phase. + + When the initial phase completes, we retain as estimate of the data + rate the highest value measured so far. We also want to obtain a + reasonable estimate of the "max RTT". In the Reno logic, the + + + +Huitema, et al. Expires 3 August 2026 [Page 23] + +Internet-Draft C4 Design January 2026 + + + "ssthresh" is set to half the CWND value before congestion is + detected. C4 will not use the ssthresh variable after exiting the + Initial phase, but it can set the max RTT to the quotient of ssthresh + by the final rate estimate. + +9. State Machine The state machine for C4 has the following states: @@ -1221,19 +1312,6 @@ Internet-Draft C4 Design December 2025 have resulted in increases of "nominal rate", or enters "cruising" otherwise. - - - - - - - - -Huitema, et al. Expires 5 June 2026 [Page 22] - -Internet-Draft C4 Design December 2025 - - * "cruising": the connection is sending using the "nominal_rate" and "nominal_max_rtt" value. If congestion is detected, the connection exits cruising and enters "recovery" after lowering the @@ -1248,6 +1326,26 @@ Internet-Draft C4 Design December 2025 These transitions are summarized in the following state diagram. + + + + + + + + + + + + + + + +Huitema, et al. Expires 3 August 2026 [Page 24] + +Internet-Draft C4 Design January 2026 + + Start | v @@ -1279,41 +1377,37 @@ Internet-Draft C4 Design December 2025 | | +<------------------+ - - - - - - -Huitema, et al. Expires 5 June 2026 [Page 23] - -Internet-Draft C4 Design December 2025 - - -9. Security Considerations +10. Security Considerations We do not believe that C4 introduce new security issues. Or maybe there are, such as what happen if applications can be fooled in going to fast and overwhelming the network, or going to slow and underwhelming the application. Discuss! -10. IANA Considerations +11. IANA Considerations This document has no IANA actions. -11. Informative References +12. Informative References [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based Multiplexed and Secure Transport", RFC 9000, DOI 10.17487/RFC9000, May 2021, . + + +Huitema, et al. Expires 3 August 2026 [Page 25] + +Internet-Draft C4 Design January 2026 + + [I-D.ietf-moq-transport] Nandakumar, S., Vasiliev, V., Swett, I., and A. Frindell, "Media over QUIC Transport", Work in Progress, Internet- - Draft, draft-ietf-moq-transport-15, 20 October 2025, + Draft, draft-ietf-moq-transport-16, 13 January 2026, . + transport-16>. [RFC9438] Xu, L., Ha, S., Rhee, I., Goel, V., and L. Eggert, Ed., "CUBIC for Fast and Long-Distance Networks", RFC 9438, @@ -1337,15 +1431,6 @@ Internet-Draft C4 Design December 2025 RFC 6582, DOI 10.17487/RFC6582, April 2012, . - - - - -Huitema, et al. Expires 5 June 2026 [Page 24] - -Internet-Draft C4 Design December 2025 - - [RFC3649] Floyd, S., "HighSpeed TCP for Large Congestion Windows", RFC 3649, DOI 10.17487/RFC3649, December 2003, . @@ -1361,6 +1446,18 @@ Internet-Draft C4 Design December 2025 start", Computer Networks vol. 55, no. 9, pp. 2092-2110 , June 2011, . + + + + + + + +Huitema, et al. Expires 3 August 2026 [Page 26] + +Internet-Draft C4 Design January 2026 + + [Cubic-QUIC-Blog] Huitema, C., "Implementing Cubic congestion control in Quic", Christian Huitema's blog , November 2019, @@ -1377,9 +1474,9 @@ Internet-Draft C4 Design December 2025 Balasubramanian, P., Ertugay, O., Havey, D., and M. Bagnulo, "LEDBAT++: Congestion Control for Background Traffic", Work in Progress, Internet-Draft, draft-irtf- - iccrg-ledbat-plus-plus-04, 18 November 2025, + iccrg-ledbat-plus-plus-06, 29 January 2026, . + ledbat-plus-plus-06>. [RFC9330] Briscoe, B., Ed., De Schepper, K., Bagnulo, M., and G. White, "Low Latency, Low Loss, and Scalable Throughput @@ -1393,15 +1490,6 @@ Internet-Draft C4 Design December 2025 DOI 10.17487/RFC9331, January 2023, . - - - - -Huitema, et al. Expires 5 June 2026 [Page 25] - -Internet-Draft C4 Design December 2025 - - [I-D.briscoe-iccrg-prague-congestion-control] De Schepper, K., Tilmans, O., Briscoe, B., and V. Goel, "Prague Congestion Control", Work in Progress, Internet- @@ -1418,6 +1506,14 @@ Internet-Draft C4 Design December 2025 slides-122-iccrg-mind-the-misleading-effects-of-leo- mobility-on-end-to-end-congestion-control-00>. + + + +Huitema, et al. Expires 3 August 2026 [Page 27] + +Internet-Draft C4 Design January 2026 + + Acknowledgments TODO acknowledge. @@ -1453,4 +1549,20 @@ Authors' Addresses -Huitema, et al. Expires 5 June 2026 [Page 26] + + + + + + + + + + + + + + + + +Huitema, et al. Expires 3 August 2026 [Page 28] diff --git a/doc/c4-spec.md b/doc/c4-spec.md index 1bc8536..ff1b1c5 100644 --- a/doc/c4-spec.md +++ b/doc/c4-spec.md @@ -347,7 +347,7 @@ packets in a single transaction, which improves performance. But sending large batches of packets creates "instant queues" and causes some Active Queue Management mechanisms to mark packets as ECN/CE, or drop them. As a compromise, we set the quantum to -4 milliseconds worth of transmission. +4 milliseconds worth of transmission, while capping it to 64KB. ~~~ quantum = max ( min (pacing_rate*4_milliseconds, 64KB), 2*MTU) @@ -355,37 +355,55 @@ quantum = max ( min (pacing_rate*4_milliseconds, 64KB), 2*MTU) ## Initial state {#c4-initial} -When the flow is initialized, it enters the Initial state, +When the flow is first initialized, it enters the Initial state, during which it does a first assessment of the "nominal rate" and "nominal max RTT". The coefficient `alpha_current` is set to 2. The "nominal rate" and "nominal max RTT" are initialized to zero, -which will cause pacing rate and CWND to be set default -initial values. The nominal max RTT will be set to the +which will cause pacing rate to be set to a default +initial value. The nominal max RTT will be set to the first assessed RTT value, but is not otherwise changed -during the initial phase. The nominal rate is updated -after receiving acknowledgements, see {#nominal-rate}. +before the end of the initial phase. +The CWND will be set to the default initial value, +corresponding to 10 packets. + +During the initial state, the nominal rate is updated +after receiving acknowledgements, see {{nominal-rate}}. +The value of CWND is increased after each acknowledgement +by the number of bytes newly acknowledged by this +acknowledgement. C4 will exit the Initial state and enter Recovery if the nominal rate does not increase for 3 consecutive eras, omitting the eras for which the transmission was "application limited". -C4 exit the Initial if receiving a congestion signal and the +C4 exit the Initial when receiving a congestion signal if the following conditions are true: -1- If the signal is due to "delay", C4 will only exit the +1- If the signal is due to "delay" or "ECN", C4 will only exit the initial state if the `nominal_rate` did not increase in the last 2 eras. 2- If the signal is due to "loss", C4 will only exit the initial state if more than 20 packets have been received. -The restriction on delay signals is meant to prevent spurious exit -due to delay jitter. The restriction on loss signals is meant -to ensure that enough packets have been received to properly +The restriction on delay signals and ECN is meant to prevent spurious exit +due to delay jitter or competing connections. The restriction on loss +signals is meant to ensure that enough packets have been received to properly assess the loss rate. +On exiting the Initial state, C4 computes an estimate of the nominal +max RTT as the quotient of the half the last CWND divided by the last +nominal rate, and updates the "nominal max RTT" accordingly. + +### Reentering the initial state + +When reentering the initial state, C4 already has an estimate of the +current nominal rate and nominal max RTT. CWND is set to the product of +nominal rate and nominal max RTT. The initial state then operates as +specified in {{c4-initial}}. + ## Recovery state {#c4-recovery} The recovery state is entered from the Initial or Pushing state, @@ -637,12 +655,70 @@ the acceptable margin, capped to `1/4`: ~~~ If the signal is an ECN/CE rate, the coefficient is proportional -to the difference between `ecn_alpha` and `ecn_threshold`, capped to '1/4': - -~~~ - beta = min(1/4, (ecn_alpha - ecn_threshold)/ ecn_threshold)) -~~~ - +to the difference between `ecn_alpha` and `ecn_threshold`, capped to `1/4`: + +~~~ + beta = min(1/4, (ecn_alpha - ecn_threshold)/ ecn_threshold) +~~~ + +# Implementation considerations + +Implementing C4 ought to be straightforward, but developers need to pay +attention to measurement of data rates and to pacing issues when the +CPU load is high. + +## Rate measurement should be conservative + +The standard algorithm for rate measurement is to consider the amount +of data acknowledged in an interval of time, and divide that amount +by the duration of the interval. This algorithm can result in +over-estimates of the rate in presence of data jitter. These +excessive estimates could cause C4 to set a nominal rate higher +than the network path bandwidth, resulting in queue build-up and +excessive delays. + +There are two known ways to reduce the effect of jitter: filter out +measurements in which the data rate measured through acknowledgements +is larger than the send rate; and, make sure that the measurement +interval are long enough so jitter only has a small influence. Cautious +implementations should use both strategies. + +## Pacing and CPU load + +C4 relies on pacing during to avoid sending data too fast during the +recovery, cruising and pushing states. Pacing is often implemented +using a "leaky bucket" algorithm, which refills the bucket at the +pacing rate, allows transmission as long as there are enough tokens +in the bucket, and forces transmission to wait when all tokens are +consumed. The wait time is computed based on the pacing rate +and the number of desired tokens, and is implemented using +operating system commands such as `select()`, `poll()`, +`epoll()` or `sleep()`. In high CPU load conditions, we observe +that these commands often return after more than the specified +wait time, resulting in a lower sending rate than the desired +pacing rate. + +This phenomenom is particularly visible in low-latency paths. +The generic solution would probably be to estimate how much slower +the actual pacing is compared to the desired rate, and increase the +programmed pacing rate by a value proportional to these measurements. +This generic solution is not yet specified. In between, implementations +had success with a simple fix: increase the pacing rate 3/64th in +"cruising" state when the RTT is less than 1ms. This definitely +improved performance in low-latency environment, in particular +loopback interfaces. + +## Nominal max RTT on low latency links + +When doing tests on low latency links, we observed on some systems +a lot of measurement jitter. The measured RTT is the sum of the +actual RTT and some system wakeup delay, which can vary between a +few microseconds and maybe 1 millisecond. The default algorithm +will adapt the nominal RTT after each roundtrip, which can lead +to excessively low values, causing a slowdown of the transmission. +A solution is to set a "floor" value to the nominal max RTT, +updating it to the maximum of the measured value and the floor. +Setting the floor value to 1ms did improve performance. # Security Considerations @@ -670,6 +746,9 @@ This section should be deleted before publication as an RFC ## Changes since draft-huitema-ccwg-c4-spec-00 {:numbered="false"} +Rewrote the description of the Initial state in {{c4-initial}} +to remove dependency on nominal max RTT. + Added the specification of reaction to ECN in {{process-ecn}} and in {{rate-reduction}}. Update section {{c4-pushing}} to modulate pushing rate based on observed rate of ECN/CE marks. diff --git a/doc/c4-spec.txt b/doc/c4-spec.txt index 5700a25..4c152bc 100644 --- a/doc/c4-spec.txt +++ b/doc/c4-spec.txt @@ -5,9 +5,9 @@ Network Working Group C. Huitema Internet-Draft Private Octopus Inc. Intended status: Experimental S. Nandakumar -Expires: 5 June 2026 C. Jennings +Expires: 3 August 2026 C. Jennings Cisco - 2 December 2025 + 30 January 2026 Specification of Christian's Congestion Control Code (C4) @@ -40,11 +40,11 @@ Status of This Memo time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on 5 June 2026. + This Internet-Draft will expire on 3 August 2026. Copyright Notice - Copyright (c) 2025 IETF Trust and the persons identified as the + Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal @@ -53,9 +53,9 @@ Copyright Notice -Huitema, et al. Expires 5 June 2026 [Page 1] +Huitema, et al. Expires 3 August 2026 [Page 1] -Internet-Draft C4 Specification December 2025 +Internet-Draft C4 Specification January 2026 Please review these documents carefully, as they describe your rights @@ -66,53 +66,60 @@ Internet-Draft C4 Specification December 2025 Table of Contents - 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 + 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Key Words . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. C4 variables . . . . . . . . . . . . . . . . . . . . . . . . 3 - 3.1. Nominal rate . . . . . . . . . . . . . . . . . . . . . . 3 + 3.1. Nominal rate . . . . . . . . . . . . . . . . . . . . . . 4 3.2. Nominal max RTT . . . . . . . . . . . . . . . . . . . . . 4 3.3. Global variables . . . . . . . . . . . . . . . . . . . . 5 3.4. Per era variables . . . . . . . . . . . . . . . . . . . . 6 4. States and Transition . . . . . . . . . . . . . . . . . . . . 6 4.1. Setting pacing rate, congestion window and quantum . . . 7 - 4.2. Initial state . . . . . . . . . . . . . . . . . . . . . . 8 - 4.3. Recovery state . . . . . . . . . . . . . . . . . . . . . 9 + 4.2. Initial state . . . . . . . . . . . . . . . . . . . . . . 9 + 4.2.1. Reentering the initial state . . . . . . . . . . . . 9 + 4.3. Recovery state . . . . . . . . . . . . . . . . . . . . . 10 4.3.1. Restarting Initial if High Jitter . . . . . . . . . . 10 - 4.4. Cruising state {#c4-cruising } . . . . . . . . . . . . . 10 - 4.5. Pushing state . . . . . . . . . . . . . . . . . . . . . . 10 + 4.4. Cruising state {#c4-cruising } . . . . . . . . . . . . . 11 + 4.5. Pushing state . . . . . . . . . . . . . . . . . . . . . . 11 5. Handling of congestion signals . . . . . . . . . . . . . . . 11 - 5.1. Variable Sensitivity . . . . . . . . . . . . . . . . . . 11 + 5.1. Variable Sensitivity . . . . . . . . . . . . . . . . . . 12 5.2. Detecting Excessive Delays . . . . . . . . . . . . . . . 12 - 5.3. Detecting Excessive Losses . . . . . . . . . . . . . . . 12 - 5.3.1. Do not react to Probe Time Out . . . . . . . . . . . 12 + 5.3. Detecting Excessive Losses . . . . . . . . . . . . . . . 13 + 5.3.1. Do not react to Probe Time Out . . . . . . . . . . . 13 5.4. Detecting Excessive CE Marks . . . . . . . . . . . . . . 13 - 5.5. Applying congestion signals . . . . . . . . . . . . . . . 13 - 5.5.1. Rate Reduction on Congestion . . . . . . . . . . . . 13 - 6. Security Considerations . . . . . . . . . . . . . . . . . . . 14 - 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 - 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 8.1. Normative References . . . . . . . . . . . . . . . . . . 14 - 8.2. Informative References . . . . . . . . . . . . . . . . . 14 - Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 15 - Changes since previous versions . . . . . . . . . . . . . . . . . 15 - Changes since draft-huitema-ccwg-c4-spec-00 . . . . . . . . . . 15 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 15 + 5.5. Applying congestion signals . . . . . . . . . . . . . . . 14 + 5.5.1. Rate Reduction on Congestion . . . . . . . . . . . . 14 + 6. Implementation considerations . . . . . . . . . . . . . . . . 15 + 6.1. Rate measurement should be conservative . . . . . . . . . 15 + 6.2. Pacing and CPU load . . . . . . . . . . . . . . . . . . . 15 + 6.3. Nominal max RTT on low latency links . . . . . . . . . . 16 + 7. Security Considerations . . . . . . . . . . . . . . . . . . . 16 + 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 + 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 + 9.1. Normative References . . . . . . . . . . . . . . . . . . 16 + 9.2. Informative References . . . . . . . . . . . . . . . . . 16 + Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 17 + Changes since previous versions . . . . . . . . . . . . . . . . . 17 + Changes since draft-huitema-ccwg-c4-spec-00 . . . . . . . . . . 17 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17 -1. Introduction - Christian's Congestion Control Code (C4) is a congestion control - algorithm designed to support Real-Time multimedia applications, - specifically multimedia applications using QUIC [RFC9000] and the - Media over QUIC transport [I-D.ietf-moq-transport]. -Huitema, et al. Expires 5 June 2026 [Page 2] +Huitema, et al. Expires 3 August 2026 [Page 2] -Internet-Draft C4 Specification December 2025 +Internet-Draft C4 Specification January 2026 + +1. Introduction + + Christian's Congestion Control Code (C4) is a congestion control + algorithm designed to support Real-Time multimedia applications, + specifically multimedia applications using QUIC [RFC9000] and the + Media over QUIC transport [I-D.ietf-moq-transport]. The two main variables describing the state of a flow are the "nominal rate" (see Section 3.1) and the "nominal max RTT" (see @@ -155,20 +162,20 @@ Internet-Draft C4 Specification December 2025 a set a variables per flow (see Section 3.3) and per era (see Section 3.4). -3.1. Nominal rate - The nominal rate is an estimate of the bandwidth available to the - flow. On initialization, the nominal rate is set to zero, and - default values are used when setting the pacing rate and CWND for the - flow. +Huitema, et al. Expires 3 August 2026 [Page 3] + +Internet-Draft C4 Specification January 2026 -Huitema, et al. Expires 5 June 2026 [Page 3] - -Internet-Draft C4 Specification December 2025 +3.1. Nominal rate + The nominal rate is an estimate of the bandwidth available to the + flow. On initialization, the nominal rate is set to zero, and + default values are used when setting the pacing rate and CWND for the + flow. C4 evaluates the nominal rate after acknowledgements are received using the number of bytes acknowledged since the packet was sent @@ -212,19 +219,19 @@ Internet-Draft C4 Specification December 2025 on the path in the absence of queues. The RTT samples observed for the flow are the sum of four components: - * the latency of the path - * the jitter introduced by processes like link layer contention or - link layer retransmission - * queuing delays caused by competing applications +Huitema, et al. Expires 3 August 2026 [Page 4] + +Internet-Draft C4 Specification January 2026 + * the latency of the path -Huitema, et al. Expires 5 June 2026 [Page 4] - -Internet-Draft C4 Specification December 2025 + * the jitter introduced by processes like link layer contention or + link layer retransmission + * queuing delays caused by competing applications * queuing delays introduced by C4 itself. @@ -267,20 +274,20 @@ Internet-Draft C4 Specification December 2025 * running min RTT, an approximation of the min RTT for the flow, - * number of eras without increase (see Section 4.2), - * number of successful pushes, - * current state of the algorithm, which can be Initial, Recovery, - Cruising or Pushing. +Huitema, et al. Expires 3 August 2026 [Page 5] + +Internet-Draft C4 Specification January 2026 + * number of eras without increase (see Section 4.2), -Huitema, et al. Expires 5 June 2026 [Page 5] - -Internet-Draft C4 Specification December 2025 + * number of successful pushes, + * current state of the algorithm, which can be Initial, Recovery, + Cruising or Pushing. 3.4. Per era variables @@ -324,19 +331,14 @@ Internet-Draft C4 Specification December 2025 one round trip, i.e., until the first packet send while "pushing" is acknowledged. At that point, it enters the "recovery" state. - These transitions are summarized in the following state diagram. - - - - - - -Huitema, et al. Expires 5 June 2026 [Page 6] +Huitema, et al. Expires 3 August 2026 [Page 6] -Internet-Draft C4 Specification December 2025 +Internet-Draft C4 Specification January 2026 + + These transitions are summarized in the following state diagram. Start | @@ -382,17 +384,19 @@ Internet-Draft C4 Specification December 2025 * congestion quantum: set to zero. - If the nominal rate or the nominal max RTT are both assessed, C4 sets - pacing rate, and congestion window to values that depends on these - variables and on a coefficient alpha_current: -Huitema, et al. Expires 5 June 2026 [Page 7] + +Huitema, et al. Expires 3 August 2026 [Page 7] -Internet-Draft C4 Specification December 2025 +Internet-Draft C4 Specification January 2026 + + If the nominal rate or the nominal max RTT are both assessed, C4 sets + pacing rate, and congestion window to values that depends on these + variables and on a coefficient alpha_current: if (c4_state == initial): margin = 0 @@ -429,46 +433,78 @@ Internet-Draft C4 Specification December 2025 sending large batches of packets creates "instant queues" and causes some Active Queue Management mechanisms to mark packets as ECN/CE, or drop them. As a compromise, we set the quantum to 4 milliseconds - worth of transmission. + worth of transmission, while capping it to 64KB. quantum = max ( min (pacing_rate*4_milliseconds, 64KB), 2*MTU) -4.2. Initial state - When the flow is initialized, it enters the Initial state, during - which it does a first assessment of the "nominal rate" and "nominal - max RTT". The coefficient alpha_current is set to 2. The "nominal - rate" and "nominal max RTT" are initialized to zero, which will cause - pacing rate and CWND to be set default initial values. The nominal - max RTT will be set to the first assessed RTT value, but is not - otherwise changed during the initial phase. The nominal rate is -Huitema, et al. Expires 5 June 2026 [Page 8] + + + + +Huitema, et al. Expires 3 August 2026 [Page 8] -Internet-Draft C4 Specification December 2025 +Internet-Draft C4 Specification January 2026 - updated after receiving acknowledgements, see {#nominal-rate}. +4.2. Initial state + + When the flow is first initialized, it enters the Initial state, + during which it does a first assessment of the "nominal rate" and + "nominal max RTT". The coefficient alpha_current is set to 2. The + "nominal rate" and "nominal max RTT" are initialized to zero, which + will cause pacing rate to be set to a default initial value. The + nominal max RTT will be set to the first assessed RTT value, but is + not otherwise changed before the end of the initial phase. The CWND + will be set to the default initial value, corresponding to 10 + packets. + + During the initial state, the nominal rate is updated after receiving + acknowledgements, see Section 3.1. The value of CWND is increased + after each acknowledgement by the number of bytes newly acknowledged + by this acknowledgement. C4 will exit the Initial state and enter Recovery if the nominal rate does not increase for 3 consecutive eras, omitting the eras for which the transmission was "application limited". - C4 exit the Initial if receiving a congestion signal and the + C4 exit the Initial when receiving a congestion signal if the following conditions are true: - 1- If the signal is due to "delay", C4 will only exit the initial - state if the nominal_rate did not increase in the last 2 eras. + 1- If the signal is due to "delay" or "ECN", C4 will only exit the + initial state if the nominal_rate did not increase in the last 2 + eras. 2- If the signal is due to "loss", C4 will only exit the initial state if more than 20 packets have been received. - The restriction on delay signals is meant to prevent spurious exit - due to delay jitter. The restriction on loss signals is meant to - ensure that enough packets have been received to properly assess the - loss rate. + The restriction on delay signals and ECN is meant to prevent spurious + exit due to delay jitter or competing connections. The restriction + on loss signals is meant to ensure that enough packets have been + received to properly assess the loss rate. + + On exiting the Initial state, C4 computes an estimate of the nominal + max RTT as the quotient of the half the last CWND divided by the last + nominal rate, and updates the "nominal max RTT" accordingly. + +4.2.1. Reentering the initial state + + When reentering the initial state, C4 already has an estimate of the + current nominal rate and nominal max RTT. CWND is set to the product + of nominal rate and nominal max RTT. The initial state then operates + as specified in Section 4.2. + + + + + +Huitema, et al. Expires 3 August 2026 [Page 9] + +Internet-Draft C4 Specification January 2026 + 4.3. Recovery state @@ -498,14 +534,6 @@ Internet-Draft C4 Specification December 2025 * Any increase if the prior pushing rate (alpha_prior) was 17/16 or less, - - - -Huitema, et al. Expires 5 June 2026 [Page 9] - -Internet-Draft C4 Specification December 2025 - - * An increase of at least 1/4th of the expected increase otherwise, for example an increase of 1/16th if alpha_previous was 5/4. @@ -524,6 +552,16 @@ Internet-Draft C4 Specification December 2025 detection. This can lead to underestimation of the "nominal rate" if the flow is operating on a path with high jitter. + + + + + +Huitema, et al. Expires 3 August 2026 [Page 10] + +Internet-Draft C4 Specification January 2026 + + C4 will reenter the "initial" phase on the first time high jitter is detected for the flow. The high jitter is detected after updating the "nominal max RTT" at the end of the recovery era, if: @@ -555,13 +593,6 @@ Internet-Draft C4 Specification December 2025 alpha_current = 17/16 + 17/16 * (1 - ecn_alpha / ecn_threshold) - - -Huitema, et al. Expires 5 June 2026 [Page 10] - -Internet-Draft C4 Specification December 2025 - - C4 exits the pushing state after one era, or if a congestion signal is received before that. In an exception to standard congestion processing, the reduction in nominal_rate and nominal_max_RTT are not @@ -579,6 +610,14 @@ Internet-Draft C4 Specification December 2025 2. Excessive rate of packet losses (but not mere Probe Time Out, see Section 5.3.1), + + + +Huitema, et al. Expires 3 August 2026 [Page 11] + +Internet-Draft C4 Specification January 2026 + + 3. Excessive rate of ECN/CE marks C4 monitors successive RTT measurements and compare them to a @@ -609,15 +648,6 @@ Internet-Draft C4 Specification December 2025 * linear interpolation between 0 and 0.92 for values between 50,000 and 1,000,000B/s. - - - - -Huitema, et al. Expires 5 June 2026 [Page 11] - -Internet-Draft C4 Specification December 2025 - - * linear interpolation between 0.92 and 1 for values between 1,000,000 and 10,000,000B/s. @@ -636,6 +666,14 @@ Internet-Draft C4 Specification December 2025 A delay congestion signal is detected if: + + + +Huitema, et al. Expires 3 August 2026 [Page 12] + +Internet-Draft C4 Specification January 2026 + + rtt_sample > nominal_max_rtt + delay_threshold 5.3. Detecting Excessive Losses @@ -667,13 +705,6 @@ Internet-Draft C4 Specification December 2025 and only react to those losses that are detected by gaps in acknowledgements. - - -Huitema, et al. Expires 5 June 2026 [Page 12] - -Internet-Draft C4 Specification December 2025 - - 5.4. Detecting Excessive CE Marks The way we handle ECN signals is designed to be compatible with L4S @@ -687,6 +718,18 @@ Internet-Draft C4 Specification December 2025 The ratio ecn_alpha is updated each time an acknowledgement is received, as follow: + + + + + + + +Huitema, et al. Expires 3 August 2026 [Page 13] + +Internet-Draft C4 Specification January 2026 + + delta_ce = increase in the reported CE marks delta_ect1 = increase in the reported ECT(1) marks frac = delta_ce / (delta_ce + delta_ect1) @@ -721,15 +764,6 @@ Internet-Draft C4 Specification December 2025 nominal_rate = (1-beta)*nominal_rate - - - - -Huitema, et al. Expires 5 June 2026 [Page 13] - -Internet-Draft C4 Specification December 2025 - - The coefficient beta differs depending on the nature of the congestion signal. For packet losses, it is set to 1/4, similar to the value used in Cubic. @@ -743,24 +777,97 @@ Internet-Draft C4 Specification December 2025 delay_threshold)) If the signal is an ECN/CE rate, the coefficient is proportional to - the difference between ecn_alpha and ecn_threshold, capped to '1/4': + the difference between ecn_alpha and ecn_threshold, capped to 1/4: + + + +Huitema, et al. Expires 3 August 2026 [Page 14] + +Internet-Draft C4 Specification January 2026 + + + beta = min(1/4, (ecn_alpha - ecn_threshold)/ ecn_threshold) + +6. Implementation considerations + + Implementing C4 ought to be straightforward, but developers need to + pay attention to measurement of data rates and to pacing issues when + the CPU load is high. + +6.1. Rate measurement should be conservative + + The standard algorithm for rate measurement is to consider the amount + of data acknowledged in an interval of time, and divide that amount + by the duration of the interval. This algorithm can result in over- + estimates of the rate in presence of data jitter. These excessive + estimates could cause C4 to set a nominal rate higher than the + network path bandwidth, resulting in queue build-up and excessive + delays. + + There are two known ways to reduce the effect of jitter: filter out + measurements in which the data rate measured through acknowledgements + is larger than the send rate; and, make sure that the measurement + interval are long enough so jitter only has a small influence. + Cautious implementations should use both strategies. + +6.2. Pacing and CPU load + + C4 relies on pacing during to avoid sending data too fast during the + recovery, cruising and pushing states. Pacing is often implemented + using a "leaky bucket" algorithm, which refills the bucket at the + pacing rate, allows transmission as long as there are enough tokens + in the bucket, and forces transmission to wait when all tokens are + consumed. The wait time is computed based on the pacing rate and the + number of desired tokens, and is implemented using operating system + commands such as select(), poll(), epoll() or sleep(). In high CPU + load conditions, we observe that these commands often return after + more than the specified wait time, resulting in a lower sending rate + than the desired pacing rate. - beta = min(1/4, (ecn_alpha - ecn_threshold)/ ecn_threshold)) + This phenomenom is particularly visible in low-latency paths. The + generic solution would probably be to estimate how much slower the + actual pacing is compared to the desired rate, and increase the + programmed pacing rate by a value proportional to these measurements. + This generic solution is not yet specified. In between, + implementations had success with a simple fix: increase the pacing + rate 3/64th in "cruising" state when the RTT is less than 1ms. This + definitely improved performance in low-latency environment, in + particular loopback interfaces. -6. Security Considerations + + + +Huitema, et al. Expires 3 August 2026 [Page 15] + +Internet-Draft C4 Specification January 2026 + + +6.3. Nominal max RTT on low latency links + + When doing tests on low latency links, we observed on some systems a + lot of measurement jitter. The measured RTT is the sum of the actual + RTT and some system wakeup delay, which can vary between a few + microseconds and maybe 1 millisecond. The default algorithm will + adapt the nominal RTT after each roundtrip, which can lead to + excessively low values, causing a slowdown of the transmission. A + solution is to set a "floor" value to the nominal max RTT, updating + it to the maximum of the measured value and the floor. Setting the + floor value to 1ms did improve performance. + +7. Security Considerations We do not believe that C4 introduce new security issues. Or maybe there are, such as what happen if applications can be fooled in going to fast and overwhelming the network, or going too slow and underwhelming the application. Discuss! -7. IANA Considerations +8. IANA Considerations This document has no IANA actions. -8. References +9. References -8.1. Normative References +9.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, @@ -771,7 +878,7 @@ Internet-Draft C4 Specification December 2025 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . -8.2. Informative References +9.2. Informative References [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based Multiplexed and Secure Transport", RFC 9000, @@ -781,17 +888,22 @@ Internet-Draft C4 Specification December 2025 -Huitema, et al. Expires 5 June 2026 [Page 14] + + + + + +Huitema, et al. Expires 3 August 2026 [Page 16] -Internet-Draft C4 Specification December 2025 +Internet-Draft C4 Specification January 2026 [I-D.ietf-moq-transport] Nandakumar, S., Vasiliev, V., Swett, I., and A. Frindell, "Media over QUIC Transport", Work in Progress, Internet- - Draft, draft-ietf-moq-transport-15, 20 October 2025, + Draft, draft-ietf-moq-transport-16, 13 January 2026, . + transport-16>. [RFC9331] De Schepper, K. and B. Briscoe, Ed., "The Explicit Congestion Notification (ECN) Protocol for Low Latency, @@ -809,6 +921,9 @@ Changes since previous versions Changes since draft-huitema-ccwg-c4-spec-00 + Rewrote the description of the Initial state in Section 4.2 to remove + dependency on nominal max RTT. + Added the specification of reaction to ECN in Section 5.4 and in Section 5.5.1. Update section Section 4.5 to modulate pushing rate based on observed rate of ECN/CE marks. @@ -832,16 +947,15 @@ Authors' Addresses Email: huitema@huitema.net - Suhas Nandakumar - Cisco - -Huitema, et al. Expires 5 June 2026 [Page 15] +Huitema, et al. Expires 3 August 2026 [Page 17] -Internet-Draft C4 Specification December 2025 +Internet-Draft C4 Specification January 2026 + Suhas Nandakumar + Cisco Email: snandaku@cisco.com @@ -891,6 +1005,4 @@ Internet-Draft C4 Specification December 2025 - - -Huitema, et al. Expires 5 June 2026 [Page 16] +Huitema, et al. Expires 3 August 2026 [Page 18] diff --git a/doc/c4-tests.md b/doc/c4-tests.md index 41072b0..9317b14 100644 --- a/doc/c4-tests.md +++ b/doc/c4-tests.md @@ -145,21 +145,20 @@ when it is the sole user of a link. The list of test includes: This scenario simulates a 10MB download over a 20 Mbps link, with an 80ms RTT, and a bottlneck buffer capacity corresponding to 1 BDP. The test verifies that 100 simulations all complete -in less than 5 seconds. +in less than 4.7 seconds. In a typical simulation, we see a initial phase complete in less than 800ms, followed by a recovery phase in which the transmission rate stabilizes to the line rate. After that, the RTT remains very close to the path RTT, except for -periodic small bumps during the "push" transitions. The typical -test completes in 4.85 seconds. +periodic small bumps during the "push" transitions. ### Simulation of a simple 200Mbps connection This scenario simulates a 20MB download over a 200 Mbps link, with a 40ms RTT, and a bottleneck buffer capacity corresponding to 1 BDP. The test verifies that 100 simulations all complete -in less than 1.25 seconds. +in less than 1.31 seconds. This short test shows that the initial phase correctly discover the path capacity, and that the transmission operates at @@ -175,7 +174,7 @@ The scenario also tests the support for careful resume the remembered CWND to 18750000 bytes and the remembered RTT to 600.123ms. The test verifies that 100 simulations all complete -in less than 7.4 seconds. +in less than 7.7 seconds. ### Low and up @@ -238,8 +237,7 @@ exercises the "slow down" mechanism to discover the new RTT. The "ECN" test simulates a 20 Mbps link, with an 80ms RTT, and a bottleneck buffer capacity corresponding to 1 BDP. The test verifies that 100 simulated downloads of -10 MB all complete in less than 5.6 seconds. - +10 MB all complete in less than 5 seconds. ## Handling of High Jitter Environments {#c4-wifi} @@ -288,7 +286,7 @@ The "bad Wi-Fi" test simulates a connection experiencing a high level of jitter. The average jitter is set to 7ms, which implies multiple spikes of 100 to 200ms every second. The data rate is set to 10Mbps, and the base RTT before jitter is set to 2ms, i.e., simulating a local server. The test -pass if 100 different 10MB downloads each complete in less than 4.5 seconds. +pass if 100 different 10MB downloads each complete in less than 4.3 seconds. ### Wifi fade trial {#wifi-fade} @@ -362,7 +360,7 @@ same time and using the same path. The path has a 20Mbps data rate and 80ms RTT. The background connection tries to download 30MB, the main connection downloads 20MB. The test pass if in 100 trials the main connection completes -in less than 22.8 seconds. +in less than 23 seconds. ### Long background C4 connection last @@ -372,7 +370,7 @@ with the background connection starting and 70ms RTT. The background connection tries to download 15MB, the main connection downloads 10MB. The test pass if in 100 trials the main connection completes -in less than 22.2 seconds after the beginning of the trial. +in less than 23 seconds after the beginning of the trial. ### Compete with C4 over bad Wi-Fi @@ -385,7 +383,7 @@ the same jitter characteristics as in the "bad Wi-Fi" test (see {{bad-wifi}}). The background connection tries to download 10MB, the main connection downloads 4MB. The test pass if in each of 100 trials the main connection completes -in less than 13 seconds after the beginning of the trial. +in less than 11 seconds after the beginning of the trial. ## Competition with Cubic @@ -408,7 +406,7 @@ same time and using the same path. The path has a 20Mbps data rate and 80ms RTT. The background Cubic connection tries to download 10MB, the main connection downloads 5MB. The test pass if in 100 trials the main connection completes -in less than 6.7 seconds. +in less than 6.8 seconds. ### Two long C4 and Cubic connections @@ -417,7 +415,7 @@ same time and using the same path. The path has a 20Mbps data rate and 80ms RTT. The background connection tries to download 30MB, the main connection downloads 20MB. The test pass if in 100 trials the main connection completes -in less than 22.2 seconds. +in less than 23 seconds. ### Long Cubic background connection last @@ -428,7 +426,7 @@ with the background Cubic connection starting and 70ms RTT. The background connection tries to download 15MB, the main connection downloads 10MB. The test pass if in 100 trials the main connection completes -in less than 22 seconds after the beginning of the trial. +in less than 23 seconds after the beginning of the trial. ### Compete with Cubic over bad Wi-Fi @@ -497,7 +495,7 @@ the same jitter characteristics as in the "bad Wi-Fi" test (see {{bad-wifi}}). The background connection tries to download 10MB, the main connection downloads 4MB. The test pass if in each of 100 trials the main connection completes -in less than 13.5 seconds after the beginning of the trial. +in less than 14.5 seconds after the beginning of the trial. ## Handling of Multimedia Applications @@ -611,8 +609,8 @@ of 100 to 200ms every second. The data rate is set to 20Mbps, and the base RTT before jitter is set to 2ms, i.e., simulating a local server. The test lasts for 5 video groups of frames, i.e. 5 seconds. The measurements start 200ms after the -start of the connection. The expected average delay is set to 120ms, -and the maximum delay is set to 675ms. The test is successful if +start of the connection. The expected average delay is set to 100ms, +and the maximum delay is set to 680ms. The test is successful if 100 trials are all successful. # Tests @@ -621,7 +619,10 @@ We need real life tests as well. ## Loopback tests -To do. Write down. +Loopback tests were performed on Windows, downloading 10GB of data over +a loopback connection. They showed picoquic using C4 achieving a data rate +of 3Gbps, slightly more than the 2.9Gbps achieved when using Cubic or the +2.6 Gbps achieved when using BBR. ## Webex prototype deployments diff --git a/papers/cpu_bound.md b/papers/cpu_bound.md new file mode 100644 index 0000000..d9ceacb --- /dev/null +++ b/papers/cpu_bound.md @@ -0,0 +1,112 @@ +# Performance of C4 when CPU bound + +Back in November 2025, when doing tests of C4 on a loopback address, +I observed that C4 achieved lower data rates than Cubic or even BBR. +Since "performance under loopback" was not a high priority scenario, +I filed that in the long pile of issues to deal with later. Then, in +early January 2026, I read a preprint of a paper by Kathrin Elmenhorst +and Nils Aschenbruck titled "2BRobust -- Overcoming TCP BBR Performance +Degradation in Virtual Machines under CPU Contention" +(see: https://arxiv.org/abs/2601.05665). +In that paper, they point out that BBR achieves lower than nominal +performance when running on VMs under high CPU load, and +trace that to pacing issues. Pacing in an application process involves +periodically waiting until the pacing system acquires enough tokens. +If the CPU is highly loaded, the system call can take longer than the +specified maximum wait time, and pacing thus slows the connection more +than expected. In such conditions, they suggest increasing the programmed +pacing rate above the nominal rate, and show that it helps performance. + +C4 is different from BBR, but like BBR it relies on pacing. After reading +the paper, I wondered whether similar fixes might be working for C4 as +well, and I quickly set up a series of tests. + +## First diagnostic + +As mentioned, the first performance tests on C4 over the loopback +interface showed lower performance than both BBR and C4. I repeated +the tests after modifying the test program to capture a log of the +connection in memory, so as to interfere as little as possible +with performance. + +The memory log showed that the connection was almost always +saturating the allocated CWND. That's not the expected behavior +with C4. The CWND is computed as the product of the "nominal +rate" and the "nominal max RTT", which is supposedly higher +than the average RTT. When that works correctly, traces show +the bytes in flight constantly lower than the program CWND. +The CWND is mostly used as some kind of "safety belt", to limit +excess sending in abnormal network conditions. + +## First fix, set a floor on max RTT + +Looking at the details of the logs showed that the pacing rate was +set at a reasonable value, but that the "nominal max RTT" was +closely tracking the RTT variations, which oscillated between a few +microseconds and up to 1ms. C4 has an adaptive algorithm to draw down +the nominal max RTT over time, and after a few RTT that value was +getting small, resulting in small values of CWND and low performance. + +Our first fix was to set a 1ms floor on max RTT. This will limit +the effect of measurement errors and ensure that the CWND always +enable 1ms of continuous transmission. We can do this fix safely +because C4 enforces pacing at or near the nominal rate, and thus +avoids excessive queues that could lead to packet losses. + +The effect was immediate. Before the fix, the throughput values observed +during loopback tests were both very low and variable. After the +fix of the max RTT, the observed throughputs became systematically +better than what we observe with BBR on the same computer, and within +88% of the values achieved with Cubic. + +## Could it be a pacing issue? + +Bringing the performance of C4 on loopback within 88% of the performance of +Cubic was a great improvement, but why not improve further and reach parity +or better? + +Kathrin Elmenhorst and Nils Aschenbruck observed that the poor performance +of BBR is some scenarios was due to excessive pacing. C4 uses a pacing +logic very similar to BBR, and we see that both C4 and BBR performed less +well than Cubic in loopback tests. Using memory logs, we verified that +the rate estimates were generally lower than the pacing rate, while the +theory assumed that they would be almost the same. This is a strong +indication that excessive wait time in system calls makes pacing too +slow when the CPU is overloaded. + +We considered measuring these waiting times, and then developing an automated +way to increase the pacing rate above the nominal rate when the waiting times +are excessive. The problem is that such adaptive systems are difficult to +properly tests, and the time allocated to completing this study was limited. +Instead, we use a simple series of tests to check what happens if we increase +the pacing rate. + +We tested pacing at a fraction above the nominal rate, and we tried three +fractions: 1/32, 1/16, and 3/64. Each test showed an improvement. The smaller +fraction, 1/32, was bringing the performance close to those of Cubic. The +larger fraction, 1/16, was improving on that but was probably excessive: +we observed a large amount of packet losses, due to the building of +queues. The middle ground actually improved over both previous tests. The +observed throughput was better than both the 1/32 and the 1/16 test: higher +throughput, but not so high that it would cause excessive losses that slow +the transmission. In fact, the performance were now better with C4 than +with Cubic. + +![graph showing performance on loopback for C4, Cubic and BBR](./perf-loopback-fix-cpu.png "Performance on loopback for C4, Cubic and BBR") + +The graph above shows the resulting performance measured on loopback on a +Windows laptop. We did three tests for each congestion control variant, +each time measuring the time needed to transfer 10GB and computing the +resulting throughput. In the first series of tests, before improvements, +C4's performance was very bad. After setting a floor to the max RTT, +performance became better than BBR, but still lower than Cubic. After +adding a 3/64th increase to the pacing rate, performance became better +than Cubic. + +The C4 code in picoquic was fixed to always apply the 1ms floor to the RTT, +but to only apply the 3/64th increase if the average RTT is less than 1ms. +This restriction ensure that the increase is only used in the narrow +conditions under which it was tested, without impeding other scenarios. +We could replace that by a more general solution in the future, +once we have validated an automated algorithm to detect and compensate for +excessive pacing. \ No newline at end of file diff --git a/papers/perf-loopback-fix-cpu.PNG b/papers/perf-loopback-fix-cpu.PNG new file mode 100644 index 0000000000000000000000000000000000000000..5f9fe4861eadb87065d1830f917e1ccbaaf68b65 GIT binary patch literal 14093 zcmd6O2UJt-wrvE&0-|)JY7{{_2qGOd6zO0>?+7SL3%xgyCh#NDJ4g%CrFRhl>C(Fe z2!tj=fB=E_h5wxY-gnP!?~QZcc#J{c*?WKcTWhX4=UNHVP*b>ck?|q~0=e`^5vB!! z5c5GGXWpGB0)Nr9TzU?^eSh}wp@x%|!UG6|AugaVIPK^A76}D&Hd!J_`61aZv2{ZcYxf>0pe|Rnl zei4r}4zo5)OU58p>6EV~ANEmRe5hDJha6b9WB?nyr)gJo3>G`~g|56eQy3 zo4xNFb90D0ckU22i268Wg|Kxa{Mif;h`%6A5d2~hY7V}BzS04{zP=uQ2JG5H8e#}U z;1wq+1Y%0ANe+RuGW=h^z;r9T^JiLGT8!dCzr^P2&YyWjH`?3VV-#;vg(a3KX0^4o zDc<}#JKNNn;q|RGK=Gzf0-LIvo7+vQtE{XgM!B$F{5zqf*D-7)W$&8$oX`5RnW$<{rq#wS)3Po>C&YPT(5PiT*s@B5H`={>xjrS z3Ac@@DrH+yY%FGZnSq(PcM+MSl9!XyuE^rg9>Pi5yMaSz@Jp{mMH4JGU)wX0>l+duAMa6Q)j<^CANuye`bBHfZub6Jt(AeD#dGM)ao7u7#f&U0q(*uudBq(j0baqsp1;t#?`dvYuI=(4QuuQ{uJK zn{)?S;rR12uFw=ak|6H3k-8ENQ6_fk@E5y1c|Arge;`#Xhcudzk@=dPIbUUp~m<)+r&sbZKRFN6w+CscB>U+&itp7k7)S8FAjt zvez^On#Cj~y_W~C_f#Al9^Ogn-TK*yJwG1BA`#eTDiR_KgN-ySBE>{p3UhNUXX{$HyW-qSQ3L4J)m0`|*5lopmSU8R-q@>4OxZ(E>yqCMujD91gN;o%G_JO_uABtR z+`CBQ$B)gKDKiT?48Uw`Xvc0H9v&LGKF*k%n$oumQBApX>r@5 zr$Ss@Jc?Bc^ZR}!YZHj}tgI|k0ftC`2lb>)*J>&<`~=Cqry_&9;}- zV`F2^;>FUGt?xt4TT@k3Rl8H}c)iu-$?Htley=MiA|fJ0_(0-*TcR%@#s{e{ED zQ*Q>)I>q|Ad`a;9hayVs&`r!>-@)~0LJ(brwfjHENh z6Rf#@{`|S+e)Kh0nUAtF`1nh|C^H+IjN=bjn}~m$HUg$q2p38kkw9>Vuf`xQnXEi5d|AFt1&h)bf@M@HVFT62{XTsP;|y+wTlbA+U%ihbWgwfG`C z%mw36CqMX)vv=C4C}UJ^<*?4U|BPJReP&}*k6TKqoLmc|<_OP8JV>roi`+x*?0Ea} zE2!i5;ap9~#c=U!XHPhm`8!QnAo%EjNstlVDnlle{naZ&+2g@{JyLvUS)lKoejUNfcI@V zHtxT1vYl~a2n%e>+)hDtx%^(}p0;l>vhRH(RuC;(s%W;0GL-w)Sg5-&-^^~GK9 z<&O?}6Zqws=w05LnKyk_ot>Sw)2`Lb)77I1ykd*h#F8~w~>14qV8eG@4tnatXvqj1yiq+C33m>FE zR#xt5_Bn*J0}RfW#@?n(N>3ltvGdGtMqaxz(d1QwYS~VHwY9Yc=R&w35cdWVwl+2x z3`T2vTP^jTreP9T^PJnf4~p@?E9__2A^uo_%}1CqQvA*z8)VMDJ|Y+pd-e78sq;lG zEiDTgd3(&a>|J3id;>@VfQuWF=|?esq#z-xTv`HP;IK2oM90UgP)_rAl{9>PQwpF`BjqDfxR+S=-`|EV!QU%@4*KqW`%PSR^e9G5hr z=*}>R=8M%eH8lf{b>RfmfST)RZymmvYt)nCY(}+;(Qrb2TTIfjIzNBrI&j9vxoyno zX^adFWz9WS??#oN+-6;L?tjzRo+OhOlY5t&D_81>YSHoXYIrODd*c=d2coLx#pDl5 zKas;e3kvF(2zq|c_hCw!Hm{mXQj#e*Ek7P}NNTT|7Pb@}E`yu-)M3FWvJr3_%^;^#o4Rf?t zvaqnA^F4wx^gk9CAm&UTH1Jn zdv&9n$*wkQfkMrCARWc^$sj5r8NA2OPj?h`GwF2Uz*hElAPZHcy`xxJ3Hq8oo1V^28F&2zDmJZEBbw(!=p?bL_DuidU2@bQtZF2%bk2u+np-hLl( zZ(4=Y!op|s%|13Clak9yiGCQCG@X^DA-3U5zqANxAc{2(+uNwRK20Eu5777 zM)wmq#dL*b&nA9ub4Bev(ygCwS)6NS>Uk-Czf|U~GXiQ5OH0~UHs{EtI zTa!&6d9$n9R^-Z3g*!r3+Q-wge*X)^h+9gmNgFYmlWlyE0as69^?6+5vxD<)aNBgf zs`A3lPWv*SVXeTbLTJ|kb`_C;xSPVl%$(|S5>A>I)RvjvcrxQO5LR>HaBMbX}*FT)stH!l;9%YtX>F`e;%HIucT#RUjtMVgq3(4j|v0bva<-o^{ zs=BAb6T(6pH+?o!lo1x?iQ6j+3-6NKjlFSZd_@h4xCH=vHZT& zf|KHEx$gTX6BmOY*hHNS4i3JroEklH+a#{Asd{_Iy1&5XVPk2TiNnr9A2$U463Hd) z3Ai5W6}X01Pm^DrN}+K<^DT%QfN(27Mn<~t#VLs;kkZC{|NdQ=AwE(G$MNTwS{T7|YS)w};^An)xeEarIv<0V1`OMya+_AVGWmP%Z zdHJZtvK$y^UE6ne@)!!wsn%&yB`T3OK?q?uL}wgwDmNxd7sG zjciST$r;Xh?(A;NGDUZ~qygw3NSETaH~KMw%iS3#`4(7bF3UUXd$^v!XY^$ex#cby zr&Jh|29g{VyDT}&r=bc5xdZO;EcI970NJ(!3K6%%^_u9&NDzMc!z+Q1sjpv=<3?_M z<){GJYbkv|MHPtH$MG6e({#pwudl(8uV3%%?A)vITeDyr4srw;DdV)%yR+l0t{&Z7 zbpPSQhkzo3GiPaVCiGa&FC1y{ZGQmCxfjguZRY8D0GL6@Wj4N=l=OTNNy#U*Fre+zn^1 zYc@Hxu~30pL*KIB0oW9Q{7BFgcI>F9i6a_Ev~_fJxZ=bef4pqwu()r`b9@HSrCASj zTD0^*f&O&$OtvGqO|B!_Xid}}k1rZrqTr%E0G9>jR2d0;>~G}GTUu`I9W1tvt!cn_ zw-RHWkFwPp>l2+{Ym^Jf&dcem_veKyqLPxX4!c179~_LSllh5ZK5Epo+dO;bxFTu5 zfN568^skC|w1bcY6tZaAV|3wCz+z;bImFbz*;=+cLOp?Rx*Bmg^&k*pB?sgd@yHH8 zrDGk1>1{2Od_Zx-nNUR7FrdJ!)+Wjy!om}Fb-#{`-}VT;<^KiN^q3OSc)2SJn-Z)@|= z0nUoY<2y>VymohY5AeQ}VKAsTvW^A_)sgEpXTCHyOV18fd2CxzcB8i|n=QoCC`g02 zUW6&$WA0x>R@*kv6OPPc9H8xaA_V<9HHE7AP*oMywY*%QtZBOn7E=cA71B}h`WA2w zc9Ss@!;AAa9=J(HLIk1^3MlsC8-BMHwe7FNKn|Dj^ig$)OGvoDm2OgbHe56gPYz>? z=-BzZv}9$33b|p;i|WE{ql1UPE!DN&n_t_c|3JtW=QSZ*>w_Qxabp|whulDz2E}yX9@}-p#UFNpd#EY z#l?NO1wZtGs^=lI)8qj+BKS$=R&1XYF_4H8)w7e6g>Dg@KkE$n7|2oi0BX|Iis+}R z(^PJO0~p^rpYG5@JL0$B9clqflDR}0gt@5HGQYWvOC{S{MHtAvD=Olqi&&w@=+vD) z@9Q>eg(P`@jvDGm0j=palI96%Kw1d@ZSy1fABNjI#_zv zheI>92%gAAqoj1-y>sm*n3GqY2Ri3k2%jiczrquy&12I<;aB&{Z)3a*S6~U}j z#?SN&0~g40S{qm^*RE)6akGK~^DOQ2$ne~pamzeEG)57IMq$FakZTiyl#|uA1|ste zF0yIb^RxSWbv8@6J^W6-a*sYxyKfb_R{TyAavD{BER&zs+5y5{?#cR86>ZZO+Y((9 zlZ*}0=|+#r(1mRetZLNZO8%5Pk_h(!KI9k4^9Eouod`5UC zZXHpP_{Kb{#2lNYa7&@9l+SE!M-8T?reIH;ot@3iv%{e*{r<`3f37ImY2qMWVl$?O z!ZvnRliN6Fg7~uJHR#Y&IE>Gv*$~M#Ypq=b%@prHbN2XYHuKa4L&KzWJ~1DS{PLD# zlZM)l#3F*qG^=ZCBbv8k6z@eOr=(zi2HF*a)#u7T2_U-cyPCmvnD8@We{gK*#8VSy z=W}w{oa!!=G2pk6!zO4X!UO$4-OIIIMSQ7c_c_LJyUM3wv2VMdrN-`fdF+ym&-b(K z*mk5~PrW&SE`V1tiZ9#014k#K!#Rf00e*4G3!y{OnY7Ou zJvrxlqlW4eWutgY5RnU6Cru$|&JL<+1Y=5GKWr+rYb*&I-F2}7M%I;>@^V2VVR3OT zs4aNTv>(j-bXKNM8Ke)4s}B=&Z_7|~zb_F$4`NL7aC8>{gg_}Mb(pED zljPc66Y9#%kOO&%wxhzO&kB$-$wr}_Zz3nsH=(7vV-l{u{mw3BibHAwrOUL$8u`Xo zJ%4?vR2ccmJ;3T=xqQD8+x7z{(dK<4{7ivD|8~(wJ-Uvk9J&yY?JoKIM<|E*VCr=e zvLH^<%;vWeEnKOVS%Ye2rhi$ZBMEM6M>F9D8=N}zKjI~7VH6J}+K8awY!Zw86{#lQ z!O1@YORPYl!&_m^yoi&COOB@fcUzh7kg6x7UaP{elYxxrOhMQ2FIUQ|;Apv< z702R^3jzp|5fV-E9;G^Ac?x2Uvj9En&4Xn2)sYM~QJk+9^X^5)v_s&1lcMqz?cP8J z#a{>Of#`BA)3*QqM7Y0~$!yiDc+u_@9}DN#S=FKDy4AIs7yQM(ZtskK@ad^gv?vfG zIL++$N)sggjvE_Y<`|xlSA8@*1!a$Q?3-2XwafN8tC8iOxNU-s34boGUBa-*G?p%i zmmIcF%lJ)e$GgiF4|0=2WH0cUMKHf&Jl&?5rK_z!1bZNsH-Dd$-X{xb&(ctois~?* zd@)v+Unv}?#2)(yJAvNDVzDc;hT#y?GX>L<35G?FRfqXRdCo$%+jzrW+}u|EX(Kw| zFTVm`@BQ)d@eiF-qB1)@y}EZO2P|7>H@Bj$MJ*Bt*~9rIy*W=x<{A%D)0fi_OMr|N z#=uafi8168U6sfi2KGe3$M{fy42FEx)74&-5FQ2}${DaPEmI$D)6Xzy{p98ZRz&EB z>S_!x$W*8h;Cq6Cnk9-McObe9gcU)*i-bo;MG^e&VFO?yraz)PPXeqG*uw9{9(I$G z;|YX531mBDT9R0oyivZ~S=?4Hp0yO-rmT5q3uwN%iJ6)A%4D96h5-A;=u`b z;sE*?S0ZgB$#&bYIbFFfPsYH!&)zJ%^202RYgZhMH3~mN;{pp!`Q#vFe;l;zwYRYy zp-cAWKY-1>bo@=`ux0Pzun3VxwACi*jH((NB%hb=5cy!1m-~1dW9Bn6Rr>!XXo0rctCBk zwHg2&0>q~K{VA~t_av4b4KGdLY0jg9(7iICdEDb4-2Od~*=pL=+B)B<#*mMmoOE

-*Y^S@Q2wgc}z$%ge{o8ZQMvnIOs?-UgwGNZ@GpkIie41>Ee((UEGxT&s zImWFsiiJAH7BKcL{gKP0G#dRN(Z*M6={`_fQ)X`M0q!-fHj_P(Ek_eXf3)(FfMEnE z8?BE$u}e?I@6$aPue=SQFRw zynW-=XR5}aJ9Hd`I|?2ve%2s>`$rs{&-VNCV9PMhSC)%ko|o!q zl0eQ@&(uDGZO9zN{eV_!u0wi(C)d;Y??M@AYmUqFsZFxk!%t4?zPnvb_4aEYKdDQ5 zEzEJGd;M-6Y%-J%(>D_&<6+}2sq-9QJ3el8YvzXnWWes?yRAu^xA9Vb`-kA4Q-XLi!^nYM~^d!7wH`aU(pq85}pir1j&oAvBGVxC88Grh;nmKGh%H35a zHlN?|(M~^aj4*?{j8wA#Y5_*-_bX#T0D>^6$@7^_-d!KJrH!e2KAv1c5q}-P`gQAh z1~GC-St4ZzWTX&0$K`;0RsP0|01k*e0CVy}oLR%1Uu+`)&Q6Al1rkybp~+GBrtQw) zl?jN-wQ#e#p}zoCI%Y&6@P4w7XY|9w59#g?0JG`NW8 z!(lJGr_If^%LjR@`YH1gzlYg)y^A3uynw~NuYJjjWN6&q6rFRou=?!G3mE}H>wi#T zr#cTbQQRnTzZ?8mYA{^=tBOdYhEV!@;+aIly6_4uT88iLqF5lNW&%3ovOan> zd3dnZHnSI{JWjy&6GG>&?Z04p9cVOYOsFrAoe#7A@J&|AebfEUb8Q<3{Qo6g0yGyh zyIyly4nwEIl^cQI^``6-H-x$*`L@jcs)hy$01f*9evVS7P}&c@Uy0BB2FEk;rS1!FGoq}#Ex*3YQAuGhQqw2 zGeQlUz3UdFK?`iAr-&PfXx+KLL%h?Juai@yui;=Z!bmfBB$Ve{)Su!j&@~YE?U@mP z^!NevX7iyPT3+N^!9=yx9u$}bo_D-T6YE`(z{aYle!lo6j(zU~PfM!b*r4`j zILBWM&8TICr^y@`GK*zVQ4~#@Qmjl&!v-TN71f|xg`)~YfshsYM%OVSWcuAzCz_=MZ zoeY&0ILI>+LwN22@L(eV?XaXOEd)aQnJ^3RAA>FtgpM|E-2!(Hcl3<14kUF%#IEo1 z9nW)Mp5QV_=IH+zbs-SLD;*mA$A^)kbBRfUMr07kTp(D2#cXmlK-y*ixIeN1_SOLi z2B1mTtPO$G0x-c%Ab;MXtUFoO36B_&uL7n(#5o-ebDZvfnpF_p7l5v5k9;vZrxlX! zzVWE)(`rWo{{`hnuml2kP3bq-Bwz|QGFsZBr?YO05@I?84L7?Ap6|Z_MKD1jXv+q6 z?jL0Me>I1YIH@x|@I;81m{>Zaw6s*+{`H>>&H#8!!4EKwTyaXbVlj?wWJag4)yJdx z9wc-^PaIDL521PF?j^~po<9r>Hqf;p=rT9Ans7rvV?Yh=vh^2YjYcEufFjNqK;v14 zSDXbH;AYCOkKY{#*8g(^>A;nJ+GE;}maZ zvz{XgC@JBitn(Y$o@-gDz2%cvMc|qDn8WCz4;G9JE_4U<%$n5ONEzkn!ozc^y z$pOrgKRuJg%ObkHP@0QpiO!(U#@X;_-A=r?1V|=C!yd(!>1^}j1^lvhiJ>0yTu+5z zIO=c^6d&!q2O}4-C*12K8-F_wYE`tjkE*oJP>mf0a+dNc$XL0YoE+%av9TpBci9-B(RUGmgKgM9oVP2k|&WaUgAji-O0xk`*e2_4MlDDc+{&(yoA197ra+3I0I`HLm0EoqZ z>hKA;^8d_uOJYq@2{?JgYzx~QFZ6lKuF)`Un-Hw@bgZUw4|o_!0Rd-D4Nmk-`kw{J zP($B{5(1HJ0)`+^5+KlitpA(xGCMoV1u6_-V?$=LbtPE`7Lg!U6{47Lb1LNzx0!0N zb5ULhY43jWwc562-o!|=xP=}9ar!;5w>A=Of-f{GTfDJ%`Dy`m{EZq;x=48tr(3%! zw{Ac|SKL$Cz#t{`byO5^9^smx7YBi03zNAy`3athmvjdI-WviU3v^6wVvAc2G6$(& z;WY8)bRaMf9Pifv&mm6S@A$~jI7<9)GW|Y~wFSO{WdFK?JXS58Ah;BEHK1^0b78JQ zpO2RmZv+=mV@`Db^eQ`ABF?XqR2G)Ze@Hei|9d1F$SWwwy#x12coo2gKwMgr2_Afk z7?JU)BPcZfG#um^{x`>gkdQ_qckVncQ7P`yFVo~7WIV0hQw-#!X+6fEjLd?NgFOK@ zB%+ZH8x-)bDwu?0gn0y=Cb{vid}@OKcIu&ytJSjH`+qeY^-qe=?P`oMU3BO0$_FQ9 z0fyJ|IT*)Vu~!)wG@^NXFq@m>4SyzV3Rh0z8^xR9qIp_{PlW0~FPHG_SJnt0@R*+5 zQ*;~k^%nxJ9nl%uYS@$U5+j_qbZmF+-#l~$01ex?MuOOc|6*ifa=MMOWdVzb4;Qif`K0AjTd_HzH3_W z!fo_`4ju@rsSAz3`68RTu=MZ}DW`HJy8W@Hw_F7 z@V&Dvjq`J#jZ1_p*-IB$&kBOnza{^D@U zQe6QC=R83m4?_!9jzA%@LEvSt4HAZ82+7S-Z({}cEen_H-+a`R@1CZP7Ix?Uwz9^8 zk~9Yihkw$>{~rwMd8B~RKmSK3zV0=l@9|S^syZaHY1umeFx;2OO`5rJv@Iop{i)+; zL;nKTg?~tjA;~nR>?nI7$if6E+b5_P7VDPrnqy4R19b+76L`(PXk1dx^7jIS<350q z!jY=^$wS020-`Za4l4(xknNO99|xBb$8+ad5JTeQ&!%w$oqm(*0%_2xQ%XV<0E)+S zHGKsT&X9=CK>6~lEHK!WJyhU1H9E=;1ygDIU8l8*m5i|iS(iG1cCy>TdDj`*T|aH- zxbrL@nqoa+D&Gl9PgEr?Ekk%hHgTuQ_OiXF&ygexKa25&I_oTERk!~wr+5dFg z2+m`#%_s^H*Zd?a!JYTKG5cY+RTWVEQzKw~nXUx4VnvzmgHCV7`yY{K@lRNRp*B&U6yWieY(kjc7=X2 z2LU1E$F}!nFsPJV>K@xBb9ASXjq&!T%Z{_F&t z{^$Vs1{wGtqcUFKQrhYH@a?l~4OpM!K%GA;Jl+{dB^DMIcDLpAb?hOTiLpUYP!PMp zkhi<#avLz8HKXUtx1IRB`aMH?#XZFj^Y?b~8}_)9Ee*F=Y(Hq!P{o*}Dyc4+91nXP z9>it z20vdjp3-mrgdPO^>ZoMureR~QzwI>6dk($5!n>W&ohqgx#hLKAkvUswbk~cpT)(}V zqm$hmhw7W_hn){wF!Eo;RO0Ablohl$_a66A!;I)2P2YXGdXxLVui_72W( z>acip47C}>?8H_GS|{?4#hrBT>m)Veh%bh;_rt+F za;iEoLB}Z7&RE~YWjVVqId>4U(wO`Qx{^# z6+Q9C_KCfekk&8tI;YkTn1FY^+xc=gY4#m@8j{nd4dzC@4Zhs~uHPGv;~kSZs^TAM z=oBRjF5lqG9I%BoGXA(;GbnOrc_^^Ij)*ICXLlFeM!ok_Nmelap#o?--UARvF#Una zzat=!S+6Gd&Q$sSz#ptU%C|rZN6Up86&VJ5AVMtVT$HXYN~fLL{`QTJ#YPAxT?8nW zvq#|#_|%+%{wd66ZaSt!qtMk|KUa{pWrQw)`4)7=S^UcR+9SKqKOHH<eRzsZxI!A{WM-Iu)^ zYnLk}T=!(~p;KARagRmqW_0BWF(v0ogW|^Y9zWypzUR0boB(?=>$1t0)DC-MV}mL$ zA6hJSPIQenhk`-bw8`wEqN1;7zC0DK_t;(nqib)@6y@Ytym-Nrre;6aIJvo3blXy- z*G$QgF$H_2Zgpj4s8~2jrNpqVV!xpX>Z1$B07l*3?^MDW4`aP{PeSk}(BIR?d+|MH zgt^n6y$V~2G_}pi%KWe5wOv+~phw1XNhHSv-|Xhjha77&o73AztF19qDO z#X4&FT)!`A*7Kk6W#Y{YCyS zw1peYMx%iPSYv%jFVA!um0m6l%|yDGhRx;%x;QRw#>jLqmEIL|19g?Bcf@z^p)fZ- z$N5<#DoDxR#Q(N7H8nN#htO~Lg)U0#xEV07j2_KgELnF@vzh=?&c`fnzVoNU$hOnF z-X<(WZy(YGnf3efZ|Z2oFZ|A*5o4evCl$4yGb{@df4D5u*0 z7Y*Mw__AjV4CMY)eZ0McNJtDl0-It(=hq!2NDt;OfDLfu*Xq%co6hhJ>0>E&w&my9 zmP+shV1kJcXxg=YY~v~>U)+klom4aOo*!*8;Vhn}mJf2A?AWHL{0(v>!&k~RCvT}B z8wSk^jBwJZ2(Ntbh(*-S^I)b@-72HrX*EM4KO*wP8ye@MwEpdwf=l?FR^gkVr;AEp z_7_9*zTP6!G`%^rAfJ6WvBxFG2By`w`jyGc(-XeSyPuUI?t%(O(K&m3c>#P|(8_+# z3I;e3vy*y!ks#d>hT=9aw>=+;81;InTHa?QOy7r9FBs6C{0_HRZmCxOT3})(6i;4SSp0Kw4qewq|{+A6Wp^rDMr#CKYrX{;gLMQseY3xK=vD$)Yobr z&${U6p}Z5cKFznpkKg(peXNSvNOGnYo@xiN&Vy60yz4KE%FDTzC@8`$SPkTI17tI2o_(GD6|CH`s(o*G0{qaA&lxHhKr=BgII!mS z_xGbN>w|gBa=jXNnuv}O!`7$5;9UhVH!Q8^y}qkh@cn$V5mHc9g@D0_c_(vn4&z5g z8Q4ol3E74*9kVxn_T{LCZL(urSzM=fFacq6Ky|fAZgwv_dmv3BuHU7U%g9_H>WmT( z)YvL4_(2vJ&=i3cRKuIpJnii_!5l>9(o3;SftW_FTy|1x(qXwUr<``z zZYHtMs^^kjeJ$3 m: + v[0] = m + +def trace_graphs(tdfs, df_names, f_name=""): + colors1 = ["blue", "green", "violet", "red", "orange"] + colors2 = ["turquoise", "lime", "magenta", "pink", "yellow" ] + colors3 = ["black", "gray", "brown", "blue", "magenta" ] + dashes = ['solid', 'dashed', 'dashdot', 'dotted', 'dotted' ] + markers = [ 'o', '+', 'x', '^', '.' ] + i_max = len(tdfs) + if i_max > 5: + i_max = 5 + legends = [] + # Prepare a subtrace with cwnd and bytes in flight + fig, axes = plt.subplots(3, gridspec_kw={'height_ratios': [1, 1, 1]}, figsize=(8, 6), sharex=True, layout='constrained') + axes.flatten() + #fig.tight_layout() + for i in range(0, i_max): + l1 = "cwin, " + df_names[i] + l2 = "bytes in flight, " + df_names[i] + l3 = "rtt, " + df_names[i] + l4 = "min rtt, " + df_names[i] + l5 = "pacing (B/s), " + df_names[i] + l6 = "estimate (B/s), " + df_names[i] + l7 = "max_RTT, " + df_names[i] + mb = [ 0 ] + mw = [ 0 ] + df.apply(lambda x: max_x(mb, x["bytes_in_transit"]), axis=1) + df.apply(lambda x: max_x(mw, x["cwin"]), axis=1) + + print("Max bytes in transit = " + str(mb[0])) + print("Max CWIN = " + str(mw[0])) + + tdfs[i].plot.scatter(ax=axes[0], x="current_time", y="bytes_in_transit", s=15, marker=markers[i], xlabel="time(us)", ylabel="bytes", alpha=0.5, color=colors2[i], label=l2) + tdfs[i].plot.line(ax=axes[0], x="current_time", y="cwin", linewidth=2, linestyle=dashes[i], alpha=0.75, xlabel="time(us)", ylabel="bytes", color=colors1[i], label=l1) + tdfs[i].plot.line(ax=axes[1], x="current_time", y="rtt_sample", linewidth=2, linestyle=dashes[i], alpha=0.75, xlabel="time(us)", ylabel="us", color=colors1[i], label=l3) + tdfs[i].plot.line(ax=axes[1], x="current_time", y="rtt_min", linewidth=1, linestyle=dashes[i], alpha=0.75, xlabel="time(us)", ylabel="us", color=colors2[i], label=l4) + tdfs[i].plot.line(ax=axes[1], x="current_time", y="cc_param", linewidth=1, linestyle=dashes[i], alpha=0.75, xlabel="time(us)", ylabel="us", color=colors3[i], label=l7) + tdfs[i].plot.line(ax=axes[2], x="current_time", y="pacing_rate", linewidth=2, linestyle=dashes[i], alpha=0.75, xlabel="time(us)", ylabel="bytes/s", color=colors1[i], label=l5) + tdfs[i].plot.line(ax=axes[2], x="current_time", y="bw_e", linewidth=2, linestyle=dashes[i], alpha=0.75, xlabel="time(us)", ylabel="bytes/s", color=colors2[i], label=l6) + #plt.legend(legends) + if len(f_name) == 0: + plt.show() + else: + plt.savefig(f_name) + + +# main + +df = pd.read_csv(sys.argv[1], skipinitialspace=True) +print(df.columns.tolist()) +dfg = df[df['bw_e'] > 0] +dfg.to_csv("mem_subset.csv") +tdfs = [ dfg ] +df_names = [ "c4" ] +trace_graphs(tdfs, df_names, f_name="") \ No newline at end of file From 8e7b8d97f0d4d425ffdd382d2b023055ac6e3789 Mon Sep 17 00:00:00 2001 From: huitema Date: Sat, 21 Feb 2026 16:17:03 -0800 Subject: [PATCH 3/3] Update specs before next draft version. --- doc/c4-design.md | 55 ++++++++ doc/c4-design.txt | 278 ++++++++++++++++++++++++++++------------- doc/c4-spec.md | 60 +++++---- doc/c4-spec.txt | 310 +++++++++++++++++++++++----------------------- doc/c4-tests.md | 6 +- doc/c4-tests.txt | 154 +++++++++++------------ 6 files changed, 522 insertions(+), 341 deletions(-) diff --git a/doc/c4-design.md b/doc/c4-design.md index 7964aef..68516aa 100644 --- a/doc/c4-design.md +++ b/doc/c4-design.md @@ -1051,6 +1051,61 @@ and then maybe switch to startup mode if a lot of capacity is available. This is something that we intend to test, but have not implemented yet. +## Adaptation to ECN/L4S + +Tests with L4S active queue management showed the tension between the +periodic updates and L4S goal to minimize queue sizes. Typical L4S deployment +start marking packets with ECN/CE when the queue size is about 1.5ms, and +increase the mark rate progressively as the queue size increases, +reaching 100% when the queue size is about 2ms. If C4 pushes at 25% every 6 RTT, +and if the bandwidth estimate is accurate, +the queue size will increase by 25% of the RTT during the first roundtrip, +before any correction signal can be applied. The increased marking +rate will affect all connections sharing the bottleneck, which is +not desirable. + +L4S is tuned for the "Prague" algorithm, which increases CWIN by one packet every +RTT. In a typical trial with a 20ms RTT and a 100 Mbps data rate, it takes 0.12ms +to send a packet, and thus 12.5 RTT before building a queue of 1.5ms. In the same +conditions, C4 would have increased the rate by 25% after 6 RTT in the +aggressive scenario, thus triggering a high rate of marking. + +The cascade process made the problem even worse. If a push at 6.25% does increase +the nominal rate, the next push will be at 25%. If that push and the next one +did increase the nominal rate, C4 will reenter the initial phase, even if some +of the pushes did cause ECN/CE marks. The initial phase will then cause a lot +of packet losses, which will degrade performance. + + +To mitigate this issue, we had to add a "very low" pushing mode, setting the +pushing rate to only 3.125% if the previous push resulted in a high rate of ECN/CE marks. +We also replaced the somewhat adhoc "count of successive probes" by the management +of a "probe level", defining 4 levels: + +- level 0: pushing at 3.125%, spend 1 cycle in cruising before pushing. +- level 1: pushing at 6.25%, spend 4 cycles in cruising before pushing. +- level 2: pushing at 25%, spend at most 1 cycle in cruising before pushing. +- level 3: pushing at 25%, spend at most 1 cycle in cruising before pushing. + +The "probe level" is updated after the recovery phase as follow: + +- if the previous probe was successful and did not result in a high rate of ECN/CE marks, + increase the probe level by 1. If the probe level was already at 3, reenter the startup phase. +- if the previous probe was successful but did result in a high rate of ECN/CE marks, + remain at the same probe level. +- if the previous probe was not successful but did not result in a high rate of ECN/CE marks, + stay at probe level 0 if already at that level, otherwise move back to probe level 1. +- if the previous probe was not successful and did result in a high rate of ECN/CE marks, + move to probe level 0. + +This logic treats the CE marking differently from other congestion signals, because +the CE marks are an intentional indication of congestion by the network, and is thus +less ambiguous than delay increases or packet losses, which can be caused by other +factors such as delay jitter or random transmission issues. Simulations show that +this logic allows to quickly discover the available capacity in L4S networks, whithout spuriously +reentering the startup phase and causing packet losses. It is equivalent to the +previous logic when the network does not support L4S. + # Revisiting the Initial Phase Our November 2025 design of C4 included a "rate based" diff --git a/doc/c4-design.txt b/doc/c4-design.txt index 103286b..7d510bd 100644 --- a/doc/c4-design.txt +++ b/doc/c4-design.txt @@ -5,9 +5,9 @@ Network Working Group C. Huitema Internet-Draft Private Octopus Inc. Intended status: Informational S. Nandakumar -Expires: 3 August 2026 C. Jennings +Expires: 25 August 2026 C. Jennings Cisco - 30 January 2026 + 21 February 2026 Design of Christian's Congestion Control Code (C4) @@ -43,7 +43,7 @@ Status of This Memo time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on 3 August 2026. + This Internet-Draft will expire on 25 August 2026. Copyright Notice @@ -53,9 +53,9 @@ Copyright Notice -Huitema, et al. Expires 3 August 2026 [Page 1] +Huitema, et al. Expires 25 August 2026 [Page 1] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 This document is subject to BCP 78 and the IETF Trust's Legal @@ -97,23 +97,25 @@ Table of Contents 7.1. Coordinated Pushing . . . . . . . . . . . . . . . . . . . 20 7.2. Variable Pushing Rate . . . . . . . . . . . . . . . . . . 21 7.3. Pushing rate and Cascades . . . . . . . . . . . . . . . . 22 - 8. Revisiting the Initial Phase . . . . . . . . . . . . . . . . 22 - 8.1. Why not increasing Max RTT during Initial phase? . . . . 23 - 8.2. Building a robust initial estimator . . . . . . . . . . . 23 - 9. State Machine . . . . . . . . . . . . . . . . . . . . . . . . 24 - 10. Security Considerations . . . . . . . . . . . . . . . . . . . 25 - 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 - 12. Informative References . . . . . . . . . . . . . . . . . . . 25 - Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 28 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 28 + 7.4. Adaptation to ECN/L4S . . . . . . . . . . . . . . . . . . 22 + 8. Revisiting the Initial Phase . . . . . . . . . . . . . . . . 24 + 8.1. Why not increasing Max RTT during Initial phase? . . . . 24 + 8.2. Building a robust initial estimator . . . . . . . . . . . 25 + 9. State Machine . . . . . . . . . . . . . . . . . . . . . . . . 25 + 10. Security Considerations . . . . . . . . . . . . . . . . . . . 27 + 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 + 12. Informative References . . . . . . . . . . . . . . . . . . . 27 + Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 30 -Huitema, et al. Expires 3 August 2026 [Page 2] +Huitema, et al. Expires 25 August 2026 [Page 2] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 30 + 1. Introduction Christian's Congestion Control Code (C4) is a new congestion control @@ -163,11 +165,9 @@ Internet-Draft C4 Design January 2026 - - -Huitema, et al. Expires 3 August 2026 [Page 3] +Huitema, et al. Expires 25 August 2026 [Page 3] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 2. Studying the reaction to delays @@ -221,9 +221,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 4] +Huitema, et al. Expires 25 August 2026 [Page 4] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 In our initial deployments, we detected competition when delay based @@ -277,9 +277,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 5] +Huitema, et al. Expires 25 August 2026 [Page 5] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 2.2. Handling Chaotic Delays @@ -333,9 +333,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 6] +Huitema, et al. Expires 25 August 2026 [Page 6] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 Using the pacing rate that way prevents the larger window to cause @@ -389,9 +389,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 7] +Huitema, et al. Expires 25 August 2026 [Page 7] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 3. Simplifying the initial design @@ -445,9 +445,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 8] +Huitema, et al. Expires 25 August 2026 [Page 8] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 To avoid that, we can have periodic periods in which the endpoint @@ -501,9 +501,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 9] +Huitema, et al. Expires 25 August 2026 [Page 9] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 * C4 notices high jitter, increases Nominal Max RTT accordingly, set @@ -557,9 +557,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 10] +Huitema, et al. Expires 25 August 2026 [Page 10] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 3.3. Monitoring the nominal rate @@ -613,9 +613,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 11] +Huitema, et al. Expires 25 August 2026 [Page 11] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 We use the data rate measurement to update the nominal rate, but only @@ -669,9 +669,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 12] +Huitema, et al. Expires 25 August 2026 [Page 12] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 inverse of these times per byte, effectively computing an harmonic @@ -725,9 +725,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 13] +Huitema, et al. Expires 25 August 2026 [Page 13] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 4. Competition with other algorithms @@ -781,9 +781,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 14] +Huitema, et al. Expires 25 August 2026 [Page 14] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 1. Excessive increase of measured RTT (above the nominal Max RTT), @@ -837,9 +837,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 15] +Huitema, et al. Expires 25 August 2026 [Page 15] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 and only react to those losses that are detected by gaps in @@ -893,9 +893,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 16] +Huitema, et al. Expires 25 August 2026 [Page 16] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 proportional to its size. This drives very good long term fairness, @@ -949,9 +949,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 17] +Huitema, et al. Expires 25 August 2026 [Page 17] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 measurements, before any of the big jitter events had occured. This @@ -1005,9 +1005,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 18] +Huitema, et al. Expires 25 August 2026 [Page 18] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 There is no doubt that the current curve will have to be refined. We @@ -1061,9 +1061,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 19] +Huitema, et al. Expires 25 August 2026 [Page 19] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 The second feature is the "make before break" nature of the rate @@ -1117,9 +1117,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 20] +Huitema, et al. Expires 25 August 2026 [Page 20] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 7.2. Variable Pushing Rate @@ -1173,9 +1173,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 21] +Huitema, et al. Expires 25 August 2026 [Page 21] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 7.3. Pushing rate and Cascades @@ -1204,6 +1204,102 @@ Internet-Draft C4 Design January 2026 to startup mode if a lot of capacity is available. This is something that we intend to test, but have not implemented yet. +7.4. Adaptation to ECN/L4S + + Tests with L4S active queue management showed the tension between the + periodic updates and L4S goal to minimize queue sizes. Typical L4S + deployment start marking packets with ECN/CE when the queue size is + about 1.5ms, and increase the mark rate progressively as the queue + size increases, reaching 100% when the queue size is about 2ms. If + C4 pushes at 25% every 6 RTT, and if the bandwidth estimate is + accurate, the queue size will increase by 25% of the RTT during the + first roundtrip, before any correction signal can be applied. The + increased marking rate will affect all connections sharing the + bottleneck, which is not desirable. + + L4S is tuned for the "Prague" algorithm, which increases CWIN by one + packet every RTT. In a typical trial with a 20ms RTT and a 100 Mbps + data rate, it takes 0.12ms to send a packet, and thus 12.5 RTT before + building a queue of 1.5ms. In the same conditions, C4 would have + increased the rate by 25% after 6 RTT in the aggressive scenario, + thus triggering a high rate of marking. + + + + + + +Huitema, et al. Expires 25 August 2026 [Page 22] + +Internet-Draft C4 Design February 2026 + + + The cascade process made the problem even worse. If a push at 6.25% + does increase the nominal rate, the next push will be at 25%. If that + push and the next one did increase the nominal rate, C4 will reenter + the initial phase, even if some of the pushes did cause ECN/CE marks. + The initial phase will then cause a lot of packet losses, which will + degrade performance. + + To mitigate this issue, we had to add a "very low" pushing mode, + setting the pushing rate to only 3.125% if the previous push resulted + in a high rate of ECN/CE marks. We also replaced the somewhat adhoc + "count of successive probes" by the management of a "probe level", + defining 4 levels: + + * level 0: pushing at 3.125%, spend 1 cycle in cruising before + pushing. + + * level 1: pushing at 6.25%, spend 4 cycles in cruising before + pushing. + + * level 2: pushing at 25%, spend at most 1 cycle in cruising before + pushing. + + * level 3: pushing at 25%, spend at most 1 cycle in cruising before + pushing. + + The "probe level" is updated after the recovery phase as follow: + + * if the previous probe was successful and did not result in a high + rate of ECN/CE marks, increase the probe level by 1. If the probe + level was already at 3, reenter the startup phase. + + * if the previous probe was successful but did result in a high rate + of ECN/CE marks, remain at the same probe level. + + * if the previous probe was not successful but did not result in a + high rate of ECN/CE marks, stay at probe level 0 if already at + that level, otherwise move back to probe level 1. + + * if the previous probe was not successful and did result in a high + rate of ECN/CE marks, move to probe level 0. + + + + + + + + + + + +Huitema, et al. Expires 25 August 2026 [Page 23] + +Internet-Draft C4 Design February 2026 + + + This logic treats the CE marking differently from other congestion + signals, because the CE marks are an intentional indication of + congestion by the network, and is thus less ambiguous than delay + increases or packet losses, which can be caused by other factors such + as delay jitter or random transmission issues. Simulations show that + this logic allows to quickly discover the available capacity in L4S + networks, whithout spuriously reentering the startup phase and + causing packet losses. It is equivalent to the previous logic when + the network does not support L4S. + 8. Revisiting the Initial Phase Our November 2025 design of C4 included a "rate based" initial phase, @@ -1224,16 +1320,6 @@ Internet-Draft C4 Design January 2026 and the initial RTT was large enough, but in many case it was not and became a limiting factor. - - - - - -Huitema, et al. Expires 3 August 2026 [Page 22] - -Internet-Draft C4 Design January 2026 - - 8.1. Why not increasing Max RTT during Initial phase? In the initial phase, the algorithm tries to discover the bandwidth @@ -1253,6 +1339,13 @@ Internet-Draft C4 Design January 2026 easy to monitor and well correlated with the actual cause of the delay. + + +Huitema, et al. Expires 25 August 2026 [Page 24] + +Internet-Draft C4 Design February 2026 + + 8.2. Building a robust initial estimator The "rate based" initial estimator requires estimating both the data @@ -1282,14 +1375,6 @@ Internet-Draft C4 Design January 2026 When the initial phase completes, we retain as estimate of the data rate the highest value measured so far. We also want to obtain a reasonable estimate of the "max RTT". In the Reno logic, the - - - -Huitema, et al. Expires 3 August 2026 [Page 23] - -Internet-Draft C4 Design January 2026 - - "ssthresh" is set to half the CWND value before congestion is detected. C4 will not use the ssthresh variable after exiting the Initial phase, but it can set the max RTT to the quotient of ssthresh @@ -1304,6 +1389,19 @@ Internet-Draft C4 Design January 2026 "nominal_cwnd" does not increase for 3 consecutive round trips. When the connection exits startup, it enters "recovery". + + + + + + + + +Huitema, et al. Expires 25 August 2026 [Page 25] + +Internet-Draft C4 Design February 2026 + + * "recovery": the connection enters that state after "startup", "pushing", or a congestion detection in a "cruising" state. It remains in that state for at least one roundtrip, until the first @@ -1341,9 +1439,23 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 24] + + + + + + + + + + + + + + +Huitema, et al. Expires 25 August 2026 [Page 26] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 Start @@ -1397,9 +1509,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 25] +Huitema, et al. Expires 25 August 2026 [Page 27] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 [I-D.ietf-moq-transport] @@ -1453,9 +1565,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 26] +Huitema, et al. Expires 25 August 2026 [Page 28] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 [Cubic-QUIC-Blog] @@ -1509,9 +1621,9 @@ Internet-Draft C4 Design January 2026 -Huitema, et al. Expires 3 August 2026 [Page 27] +Huitema, et al. Expires 25 August 2026 [Page 29] -Internet-Draft C4 Design January 2026 +Internet-Draft C4 Design February 2026 Acknowledgments @@ -1565,4 +1677,4 @@ Authors' Addresses -Huitema, et al. Expires 3 August 2026 [Page 28] +Huitema, et al. Expires 25 August 2026 [Page 30] diff --git a/doc/c4-spec.md b/doc/c4-spec.md index ff1b1c5..2323a48 100644 --- a/doc/c4-spec.md +++ b/doc/c4-spec.md @@ -210,9 +210,12 @@ C4 maintains a set of variables tracking the evolution of the flow: - running min RTT, an approximation of the min RTT for the flow, - number of eras without increase (see {{c4-initial}}), -- number of successful pushes, - current state of the algorithm, which can be Initial, Recovery, Cruising or Pushing. +- probe level. + +The probe level determines how aggressive the pushing phase is, and also +how long to wait between recovery and pushing. ## Per era variables {#era-variables} @@ -395,7 +398,8 @@ assess the loss rate. On exiting the Initial state, C4 computes an estimate of the nominal max RTT as the quotient of the half the last CWND divided by the last -nominal rate, and updates the "nominal max RTT" accordingly. +nominal rate, and updates the "nominal max RTT" accordingly. The probe level +is set to 1. ### Reentering the initial state @@ -434,8 +438,20 @@ less, * An increase of at least 1/4th of the expected increase otherwise, for example an increase of 1/16th if `alpha_previous` was 5/4. -C4 re-enters "Initial" at the end of the recovery period if the evaluation -shows 3 successive rate increases without congestion, or if +The probe level is updated as follow: + +* If the prior pushing was successful, and did not trigger an excessive rate of ECN/CE marks, + the probe level is increased by 1. +* If the prior pushing was successful but did trigger an excessive rate of ECN/CE marks, + the probe level remains unchanged. +* If the prior pushing was not successful but did not trigger an excessive rate of ECN/CE marks, + the probe level left unchanged if it was 0, set to 1 otherwise. +* If the prior pushing was not successful and did trigger an excessive rate of ECN/CE marks, + the probe level is set to 0. + + +C4 re-enters "Initial" at the end of the recovery period if the probe level +as reached a value 4 or larger, or if high jitter requires restarting the Initial phase (see {{restart-high-jitter}}. Otherwise, C4 enters cruising. @@ -465,27 +481,25 @@ This will be done at most once per flow. The Cruising state is entered from the Recovery state. The coefficient `alpha_current` is set to 1. -C4 will normally transition from Cruising state to Pushing state -after 4 eras. It will transition to Recovery before that if -a congestion signal is received. +C4 will transition from Cruising state to Pushing state +after a number of eras that depend on the probe level: + +- 1 era if the probe level is 0, +- 4 eras if the probe level is 1, +- 1 era if the probe level is 2 or 3. + +C4 will transition to Recovery before that if +a congestion signal is received before transition to Pushing. ## Pushing state {#c4-pushing} The Pushing state is entered from the Cruising state. -The coefficient `alpha_current` depend on whether the -previous -pushing attempt was successful (see {{c4-recovery}}), -and also of the current value of `ecn_alpha` -(see {{process-ecn}}): +The coefficient `alpha_current` depend on the probe level: -~~~ - if not previous_attempt_successful: - alpha_current = 17/16 - else: - alpha_current = 17/16 + - 17/16 * (1 - ecn_alpha / ecn_threshold) -~~~ +- If the probe level is 0, `alpha_current` is set to 33/32. +- If the probe level is 1, `alpha_current` is set to 17/16. +- If the probe level is 2 or higher, `alpha_current` is set to 5/4. C4 exits the pushing state after one era, or if a congestion signal is received before that. In an exception to @@ -527,13 +541,13 @@ drive these flows towards sharing the available resource evenly. The sensitivity coefficient varies from 0 to 1, according to a simple curve: -* set sensitivity to 0 if data rate is lower than 50000B/s +* set sensitivity to 0 if data rate is lower than 50000 B/s * linear interpolation between 0 and 0.92 for values - between 50,000 and 1,000,000B/s. + between 50,000 and 1,000,000 B/s. * linear interpolation between 0.92 and 1 for values - between 1,000,000 and 10,000,000B/s. + between 1,000,000 and 10,000,000 B/s. * set sensitivity to 1 if data rate is higher than - 10,000,000B/s + 10,000,000 B/s The sensitivity index is then used to set the value of delay and loss and CE thresholds. diff --git a/doc/c4-spec.txt b/doc/c4-spec.txt index 4c152bc..24226f6 100644 --- a/doc/c4-spec.txt +++ b/doc/c4-spec.txt @@ -5,9 +5,9 @@ Network Working Group C. Huitema Internet-Draft Private Octopus Inc. Intended status: Experimental S. Nandakumar -Expires: 3 August 2026 C. Jennings +Expires: 25 August 2026 C. Jennings Cisco - 30 January 2026 + 21 February 2026 Specification of Christian's Congestion Control Code (C4) @@ -40,7 +40,7 @@ Status of This Memo time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on 3 August 2026. + This Internet-Draft will expire on 25 August 2026. Copyright Notice @@ -53,9 +53,9 @@ Copyright Notice -Huitema, et al. Expires 3 August 2026 [Page 1] +Huitema, et al. Expires 25 August 2026 [Page 1] -Internet-Draft C4 Specification January 2026 +Internet-Draft C4 Specification February 2026 Please review these documents carefully, as they describe your rights @@ -78,30 +78,30 @@ Table of Contents 4.2. Initial state . . . . . . . . . . . . . . . . . . . . . . 9 4.2.1. Reentering the initial state . . . . . . . . . . . . 9 4.3. Recovery state . . . . . . . . . . . . . . . . . . . . . 10 - 4.3.1. Restarting Initial if High Jitter . . . . . . . . . . 10 + 4.3.1. Restarting Initial if High Jitter . . . . . . . . . . 11 4.4. Cruising state {#c4-cruising } . . . . . . . . . . . . . 11 4.5. Pushing state . . . . . . . . . . . . . . . . . . . . . . 11 - 5. Handling of congestion signals . . . . . . . . . . . . . . . 11 + 5. Handling of congestion signals . . . . . . . . . . . . . . . 12 5.1. Variable Sensitivity . . . . . . . . . . . . . . . . . . 12 - 5.2. Detecting Excessive Delays . . . . . . . . . . . . . . . 12 + 5.2. Detecting Excessive Delays . . . . . . . . . . . . . . . 13 5.3. Detecting Excessive Losses . . . . . . . . . . . . . . . 13 5.3.1. Do not react to Probe Time Out . . . . . . . . . . . 13 - 5.4. Detecting Excessive CE Marks . . . . . . . . . . . . . . 13 + 5.4. Detecting Excessive CE Marks . . . . . . . . . . . . . . 14 5.5. Applying congestion signals . . . . . . . . . . . . . . . 14 5.5.1. Rate Reduction on Congestion . . . . . . . . . . . . 14 6. Implementation considerations . . . . . . . . . . . . . . . . 15 6.1. Rate measurement should be conservative . . . . . . . . . 15 - 6.2. Pacing and CPU load . . . . . . . . . . . . . . . . . . . 15 + 6.2. Pacing and CPU load . . . . . . . . . . . . . . . . . . . 16 6.3. Nominal max RTT on low latency links . . . . . . . . . . 16 7. Security Considerations . . . . . . . . . . . . . . . . . . . 16 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 9.1. Normative References . . . . . . . . . . . . . . . . . . 16 - 9.2. Informative References . . . . . . . . . . . . . . . . . 16 + 9.1. Normative References . . . . . . . . . . . . . . . . . . 17 + 9.2. Informative References . . . . . . . . . . . . . . . . . 17 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 17 Changes since previous versions . . . . . . . . . . . . . . . . . 17 Changes since draft-huitema-ccwg-c4-spec-00 . . . . . . . . . . 17 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18 @@ -109,9 +109,9 @@ Table of Contents -Huitema, et al. Expires 3 August 2026 [Page 2] +Huitema, et al. Expires 25 August 2026 [Page 2] -Internet-Draft C4 Specification January 2026 +Internet-Draft C4 Specification February 2026 1. Introduction @@ -165,9 +165,9 @@ Internet-Draft C4 Specification January 2026 -Huitema, et al. Expires 3 August 2026 [Page 3] +Huitema, et al. Expires 25 August 2026 [Page 3] -Internet-Draft C4 Specification January 2026 +Internet-Draft C4 Specification February 2026 3.1. Nominal rate @@ -221,9 +221,9 @@ Internet-Draft C4 Specification January 2026 -Huitema, et al. Expires 3 August 2026 [Page 4] +Huitema, et al. Expires 25 August 2026 [Page 4] -Internet-Draft C4 Specification January 2026 +Internet-Draft C4 Specification February 2026 * the latency of the path @@ -277,18 +277,21 @@ Internet-Draft C4 Specification January 2026 -Huitema, et al. Expires 3 August 2026 [Page 5] +Huitema, et al. Expires 25 August 2026 [Page 5] -Internet-Draft C4 Specification January 2026 +Internet-Draft C4 Specification February 2026 * number of eras without increase (see Section 4.2), - * number of successful pushes, - * current state of the algorithm, which can be Initial, Recovery, Cruising or Pushing. + * probe level. + + The probe level determines how aggressive the pushing phase is, and + also how long to wait between recovery and pushing. + 3.4. Per era variables C4 keeps variables per era: @@ -326,18 +329,20 @@ Internet-Draft C4 Specification January 2026 "cruising" state until at least 4 RTT and the connection is not "app limited". At that point, it enters "pushing". - * "Pushing": the connection is using a rate and CWND 25% larger than - "nominal_rate" and "nominal_CWND". It remains in that state for - one round trip, i.e., until the first packet send while "pushing" - is acknowledged. At that point, it enters the "recovery" state. -Huitema, et al. Expires 3 August 2026 [Page 6] + +Huitema, et al. Expires 25 August 2026 [Page 6] -Internet-Draft C4 Specification January 2026 +Internet-Draft C4 Specification February 2026 + * "Pushing": the connection is using a rate and CWND 25% larger than + "nominal_rate" and "nominal_CWND". It remains in that state for + one round trip, i.e., until the first packet send while "pushing" + is acknowledged. At that point, it enters the "recovery" state. + These transitions are summarized in the following state diagram. Start @@ -382,18 +387,15 @@ Internet-Draft C4 Specification January 2026 * congestion window: set to the equivalent of 10 packets, - * congestion quantum: set to zero. - - - - -Huitema, et al. Expires 3 August 2026 [Page 7] +Huitema, et al. Expires 25 August 2026 [Page 7] -Internet-Draft C4 Specification January 2026 +Internet-Draft C4 Specification February 2026 + * congestion quantum: set to zero. + If the nominal rate or the nominal max RTT are both assessed, C4 sets pacing rate, and congestion window to values that depends on these variables and on a coefficient alpha_current: @@ -443,11 +445,9 @@ Internet-Draft C4 Specification January 2026 - - -Huitema, et al. Expires 3 August 2026 [Page 8] +Huitema, et al. Expires 25 August 2026 [Page 8] -Internet-Draft C4 Specification January 2026 +Internet-Draft C4 Specification February 2026 4.2. Initial state @@ -488,7 +488,8 @@ Internet-Draft C4 Specification January 2026 On exiting the Initial state, C4 computes an estimate of the nominal max RTT as the quotient of the half the last CWND divided by the last - nominal rate, and updates the "nominal max RTT" accordingly. + nominal rate, and updates the "nominal max RTT" accordingly. The + probe level is set to 1. 4.2.1. Reentering the initial state @@ -500,10 +501,9 @@ Internet-Draft C4 Specification January 2026 - -Huitema, et al. Expires 3 August 2026 [Page 9] +Huitema, et al. Expires 25 August 2026 [Page 9] -Internet-Draft C4 Specification January 2026 +Internet-Draft C4 Specification February 2026 4.3. Recovery state @@ -537,31 +537,46 @@ Internet-Draft C4 Specification January 2026 * An increase of at least 1/4th of the expected increase otherwise, for example an increase of 1/16th if alpha_previous was 5/4. - C4 re-enters "Initial" at the end of the recovery period if the - evaluation shows 3 successive rate increases without congestion, or - if high jitter requires restarting the Initial phase (see - Section 4.3.1. Otherwise, C4 enters cruising. + The probe level is updated as follow: - Reception of a congestion signal during the Initial phase does not - cause a change in the nominal_rate or nominal_max_RTT. + * If the prior pushing was successful, and did not trigger an + excessive rate of ECN/CE marks, the probe level is increased by 1. -4.3.1. Restarting Initial if High Jitter + * If the prior pushing was successful but did trigger an excessive + rate of ECN/CE marks, the probe level remains unchanged. - The "nominal max RTT" is not updated during the Initial phase, - because doing so would prevent exiting Initial on high delay - detection. This can lead to underestimation of the "nominal rate" if - the flow is operating on a path with high jitter. + * If the prior pushing was not successful but did not trigger an + excessive rate of ECN/CE marks, the probe level left unchanged if + it was 0, set to 1 otherwise. + + * If the prior pushing was not successful and did trigger an + excessive rate of ECN/CE marks, the probe level is set to 0. -Huitema, et al. Expires 3 August 2026 [Page 10] +Huitema, et al. Expires 25 August 2026 [Page 10] -Internet-Draft C4 Specification January 2026 +Internet-Draft C4 Specification February 2026 + C4 re-enters "Initial" at the end of the recovery period if the probe + level as reached a value 4 or larger, or if high jitter requires + restarting the Initial phase (see Section 4.3.1. Otherwise, C4 + enters cruising. + + Reception of a congestion signal during the Initial phase does not + cause a change in the nominal_rate or nominal_max_RTT. + +4.3.1. Restarting Initial if High Jitter + + The "nominal max RTT" is not updated during the Initial phase, + because doing so would prevent exiting Initial on high delay + detection. This can lead to underestimation of the "nominal rate" if + the flow is operating on a path with high jitter. + C4 will reenter the "initial" phase on the first time high jitter is detected for the flow. The high jitter is detected after updating the "nominal max RTT" at the end of the recovery era, if: @@ -575,23 +590,37 @@ Internet-Draft C4 Specification January 2026 The Cruising state is entered from the Recovery state. The coefficient alpha_current is set to 1. - C4 will normally transition from Cruising state to Pushing state - after 4 eras. It will transition to Recovery before that if a - congestion signal is received. + C4 will transition from Cruising state to Pushing state after a + number of eras that depend on the probe level: + + * 1 era if the probe level is 0, + + * 4 eras if the probe level is 1, + + * 1 era if the probe level is 2 or 3. + + C4 will transition to Recovery before that if a congestion signal is + received before transition to Pushing. 4.5. Pushing state The Pushing state is entered from the Cruising state. - The coefficient alpha_current depend on whether the previous pushing - attempt was successful (see Section 4.3), and also of the current - value of ecn_alpha (see Section 5.4): + The coefficient alpha_current depend on the probe level: + + * If the probe level is 0, alpha_current is set to 33/32. + + + + +Huitema, et al. Expires 25 August 2026 [Page 11] + +Internet-Draft C4 Specification February 2026 - if not previous_attempt_successful: - alpha_current = 17/16 - else: - alpha_current = 17/16 + - 17/16 * (1 - ecn_alpha / ecn_threshold) + + * If the probe level is 1, alpha_current is set to 17/16. + + * If the probe level is 2 or higher, alpha_current is set to 5/4. C4 exits the pushing state after one era, or if a congestion signal is received before that. In an exception to standard congestion @@ -610,14 +639,6 @@ Internet-Draft C4 Specification January 2026 2. Excessive rate of packet losses (but not mere Probe Time Out, see Section 5.3.1), - - - -Huitema, et al. Expires 3 August 2026 [Page 11] - -Internet-Draft C4 Specification January 2026 - - 3. Excessive rate of ECN/CE marks C4 monitors successive RTT measurements and compare them to a @@ -643,15 +664,23 @@ Internet-Draft C4 Specification January 2026 The sensitivity coefficient varies from 0 to 1, according to a simple curve: - * set sensitivity to 0 if data rate is lower than 50000B/s + * set sensitivity to 0 if data rate is lower than 50000 B/s + + + + +Huitema, et al. Expires 25 August 2026 [Page 12] + +Internet-Draft C4 Specification February 2026 + * linear interpolation between 0 and 0.92 for values between 50,000 - and 1,000,000B/s. + and 1,000,000 B/s. * linear interpolation between 0.92 and 1 for values between - 1,000,000 and 10,000,000B/s. + 1,000,000 and 10,000,000 B/s. - * set sensitivity to 1 if data rate is higher than 10,000,000B/s + * set sensitivity to 1 if data rate is higher than 10,000,000 B/s The sensitivity index is then used to set the value of delay and loss and CE thresholds. @@ -666,14 +695,6 @@ Internet-Draft C4 Specification January 2026 A delay congestion signal is detected if: - - - -Huitema, et al. Expires 3 August 2026 [Page 12] - -Internet-Draft C4 Specification January 2026 - - rtt_sample > nominal_max_rtt + delay_threshold 5.3. Detecting Excessive Losses @@ -701,6 +722,14 @@ Internet-Draft C4 Specification January 2026 sent packet has not been acknowledged. This is not a robust congestion signal, because delay jitter may also cause PTO timeouts. When testing in "high jitter" conditions, we realized that we should + + + +Huitema, et al. Expires 25 August 2026 [Page 13] + +Internet-Draft C4 Specification February 2026 + + not change the state of C4 for losses detected solely based on timer, and only react to those losses that are detected by gaps in acknowledgements. @@ -718,18 +747,6 @@ Internet-Draft C4 Specification January 2026 The ratio ecn_alpha is updated each time an acknowledgement is received, as follow: - - - - - - - -Huitema, et al. Expires 3 August 2026 [Page 13] - -Internet-Draft C4 Specification January 2026 - - delta_ce = increase in the reported CE marks delta_ect1 = increase in the reported ECT(1) marks frac = delta_ce / (delta_ce + delta_ect1) @@ -762,6 +779,13 @@ Internet-Draft C4 Specification January 2026 nominal_rate by the factor "beta" corresponding to the congestion signal: + + +Huitema, et al. Expires 25 August 2026 [Page 14] + +Internet-Draft C4 Specification February 2026 + + nominal_rate = (1-beta)*nominal_rate The coefficient beta differs depending on the nature of the @@ -779,13 +803,6 @@ Internet-Draft C4 Specification January 2026 If the signal is an ECN/CE rate, the coefficient is proportional to the difference between ecn_alpha and ecn_threshold, capped to 1/4: - - -Huitema, et al. Expires 3 August 2026 [Page 14] - -Internet-Draft C4 Specification January 2026 - - beta = min(1/4, (ecn_alpha - ecn_threshold)/ ecn_threshold) 6. Implementation considerations @@ -810,6 +827,21 @@ Internet-Draft C4 Specification January 2026 interval are long enough so jitter only has a small influence. Cautious implementations should use both strategies. + + + + + + + + + + +Huitema, et al. Expires 25 August 2026 [Page 15] + +Internet-Draft C4 Specification February 2026 + + 6.2. Pacing and CPU load C4 relies on pacing during to avoid sending data too fast during the @@ -834,14 +866,6 @@ Internet-Draft C4 Specification January 2026 definitely improved performance in low-latency environment, in particular loopback interfaces. - - - -Huitema, et al. Expires 3 August 2026 [Page 15] - -Internet-Draft C4 Specification January 2026 - - 6.3. Nominal max RTT on low latency links When doing tests on low latency links, we observed on some systems a @@ -867,6 +891,13 @@ Internet-Draft C4 Specification January 2026 9. References + + +Huitema, et al. Expires 25 August 2026 [Page 16] + +Internet-Draft C4 Specification February 2026 + + 9.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate @@ -885,19 +916,6 @@ Internet-Draft C4 Specification January 2026 DOI 10.17487/RFC9000, May 2021, . - - - - - - - - -Huitema, et al. Expires 3 August 2026 [Page 16] - -Internet-Draft C4 Specification January 2026 - - [I-D.ietf-moq-transport] Nandakumar, S., Vasiliev, V., Swett, I., and A. Frindell, "Media over QUIC Transport", Work in Progress, Internet- @@ -928,6 +946,14 @@ Changes since draft-huitema-ccwg-c4-spec-00 Section 5.5.1. Update section Section 4.5 to modulate pushing rate based on observed rate of ECN/CE marks. + + + +Huitema, et al. Expires 25 August 2026 [Page 17] + +Internet-Draft C4 Specification February 2026 + + Added the RTT margin consideration in Section 4.1, and changed the computation of the "quantum" from: @@ -947,13 +973,6 @@ Authors' Addresses Email: huitema@huitema.net - - -Huitema, et al. Expires 3 August 2026 [Page 17] - -Internet-Draft C4 Specification January 2026 - - Suhas Nandakumar Cisco Email: snandaku@cisco.com @@ -986,23 +1005,4 @@ Internet-Draft C4 Specification January 2026 - - - - - - - - - - - - - - - - - - - -Huitema, et al. Expires 3 August 2026 [Page 18] +Huitema, et al. Expires 25 August 2026 [Page 18] diff --git a/doc/c4-tests.md b/doc/c4-tests.md index 9317b14..ff4d41b 100644 --- a/doc/c4-tests.md +++ b/doc/c4-tests.md @@ -158,7 +158,7 @@ periodic small bumps during the "push" transitions. This scenario simulates a 20MB download over a 200 Mbps link, with a 40ms RTT, and a bottleneck buffer capacity corresponding to 1 BDP. The test verifies that 100 simulations all complete -in less than 1.31 seconds. +in less than 1.3 seconds. This short test shows that the initial phase correctly discover the path capacity, and that the transmission operates at @@ -237,7 +237,7 @@ exercises the "slow down" mechanism to discover the new RTT. The "ECN" test simulates a 20 Mbps link, with an 80ms RTT, and a bottleneck buffer capacity corresponding to 1 BDP. The test verifies that 100 simulated downloads of -10 MB all complete in less than 5 seconds. +10 MB all complete in less than 4.5 seconds. ## Handling of High Jitter Environments {#c4-wifi} @@ -383,7 +383,7 @@ the same jitter characteristics as in the "bad Wi-Fi" test (see {{bad-wifi}}). The background connection tries to download 10MB, the main connection downloads 4MB. The test pass if in each of 100 trials the main connection completes -in less than 11 seconds after the beginning of the trial. +in less than 12 seconds after the beginning of the trial. ## Competition with Cubic diff --git a/doc/c4-tests.txt b/doc/c4-tests.txt index ae82da5..64c81c9 100644 --- a/doc/c4-tests.txt +++ b/doc/c4-tests.txt @@ -5,9 +5,9 @@ Network Working Group C. Huitema Internet-Draft Private Octopus Inc. Intended status: Informational S. Nandakumar -Expires: 6 June 2026 C. Jennings +Expires: 25 August 2026 C. Jennings Cisco - 3 December 2025 + 21 February 2026 Testing of Christian's Congestion Control Code (C4) @@ -40,11 +40,11 @@ Status of This Memo time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on 6 June 2026. + This Internet-Draft will expire on 25 August 2026. Copyright Notice - Copyright (c) 2025 IETF Trust and the persons identified as the + Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal @@ -53,9 +53,9 @@ Copyright Notice -Huitema, et al. Expires 6 June 2026 [Page 1] +Huitema, et al. Expires 25 August 2026 [Page 1] -Internet-Draft C4 Tests December 2025 +Internet-Draft C4 Tests February 2026 Please review these documents carefully, as they describe your rights @@ -109,20 +109,20 @@ Table of Contents -Huitema, et al. Expires 6 June 2026 [Page 2] +Huitema, et al. Expires 25 August 2026 [Page 2] -Internet-Draft C4 Tests December 2025 +Internet-Draft C4 Tests February 2026 2.8.7. Media over bad Wi-Fi . . . . . . . . . . . . . . . . 14 3. Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.1. Loopback tests . . . . . . . . . . . . . . . . . . . . . 14 3.2. Webex prototype deployments . . . . . . . . . . . . . . . 14 - 4. Security Considerations . . . . . . . . . . . . . . . . . . . 14 + 4. Security Considerations . . . . . . . . . . . . . . . . . . . 15 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 6. Informative References . . . . . . . . . . . . . . . . . . . 15 - Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 15 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 15 + Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 16 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 1. Introduction @@ -165,9 +165,9 @@ Internet-Draft C4 Tests December 2025 -Huitema, et al. Expires 6 June 2026 [Page 3] +Huitema, et al. Expires 25 August 2026 [Page 3] -Internet-Draft C4 Tests December 2025 +Internet-Draft C4 Tests February 2026 That implementation is designed so that the same code can be used in @@ -221,29 +221,29 @@ Internet-Draft C4 Tests December 2025 -Huitema, et al. Expires 6 June 2026 [Page 4] +Huitema, et al. Expires 25 August 2026 [Page 4] -Internet-Draft C4 Tests December 2025 +Internet-Draft C4 Tests February 2026 2.2.1. Simulation of a simple 20Mbps connection This scenario simulates a 10MB download over a 20 Mbps link, with an 80ms RTT, and a bottlneck buffer capacity corresponding to 1 BDP. - The test verifies that 100 simulations all complete in less than 5 + The test verifies that 100 simulations all complete in less than 4.7 seconds. In a typical simulation, we see a initial phase complete in less than 800ms, followed by a recovery phase in which the transmission rate stabilizes to the line rate. After that, the RTT remains very close to the path RTT, except for periodic small bumps during the "push" - transitions. The typical test completes in 4.85 seconds. + transitions. 2.2.2. Simulation of a simple 200Mbps connection This scenario simulates a 20MB download over a 200 Mbps link, with a 40ms RTT, and a bottleneck buffer capacity corresponding to 1 BDP. - The test verifies that 100 simulations all complete in less than 1.25 + The test verifies that 100 simulations all complete in less than 1.3 seconds. This short test shows that the initial phase correctly discover the @@ -258,7 +258,7 @@ Internet-Draft C4 Tests December 2025 also tests the support for careful resume [I-D.ietf-tsvwg-careful-resume] by setting the remembered CWND to 18750000 bytes and the remembered RTT to 600.123ms. The test - verifies that 100 simulations all complete in less than 7.4 seconds. + verifies that 100 simulations all complete in less than 7.7 seconds. 2.2.4. Low and up @@ -277,9 +277,9 @@ Internet-Draft C4 Tests December 2025 -Huitema, et al. Expires 6 June 2026 [Page 5] +Huitema, et al. Expires 25 August 2026 [Page 5] -Internet-Draft C4 Tests December 2025 +Internet-Draft C4 Tests February 2026 2.2.5. Drop and back @@ -323,7 +323,7 @@ Internet-Draft C4 Tests December 2025 The "ECN" test simulates a 20 Mbps link, with an 80ms RTT, and a bottleneck buffer capacity corresponding to 1 BDP. The test verifies - that 100 simulated downloads of 10 MB all complete in less than 5.6 + that 100 simulated downloads of 10 MB all complete in less than 4.5 seconds. @@ -333,9 +333,9 @@ Internet-Draft C4 Tests December 2025 -Huitema, et al. Expires 6 June 2026 [Page 6] +Huitema, et al. Expires 25 August 2026 [Page 6] -Internet-Draft C4 Tests December 2025 +Internet-Draft C4 Tests February 2026 2.4. Handling of High Jitter Environments @@ -383,15 +383,15 @@ Internet-Draft C4 Tests December 2025 spikes of 100 to 200ms every second. The data rate is set to 10Mbps, and the base RTT before jitter is set to 2ms, i.e., simulating a local server. The test pass if 100 different 10MB downloads each - complete in less than 4.5 seconds. + complete in less than 4.3 seconds. -Huitema, et al. Expires 6 June 2026 [Page 7] +Huitema, et al. Expires 25 August 2026 [Page 7] -Internet-Draft C4 Tests December 2025 +Internet-Draft C4 Tests February 2026 2.4.2. Wifi fade trial @@ -445,9 +445,9 @@ Internet-Draft C4 Tests December 2025 -Huitema, et al. Expires 6 June 2026 [Page 8] +Huitema, et al. Expires 25 August 2026 [Page 8] -Internet-Draft C4 Tests December 2025 +Internet-Draft C4 Tests February 2026 2.5.2. Short background C4 connection first @@ -474,7 +474,7 @@ Internet-Draft C4 Tests December 2025 same time and using the same path. The path has a 20Mbps data rate and 80ms RTT. The background connection tries to download 30MB, the main connection downloads 20MB. The test pass if in 100 trials the - main connection completes in less than 22.8 seconds. + main connection completes in less than 23 seconds. 2.5.5. Long background C4 connection last @@ -483,7 +483,7 @@ Internet-Draft C4 Tests December 2025 the main connection. The path has a 10Mbps data rate and 70ms RTT. The background connection tries to download 15MB, the main connection downloads 10MB. The test pass if in 100 trials the main connection - completes in less than 22.2 seconds after the beginning of the trial. + completes in less than 23 seconds after the beginning of the trial. 2.5.6. Compete with C4 over bad Wi-Fi @@ -494,16 +494,16 @@ Internet-Draft C4 Tests December 2025 the same jitter characteristics as in the "bad Wi-Fi" test (see Section 2.4.1). The background connection tries to download 10MB, the main connection downloads 4MB. The test pass if in each of 100 - trials the main connection completes in less than 13 seconds after + trials the main connection completes in less than 12 seconds after the beginning of the trial. -Huitema, et al. Expires 6 June 2026 [Page 9] +Huitema, et al. Expires 25 August 2026 [Page 9] -Internet-Draft C4 Tests December 2025 +Internet-Draft C4 Tests February 2026 2.6. Competition with Cubic @@ -525,7 +525,7 @@ Internet-Draft C4 Tests December 2025 same time and using the same path. The path has a 20Mbps data rate and 80ms RTT. The background Cubic connection tries to download 10MB, the main connection downloads 5MB. The test pass if in 100 - trials the main connection completes in less than 6.7 seconds. + trials the main connection completes in less than 6.8 seconds. 2.6.2. Two long C4 and Cubic connections @@ -533,8 +533,7 @@ Internet-Draft C4 Tests December 2025 starting at the same time and using the same path. The path has a 20Mbps data rate and 80ms RTT. The background connection tries to download 30MB, the main connection downloads 20MB. The test pass if - in 100 trials the main connection completes in less than 22.2 - seconds. + in 100 trials the main connection completes in less than 23 seconds. 2.6.3. Long Cubic background connection last @@ -543,7 +542,7 @@ Internet-Draft C4 Tests December 2025 starting 1 second after the main connection. The path has a 10Mbps data rate and 70ms RTT. The background connection tries to download 15MB, the main connection downloads 10MB. The test pass if in 100 - trials the main connection completes in less than 22 seconds after + trials the main connection completes in less than 23 seconds after the beginning of the trial. 2.6.4. Compete with Cubic over bad Wi-Fi @@ -554,15 +553,15 @@ Internet-Draft C4 Tests December 2025 10Mbps data rate and 2ms RTT, plus Wi-Fi jitter set to 7ms average -- the same jitter characteristics as in the "bad Wi-Fi" test (see Section 2.4.1). The background connection tries to download 10MB, + the main connection downloads 4MB. The test pass if in each of 100 -Huitema, et al. Expires 6 June 2026 [Page 10] +Huitema, et al. Expires 25 August 2026 [Page 10] -Internet-Draft C4 Tests December 2025 +Internet-Draft C4 Tests February 2026 - the main connection downloads 4MB. The test pass if in each of 100 trials the main connection completes in less than 12.5 seconds after the beginning of the trial. @@ -613,9 +612,10 @@ Internet-Draft C4 Tests December 2025 -Huitema, et al. Expires 6 June 2026 [Page 11] + +Huitema, et al. Expires 25 August 2026 [Page 11] -Internet-Draft C4 Tests December 2025 +Internet-Draft C4 Tests February 2026 2.7.4. Compete with BBR over bad Wi-Fi @@ -627,7 +627,7 @@ Internet-Draft C4 Tests December 2025 -- the same jitter characteristics as in the "bad Wi-Fi" test (see Section 2.4.1). The background connection tries to download 10MB, the main connection downloads 4MB. The test pass if in each of 100 - trials the main connection completes in less than 13.5 seconds after + trials the main connection completes in less than 14.5 seconds after the beginning of the trial. 2.8. Handling of Multimedia Applications @@ -669,9 +669,9 @@ Internet-Draft C4 Tests December 2025 -Huitema, et al. Expires 6 June 2026 [Page 12] +Huitema, et al. Expires 25 August 2026 [Page 12] -Internet-Draft C4 Tests December 2025 +Internet-Draft C4 Tests February 2026 2.8.1. Media on High Speed Connection @@ -725,9 +725,9 @@ Internet-Draft C4 Tests December 2025 -Huitema, et al. Expires 6 June 2026 [Page 13] +Huitema, et al. Expires 25 August 2026 [Page 13] -Internet-Draft C4 Tests December 2025 +Internet-Draft C4 Tests February 2026 to 110ms, and the maximum delay is set to 350ms. The test is @@ -757,8 +757,8 @@ Internet-Draft C4 Tests December 2025 before jitter is set to 2ms, i.e., simulating a local server. The test lasts for 5 video groups of frames, i.e. 5 seconds. The measurements start 200ms after the start of the connection. The - expected average delay is set to 120ms, and the maximum delay is set - to 675ms. The test is successful if 100 trials are all successful. + expected average delay is set to 100ms, and the maximum delay is set + to 680ms. The test is successful if 100 trials are all successful. 3. Tests @@ -766,26 +766,33 @@ Internet-Draft C4 Tests December 2025 3.1. Loopback tests - To do. Write down. + Loopback tests were performed on Windows, downloading 10GB of data + over a loopback connection. They showed picoquic using C4 achieving + a data rate of 3Gbps, slightly more than the 2.9Gbps achieved when + using Cubic or the 2.6 Gbps achieved when using BBR. 3.2. Webex prototype deployments To do. Write down. -4. Security Considerations - This documentation of protocol testing does not have any particular - security considerations. - We did not include specific security oriented tests in this document. -Huitema, et al. Expires 6 June 2026 [Page 14] + +Huitema, et al. Expires 25 August 2026 [Page 14] -Internet-Draft C4 Tests December 2025 +Internet-Draft C4 Tests February 2026 +4. Security Considerations + + This documentation of protocol testing does not have any particular + security considerations. + + We did not include specific security oriented tests in this document. + 5. IANA Considerations This document has no IANA actions. @@ -800,9 +807,9 @@ Internet-Draft C4 Tests December 2025 [I-D.ietf-moq-transport] Nandakumar, S., Vasiliev, V., Swett, I., and A. Frindell, "Media over QUIC Transport", Work in Progress, Internet- - Draft, draft-ietf-moq-transport-15, 20 October 2025, + Draft, draft-ietf-moq-transport-16, 13 January 2026, . + transport-16>. [I-D.ietf-tsvwg-careful-resume] Kuhn, N., Stephan, E., Fairhurst, G., Secchi, R., and C. @@ -825,6 +832,16 @@ Internet-Draft C4 Tests December 2025 Repository , 2025, . + + + + + +Huitema, et al. Expires 25 August 2026 [Page 15] + +Internet-Draft C4 Tests February 2026 + + Acknowledgments TODO acknowledge. @@ -836,12 +853,6 @@ Authors' Addresses Email: huitema@huitema.net - -Huitema, et al. Expires 6 June 2026 [Page 15] - -Internet-Draft C4 Tests December 2025 - - Suhas Nandakumar Cisco Email: snandaku@cisco.com @@ -882,15 +893,4 @@ Internet-Draft C4 Tests December 2025 - - - - - - - - - - - -Huitema, et al. Expires 6 June 2026 [Page 16] +Huitema, et al. Expires 25 August 2026 [Page 16]