--- 1/draft-ietf-tsvwg-aqm-dualq-coupled-14.txt 2021-05-23 15:13:34.962591998 -0700 +++ 2/draft-ietf-tsvwg-aqm-dualq-coupled-15.txt 2021-05-23 15:13:35.070594700 -0700 @@ -1,22 +1,22 @@ Transport Area working group (tsvwg) K. De Schepper Internet-Draft Nokia Bell Labs Intended status: Experimental B. Briscoe, Ed. -Expires: September 11, 2021 Independent +Expires: November 22, 2021 Independent G. White CableLabs - March 10, 2021 + May 21, 2021 DualQ Coupled AQMs for Low Latency, Low Loss and Scalable Throughput (L4S) - draft-ietf-tsvwg-aqm-dualq-coupled-14 + draft-ietf-tsvwg-aqm-dualq-coupled-15 Abstract The Low Latency Low Loss Scalable Throughput (L4S) architecture allows data flows over the public Internet to achieve consistent low queuing latency, generally zero congestion loss and scaling of per- flow throughput without the scaling problems of standard TCP Reno- friendly congestion controls. To achieve this, L4S data flows have to use one of the family of 'Scalable' congestion controls (TCP Prague and Data Center TCP are examples) and a form of Explicit @@ -51,21 +51,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on September 11, 2021. + This Internet-Draft will expire on November 22, 2021. Copyright Notice Copyright (c) 2021 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -116,39 +116,39 @@ B.2. Efficient Implementation of Curvy RED . . . . . . . . . . 50 Appendix C. Choice of Coupling Factor, k . . . . . . . . . . . . 52 C.1. RTT-Dependence . . . . . . . . . . . . . . . . . . . . . 52 C.2. Guidance on Controlling Throughput Equivalence . . . . . 53 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 54 1. Introduction This document specifies a framework for DualQ Coupled AQMs, which is the network part of the L4S architecture [I-D.ietf-tsvwg-l4s-arch]. - L4S enables both ultra-low queuing latency (sub-millisecond on + L4S enables both very low queuing latency (sub-millisecond on average) and high throughput at the same time, for ad hoc numbers of capacity-seeking applications all sharing the same capacity. 1.1. Outline of the Problem Latency is becoming the critical performance factor for many (most?) applications on the public Internet, e.g. interactive Web, Web services, voice, conversational video, interactive video, interactive remote presence, instant messaging, online gaming, remote desktop, cloud-based applications, and video-assisted remote control of machinery and industrial processes. In the developed world, further increases in access network bit-rate offer diminishing returns, whereas latency is still a multi-faceted problem. In the last decade or so, much has been done to reduce propagation time by placing caches or servers closer to users. However, queuing remains a major intermittent component of latency. - Traditionally ultra-low latency has only been available for a few + Traditionally very low latency has only been available for a few selected low rate applications, that confine their sending rate within a specially carved-off portion of capacity, which is prioritized over other traffic, e.g. Diffserv EF [RFC3246]. Up to now it has not been possible to allow any number of low latency, high throughput applications to seek to fully utilize available capacity, because the capacity-seeking process itself causes too much queuing delay. To reduce this queuing delay caused by the capacity seeking process, changes either to the network alone or to end-systems alone are in @@ -392,21 +392,21 @@ ECT stands for ECN-Capable Transport and CE stands for Congestion Experienced. 1.4. Features The AQM couples marking and/or dropping from the Classic queue to the L4S queue in such a way that a flow will get roughly the same throughput whichever it uses. Therefore both queues can feed into the full capacity of a link and no rates need to be configured for the queues. The L4S queue enables Scalable congestion controls like - DCTCP or TCP Prague to give ultra-low and predictably low latency, + DCTCP or TCP Prague to give very low and predictably low latency, without compromising the performance of competing 'Classic' Internet traffic. Thousands of tests have been conducted in a typical fixed residential broadband setting. Experiments used a range of base round trip delays up to 100ms and link rates up to 200 Mb/s between the data centre and home network, with varying amounts of background traffic in both queues. For every L4S packet, the AQM kept the average queuing delay below 1ms (or 2 packets where serialization delay exceeded 1ms on slower links), with 99th percentile no worse than @@ -421,22 +421,22 @@ on the fly in 'the cloud' from a football match. Another user wearing VR goggles was remotely receiving a feed from a 360-degree camera in a racing car, again with the sub-window in their field of vision generated on the fly in 'the cloud' dependent on their head movements. Even though other users were also downloading large amounts of L4S and Classic data, playing a gaming benchmark and watchings videos over the same 40Mb/s downstream broadband link, latency was so low that the football picture appeared to stick to the user's finger on the touch pad and the experience fed from the remote camera did not noticeably lag head movements. All the L4S data (even - including the downloads) achieved the same ultra-low latency. With - an alternative AQM, the video noticeably lagged behind the finger + including the downloads) achieved the same very low latency. With an + alternative AQM, the video noticeably lagged behind the finger gestures and head movements. Unlike Diffserv Expedited Forwarding, the L4S queue does not have to be limited to a small proportion of the link capacity in order to achieve low delay. The L4S queue can be filled with a heavy load of capacity-seeking flows (TCP Prague etc.) and still achieve low delay. The L4S queue does not rely on the presence of other traffic in the Classic queue that can be 'overtaken'. It gives low latency to L4S traffic whether or not there is Classic traffic, and the latency of Classic traffic does not suffer when a proportion of the traffic is @@ -542,23 +542,22 @@ 2.3. Traffic Classification Both the Coupled AQM and DualQ mechanisms need an identifier to distinguish L4S (L) and Classic (C) packets. Then the coupling algorithm can achieve coexistence without having to inspect flow identifiers, because it can apply the appropriate marking or dropping probability to all flows of each type. A separate specification [I-D.ietf-tsvwg-ecn-l4s-id] requires the network to treat the ECT(1) and CE codepoints of the ECN field as this - identifier, having assessed various alternatives. An additional - process document has proved necessary to make the ECT(1) codepoint - available for experimentation [RFC8311]. + identifier. An additional process document has proved necessary to + make the ECT(1) codepoint available for experimentation [RFC8311]. For policy reasons, an operator might choose to steer certain packets (e.g. from certain flows or with certain addresses) out of the L queue, even though they identify themselves as L4S by their ECN codepoints. In such cases, [I-D.ietf-tsvwg-ecn-l4s-id] says that the device "MUST NOT alter the end-to-end L4S ECN identifier", so that it is preserved end-to-end. The aim is that each operator can choose how it treats L4S traffic locally, but an individual operator does not alter the identification of L4S packets, which would prevent other operators downstream from making their own choices on how to @@ -1173,24 +1172,24 @@ Ing Jyh (Inton) Tsang of Nokia, Belgium built the End-to-End Data Centre to the Home broadband testbed on which DualQ Coupled AQM implementations were tested. 7. References 7.1. Normative References [I-D.ietf-tsvwg-ecn-l4s-id] - Schepper, K. and B. Briscoe, "Identifying Modified - Explicit Congestion Notification (ECN) Semantics for - Ultra-Low Queuing Delay (L4S)", draft-ietf-tsvwg-ecn-l4s- - id-12 (work in progress), November 2020. + Schepper, K. D. and B. Briscoe, "Explicit Congestion + Notification (ECN) Protocol for Ultra-Low Queuing Delay + (L4S)", draft-ietf-tsvwg-ecn-l4s-id-14 (work in progress), + March 2021. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, DOI 10.17487/RFC3168, September 2001, . @@ -1265,34 +1264,34 @@ Low Latency", draft-briscoe-docsis-q-protection-00 (work in progress), July 2019. [I-D.briscoe-tsvwg-l4s-diffserv] Briscoe, B., "Interactions between Low Latency, Low Loss, Scalable Throughput (L4S) and Differentiated Services", draft-briscoe-tsvwg-l4s-diffserv-02 (work in progress), November 2018. [I-D.cardwell-iccrg-bbr-congestion-control] - Cardwell, N., Cheng, Y., Yeganeh, S., and V. Jacobson, + Cardwell, N., Cheng, Y., Yeganeh, S. H., and V. Jacobson, "BBR Congestion Control", draft-cardwell-iccrg-bbr- congestion-control-00 (work in progress), July 2017. [I-D.ietf-tsvwg-l4s-arch] - Briscoe, B., Schepper, K., Bagnulo, M., and G. White, "Low - Latency, Low Loss, Scalable Throughput (L4S) Internet + Briscoe, B., Schepper, K. D., Bagnulo, M., and G. White, + "Low Latency, Low Loss, Scalable Throughput (L4S) Internet Service: Architecture", draft-ietf-tsvwg-l4s-arch-08 (work in progress), November 2020. [I-D.ietf-tsvwg-nqb] White, G. and T. Fossati, "A Non-Queue-Building Per-Hop Behavior (NQB PHB) for Differentiated Services", draft- - ietf-tsvwg-nqb-03 (work in progress), November 2020. + ietf-tsvwg-nqb-05 (work in progress), March 2021. [L4Sdemo16] Bondarenko, O., De Schepper, K., Tsang, I., and B. Briscoe, "Ultra-Low Delay for All: Live Experience, Live Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016, . [LLD] White, G., Sundaresan, K., and B. Briscoe, "Low Latency @@ -2352,21 +2351,21 @@ Where Classic flows compete for the same capacity, their relative flow rates depend not only on the congestion probability, but also on their end-to-end RTT (= base RTT + queue delay). The rates of competing Reno [RFC5681] flows are roughly inversely proportional to their RTTs. Cubic exhibits similar RTT-dependence when in Reno- compatibility mode, but is less RTT-dependent otherwise. Until the early experiments with the DualQ Coupled AQM, the importance of the reasonably large Classic queue in mitigating RTT- - dependence had not been appreciated. Appendix A.1.5 of + dependence had not been appreciated. Appendix A.1.6 of [I-D.ietf-tsvwg-ecn-l4s-id] uses numerical examples to explain why bloated buffers had concealed the RTT-dependence of Classic congestion controls before that time. Then it explains why, the more that queuing delays have reduced, the more that RTT-dependence has surfaced as a potential starvation problem for long RTT flows. Given that congestion control on end-systems is voluntary, there is no reason why it has to be voluntarily RTT-dependent. Therefore [I-D.ietf-tsvwg-ecn-l4s-id] requires L4S congestion controls to be significantly less RTT-dependent than the standard Reno congestion