draft-ietf-tsvwg-aqm-dualq-coupled-13.txt | draft-ietf-tsvwg-aqm-dualq-coupled-14.txt | |||
---|---|---|---|---|
Transport Area working group (tsvwg) K. De Schepper | Transport Area working group (tsvwg) K. De Schepper | |||
Internet-Draft Nokia Bell Labs | Internet-Draft Nokia Bell Labs | |||
Intended status: Experimental B. Briscoe, Ed. | Intended status: Experimental B. Briscoe, Ed. | |||
Expires: May 19, 2021 Independent | Expires: September 11, 2021 Independent | |||
G. White | G. White | |||
CableLabs | CableLabs | |||
November 15, 2020 | March 10, 2021 | |||
DualQ Coupled AQMs for Low Latency, Low Loss and Scalable Throughput | DualQ Coupled AQMs for Low Latency, Low Loss and Scalable Throughput | |||
(L4S) | (L4S) | |||
draft-ietf-tsvwg-aqm-dualq-coupled-13 | draft-ietf-tsvwg-aqm-dualq-coupled-14 | |||
Abstract | Abstract | |||
The Low Latency Low Loss Scalable Throughput (L4S) architecture | The Low Latency Low Loss Scalable Throughput (L4S) architecture | |||
allows data flows over the public Internet to achieve consistent low | allows data flows over the public Internet to achieve consistent low | |||
queuing latency, generally zero congestion loss and scaling of per- | queuing latency, generally zero congestion loss and scaling of per- | |||
flow throughput without the scaling problems of standard TCP Reno- | flow throughput without the scaling problems of standard TCP Reno- | |||
friendly congestion controls. To achieve this, L4S data flows have | friendly congestion controls. To achieve this, L4S data flows have | |||
to use one of the family of 'Scalable' congestion controls (TCP | to use one of the family of 'Scalable' congestion controls (TCP | |||
Prague and Data Center TCP are examples) and a form of Explicit | Prague and Data Center TCP are examples) and a form of Explicit | |||
skipping to change at page 2, line 20 ¶ | skipping to change at page 2, line 20 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on May 19, 2021. | This Internet-Draft will expire on September 11, 2021. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2020 IETF Trust and the persons identified as the | Copyright (c) 2021 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
skipping to change at page 10, line 39 ¶ | skipping to change at page 10, line 39 ¶ | |||
Classic traffic, | Classic traffic, | |||
The L4S queue has latency priority, but the coupling from the Classic | The L4S queue has latency priority, but the coupling from the Classic | |||
to the L4S AQM (explained below) ensures that it does not have | to the L4S AQM (explained below) ensures that it does not have | |||
bandwidth priority over the Classic queue. | bandwidth priority over the Classic queue. | |||
2. DualQ Coupled AQM | 2. DualQ Coupled AQM | |||
There are two main aspects to the approach: | There are two main aspects to the approach: | |||
o the Coupled AQM that addresses throughput equivalence between | o The Coupled AQM that addresses throughput equivalence between | |||
Classic (e.g. Reno, Cubic) flows and L4S flows (that satisfy the | Classic (e.g. Reno, Cubic) flows and L4S flows (that satisfy the | |||
Prague L4S requirements). | Prague L4S requirements). | |||
o the Dual Queue structure that provides latency separation for L4S | o The Dual Queue structure that provides latency separation for L4S | |||
flows to isolate them from the typically large Classic queue. | flows to isolate them from the typically large Classic queue. | |||
2.1. Coupled AQM | 2.1. Coupled AQM | |||
In the 1990s, the `TCP formula' was derived for the relationship | In the 1990s, the `TCP formula' was derived for the relationship | |||
between the steady-state congestion window, cwnd, and the drop | between the steady-state congestion window, cwnd, and the drop | |||
probability, p of standard Reno congestion control [RFC5681] . To a | probability, p of standard Reno congestion control [RFC5681] . To a | |||
first order approximation, the steady-state cwnd of Reno is inversely | first order approximation, the steady-state cwnd of Reno is inversely | |||
proportional to the square root of p. | proportional to the square root of p. | |||
skipping to change at page 15, line 43 ¶ | skipping to change at page 15, line 43 ¶ | |||
Indeed, this Base AQM with just the squared output and no L4S queue | Indeed, this Base AQM with just the squared output and no L4S queue | |||
can be used as a drop-in replacement for PIE [RFC8033], in which case | can be used as a drop-in replacement for PIE [RFC8033], in which case | |||
it is just called PI2 [PI2]. PI2 is a principled simplification of | it is just called PI2 [PI2]. PI2 is a principled simplification of | |||
PIE that is both more responsive and more stable in the face of | PIE that is both more responsive and more stable in the face of | |||
dynamically varying load. | dynamically varying load. | |||
Curvy RED is derived from RED [RFC2309], but its configuration | Curvy RED is derived from RED [RFC2309], but its configuration | |||
parameters are insensitive to link rate and it requires less | parameters are insensitive to link rate and it requires less | |||
operations per packet. However, DualPI2 is more responsive and | operations per packet. However, DualPI2 is more responsive and | |||
stable over a wider range of RTTs than Curvy RED. As a consequence, | stable over a wider range of RTTs than Curvy RED. As a consequence, | |||
DualPI2 has attracted more development and evaluation attention than | at the time of writing, DualPI2 has attracted more development and | |||
Curvy RED, leaving the Curvy RED design incomplete and not so fully | evaluation attention than Curvy RED, leaving the Curvy RED design | |||
evaluated. | incomplete and not so fully evaluated. | |||
Both AQMs regulate their queue in units of time rather than bytes. | Both AQMs regulate their queue in units of time rather than bytes. | |||
As already explained, this ensures configuration can be invariant for | As already explained, this ensures configuration can be invariant for | |||
different drain rates. With AQMs in a dualQ structure this is | different drain rates. With AQMs in a dualQ structure this is | |||
particularly important because the drain rate of each queue can vary | particularly important because the drain rate of each queue can vary | |||
rapidly as flows for the two queues arrive and depart, even if the | rapidly as flows for the two queues arrive and depart, even if the | |||
combined link rate is constant. | combined link rate is constant. | |||
It would be possible to control the queues with other alternative | It would be possible to control the queues with other alternative | |||
AQMs, as long as the normative requirements (those expressed in | AQMs, as long as the normative requirements (those expressed in | |||
capitals) in Section 2.5 are observed. | capitals) in Section 2.5 are observed. | |||
2.5. Normative Requirements for a DualQ Coupled AQM | 2.5. Normative Requirements for a DualQ Coupled AQM | |||
The following requirements are intended to capture only the essential | The following requirements are intended to capture only the essential | |||
aspects of a DualQ Coupled AQM. They are intended to be independent | aspects of a DualQ Coupled AQM. They are intended to be independent | |||
of the particular AQMs used for each queue. | of the particular AQMs used for each queue. | |||
2.5.1. Functional Requirements | 2.5.1. Functional Requirements | |||
A Dual Queue Coupled AQM implementation MUST comply with the | ||||
prerequisite L4S behaviours for any L4S network node (not just a | ||||
DualQ) as specified in section 5 of [I-D.ietf-tsvwg-ecn-l4s-id]. | ||||
These primarily concern classification and remarking as briefly | ||||
summarized in Section 2.3 earlier. But there is also a subsection | ||||
(5.5) giving guidance on reducing the burstiness of the link | ||||
technology underlying any L4S AQM. | ||||
A Dual Queue Coupled AQM implementation MUST utilize two queues, each | A Dual Queue Coupled AQM implementation MUST utilize two queues, each | |||
with an AQM algorithm. The two queues can be part of a larger | with an AQM algorithm. The two queues can be part of a larger | |||
queuing hierarchy [I-D.briscoe-tsvwg-l4s-diffserv]. | queuing hierarchy [I-D.briscoe-tsvwg-l4s-diffserv]. | |||
The AQM algorithm for the low latency (L) queue MUST be able to apply | The AQM algorithm for the low latency (L) queue MUST be able to apply | |||
ECN marking to ECN-capable packets. | ECN marking to ECN-capable packets. | |||
The scheduler draining the two queues MUST give L4S packets priority | The scheduler draining the two queues MUST give L4S packets priority | |||
over Classic, although priority MUST be bounded in order not to | over Classic, although priority MUST be bounded in order not to | |||
starve Classic traffic. The scheduler SHOULD be work-conserving. | starve Classic traffic. The scheduler SHOULD be work-conserving. | |||
skipping to change at page 25, line 42 ¶ | skipping to change at page 25, line 46 ¶ | |||
implementations were tested. | implementations were tested. | |||
7. References | 7. References | |||
7.1. Normative References | 7.1. Normative References | |||
[I-D.ietf-tsvwg-ecn-l4s-id] | [I-D.ietf-tsvwg-ecn-l4s-id] | |||
Schepper, K. and B. Briscoe, "Identifying Modified | Schepper, K. and B. Briscoe, "Identifying Modified | |||
Explicit Congestion Notification (ECN) Semantics for | Explicit Congestion Notification (ECN) Semantics for | |||
Ultra-Low Queuing Delay (L4S)", draft-ietf-tsvwg-ecn-l4s- | Ultra-Low Queuing Delay (L4S)", draft-ietf-tsvwg-ecn-l4s- | |||
id-11 (work in progress), November 2020. | id-12 (work in progress), November 2020. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | |||
of Explicit Congestion Notification (ECN) to IP", | of Explicit Congestion Notification (ECN) to IP", | |||
RFC 3168, DOI 10.17487/RFC3168, September 2001, | RFC 3168, DOI 10.17487/RFC3168, September 2001, | |||
<https://www.rfc-editor.org/info/rfc3168>. | <https://www.rfc-editor.org/info/rfc3168>. | |||
skipping to change at page 27, line 48 ¶ | skipping to change at page 27, line 48 ¶ | |||
November 2018. | November 2018. | |||
[I-D.cardwell-iccrg-bbr-congestion-control] | [I-D.cardwell-iccrg-bbr-congestion-control] | |||
Cardwell, N., Cheng, Y., Yeganeh, S., and V. Jacobson, | Cardwell, N., Cheng, Y., Yeganeh, S., and V. Jacobson, | |||
"BBR Congestion Control", draft-cardwell-iccrg-bbr- | "BBR Congestion Control", draft-cardwell-iccrg-bbr- | |||
congestion-control-00 (work in progress), July 2017. | congestion-control-00 (work in progress), July 2017. | |||
[I-D.ietf-tsvwg-l4s-arch] | [I-D.ietf-tsvwg-l4s-arch] | |||
Briscoe, B., Schepper, K., Bagnulo, M., and G. White, "Low | Briscoe, B., Schepper, K., Bagnulo, M., and G. White, "Low | |||
Latency, Low Loss, Scalable Throughput (L4S) Internet | Latency, Low Loss, Scalable Throughput (L4S) Internet | |||
Service: Architecture", draft-ietf-tsvwg-l4s-arch-07 (work | Service: Architecture", draft-ietf-tsvwg-l4s-arch-08 (work | |||
in progress), October 2020. | in progress), November 2020. | |||
[I-D.ietf-tsvwg-nqb] | [I-D.ietf-tsvwg-nqb] | |||
White, G. and T. Fossati, "A Non-Queue-Building Per-Hop | White, G. and T. Fossati, "A Non-Queue-Building Per-Hop | |||
Behavior (NQB PHB) for Differentiated Services", draft- | Behavior (NQB PHB) for Differentiated Services", draft- | |||
ietf-tsvwg-nqb-03 (work in progress), November 2020. | ietf-tsvwg-nqb-03 (work in progress), November 2020. | |||
[L4Sdemo16] | [L4Sdemo16] | |||
Bondarenko, O., De Schepper, K., Tsang, I., and B. | Bondarenko, O., De Schepper, K., Tsang, I., and B. | |||
Briscoe, "Ultra-Low Delay for All: Live Experience, Live | Briscoe, "Ultra-Low Delay for All: Live Experience, Live | |||
Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016, | Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016, | |||
skipping to change at page 31, line 27 ¶ | skipping to change at page 31, line 27 ¶ | |||
[DualPI2Linux]. The specification of the DualQ Coupled AQM for | [DualPI2Linux]. The specification of the DualQ Coupled AQM for | |||
DOCSIS cable modems and CMTSs is available in [DOCSIS3.1] and | DOCSIS cable modems and CMTSs is available in [DOCSIS3.1] and | |||
explained in [LLD]. | explained in [LLD]. | |||
A.1. Pass #1: Core Concepts | A.1. Pass #1: Core Concepts | |||
The pseudocode manipulates three main structures of variables: the | The pseudocode manipulates three main structures of variables: the | |||
packet (pkt), the L4S queue (lq) and the Classic queue (cq). The | packet (pkt), the L4S queue (lq) and the Classic queue (cq). The | |||
pseudocode consists of the following six functions: | pseudocode consists of the following six functions: | |||
o the initialization function dualpi2_params_init(...) (Figure 2) | o The initialization function dualpi2_params_init(...) (Figure 2) | |||
that sets parameter defaults (the API for setting non-default | that sets parameter defaults (the API for setting non-default | |||
values is omitted for brevity) | values is omitted for brevity) | |||
o the enqueue function dualpi2_enqueue(lq, cq, pkt) (Figure 3) | o The enqueue function dualpi2_enqueue(lq, cq, pkt) (Figure 3) | |||
o the dequeue function dualpi2_dequeue(lq, cq, pkt) (Figure 4) | o The dequeue function dualpi2_dequeue(lq, cq, pkt) (Figure 4) | |||
o recur(q, likelihood) for de-randomized ECN marking (shown at the | o The recurrence function recur(q, likelihood) for de-randomized ECN | |||
end of Figure 4). | marking (shown at the end of Figure 4). | |||
o the L4S AQM function laqm(qdelay) (Figure 5) used to calculate the | o The L4S AQM function laqm(qdelay) (Figure 5) used to calculate the | |||
ECN-marking probability for the L4S queue | ECN-marking probability for the L4S queue | |||
o the base AQM function that implements the PI algorithm | o The base AQM function that implements the PI algorithm | |||
dualpi2_update(lq, cq) (Figure 6) used to regularly update the | dualpi2_update(lq, cq) (Figure 6) used to regularly update the | |||
base probability (p'), which is squared for the Classic AQM as | base probability (p'), which is squared for the Classic AQM as | |||
well as being coupled across to the L4S queue. | well as being coupled across to the L4S queue. | |||
It also uses the following functions that are not shown in full here: | It also uses the following functions that are not shown in full here: | |||
o scheduler(), which selects between the head packets of the two | o scheduler(), which selects between the head packets of the two | |||
queues; the choice of scheduler technology is discussed later; | queues; the choice of scheduler technology is discussed later; | |||
o cq.len() or lq.len() returns the current length (aka. backlog) of | o cq.len() or lq.len() returns the current length (aka. backlog) of | |||
skipping to change at page 35, line 42 ¶ | skipping to change at page 35, line 42 ¶ | |||
signals, but it still desynchronizes the saw-teeth of the flows. | signals, but it still desynchronizes the saw-teeth of the flows. | |||
Line 6 calculates p_L as the maximum of the coupled L4S | Line 6 calculates p_L as the maximum of the coupled L4S | |||
probability p_CL and the probability from the native L4S AQM p'_L. | probability p_CL and the probability from the native L4S AQM p'_L. | |||
This implements the max() function shown in Figure 1 to couple the | This implements the max() function shown in Figure 1 to couple the | |||
outputs of the two AQMs together. Of the two probabilities input | outputs of the two AQMs together. Of the two probabilities input | |||
to p_L in line 6: | to p_L in line 6: | |||
* p'_L is calculated per packet in line 5 by the laqm() function | * p'_L is calculated per packet in line 5 by the laqm() function | |||
(see Figure 5), | (see Figure 5), | |||
* whereas p_CL is maintained by the dualpi2_update() function | * Whereas p_CL is maintained by the dualpi2_update() function | |||
which runs every Tupdate (Tupdate is set in line 13 of | which runs every Tupdate (Tupdate is set in line 13 of | |||
Figure 2. It defaults to 16 ms in the reference Linux | Figure 2. It defaults to 16 ms in the reference Linux | |||
implementation because it has to be rounded to a multiple of 4 | implementation because it has to be rounded to a multiple of 4 | |||
ms). | ms). | |||
o If a Classic packet is scheduled, lines 10 to 17 drop or mark the | o If a Classic packet is scheduled, lines 10 to 17 drop or mark the | |||
packet with probability p_C. | packet with probability p_C. | |||
The Native L4S AQM algorithm (Figure 5) is a ramp function, similar | The Native L4S AQM algorithm (Figure 5) is a ramp function, similar | |||
to the RED algorithm, but simplified as follows: | to the RED algorithm, but simplified as follows: | |||
skipping to change at page 44, line 31 ¶ | skipping to change at page 44, line 31 ¶ | |||
pseudocode (Figure 11). To aid comparison, the line numbers are kept | pseudocode (Figure 11). To aid comparison, the line numbers are kept | |||
in step between the two by using letter suffixes where the longer | in step between the two by using letter suffixes where the longer | |||
code needs extra lines. | code needs extra lines. | |||
B.1. Curvy RED in Pseudocode | B.1. Curvy RED in Pseudocode | |||
The pseudocode manipulates three main structures of variables: the | The pseudocode manipulates three main structures of variables: the | |||
packet (pkt), the L4S queue (lq) and the Classic queue (cq) and | packet (pkt), the L4S queue (lq) and the Classic queue (cq) and | |||
consists of the following five functions: | consists of the following five functions: | |||
o the initialization function cred_params_init(...) (Figure 2) that | o The initialization function cred_params_init(...) (Figure 2) that | |||
sets parameter defaults (the API for setting non-default values is | sets parameter defaults (the API for setting non-default values is | |||
omitted for brevity); | omitted for brevity); | |||
o the dequeue function cred_dequeue(lq, cq, pkt) (Figure 4); | o The dequeue function cred_dequeue(lq, cq, pkt) (Figure 4); | |||
o the scheduling function scheduler(), which selects between the | o The scheduling function scheduler(), which selects between the | |||
head packets of the two queues. | head packets of the two queues. | |||
It also uses the following functions that are either shown elsewhere, | It also uses the following functions that are either shown elsewhere, | |||
or not shown in full here: | or not shown in full here: | |||
o the enqueue function, which is identical to that used for DualPI2, | o The enqueue function, which is identical to that used for DualPI2, | |||
dualpi2_enqueue(lq, cq, pkt) in Figure 3; | dualpi2_enqueue(lq, cq, pkt) in Figure 3; | |||
o mark(pkt) and drop(pkt) for ECN-marking and dropping a packet; | o mark(pkt) and drop(pkt) for ECN-marking and dropping a packet; | |||
o cq.len() or lq.len() returns the current length (aka. backlog) of | o cq.len() or lq.len() returns the current length (aka. backlog) of | |||
the relevant queue in bytes; | the relevant queue in bytes; | |||
o cq.time() or lq.time() returns the current queuing delay | o cq.time() or lq.time() returns the current queuing delay | |||
(aka. sojourn time or service time) of the relevant queue in units | (aka. sojourn time or service time) of the relevant queue in units | |||
of time (see Note a in Appendix A.1). | of time (see Note a in Appendix A.1). | |||
End of changes. 22 change blocks. | ||||
25 lines changed or deleted | 33 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |