--- 1/draft-ietf-tsvwg-aqm-dualq-coupled-11.txt 2020-07-27 16:13:09.290904174 -0700
+++ 2/draft-ietf-tsvwg-aqm-dualq-coupled-12.txt 2020-07-27 16:13:09.398906934 -0700
@@ -1,22 +1,22 @@
Transport Area working group (tsvwg) K. De Schepper
Internet-Draft Nokia Bell Labs
Intended status: Experimental B. Briscoe, Ed.
-Expires: September 10, 2020 Independent
+Expires: January 28, 2021 Independent
G. White
CableLabs
- March 9, 2020
+ July 27, 2020
DualQ Coupled AQMs for Low Latency, Low Loss and Scalable Throughput
(L4S)
- draft-ietf-tsvwg-aqm-dualq-coupled-11
+ draft-ietf-tsvwg-aqm-dualq-coupled-12
Abstract
The Low Latency Low Loss Scalable Throughput (L4S) architecture
allows data flows over the public Internet to achieve consistent low
queuing latency, generally zero congestion loss and scaling of per-
flow throughput without the scaling problems of standard TCP Reno-
friendly congestion controls. To achieve this, L4S data flows have
to use one of the family of 'Scalable' congestion controls (TCP
Prague and Data Center TCP are examples) and a form of Explicit
@@ -51,21 +51,21 @@
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
- This Internet-Draft will expire on September 10, 2020.
+ This Internet-Draft will expire on January 28, 2021.
Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
@@ -1162,21 +1162,21 @@
implementations were tested.
7. References
7.1. Normative References
[I-D.ietf-tsvwg-ecn-l4s-id]
Schepper, K. and B. Briscoe, "Identifying Modified
Explicit Congestion Notification (ECN) Semantics for
Ultra-Low Queuing Delay (L4S)", draft-ietf-tsvwg-ecn-l4s-
- id-09 (work in progress), February 2020.
+ id-10 (work in progress), March 2020.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
.
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
of Explicit Congestion Notification (ECN) to IP",
RFC 3168, DOI 10.17487/RFC3168, September 2001,
.
@@ -1253,27 +1253,27 @@
[I-D.briscoe-tsvwg-l4s-diffserv]
Briscoe, B., "Interactions between Low Latency, Low Loss,
Scalable Throughput (L4S) and Differentiated Services",
draft-briscoe-tsvwg-l4s-diffserv-02 (work in progress),
November 2018.
[I-D.ietf-tsvwg-l4s-arch]
Briscoe, B., Schepper, K., Bagnulo, M., and G. White, "Low
Latency, Low Loss, Scalable Throughput (L4S) Internet
- Service: Architecture", draft-ietf-tsvwg-l4s-arch-05 (work
- in progress), February 2020.
+ Service: Architecture", draft-ietf-tsvwg-l4s-arch-06 (work
+ in progress), March 2020.
[I-D.ietf-tsvwg-nqb]
White, G. and T. Fossati, "A Non-Queue-Building Per-Hop
Behavior (NQB PHB) for Differentiated Services", draft-
- ietf-tsvwg-nqb-00 (work in progress), November 2019.
+ ietf-tsvwg-nqb-01 (work in progress), March 2020.
[L4Sdemo16]
Bondarenko, O., De Schepper, K., Tsang, I., and B.
Briscoe, "Ultra-Low Delay for All: Live Experience, Live
Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016,
.
[LLD] White, G., Sundaresan, K., and B. Briscoe, "Low Latency
@@ -2075,27 +2075,27 @@
18: S_L = S_C - k' % L4S ramp scaling factor as a power of 2
19: range_L = 2^S_L % Range of L4S ramp
20: }
Figure 9: Example Header Pseudocode for DualQ Coupled Curvy RED AQM
1: cred_dequeue(lq, cq, pkt) { % Couples L4S & Classic queues
2: while ( lq.len() + cq.len() > 0 ) {
3: if ( scheduler() == lq ) {
4: lq.dequeue(pkt) % L4S scheduled
- 5a: p_CL = (cq.time() - minTh) / range_L
+ 5a: p_CL = (Q_C - minTh) / range_L
5b: if ( ( lq.time() > T )
5c: OR ( p_CL > maxrand(U) ) )
6: mark(pkt)
7: } else {
8: cq.dequeue(pkt) % Classic scheduled
- 9a: Q_C = gamma * qc.time() + (1-gamma) * Q_C % Classic Q EWMA
+ 9a: Q_C = gamma * cq.time() + (1-gamma) * Q_C % Classic Q EWMA
10a: sqrt_p_C = (Q_C - minTh) / range_C
10b: if ( sqrt_p_C > maxrand(2*U) ) {
11: if ( (ecn(pkt) == 0) { % ECN field = not-ECT
12: drop(pkt) % Squared drop, redo loop
13: continue % continue to the top of the while loop
14: }
15: mark(pkt)
16: }
17: }
18: return(pkt) % return the packet and stop here
@@ -2142,49 +2142,48 @@
Within each queue, the decision whether to forward, drop or mark is
taken as follows (to simplify the explanation, it is assumed that
U=1):
L4S: If the test at line 3 determines there is an L4S packet to
dequeue, the tests at lines 5b and 5c determine whether to mark
it. The first is a simple test of whether the L4S queue delay
(lq.time()) is greater than a step threshold T (Note 3). The
second test is similar to the random ECN marking in RED, but with
- the following differences: ii) marking depends on queuing time,
- not bytes, in order to scale for any link rate without being
- reconfigured; ii) marking of the L4S queue does not depend on
- itself, it depends on the queuing time of the _other_ (Classic)
- queue, where cq.time() is the queuing time of the packet at the
- head of the Classic queue (zero if empty); iii) marking depends on
- the instantaneous queuing time (of the other Classic queue), not a
- smoothed average; iv) the queue is compared with the maximum of U
- random numbers (but if U=1, this is the same as the single random
- number used in RED).
+ the following differences: i) marking depends on queuing time, not
+ bytes, in order to scale for any link rate without being
+ reconfigured; ii) marking of the L4S queue depends on a logical OR
+ of two tests; one against its own queuing time and one against the
+ queuing time of the _other_ (Classic) queue; iii) the tests are
+ against the instantaneous queuing time of the L4S queue, but a
+ smoothed average of the other (Classic) queue; iv) the queue is
+ compared with the maximum of U random numbers (but if U=1, this is
+ the same as the single random number used in RED).
Specifically, in line 5a the coupled marking probability p_CL is
- set to the excess of the Classic queueing delay qc.time() above
- the minimum queuing delay threshold (minTh) all divided by the L4S
- scaling parameter range_L. range_L represents the queuing delay
- (in seconds) added to minTh at which marking probability would hit
- 100%. Then in line 5c (if U=1) the result is compared with a
- uniformly distributed random number between 0 and 1, which ensures
- that marking probability will linearly increase with queueing
- time.
+ set to the amount by which the averaged Classic queueing delay Q_C
+ exceeds the minimum queuing delay threshold (minTh) all divided by
+ the L4S scaling parameter range_L. range_L represents the queuing
+ delay (in seconds) added to minTh at which marking probability
+ would hit 100%. Then in line 5c (if U=1) the result is compared
+ with a uniformly distributed random number between 0 and 1, which
+ ensures that, over range_L, marking probability will linearly
+ increase with queueing time.
Classic: If the scheduler at line 3 chooses to dequeue a Classic
packet and jumps to line 7, the test at line 10b determines
whether to drop or mark it. But before that, line 9a updates Q_C,
which is an exponentially weighted moving average (Note 4) of the
- queuing time in the Classic queue, where qc.time() is the current
- instantaneous queueing time of the Classic queue and gamma is the
- EWMA constant (default 1/32, see line 12 of the initialization
- function).
+ queuing time of the Classic queue, where cq.time() is the current
+ instantaneous queueing time of the packet at the head of the
+ Classic queue (zero if empty) and gamma is the EWMA constant
+ (default 1/32, see line 12 of the initialization function).
Lines 10a and 10b implement the Classic AQM. In line 10a the
averaged queuing time Q_C is divided by the Classic scaling
parameter range_C, in the same way that queuing time was scaled
for L4S marking. This scaled queuing time will be squared to
compute Classic drop probability so, before it is squared, it is
effectively the square root of the drop probability, hence it is
given the variable name sqrt_p_C. The squaring is done by
comparing it with the maximum out of two random numbers (assuming
U=1). Comparing it with the maximum out of two is the same as the
@@ -2201,23 +2200,24 @@
rather than queuing becomes the dominant cause of delay for short
flows, due to timeouts and tail losses.
Curvy RED constrains delay with a softened target that allows some
increase in delay as load increases. This is achieved by increasing
drop probability on a convex curve relative to queue growth (the
square curve in the Classic queue, if U=1). Like RED, the curve hugs
the zero axis while the queue is shallow. Then, as load increases,
it introduces a growing barrier to higher delay. But, unlike RED, it
requires only two parameters, not three. The disadvantage of Curvy
- RED is that it is not adapted to a wide range of RTTs. Curvy RED can
- be used as is when the RTT range to be supported is limited,
- otherwise an adaptation mechanism is required.
+ RED (compared to a PI controller for example) is that it is not
+ adapted to a wide range of RTTs. Curvy RED can be used as is when
+ the RTT range to be supported is limited, otherwise an adaptation
+ mechanism is required.
From our limited experiments with Curvy RED so far, recommended
values of these parameters are: S_C = -1; g_C = 5; T = 5 * MTU at the
link rate (about 1ms at 60Mb/s) for the range of base RTTs typical on
the public Internet. [CRED_Insights] explains why these parameters
are applicable whatever rate link this AQM implementation is deployed
on and how the parameters would need to be adjusted for a scenario
with a different range of RTTs (e.g. a data centre). The setting of
k depends on policy (see Section 2.5 and Appendix C.2 respectively
for its recommended setting and guidance on alternatives).
@@ -2278,25 +2278,25 @@
in parallel out of band. For instance, if U=1, the Classic queue
would require the maximum of two random numbers. So, instead of
calling maxrand(2*U) in-band, the maximum of every pair of values
from a pseudorandom number generator could be generated out-of-band,
and held in a buffer ready for the Classic queue to consume.
1: cred_dequeue(lq, cq, pkt) { % Couples L4S & Classic queues
2: while ( lq.len() + cq.len() > 0 ) {
3: if ( scheduler() == lq ) {
4: lq.dequeue(pkt) % L4S scheduled
- 5: if ((lq.time() > T) OR (cq.ns() >> (S_L-2) > maxrand(U)))
+ 5: if ((lq.time() > T) OR (Q_C >> (S_L-2) > maxrand(U)))
6: mark(pkt)
7: } else {
8: cq.dequeue(pkt) % Classic scheduled
- 9: Q_C += (cq.ns() - Q_C) >> g_C % Classic Q EWMA
+ 9: Q_C += (qc.ns() - Q_C) >> g_C % Classic Q EWMA
10: if ( (Q_C >> (S_C-2) ) > maxrand(2*U) ) {
11: if ( (ecn(pkt) == 0) { % ECN field = not-ECT
12: drop(pkt) % Squared drop, redo loop
13: continue % continue to the top of the while loop
14: }
15: mark(pkt)
16: }
17: }
18: return(pkt) % return the packet and stop here
19: }
@@ -2310,25 +2310,25 @@
that division can be implemented as a right bit-shift (>>) in lines 5
and 10 of the integer variant of the pseudocode (Figure 11).
For the integer variant of the pseudocode, an integer version of the
rand() function used at line 25 of the maxrand(function) in Figure 10
would be arranged to return an integer in the range 0 <= maxrand() <
2^32 (not shown). This would scale up all the floating point
probabilities in the range [0,1] by 2^32.
Queuing delays are also scaled up by 2^32, but in two stages: i) In
- lines 5 and 10 queuing times cq.ns() and pkt.ns() are returned in
- integer nanoseconds, making the values about 2^30 times larger than
- when the units were seconds, ii) then in lines 3 and 9 an adjustment
- of -2 to the right bit-shift multiplies the result by 2^2, to
- complete the scaling by 2^32.
+ line 9 queuing time qc.ns() is returned in integer nanoseconds,
+ making the value about 2^30 times larger than when the units were
+ seconds, ii) then in lines 5 and 10 an adjustment of -2 to the right
+ bit-shift multiplies the result by 2^2, to complete the scaling by
+ 2^32.
In line 8 of the initialization function, the EWMA constant gamma is
represented as an integer power of 2, g_C, so that in line 9 of the
integer code the division needed to weight the moving average can be
implemented by a right bit-shift (>> g_C).
Appendix C. Choice of Coupling Factor, k
C.1. RTT-Dependence