draft-ietf-tcpm-rto-consider-14.txt   draft-ietf-tcpm-rto-consider-15.txt 
Internet Engineering Task Force M. Allman Internet Engineering Task Force M. Allman
INTERNET-DRAFT ICSI INTERNET-DRAFT ICSI
File: draft-ietf-tcpm-rto-consider-14.txt May 13, 2020 File: draft-ietf-tcpm-rto-consider-15.txt June 8, 2020
Intended Status: Best Current Practice Intended Status: Best Current Practice
Expires: November 13, 2020 Expires: December 8, 2020
Requirements for Time-Based Loss Detection Requirements for Time-Based Loss Detection
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. Internet-Drafts are working provisions of BCP 78 and BCP 79. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts. working documents as Internet-Drafts.
skipping to change at page 1, line 30 skipping to change at page 1, line 30
months and may be updated, replaced, or obsoleted by other documents months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress." reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This Internet-Draft will expire on November 13, 2020. This Internet-Draft will expire on December 8, 2020.
Copyright Notice Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 18 skipping to change at page 2, line 18
Terminology Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in BCP 14, RFC 2119 document are to be interpreted as described in BCP 14, RFC 2119
[RFC2119]. [RFC2119].
1 Introduction 1 Introduction
Loss detection is a crucial activity for many protocols and As a network of networks, the Internet consists of a large variety
applications and is generally undertaken for two major reasons: of links and systems that combine to form "best effort" network
paths. The path that traffic takes through the network is generally
unknown a priori. Further, the path and the path properties that
traffic experiences dynamically vary over time. As two examples,
consider delay and loss. In the general case, delay across a
network path depends not only on distance, but also a number of
variable components such as the route and the level of buffering in
intermediate devices. Since our wide-area network paths are best
effort, packet loss is a regular occurrence. While there are
numerous causes of packet loss, the conservative general approach
that has historically served us well---and we use in this
document---is to treat loss as an implicit indication of network
congestion.
Given that packet loss is routine in best effort networks, loss
detection is a crucial activity for many protocols and applications
and is generally undertaken for two major reasons:
(1) Ensuring reliable data delivery. (1) Ensuring reliable data delivery.
This requires a data sender to develop an understanding of This requires a data sender to develop an understanding of
which transmitted packets have not arrived at the receiver. which transmitted packets have not arrived at the receiver.
This knowledge allows the sender to retransmit missing data. This knowledge allows the sender to retransmit missing data.
(2) Congestion control. (2) Congestion control.
Packet loss is often taken as an indication that the sender As we mention above, packet loss is often taken as an
is transmitting too fast and is overwhelming some portion of implicit indication that the sender is transmitting too fast
the network path. Data senders can therefore use loss to and is overwhelming some portion of the network path. Data
trigger transmission rate reductions. senders can therefore use loss to trigger transmission rate
reductions.
Various mechanisms are used to detect losses in a packet stream. Various mechanisms are used to detect losses in a packet stream.
Often we use continuous or periodic acknowledgments from the Often we use continuous or periodic acknowledgments from the
recipient to inform the sender's notion of which pieces of data are recipient to inform the sender's notion of which pieces of data are
missing. However, despite our best intentions and most robust missing. However, despite our best intentions and most robust
mechanisms we cannot place ultimate faith in receiving such mechanisms we cannot place ultimate faith in receiving such
acknowledgments, but can only truly depend on the passage of time. acknowledgments, but can only truly depend on the passage of time.
Therefore, our ultimate backstop to ensuring that we detect all loss Therefore, our ultimate backstop to ensuring that we detect all loss
is a timeout. That is, the sender sets some expectation for how is a timeout. That is, the sender sets some expectation for how
long to wait for confirmation of delivery for a given piece of data. long to wait for confirmation of delivery for a given piece of data.
skipping to change at page 3, line 4 skipping to change at page 3, line 21
wish to simultaneously: wish to simultaneously:
- wait long enough to ensure the detection of loss is correct, and - wait long enough to ensure the detection of loss is correct, and
- minimize the amount of delay we impose on applications (before - minimize the amount of delay we impose on applications (before
repairing loss) and the network (before we reduce the repairing loss) and the network (before we reduce the
congestion). congestion).
Serving both of these goals is difficult as they pull in opposite Serving both of these goals is difficult as they pull in opposite
directions [AP99]. By not waiting long enough to accurately directions [AP99]. By not waiting long enough to accurately
determine a packet has been lost we risk sending unnecessary determine a packet has been lost we may provide a needed
retransmission in a timely manner, but risk sending unnecessary
("spurious") retransmissions and needlessly lowering the ("spurious") retransmissions and needlessly lowering the
transmission rate. By waiting long enough that we are unambiguously transmission rate. By waiting long enough that we are unambiguously
certain a packet has been lost we cannot repair losses in a timely certain a packet has been lost we cannot repair losses in a timely
manner and we risk prolonging network congestion. manner and we risk prolonging network congestion.
Many protocols and applications use their own time-based loss Many protocols and applications use their own time-based loss
detection mechanisms (e.g., TCP [RFC6298], SCTP [RFC4960], SIP detection mechanisms (e.g., TCP [RFC6298], SCTP [RFC4960], SIP
[RFC3261]). At this point, our experience leads to a recognition [RFC3261]). At this point, our experience leads to a recognition
that often specific tweaks that deviate from standardized time-based that often specific tweaks that deviate from standardized time-based
loss detectors do not materially impact network safety. Therefore, loss detectors do not materially impact network safety with respect
in this document we outline a set of high-level protocol-agnostic to congestion control. Therefore, in this document we outline a set
requirements for time-based loss detection. The intent is to of high-level protocol-agnostic requirements for time-based loss
provide a safe foundation on which implementations have the detection. The intent is to provide a safe foundation on which
flexibility to instantiate mechanisms that best realize their implementations have the flexibility to instantiate mechanisms that
specific goals. best realize their specific goals.
2 Context 2 Context
This document is different from from the way we ideally like to This document is different from from the way we ideally like to
engineer systems. Usually, we strive to understand high-level engineer systems. Usually, we strive to understand high-level
requirements as a starting point. We then methodically engineer requirements as a starting point. We then methodically engineer
specific protocols, algorithms and systems that meet these specific protocols, algorithms and systems that meet these
requirements. Within the standards process we have derived many requirements. Within the IETF standards process we have derived
time-based loss detection schemes without benefit from some many time-based loss detection schemes without benefit from some
over-arching requirements document---because we had no idea how to over-arching requirements document---because we had no idea how to
write such a document! Therefore, we made the best specific write such a document! Therefore, we made the best specific
decisions we could in response to specific needs. decisions we could in response to specific needs.
At this point, however, the community's experience has matured to At this point, however, the community's experience has matured to
the point where we can define a set of high-level requirements for the point where we can define a set of general, high-level
time-based loss detection schemes. We now understand how to requirements for time-based loss detection schemes. We now
separate the strategies these mechanisms use that are crucial for understand how to separate the strategies these mechanisms use that
network safety from those small details that do not materially are crucial for network safety from those small details that do not
impact network safety. However, adding a requirements umbrella to a materially impact network safety. The requirements in this document
body of existing specifications is inherently messy and we run the may not be appropriate in all cases. In particular, the guidelines
risk of creating inconsistencies with both past and future in section 4 are concerned with the general case, but specific
mechanisms. The correct way to view this document is as the default situations may allow for more flexibility in terms of loss detection
case. Specifically: because specific facets of the environment are known (e.g., when
operating over a single physical link or within a tightly controlled
data center). Therefore, variants, deviations or wholly different
time-based loss detectors may be necessary or useful in some cases.
The correct way to view this document is as the default case and not
as a one-size-fits-all that is optimal in all cases.
Adding a requirements umbrella to a body of existing specifications
is inherently messy and we run the risk of creating inconsistencies
with both past and future mechanisms. Therefore, we make the
following statements about the relationship of this document to past
and future specifications:
- This document does not update or obsolete any existing RFC. - This document does not update or obsolete any existing RFC.
These previous specifications---while generally consistent with These previous specifications---while generally consistent with
the requirements in this document---reflect community consensus the requirements in this document---reflect community consensus
and this document does not change that consensus. and this document does not change that consensus.
- The requirements in this document are meant to provide for - The requirements in this document are meant to provide for
network safety and, as such, SHOULD be used by all time-based network safety and, as such, SHOULD be used by all time-based
loss detection mechanisms. loss detection mechanisms.
- The requirements in this document may not be appropriate in all - The requirements in this document may not be appropriate in all
cases and, therefore, inconsistent deviations may be necessary cases and, therefore, inconsistent deviations and variants may
(hence the "SHOULD" in the last bullet). However, be necessary (hence the "SHOULD" in the last bullet). However,
inconsistencies MUST be (a) explained and (b) gather consensus. inconsistencies MUST be (a) explained and (b) gather consensus.
3 Scope 3 Scope
The principles we outline in this document are protocol-agnostic and The principles we outline in this document are protocol-agnostic and
widely applicable. We make the following scope statements about widely applicable. We make the following scope statements about
the application of the requirements discussed in Section 4: the application of the requirements discussed in Section 4:
(S.1) The requirements in this document apply only to the primary or (S.1) The requirements in this document apply only to the primary or
last resort time-based loss detection. last resort time-based loss detection.
skipping to change at page 7, line 11 skipping to change at page 7, line 41
communication. communication.
As an example, TCP takes an FT sample roughly once per RTT, As an example, TCP takes an FT sample roughly once per RTT,
or if using the timestamp option [RFC7323] on each or if using the timestamp option [RFC7323] on each
acknowledgment arrival. [AP99] shows that both these acknowledgment arrival. [AP99] shows that both these
approaches result in roughly equivalent performance for the approaches result in roughly equivalent performance for the
RTO estimator. RTO estimator.
(c) FT observations MAY be taken from non-data exchanges. (c) FT observations MAY be taken from non-data exchanges.
Some protocols use keepalives, heartbeats or other messages Some protocols use non-data exchanges for various
to exchange control information. To the extent that the reasons---e.g., keepalives, heartbeats, control messages.
latency of these transactions mirrors data exchange, they To the extent that the latency of these exchanges mirrors
can be leveraged to take FT samples within the RTO data exchange, they can be leveraged to take FT samples
mechanism. Such samples can help protocols keep their RTO within the RTO mechanism. Such samples can help protocols
accurate during lulls in data transmission. However, given keep their RTO accurate during lulls in data transmission.
that these messages may not be subject to the same delays as However, given that these messages may not be subject to the
data transmission, we do not take a general view on whether same delays as data transmission, we do not take a general
this is useful or not. view on whether this is useful or not.
(d) An RTO mechanism MUST NOT use ambiguous FT samples. (d) An RTO mechanism MUST NOT use ambiguous FT samples.
Assume two copies of some segment X are transmitted at times Assume two copies of some segment X are transmitted at times
t0 and t1 and then at time t2 the sender receives t0 and t1 and then at time t2 the sender receives
confirmation that X in fact arrived. In some cases, it is confirmation that X in fact arrived. In some cases, it is
not clear which copy of X triggered the confirmation and not clear which copy of X triggered the confirmation and
hence the actual FT is either t2-t1 or t2-t0, but which is a hence the actual FT is either t2-t1 or t2-t0, but which is a
mystery. Therefore, in this situation an implementation mystery. Therefore, in this situation an implementation
MUST use Karn's algorithm [KP87,RFC6298] and use neither MUST use Karn's algorithm [KP87,RFC6298] and use neither
version of the FT sample and hence not update the RTO. version of the FT sample and hence not update the RTO.
There are cases where two copies of some data are There are cases where two copies of some data are
transmitted in a way whereby the sender can tell which is transmitted in a way whereby the sender can tell which is
being acknowledged by an incoming ACK. E.g., TCP's being acknowledged by an incoming ACK. E.g., TCP's
timestamp option [RFC7323] allows for segments to be timestamp option [RFC7323] allows for segments to be
uniquely identified and hence avoid the ambiguity. In such uniquely identified and hence avoid the ambiguity. In such
cases there is no ambiguity and the resulting samples can cases there is no ambiguity and the resulting samples can
update the RTO. update the RTO.
(3) Each time the RTO is used to detect a loss, the value of the RTO (3) Loss detected by the RTO mechanism MUST be taken as an
MUST be exponentially backed off such that the next firing
requires a longer interval. The backoff SHOULD be removed after
either (a) the subsequent successful transmission of
non-retransmitted data, or (b) an RTO passes without detecting
additional losses. The former will generally be quicker. The
latter covers cases where loss is detected, but not repaired.
A maximum value MAY be placed on the RTO. The maximum RTO MUST
NOT be less than 60 seconds (as specified in [RFC6298]).
This ensures network safety.
(4) Loss detected by the RTO mechanism MUST be taken as an
indication of network congestion and the sending rate adapted indication of network congestion and the sending rate adapted
using a standard mechanism (e.g., TCP collapses the congestion using a standard mechanism (e.g., TCP collapses the congestion
window to one segment [RFC5681]). window to one segment [RFC5681]).
This ensures network safety. This ensures network safety.
An exception to this rule is if an IETF standardized mechanism An exception to this rule is if an IETF standardized mechanism
determines that a particular loss is due to a non-congestion determines that a particular loss is due to a non-congestion
event (e.g., packet corruption). In such a case a congestion event (e.g., packet corruption). In such a case a congestion
control action is not required. Additionally, congestion control action is not required. Additionally, congestion
control actions taken based on time-based loss detection could control actions taken based on time-based loss detection could
be reversed when a standard mechanism post-facto determines that be reversed when a standard mechanism post-facto determines that
the cause of the loss was not congestion (e.g., [RFC5682]). the cause of the loss was not congestion (e.g., [RFC5682]).
(4) Each time the RTO is used to detect a loss, the value of the RTO
MUST be exponentially backed off such that the next firing
requires a longer interval. The backoff SHOULD be removed after
either (a) the subsequent successful transmission of
non-retransmitted data, or (b) an RTO passes without detecting
additional losses. The former will generally be quicker. The
latter covers cases where loss is detected, but not repaired.
A maximum value MAY be placed on the RTO. The maximum RTO MUST
NOT be less than 60 seconds (as specified in [RFC6298]).
This ensures network safety.
As with guideline (3), an exception to this rule exists if an
IETF standardized mechanism determines that a particular loss is
not due to congestion.
5 Discussion 5 Discussion
We note that research has shown the tension between the We note that research has shown the tension between the
responsiveness and correctness of time-based loss detection seems to responsiveness and correctness of time-based loss detection seems to
be a fundamental tradeoff in the context of TCP [AP99]. That is, be a fundamental tradeoff in the context of TCP [AP99]. That is,
making the RTO more aggressive (e.g., via changing TCP's making the RTO more aggressive (e.g., via changing TCP's
exponentially weighted moving average (EWMA) gains, lowering the exponentially weighted moving average (EWMA) gains, lowering the
minimum RTO, etc.) can reduce the time required to detect actual minimum RTO, etc.) can reduce the time required to detect actual
loss. However, at the same time, such aggressiveness leads to more loss. However, at the same time, such aggressiveness leads to more
cases of mistakenly declaring packets lost that ultimately arrived cases of mistakenly declaring packets lost that ultimately arrived
skipping to change at page 10, line 33 skipping to change at page 11, line 11
"SIP: Session Initiation Protocol", RFC 3261, June 2002. "SIP: Session Initiation Protocol", RFC 3261, June 2002.
[RFC3522] Ludwig, R., M. Meyer, "The Eifel Detection Algorithm for [RFC3522] Ludwig, R., M. Meyer, "The Eifel Detection Algorithm for
TCP", RFC 3522, april 2003. TCP", RFC 3522, april 2003.
[RFC3708] Blanton, E., M. Allman, "Using TCP Duplicate Selective [RFC3708] Blanton, E., M. Allman, "Using TCP Duplicate Selective
Acknowledgement (DSACKs) and Stream Control Transmission Acknowledgement (DSACKs) and Stream Control Transmission
Protocol (SCTP) Duplicate Transmission Sequence Numbers (TSNs) Protocol (SCTP) Duplicate Transmission Sequence Numbers (TSNs)
to Detect Spurious Retransmissions", RFC 3708, February 2004. to Detect Spurious Retransmissions", RFC 3708, February 2004.
[RFC3940] Adamson, B., C. Bormann, M. Handley, J. Macker,
"Negative-acknowledgment (NACK)-Oriented Reliable Multicast
(NORM) Protocol", November 2004, RFC 3940.
[RFC4340] Kohler, E., M. Handley, S. Floyd, "Datagram Congestion
Control Protocol (DCCP)", March 2006, RFC 4340.
[RFC4960] Stweart, R., "Stream Control Transmission Protocol", RFC [RFC4960] Stweart, R., "Stream Control Transmission Protocol", RFC
4960, September 2007. 4960, September 2007.
[RFC5681] Allman, M., V. Paxson, E. Blanton, "TCP Congestion
Control", RFC 5681, September 2009.
[RFC5682] Sarolahti, P., M. Kojo, K. Yamamoto, M. Hata, "Forward [RFC5682] Sarolahti, P., M. Kojo, K. Yamamoto, M. Hata, "Forward
RTO-Recovery (F-RTO): An Algorithm for Detecting Spurious RTO-Recovery (F-RTO): An Algorithm for Detecting Spurious
Retransmission Timeouts with TCP", RFC 5682, September 2009. Retransmission Timeouts with TCP", RFC 5682, September 2009.
[RFC5740] Adamson, B., C. Bormann, M. Handley, J. Macker, [RFC5740] Adamson, B., C. Bormann, M. Handley, J. Macker,
"NACK-Oriented Reliable Multicast (NORM) Transport Protocol", "NACK-Oriented Reliable Multicast (NORM) Transport Protocol",
November 2009, RFC 5740. RFC 5740, November 2009.
[RFC6182] Ford, A., C. Raiciu, M. Handley, S. Barre, J. Iyengar, [RFC6182] Ford, A., C. Raiciu, M. Handley, S. Barre, J. Iyengar,
"Architectural Guidelines for Multipath TCP Development", March "Architectural Guidelines for Multipath TCP Development", March
2011, RFC 6182. 2011, RFC 6182.
[RFC6298] Paxson, V., M. Allman, H.K. Chu, M. Sargent, "Computing [RFC6298] Paxson, V., M. Allman, H.K. Chu, M. Sargent, "Computing
TCP's Retransmission Timer", June 2011, RFC 6298. TCP's Retransmission Timer", June 2011, RFC 6298.
[RFC6582] Henderson, T., S. Floyd, A. Gurtov, Y. Nishida, "The
NewReno Modification to TCP's Fast Recovery Algorithm", April
2012, RFC 6582.
[RFC6675] Blanton, E., M. Allman, L. Wang, I. Jarvinen, M. Kojo, [RFC6675] Blanton, E., M. Allman, L. Wang, I. Jarvinen, M. Kojo,
Y. Nishida, "A Conservative Loss Recovery Algorithm Based on Y. Nishida, "A Conservative Loss Recovery Algorithm Based on
Selective Acknowledgment (SACK) for TCP", August 2012, RFC 6675. Selective Acknowledgment (SACK) for TCP", August 2012, RFC 6675.
[RFC7323] Borman D., B. Braden, V. Jacobson, R. Scheffenegger, "TCP [RFC7323] Borman D., B. Braden, V. Jacobson, R. Scheffenegger, "TCP
Extensions for High Performance", September 2014, RFC 7323. Extensions for High Performance", September 2014, RFC 7323.
Authors' Addresses Authors' Addresses
Mark Allman Mark Allman
 End of changes. 17 change blocks. 
64 lines changed or deleted 89 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/