draft-ietf-tcpm-rto-consider-12.txt   draft-ietf-tcpm-rto-consider-13.txt 
Internet Engineering Task Force M. Allman Internet Engineering Task Force M. Allman
INTERNET-DRAFT ICSI INTERNET-DRAFT ICSI
File: draft-ietf-tcpm-rto-consider-12.txt May 4, 2020 File: draft-ietf-tcpm-rto-consider-13.txt May 8, 2020
Intended Status: Best Current Practice Intended Status: Best Current Practice
Expires: November 4, 2020 Expires: November 8, 2020
Requirements for Time-Based Loss Detection Requirements for Time-Based Loss Detection
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. Internet-Drafts are working provisions of BCP 78 and BCP 79. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts. working documents as Internet-Drafts.
skipping to change at page 1, line 30 skipping to change at page 1, line 30
months and may be updated, replaced, or obsoleted by other documents months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress." reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This Internet-Draft will expire on November 4, 2020. This Internet-Draft will expire on November 8, 2020.
Copyright Notice Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 23 skipping to change at page 3, line 23
that often specific tweaks that deviate from standardized time-based that often specific tweaks that deviate from standardized time-based
loss detectors do not materially impact network safety. Therefore, loss detectors do not materially impact network safety. Therefore,
in this document we outline a set of high-level protocol-agnostic in this document we outline a set of high-level protocol-agnostic
requirements for time-based loss detection. The intent is to requirements for time-based loss detection. The intent is to
provide a safe foundation on which implementations have the provide a safe foundation on which implementations have the
flexibility to instantiate mechanisms that best realize their flexibility to instantiate mechanisms that best realize their
specific goals. specific goals.
2 Context 2 Context
This document is different from other standards documents in that it This document is different from from the way we ideally like to
is backwards from the way we generally like to engineer systems. engineer systems. Usually, we strive to understand high-level
Usually, we strive to understand high-level requirements as a requirements as a starting point. We then methodically engineer
starting point. We then methodically engineer specific protocols, specific protocols, algorithms and systems that meet these
algorithms and systems that meet these requirements. Within the requirements. Within the standards process we have derived many
standards process we have derived many time-based loss detection time-based loss detection schemes without benefit from some
schemes without benefit from some over-arching requirements over-arching requirements document---because we had no idea how to
document---because we had no idea how to write such a document! write such a document! Therefore, we made the best specific
Therefore, we made the best specific decisions we could in response decisions we could in response to specific needs.
to specific needs.
At this point, however, the community's experience has matured to At this point, however, the community's experience has matured to
the point where we can define a set of high-level requirements for the point where we can define a set of high-level requirements for
time-based loss detection schemes. We now understand how to time-based loss detection schemes. We now understand how to
separate the strategies these mechanisms use that are crucial for separate the strategies these mechanisms use that are crucial for
network safety from those small details that do not materially network safety from those small details that do not materially
impact network safety. However, adding a requirements umbrella to a impact network safety. However, adding a requirements umbrella to a
body of existing specifications is inherently messy and we run the body of existing specifications is inherently messy and we run the
risk of creating inconsistencies with both past and future risk of creating inconsistencies with both past and future
mechanisms. The correct way to view this document is as the default mechanisms. The correct way to view this document is as the default
skipping to change at page 4, line 11 skipping to change at page 4, line 11
cases and, therefore, inconsistent deviations may be necessary cases and, therefore, inconsistent deviations may be necessary
(hence the "SHOULD" in the last bullet). However, (hence the "SHOULD" in the last bullet). However,
inconsistencies MUST be (a) explained and (b) gather consensus. inconsistencies MUST be (a) explained and (b) gather consensus.
3 Scope 3 Scope
The principles we outline in this document are protocol-agnostic and The principles we outline in this document are protocol-agnostic and
widely applicable. We make the following scope statements about widely applicable. We make the following scope statements about
the application of the requirements discussed in Section 4: the application of the requirements discussed in Section 4:
(S.1) The requirements in this document apply only to time-based (S.1) The requirements in this document apply only to the primary or
loss detection. last resort time-based loss detection.
While there are a bevy of uses for timers in protocols---from While there are a bevy of uses for timers in protocols---from
rate-based pacing to connection failure detection and rate-based pacing to connection failure detection and
beyond---these are outside the scope of this document. beyond---these are outside the scope of this document.
(S.2) The requirements in this document apply only to endpoint-to- (S.2) The requirements for time-based loss detection mechanisms in
this document are for the primary or "last resort" loss
detection mechanism whether the mechanism is the sole loss
repair strategy or works in concert with other mechanisms.
While a straightforward time-based loss detector is sufficient
for simple protocols like DNS [RFC1034,RFC1035], more complex
protocols often use more advanced loss detectors to aid
performance. For instance, TCP and SCTP have methods to
detect (and repair) loss based on explicit endpoint state
sharing [RFC2018,RFC4960,RFC6675]. Such mechanisms often
provide more timely and precise results than time-based loss
detectors. However, these mechanisms do not obviate the need
for a "retransmission timeout" or "RTO" because---as we
discuss in Section 1---only the passage of time can ultimately
be relied upon to detect loss. In cases such as these, the
time-based loss detector functions as a "last resort".
Also, note, that some recent proposals have incorporated time
as a component of advanced loss detection methods---either as
an aggressive first loss detector or in conjunction with
endpoint state sharing [DCCM13,CCDJ20,IS20]. Since these
timers are not used as "last resort" the requirements in this
document need not be directly used in these cases. However,
we expect that many of the requirements are useful for these
situations, as well.
(S.3) The requirements in this document apply only to endpoint-to-
endpoint unicast communication. Reliable multicast (e.g., endpoint unicast communication. Reliable multicast (e.g.,
[RFC5740]) protocols are explicitly outside the scope of this [RFC5740]) protocols are explicitly outside the scope of this
document. document.
Protocols such as SCTP [RFC4960] and MP-TCP [RFC6182] that Protocols such as SCTP [RFC4960] and MP-TCP [RFC6182] that
communicate in a unicast fashion with multiple specific communicate in a unicast fashion with multiple specific
endpoints can leverage the requirements in this document endpoints can leverage the requirements in this document
provided they track state and follow the requirements for each provided they track state and follow the requirements for each
endpoint independently. I.e., if host A communicates with endpoint independently. I.e., if host A communicates with
addresses B and C, A needs to use independent time-based loss addresses B and C, A needs to use independent time-based loss
detector instances for traffic sent to B and C. detector instances for traffic sent to B and C.
(S.3) There are cases where state is shared across connections (S.4) There are cases where state is shared across connections
or flows (e.g., [RFC2140], [RFC3124]). State pertaining to or flows (e.g., [RFC2140], [RFC3124]). State pertaining to
time-based loss detection is often discussed as sharable. time-based loss detection is often discussed as sharable.
These situations raise issues that the simple flow-oriented These situations raise issues that the simple flow-oriented
time-based loss detection mechanism discussed in this document time-based loss detection mechanism discussed in this document
does not consider (e.g., how long to preserve state between does not consider (e.g., how long to preserve state between
connections). Therefore, while the general principles given connections). Therefore, while the general principles given
in Section 4 are likely applicable, sharing time-based loss in Section 4 are likely applicable, sharing time-based loss
detection information across flows is outside the scope of detection information across flows is outside the scope of
this document. this document.
(S.4) The requirements for time-based loss detection mechanisms in
this document can be applied regardless of whether the
mechanism is the sole loss repair strategy or works in concert
with other mechanisms.
E.g., for a simple protocol like UDP-based DNS
[RFC1034,RFC1035] a timeout and re-try mechanism is likely to
act alone to ensure reliability.
E.g., complex protocols like TCP or SCTP have methods to
detect (and repair) loss based on explicit endpoint state
sharing [RFC2018,RFC4960,RFC6675]. These mechanisms are
preferred over a time-based loss detection as they are often
more timely and precise than time-based schemes. In these
cases, a time-based scheme---called a "retransmission timeout"
or "RTO"---becomes a last resort when the more advanced
mechanisms fail.
E.g., some protocols may leverage more than one time-based
loss detector simultaneously. In these cases, the general
guidance in this document can be applied to all such timers.
4 Requirements 4 Requirements
We now list the requirements that apply when designing time-based We now list the requirements that apply when designing primary or
loss detection mechanisms. For historical reasons and ease of last resort time-based loss detection mechanisms. For historical
exposition, we refer to the time between sending a packet and reasons and ease of exposition, we refer to the time between sending
determining the packet has been lost due to lack of delivery a packet and determining the packet has been lost due to lack of
confirmation as the "retransmission timeout" or "RTO". However, the delivery confirmation as the "retransmission timeout" or "RTO".
After the RTO passes without delivery confirmation, the sender may
safely assume the packet is lost. However, as discussed above, the
detected loss need not be repaired (i.e., the loss could be detected detected loss need not be repaired (i.e., the loss could be detected
only for congestion control and not reliability purposes). only for congestion control and not reliability purposes).
(1) As we note above, loss detection happens when a sender does not (1) As we note above, loss detection happens when a sender does not
receive delivery confirmation within an some expected period of receive delivery confirmation within an some expected period of
time. In the absence of any knowledge about the latency of a time. In the absence of any knowledge about the latency of a
path, the initial RTO MUST be conservatively set to no less than path, the initial RTO MUST be conservatively set to no less than
1 second. 1 second.
Correctness is of the utmost importance when transmitting into a Correctness is of the utmost importance when transmitting into a
skipping to change at page 6, line 23 skipping to change at page 6, line 29
resolver will request the needed information from one or more resolver will request the needed information from one or more
authoritative DNS servers, which will non-trivially increase the authoritative DNS servers, which will non-trivially increase the
FT compared to the network RTT between the client and resolver. FT compared to the network RTT between the client and resolver.
Therefore, we express the requirements in terms of FT. Again, Therefore, we express the requirements in terms of FT. Again,
for ease of exposition we use "RTO" to indicate the interval for ease of exposition we use "RTO" to indicate the interval
between a packet transmission and the decision the packet has between a packet transmission and the decision the packet has
been lost---regardless of whether the packet will be been lost---regardless of whether the packet will be
retransmitted. retransmitted.
(a) In steady state the RTO SHOULD be set based on observations (a) If/when available, the RTO SHOULD be set based on multiple
of both the FT and the variance of the FT. observations of the FT.
In other words, the RTO should represent an empirically- In other words, the RTO should represent an empirically-
derived reasonable amount of time that the sender should derived reasonable amount of time that the sender should
wait for delivery confirmation before deciding the given wait for delivery confirmation before deciding the given
data is lost. Networks are inherently dynamic and therefore data is lost. Network paths are inherently dynamic and
it is crucial to allow for some variance in the FT when therefore it is crucial to incorporate multiple FT samples
developing the expectation. in the RTO to take into account the delay variation across
time.
For example, TCP's RTO [RFC6298] would satisfy this
requirement due to its use of an EWMA to combine multiple FT
samples into a "smoothed RTT". In the name of
conservativeness, TCP goes further to also include an
explicit variance term when computing the RTO.
(b) FT observations SHOULD be taken and incorporated into the (b) FT observations SHOULD be taken and incorporated into the
RTO at least once per RTT or as frequently as data is RTO at least once per RTT or as frequently as data is
exchanged in cases where that happens less frequently than exchanged in cases where that happens less frequently than
once per RTT. once per RTT.
Internet measurements show that taking only a single FT Internet measurements show that taking only a single FT
sample per TCP connection results in a relatively poorly sample per TCP connection results in a relatively poorly
performing RTO mechanism [AP99], hence this requirement that performing RTO mechanism [AP99], hence this requirement that
the FT be sampled continuously throughout the lifetime of the FT be sampled continuously throughout the lifetime of
skipping to change at page 9, line 22 skipping to change at page 9, line 34
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
Informative References Informative References
[AP99] Allman, M., V. Paxson, "On Estimating End-to-End Network Path [AP99] Allman, M., V. Paxson, "On Estimating End-to-End Network Path
Properties", Proceedings of the ACM SIGCOMM Technical Symposium, Properties", Proceedings of the ACM SIGCOMM Technical Symposium,
September 1999. September 1999.
[CCDJ20] Cheng, Y., N. Cardwell, N. Dukkipati, P. Jha, "RACK: a
time-based fast loss detection algorithm for TCP",
Internet-Draft draft-ietf-tcpm-rack-08.txt (work in progress),
March 2020.
[DCCM13] Dukkipati, N., N. Cardwell, Y. Cheng, M. Mathis, "Tail Loss
Probe (TLP): An Algorithm for Fast Recovery of Tail Losses",
Internet-Draft draft-dukkipati-tcpm-tcp-loss-probe-01.txt (work
in progress), February 2013.
[IS20] Iyengar, I., I. Swett, "QUIC Loss Detection and Congestion
Control", Internet-Draft
draft-ietf-quic-recovery-27.txt (work in progress), March 2020.
[KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time [KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time
Estimates in Reliable Transport Protocols", SIGCOMM 87. Estimates in Reliable Transport Protocols", SIGCOMM 87.
[RFC1034] Mockapetris, P. "Domain Names - Concepts and Facilities", [RFC1034] Mockapetris, P. "Domain Names - Concepts and Facilities",
RFC 1034, November 1987. RFC 1034, November 1987.
[RFC1035] Mockapetris, P. "Domain Names - Implementation and [RFC1035] Mockapetris, P. "Domain Names - Implementation and
Specification", RFC 1035, November 1987. Specification", RFC 1035, November 1987.
[RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
 End of changes. 12 change blocks. 
49 lines changed or deleted 76 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/