--- 1/draft-ietf-tcpm-early-rexmt-03.txt 2010-01-28 01:11:01.000000000 +0100 +++ 2/draft-ietf-tcpm-early-rexmt-04.txt 2010-01-28 01:11:01.000000000 +0100 @@ -1,23 +1,23 @@ Internet Engineering Task Force Mark Allman INTERNET DRAFT ICSI -File: draft-ietf-tcpm-early-rexmt-03.txt Konstantin Avrachenkov +File: draft-ietf-tcpm-early-rexmt-04.txt Konstantin Avrachenkov Intended Status: Experimental INRIA Urtzi Ayesta - LAAS-CNRS + BCAM-IKERBASQUE and LAAS-CNRS Josh Blanton Ohio University Per Hurtig Karlstad University - November 2009 - Expires: May 2010 + January 2010 + Expires: July 2010 Early Retransmit for TCP and SCTP Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that @@ -28,21 +28,21 @@ months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. - This Internet-Draft will expire on May 18, 2010. + This Internet-Draft will expire on July 27, 2010. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -66,23 +66,23 @@ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. The reader is expected to be familiar with the definitions given in [RFC5681]. 1 Introduction - Many researchers have studied problems with TCP [RFC793,RFC5681] - when the congestion window is small and have outlined possible - mechanisms to mitigate these problems + Many researchers have studied problems with TCP's loss recovery + [RFC793,RFC5681] when the congestion window is small and have + outlined possible mechanisms to mitigate these problems [Mor97,BPS+98,Bal98,LK98,RFC3150,AA02]. SCTP's [RFC4960] loss recovery and congestion control mechanisms are based on TCP and therefore the same problems impact the performance of SCTP connections. When the transport detects a missing segment, the connection enters a loss recovery phase. There are several variants of the loss recovery phase depending on the TCP implementation. TCP can use slow start based recovery or Fast Recovery [RFC5681], NewReno [RFC3782], and loss recovery based on selective acknowledgments (SACKs) [RFC2018,FF96,RFC3517]. SCTP's loss recovery is not as varied due to the built-in selective @@ -164,20 +164,28 @@ use "Limited Transmit" to include both TCP and SCTP mechanisms for sending in response to the first two duplicate ACKs. By sending these two new segments the sender is attempting to induce additional duplicate ACKs (if appropriate) so that Fast Retransmit will be triggered before the retransmission timeout expires. The sender-side "Early Retransmit" mechanism outlined in this document covers the case when previously unsent data is not available for transmission (case (2) above) or cannot be transmitted due to an advertised window limitation (case (3) above). + Note: This document is being published as an experimental RFC as + part of the process for the TCPM WG and the IETF to assess whether + the proposed change is useful and safe in the heterogeneous + environments, including which variants of the mechanism are the most + effective. In the future, this specification may be updated and put + on the standards track if the safeness and efficacy can be + demonstrated. + 2 Early Retransmit Algorithm The Early Retransmit algorithm calls for lowering the threshold for triggering Fast Retransmit when the amount of outstanding data is small and when no previously unsent data can be transmitted (such that Limited Transmit could be used). Duplicate ACKs are triggered by each arriving out-of-order segment. Therefore, Fast Retransmit will not be invoked when there are less than four outstanding segments (assuming only one segment loss in the window). However, TCP and SCTP are not required to track the number of outstanding @@ -227,23 +235,23 @@ ER_thresh = ceiling (ownd/SMSS) - 1 (1) duplicate ACKs, where ownd is in terms of bytes. We call this reduced ACK threshold enabling "Early Retransmission". When conditions (2.a) and (2.b) hold and a TCP connection does support SACK or SCTP is in use, Early Retransmit MUST be used only when "ownd - SMSS" bytes have been SACKed. - When conditions (2.a) and (2.b) do not hold, the transport MUST NOT - use Early Retransmit, but rather prefer the standard mechanisms, - including Fast Retransmit and Limited Transmit. + If either (or both) condition (2.a) or (2.b) does not hold, the + transport MUST NOT use Early Retransmit, but rather prefer the + standard mechanisms, including Fast Retransmit and Limited Transmit. As noted above, the drawback of this byte-based variant is precision [HB08]. We illustrate this with two examples: + Consider a non-SACK TCP sender that uses an SMSS of 1460 bytes and transmits three segments each with 400 bytes of payload. This is a case where Early Retransmit could aid loss recovery if one segment is lost. However, in this case ER_thresh will become zero, per equation (1), because the number of outstanding bytes is a poor estimate of the number of outstanding segments. @@ -283,23 +290,23 @@ segments. (We discuss tracking the number of outstanding segments below.) We call this reduced ACK threshold enabling "Early Retransmission". When conditions (3.a) and (3.b) hold and a TCP connection does support SACK or SCTP is in use, Early Retransmit MUST be used only when "oseg - 1" segments have been SACKed. A segment is considered to be SACKed when all its data bytes (TCP) or data chunks (SCTP) have been indicated as arrived by the receiver. - When conditions (3.a) and (3.b) do not hold, the transport MUST NOT - use Early Retransmit, but rather prefer the standard mechanisms, - including Fast Retransmit and Limited Transmit. + If either (or both) conditions (3.a) or (3.b) does not hold, the + transport MUST NOT use Early Retransmit, but rather prefer the + standard mechanisms, including Fast Retransmit and Limited Transmit. This version of Early Retransmit solves the precision issues discussed in the previous section. As noted previously, the cost is that the implementation will have to track segment boundaries to form an understanding as to how many actual segments have been transmitted, but not acknowledged. This can be done by the sender tracking the boundaries of the three segments on the right side of the current window (which involves tracking four sequence numbers in TCP). This could be done by keeping a circular list of the segment boundaries, for instance. Cumulative ACKs that do not fall within