draft-ietf-tsvwg-highspeed-01.txt | rfc3649.txt | |||
---|---|---|---|---|
Internet Engineering Task Force Sally Floyd | ||||
INTERNET-DRAFT ICSI | Network Working Group S. Floyd | |||
Expires: February 2004 | Request for Comments: 3649 ICSI | |||
Category: Experimental December 2003 | ||||
HighSpeed TCP for Large Congestion Windows | HighSpeed TCP for Large Congestion Windows | |||
Status of this Memo | Status of this Memo | |||
This memo defines an Experimental Protocol for the Internet | ||||
community. It does not specify an Internet standard of any kind. | ||||
Discussion and suggestions for improvement are requested. | ||||
Distribution of this memo is unlimited. | ||||
Copyright Notice | ||||
Copyright (C) The Internet Society (2003). All Rights Reserved. | ||||
Abstract | ||||
The proposals in this document are experimental. While they may be | The proposals in this document are experimental. While they may be | |||
deployed in the current Internet, they do not represent a consensus | deployed in the current Internet, they do not represent a consensus | |||
that this is the best method for high-speed congestion control. In | that this is the best method for high-speed congestion control. In | |||
particular, we note that alternative experimental proposals are | particular, we note that alternative experimental proposals are | |||
likely to be forthcoming, and it is not well understood how the | likely to be forthcoming, and it is not well understood how the | |||
proposals in this document will interact with such alternative | proposals in this document will interact with such alternative | |||
proposals. | proposals. | |||
Status of this Document | ||||
This document is an Internet-Draft and is in full conformance with | ||||
all provisions of Section 10 of RFC2026. | ||||
Internet-Drafts are working documents of the Internet Engineering | ||||
Task Force (IETF), its areas, and its working groups. Note that | ||||
other groups may also distribute working documents as Internet- | ||||
Drafts. | ||||
Internet-Drafts are draft documents valid for a maximum of six | ||||
months and may be updated, replaced, or obsoleted by other documents | ||||
at any time. It is inappropriate to use Internet- Drafts as | ||||
reference material or to cite them other than as "work in progress." | ||||
The list of current Internet-Drafts can be accessed at | ||||
http://www.ietf.org/ietf/1id-abstracts.txt | ||||
The list of Internet-Draft Shadow Directories can be accessed at | ||||
http://www.ietf.org/shadow.html. | ||||
Abstract | ||||
This document proposes HighSpeed TCP, a modification to TCP's | This document proposes HighSpeed TCP, a modification to TCP's | |||
congestion control mechanism for use with TCP connections with | congestion control mechanism for use with TCP connections with large | |||
large congestion windows. The congestion control mechanisms | congestion windows. The congestion control mechanisms of the current | |||
of the current Standard TCP constrains the congestion windows | Standard TCP constrains the congestion windows that can be achieved | |||
that can be achieved by TCP in realistic environments. For | by TCP in realistic environments. For example, for a Standard TCP | |||
example, for a Standard TCP connection with 1500-byte packets | connection with 1500-byte packets and a 100 ms round-trip time, | |||
and a 100 ms round-trip time, achieving a steady-state | achieving a steady-state throughput of 10 Gbps would require an | |||
throughput of 10 Gbps would require an average congestion | average congestion window of 83,333 segments, and a packet drop rate | |||
window of 83,333 segments, and a packet drop rate of at most | of at most one congestion event every 5,000,000,000 packets (or | |||
one congestion event every 5,000,000,000 packets (or | equivalently, at most one congestion event every 1 2/3 hours). This | |||
equivalently, at most one congestion event every 1 2/3 hours). | is widely acknowledged as an unrealistic constraint. To address this | |||
This is widely acknowledged as an unrealistic constraint. To | limitation of TCP, this document proposes HighSpeed TCP, and solicits | |||
address this limitation of TCP, this document proposes | experimentation and feedback from the wider community. | |||
HighSpeed TCP, and solicits experimentation and feedback from | ||||
the wider community. | ||||
TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION: | ||||
Changes from draft-ietf-tsvwg-highspeed-00.txt: | ||||
Changed: | ||||
"The proposals in this document are experimental. We believe | ||||
they are safe for deployment in the current Internet, " | ||||
To: | ||||
"The proposals in this document are experimental. While they | ||||
may be deployed in the current Internet, " | ||||
Changes from draft-floyd-tcp-highspeed-03.txt: | ||||
Added the section on "Status of this Memo". | ||||
Added a paragraph to the end of the section on "Deployment | ||||
issues of HighSpeed TCP" about possible interactions between | ||||
HighSpeed TCP and other alternative experimental proposals. | ||||
Changes from draft-floyd-tcp-highspeed-02.txt: | ||||
* Added a section on "Deployment issues." | ||||
* Added a short section on "Implementation issues." | ||||
* Added a section on "Limiting burstiness on short time | ||||
scales". | ||||
* Added to the discussion on convergence times. | ||||
* Clarified that "log" is "log base 10". | ||||
* Clarified that W = Low_window and W_1 = High_window, in the | ||||
equation for b(w). | ||||
Changes from draft-floyd-tcp-highspeed-01.txt: | ||||
* Added a section on "Tradeoffs for Choosing Congestion | ||||
Control Parameters". | ||||
* Added mention of Scalable TCP from Tom Kelly. | ||||
Changes from draft-floyd-tcp-highspeed-00.txt: | ||||
* Added a discussion on related work about changing the PMTU. | ||||
* Added a discussion of an alternate, linear response | ||||
function. | ||||
* Added a discussion of the TCP window scale option. | ||||
* Added a discussion of HighSpeed TCP as roughly emulating the | ||||
congestion control response of N parallel TCP connections. | ||||
* Added a discussion of the time to converge to fairness. | ||||
* Expanded the Introduction. | ||||
Table of Contents | Table of Contents | |||
1. Introduction. . . . . . . . . . . . . . . . . . . . . . 5 | 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
2. The Problem Description.. . . . . . . . . . . . . . . . 6 | 2. The Problem Description.. . . . . . . . . . . . . . . . . . . . 3 | |||
3. Design Guidelines.. . . . . . . . . . . . . . . . . . . 6 | 3. Design Guidelines.. . . . . . . . . . . . . . . . . . . . . . . 4 | |||
4. Non-Goals.. . . . . . . . . . . . . . . . . . . . . . . 7 | 4. Non-Goals.. . . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
5. Modifying the TCP Response Function.. . . . . . . . . . 8 | 5. Modifying the TCP Response Function.. . . . . . . . . . . . . . 6 | |||
6. Fairness Implications of the HighSpeed Response | 6. Fairness Implications of the HighSpeed Response | |||
Function.. . . . . . . . . . . . . . . . . . . . . . . . . 11 | Function. . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
7. Translating the HighSpeed Response Function into | 7. Translating the HighSpeed Response Function into | |||
Congestion Control Parameters. . . . . . . . . . . . . . . 14 | Congestion Control Parameters . . . . . . . . . . . . . . . . . 12 | |||
8. An alternate, linear response functions.. . . . . . . . 16 | 8. An alternate, linear response functions.. . . . . . . . . . . . 13 | |||
9. Tradeoffs for Choosing Congestion Control Parame- | 9. Tradeoffs for Choosing Congestion Control Parameters. . . . . . 16 | |||
ters.. . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | 9.1. The Number of Round-Trip Times between Loss Events . . . . 17 | |||
9.1. The Number of Round-Trip Times between Loss | 9.2. The Number of Packet Drops per Loss Event, with Drop-Tail. 17 | |||
Events. . . . . . . . . . . . . . . . . . . . . . . . . . 19 | 10. Related Issues . . . . . . . . . . . . . . . . . . . . . . . . 18 | |||
9.2. The Number of Packet Drops per Loss Event, | 10.1. Slow-Start. . . . . . . . . . . . . . . . . . . . . . . . 18 | |||
with Drop-Tail. . . . . . . . . . . . . . . . . . . . . . 19 | 10.2. Limiting burstiness on short time scales. . . . . . . . . 19 | |||
10. Related Issues . . . . . . . . . . . . . . . . . . . . 20 | 10.3. Other limitations on window size. . . . . . . . . . . . . 19 | |||
10.1. Slow-Start. . . . . . . . . . . . . . . . . . . . . 20 | 10.4. Implementation issues.. . . . . . . . . . . . . . . . . . 19 | |||
10.2. Limiting burstiness on short time scales. . . . . . 21 | 11. Deployment issues. . . . . . . . . . . . . . . . . . . . . . . 20 | |||
10.3. Other limitations on window size. . . . . . . . . . 22 | 11.1. Deployment issues of HighSpeed TCP. . . . . . . . . . . . 20 | |||
10.4. Implementation issues.. . . . . . . . . . . . . . . 22 | 11.2. Deployment issues of Scalable TCP . . . . . . . . . . . . 22 | |||
11. Deployment issues. . . . . . . . . . . . . . . . . . . 22 | 12. Related Work in HighSpeed TCP. . . . . . . . . . . . . . . . . 23 | |||
11.1. Deployment issues of HighSpeed TCP. . . . . . . . . 22 | 13. Relationship to other Work.. . . . . . . . . . . . . . . . . . 25 | |||
11.2. Deployment issues of Scalable TCP . . . . . . . . . 24 | 14. Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . 25 | |||
12. Related Work in HighSpeed TCP. . . . . . . . . . . . . 26 | 15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 25 | |||
13. Relationship to other Work.. . . . . . . . . . . . . . 27 | 16. Normative References . . . . . . . . . . . . . . . . . . . . . 26 | |||
14. Conclusions. . . . . . . . . . . . . . . . . . . . . . 28 | 17. Informative References . . . . . . . . . . . . . . . . . . . . 26 | |||
15. Acknowledgements . . . . . . . . . . . . . . . . . . . 28 | 18. Security Considerations. . . . . . . . . . . . . . . . . . . . 28 | |||
16. Normative References . . . . . . . . . . . . . . . . . 29 | 19. IANA Considerations. . . . . . . . . . . . . . . . . . . . . . 28 | |||
17. Informative References . . . . . . . . . . . . . . . . 29 | A. TCP's Loss Event Rate in Steady-State. . . . . . . . . . . . . 29 | |||
18. Security Considerations. . . . . . . . . . . . . . . . 31 | B. A table for a(w) and b(w). . . . . . . . . . . . . . . . . . . 30 | |||
19. IANA Considerations. . . . . . . . . . . . . . . . . . 31 | C. Exploring the time to converge to fairness . . . . . . . . . . 32 | |||
20. TCP's Loss Event Rate in Steady-State. . . . . . . . . 31 | Author's Address . . . . . . . . . . . . . . . . . . . . . . . 33 | |||
Full Copyright Statement . . . . . . . . . . . . . . . . . . . 34 | ||||
1. Introduction. | 1. Introduction | |||
This document proposes HighSpeed TCP, a modification to TCP's | This document proposes HighSpeed TCP, a modification to TCP's | |||
congestion control mechanism for use with TCP connections with large | congestion control mechanism for use with TCP connections with large | |||
congestion windows. In a steady-state environment, with a packet | congestion windows. In a steady-state environment, with a packet | |||
loss rate p, the current Standard TCP's average congestion window is | loss rate p, the current Standard TCP's average congestion window is | |||
roughly 1.2/sqrt(p) segments. This places a serious constraint on | roughly 1.2/sqrt(p) segments. This places a serious constraint on | |||
the congestion windows that can be achieved by TCP in realistic | the congestion windows that can be achieved by TCP in realistic | |||
environments. For example, for a Standard TCP connection with | environments. For example, for a Standard TCP connection with 1500- | |||
1500-byte packets and a 100 ms round-trip time, achieving a steady- | byte packets and a 100 ms round-trip time, achieving a steady-state | |||
state throughput of 10 Gbps would require an average congestion | throughput of 10 Gbps would require an average congestion window of | |||
window of 83,333 segments, and a packet drop rate of at most one | 83,333 segments, and a packet drop rate of at most one congestion | |||
congestion event every 5,000,000,000 packets (or equivalently, at | event every 5,000,000,000 packets (or equivalently, at most one | |||
most one congestion event every 1 2/3 hours). The average packet | congestion event every 1 2/3 hours). The average packet drop rate of | |||
drop rate of at most 2*10^(-10) needed for full link utilization in | at most 2*10^(-10) needed for full link utilization in this | |||
this environment corresponds to a bit error rate of at most | environment corresponds to a bit error rate of at most 2*10^(-14), | |||
2*10^(-14), and this is an unrealistic requirement for current | and this is an unrealistic requirement for current networks. | |||
networks. | ||||
To address this fundamental limitation of TCP and of the TCP | To address this fundamental limitation of TCP and of the TCP response | |||
response function (the function mapping the steady-state packet drop | function (the function mapping the steady-state packet drop rate to | |||
rate to TCP's average sending rate in packets per round-trip time), | TCP's average sending rate in packets per round-trip time), this | |||
this document describes a modified TCP response function for regimes | document describes a modified TCP response function for regimes with | |||
with higher congestion windows. This document also solicits | higher congestion windows. This document also solicits | |||
experimentation and feedback on HighSpeed TCP from the wider | experimentation and feedback on HighSpeed TCP from the wider | |||
community. | community. | |||
Because HighSpeed TCP's modified response function would only take | Because HighSpeed TCP's modified response function would only take | |||
effect with higher congestion windows, HighSpeed TCP does not modify | effect with higher congestion windows, HighSpeed TCP does not modify | |||
TCP behavior in environments with mild to heavy congestion, and | TCP behavior in environments with heavy congestion, and therefore | |||
therefore does not introduce any new dangers of congestion collapse. | does not introduce any new dangers of congestion collapse. However, | |||
However, if relative fairness between HighSpeed TCP connections is | if relative fairness between HighSpeed TCP connections is to be | |||
to be preserved, then in our view any modification to the TCP | preserved, then in our view any modification to the TCP response | |||
response function should be addressed in the IETF, rather than made | function should be addressed in the IETF, rather than made as ad hoc | |||
as ad hoc decisions by individual implementors or TCP senders. | decisions by individual implementors or TCP senders. Modifications | |||
Modifications to the TCP response function would also have | to the TCP response function would also have implications for | |||
implications for transport protocols that use TFRC and other forms | transport protocols that use TFRC and other forms of equation-based | |||
of equation-based congestion control, as these congestion control | congestion control, as these congestion control mechanisms directly | |||
mechanisms directly use the TCP response function [RFC3448]. | use the TCP response function [RFC3448]. | |||
This proposal for HighSpeed TCP focuses specifically on a proposed | This proposal for HighSpeed TCP focuses specifically on a proposed | |||
change to the TCP response function, and its implications for TCP. | change to the TCP response function, and its implications for TCP. | |||
This document does not address what we view as a separate | This document does not address what we view as a separate fundamental | |||
fundamental issue, of the mechanisms required to enable best-effort | issue, of the mechanisms required to enable best-effort connections | |||
connections to *start* with large initial windows. In our view, | to *start* with large initial windows. In our view, while HighSpeed | |||
while HighSpeed TCP proposes a somewhat fundamental change to the | TCP proposes a somewhat fundamental change to the TCP response | |||
TCP response function, at the same time it is a relatively simple | function, at the same time it is a relatively simple change to | |||
change to implement in a single TCP sender, and presents no dangers | implement in a single TCP sender, and presents no dangers in terms of | |||
in terms of congestion collapse. In contrast, in our view, the | congestion collapse. In contrast, in our view, the problem of | |||
problem of enabling connections to *start* with large initial | enabling connections to *start* with large initial windows is | |||
windows is inherently more risky and structurally more difficult, | inherently more risky and structurally more difficult, requiring some | |||
requiring some form of explicit feedback from all of the routers | form of explicit feedback from all of the routers along the path. | |||
along the path. This is another reason why we would propose | This is another reason why we would propose addressing the problem of | |||
addressing the problem of starting with large initial windows | starting with large initial windows separately, and on a separate | |||
separately, and on a separate timetable, from the problem of | timetable, from the problem of modifying the TCP response function. | |||
modifying the TCP response function. | ||||
2. The Problem Description. | 2. The Problem Description | |||
This section describes the number of round-trip times between | This section describes the number of round-trip times between | |||
congestion events required for a Standard TCP flow to achieve an | congestion events required for a Standard TCP flow to achieve an | |||
average throughput of B bps, given packets of D bytes and a round- | average throughput of B bps, given packets of D bytes and a round- | |||
trip time of R seconds. A congestion event refers to a window of | trip time of R seconds. A congestion event refers to a window of | |||
data with one or more dropped or ECN-marked packets (where ECN | data with one or more dropped or ECN-marked packets (where ECN stands | |||
stands for Explicit Congestion Notification). | for Explicit Congestion Notification). | |||
From Appendix A, achieving an average TCP throughput of B bps | From Appendix A, achieving an average TCP throughput of B bps | |||
requires a loss event at most every BR/(12D) round-trip times. This | requires a loss event at most every BR/(12D) round-trip times. This | |||
is illustrated in Table 1, for R = 0.1 seconds and D = 1500 bytes. | is illustrated in Table 1, for R = 0.1 seconds and D = 1500 bytes. | |||
The table also gives the average congestion window W of BR/(8D), and | The table also gives the average congestion window W of BR/(8D), and | |||
the steady-state packet drop rate P of 1.5/W^2. | the steady-state packet drop rate P of 1.5/W^2. | |||
TCP Throughput (Mbps) RTTs Between Losses W P | TCP Throughput (Mbps) RTTs Between Losses W P | |||
--------------------- ------------------- ---- ----- | --------------------- ------------------- ---- ----- | |||
1 5.5 8.3 0.02 | 1 5.5 8.3 0.02 | |||
10 55.5 83.3 0.0002 | 10 55.5 83.3 0.0002 | |||
100 555.5 833.3 0.000002 | 100 555.5 833.3 0.000002 | |||
1000 5555.5 8333.3 0.00000002 | 1000 5555.5 8333.3 0.00000002 | |||
10000 55555.5 83333.3 0.0000000002 | 10000 55555.5 83333.3 0.0000000002 | |||
Table 1: RTTs Between Congestion Events for Standard TCP, for | Table 1: RTTs Between Congestion Events for Standard TCP, for | |||
1500-Byte Packets and a Round-Trip Time of 0.1 Seconds. | 1500-Byte Packets and a Round-Trip Time of 0.1 Seconds. | |||
This document proposes HighSpeed TCP, a minimal modification to | This document proposes HighSpeed TCP, a minimal modification to TCP's | |||
TCP's increase and decrease parameters, for TCP connections with | increase and decrease parameters, for TCP connections with larger | |||
larger congestion windows, to allow TCP to achieve high throughput | congestion windows, to allow TCP to achieve high throughput with more | |||
with more realistic requirements for the steady-state packet drop | realistic requirements for the steady-state packet drop rate. | |||
rate. Equivalently, HighSpeed TCP has more realistic requirements | Equivalently, HighSpeed TCP has more realistic requirements for the | |||
for the number of round-trip times between loss events. | number of round-trip times between loss events. | |||
3. Design Guidelines. | 3. Design Guidelines | |||
Our proposal for HighSpeed TCP is motivated by the following | Our proposal for HighSpeed TCP is motivated by the following | |||
requirements: | requirements: | |||
* Achieve high per-connection throughput without requiring | * Achieve high per-connection throughput without requiring | |||
unrealistically low packet loss rates. | unrealistically low packet loss rates. | |||
* Reach high throughput reasonably quickly when in slow-start. | * Reach high throughput reasonably quickly when in slow-start. | |||
* Reach high throughput without overly long delays when recovering | * Reach high throughput without overly long delays when recovering | |||
from multiple retransmit timeouts, or when ramping-up from a period | from multiple retransmit timeouts, or when ramping-up from a | |||
with small congestion windows. | period with small congestion windows. | |||
* No additional feedback or support required from routers: | * No additional feedback or support required from routers: | |||
For example, the goal is for acceptable performance in both ECN- | For example, the goal is for acceptable performance in both ECN- | |||
capable and non-ECN-capable environments, and with Drop-Tail as well | capable and non-ECN-capable environments, and with Drop-Tail as well | |||
as with Active Queue Management such as RED in the routers. | as with Active Queue Management such as RED in the routers. | |||
* No additional feedback required from TCP receivers. | * No additional feedback required from TCP receivers. | |||
* TCP-compatible performance in environments with moderate or high | * TCP-compatible performance in environments with moderate or high | |||
congestion: | congestion (e.g., packet drop rates of 1% or higher): | |||
Equivalently, the requirement is that there be no additional load on | Equivalently, the requirement is that there be no additional load on | |||
the network (in terms of increased packet drop rates) in | the network (in terms of increased packet drop rates) in environments | |||
environments with moderate or high congestion. | with moderate or high congestion. | |||
* Performance at least as good as Standard TCP in environments with | * Performance at least as good as Standard TCP in environments with | |||
moderate or high congestion. | moderate or high congestion. | |||
* Acceptable transient performance, in terms of increases in the | * Acceptable transient performance, in terms of increases in the | |||
congestion window in one round-trip time, responses to severe | congestion window in one round-trip time, responses to severe | |||
congestion, and convergence times to fairness. | congestion, and convergence times to fairness. | |||
Currently, users wishing to achieve throughputs of 1 Gbps or more | Currently, users wishing to achieve throughputs of 1 Gbps or more | |||
typically open up multiple TCP connections in parallel, or use | typically open up multiple TCP connections in parallel, or use MulTCP | |||
MulTCP [CO98,GRK99], which behaves roughly like the aggregate of N | [CO98,GRK99], which behaves roughly like the aggregate of N virtual | |||
virtual TCP connections. While this approach suffices for the | TCP connections. While this approach suffices for the occasional | |||
occasional user on well-provisioned links, it leaves the parameter N | user on well-provisioned links, it leaves the parameter N to be | |||
to be determined by the user, and results in more aggressive | determined by the user, and results in more aggressive performance | |||
performance and higher steady-state packet drop rates if used in | and higher steady-state packet drop rates if used in environments | |||
environments with periods of moderate or high congestion. We | with periods of moderate or high congestion. We believe that a new | |||
believe that a new approach is needed that offers more flexibility, | approach is needed that offers more flexibility, more effectively | |||
more effectively scales to a wide range of available bandwidths, and | scales to a wide range of available bandwidths, and competes more | |||
competes more fairly with Standard TCP in congested environments. | fairly with Standard TCP in congested environments. | |||
4. Non-Goals. | 4. Non-Goals | |||
The following are explicitly *not* goals of our work: | The following are explicitly *not* goals of our work: | |||
* Non-goal: TCP-compatible performance in environments with very low | * Non-goal: TCP-compatible performance in environments with very low | |||
packet drop rates. | packet drop rates. | |||
We note that our proposal does not require, or deliver, TCP- | We note that our proposal does not require, or deliver, TCP- | |||
compatible performance in environments with very low packet drop | compatible performance in environments with very low packet drop | |||
rates, e.g., with packet loss rates of 10^-5 or 10^-6. As we | rates, e.g., with packet loss rates of 10^-5 or 10^-6. As we discuss | |||
discuss later in this document, we assume that Standard TCP is | later in this document, we assume that Standard TCP is unable to make | |||
unable to make effective use of the available bandwidth in | effective use of the available bandwidth in environments with loss | |||
environments with loss rates of 10^-6 in any case, so that it is | rates of 10^-6 in any case, so that it is acceptable and appropriate | |||
acceptable and appropriate for HighSpeed TCP to perform more | for HighSpeed TCP to perform more aggressively than Standard TCP in | |||
aggressively than Standard TCP is such an environment. | such an environment. | |||
* Non-goal: Ramping-up more quickly than allowed by slow-start. | * Non-goal: Ramping-up more quickly than allowed by slow-start. | |||
It is our belief that ramping-up more quickly than allowed by slow- | It is our belief that ramping-up more quickly than allowed by slow- | |||
start would necessitate more explicit feedback from routers along | start would necessitate more explicit feedback from routers along the | |||
the path. The proposal for HighSpeed TCP is focused on changes to | path. The proposal for HighSpeed TCP is focused on changes to TCP | |||
TCP that could be effectively deployed in the current Internet | that could be effectively deployed in the current Internet | |||
environment. | environment. | |||
* Non-goal: Avoiding oscillations in environments with only one-way, | * Non-goal: Avoiding oscillations in environments with only one-way, | |||
long-lived flows all with the same round-trip times. | long-lived flows all with the same round-trip times. | |||
While we agree that attention to oscillatory behavior is useful, | While we agree that attention to oscillatory behavior is useful, | |||
avoiding oscillations in aggregate throughput has not been our | avoiding oscillations in aggregate throughput has not been our | |||
primary consideration, particularly for simplified environments | primary consideration, particularly for simplified environments | |||
limited to one-way, long-lived flows all with the same, large round- | limited to one-way, long-lived flows all with the same, large round- | |||
trip times. Our assessment is that some oscillatory behavior in | trip times. Our assessment is that some oscillatory behavior in | |||
these extreme environments is an acceptable price to pay for the | these extreme environments is an acceptable price to pay for the | |||
other benefits of HighSpeed TCP. | other benefits of HighSpeed TCP. | |||
5. Modifying the TCP Response Function. | 5. Modifying the TCP Response Function | |||
The TCP response function, w = 1.2/sqrt(p), gives TCP's average | The TCP response function, w = 1.2/sqrt(p), gives TCP's average | |||
congestion window w in MSS-sized segments, as a function of the | congestion window w in MSS-sized segments, as a function of the | |||
steady-state packet drop rate p [FF98]. This TCP response function | steady-state packet drop rate p [FF98]. This TCP response function | |||
is a direct consequence of TCP's Additive Increase Multiplicative | is a direct consequence of TCP's Additive Increase Multiplicative | |||
Decrease (AIMD) mechanisms of increasing the congestion window by | Decrease (AIMD) mechanisms of increasing the congestion window by | |||
roughly one segment per round-trip time in the absence of | roughly one segment per round-trip time in the absence of congestion, | |||
congestion, and halving the congestion window in response to a | and halving the congestion window in response to a round-trip time | |||
round-trip time with a congestion event. This response function for | with a congestion event. This response function for Standard TCP is | |||
Standard TCP is reflected in the table below. In this proposal we | reflected in the table below. In this proposal we restrict our | |||
restrict our attention to TCP performance in environments with | attention to TCP performance in environments with packet loss rates | |||
packet loss rates of at most 10^-2, and so we can ignore the more | of at most 10^-2, and so we can ignore the more complex response | |||
complex response functions that are required to model TCP | functions that are required to model TCP performance in more | |||
performance in more congested environments with retransmit timeouts. | congested environments with retransmit timeouts. From Appendix A, an | |||
From Appendix A, an average congestion window of W corresponds to an | average congestion window of W corresponds to an average of 2/3 W | |||
average of 2/3 W round-trip times between loss events for Standard | round-trip times between loss events for Standard TCP (with the | |||
TCP (with the congestion window varying from 2/3 W to 4/3 W). | congestion window varying from 2/3 W to 4/3 W). | |||
Packet Drop Rate P Congestion Window W RTTs Between Losses | Packet Drop Rate P Congestion Window W RTTs Between Losses | |||
------------------ ------------------- ------------------- | ------------------ ------------------- ------------------- | |||
10^-2 12 8 | 10^-2 12 8 | |||
10^-3 38 25 | 10^-3 38 25 | |||
10^-4 120 80 | 10^-4 120 80 | |||
10^-5 379 252 | 10^-5 379 252 | |||
10^-6 1200 800 | 10^-6 1200 800 | |||
10^-7 3795 2530 | 10^-7 3795 2530 | |||
10^-8 12000 8000 | 10^-8 12000 8000 | |||
10^-9 37948 25298 | 10^-9 37948 25298 | |||
10^-10 120000 80000 | 10^-10 120000 80000 | |||
Table 2: TCP Response Function for Standard TCP. The average | Table 2: TCP Response Function for Standard TCP. The average | |||
congestion window W in MSS-sized segments is given as a function of | congestion window W in MSS-sized segments is given as a function of | |||
the packet drop rate P. | the packet drop rate P. | |||
To specify a modified response function for HighSpeed TCP, we use | To specify a modified response function for HighSpeed TCP, we use | |||
three parameters, Low_Window, High_Window, and High_P. To ensure | three parameters, Low_Window, High_Window, and High_P. To ensure TCP | |||
TCP compatibility, the HighSpeed response function uses the same | compatibility, the HighSpeed response function uses the same response | |||
response function as Standard TCP when the current congestion window | function as Standard TCP when the current congestion window is at | |||
is at most Low_Window, and uses the HighSpeed response function when | most Low_Window, and uses the HighSpeed response function when the | |||
the current congestion window is greater than Low_Window. In this | current congestion window is greater than Low_Window. In this | |||
document we set Low_Window to 38 MSS-sized segments, corresponding | document we set Low_Window to 38 MSS-sized segments, corresponding to | |||
to a packet drop rate of 10^-3 for TCP. | a packet drop rate of 10^-3 for TCP. | |||
To specify the upper end of the HighSpeed response function, we | To specify the upper end of the HighSpeed response function, we | |||
specify the packet drop rate needed in the HighSpeed response | specify the packet drop rate needed in the HighSpeed response | |||
function to achieve an average congestion window of 83000 segments. | function to achieve an average congestion window of 83000 segments. | |||
This is roughly the window needed to sustain 10 Gbps throughput, for | This is roughly the window needed to sustain 10 Gbps throughput, for | |||
a TCP connection with the default packet size and round-trip time | a TCP connection with the default packet size and round-trip time | |||
used earlier in this document. For High_Window set to 83000, we | used earlier in this document. For High_Window set to 83000, we | |||
specify High_P of 10^-7; that is, with HighSpeed TCP a packet drop | specify High_P of 10^-7; that is, with HighSpeed TCP a packet drop | |||
rate of 10^-7 allows the HighSpeed TCP connection to achieve an | rate of 10^-7 allows the HighSpeed TCP connection to achieve an | |||
average congestion window of 83000 segments. We believe that this | average congestion window of 83000 segments. We believe that this | |||
loss rate sets an achievable target for high-speed environments, | loss rate sets an achievable target for high-speed environments, | |||
while still allowing acceptable fairness for the HighSpeed response | while still allowing acceptable fairness for the HighSpeed response | |||
function when competing with Standard TCP in environments with | function when competing with Standard TCP in environments with packet | |||
packet drop rates of 10^-4 or 10^5. | drop rates of 10^-4 or 10^5. | |||
For simplicity, for the HighSpeed response function we maintain the | For simplicity, for the HighSpeed response function we maintain the | |||
property that the response function gives a straight line on a log- | property that the response function gives a straight line on a log- | |||
log scale (as does the response function for Standard TCP, for low | log scale (as does the response function for Standard TCP, for low to | |||
to moderate congestion). This results in the following response | moderate congestion). This results in the following response | |||
function, for values of the average congestion window W greater than | function, for values of the average congestion window W greater than | |||
Low_Window: | Low_Window: | |||
W = (p/Low_P)^S Low_Window, | W = (p/Low_P)^S Low_Window, | |||
for Low_P the packet drop rate corresponding to Low_Window, and for S | ||||
for Low_P the packet drop rate corresponding to Low_Window, and for | as following constant [FRS02]: | |||
S as following constant [FRS02]: | ||||
S = (log High_Window - log Low_Window)/(log High_P - log Low_P). | S = (log High_Window - log Low_Window)/(log High_P - log Low_P). | |||
(In this paper, "log x" refers to the log base 10.) For example, | (In this paper, "log x" refers to the log base 10.) For example, for | |||
for Low_Window set to 38, we have Low_P of 10^-3 (for compatibility | Low_Window set to 38, we have Low_P of 10^-3 (for compatibility with | |||
with Standard TCP). Thus, for High_Window set to 83000 and High_P | Standard TCP). Thus, for High_Window set to 83000 and High_P set to | |||
set to 10^-7, we get the following response function: | 10^-7, we get the following response function: | |||
W = 0.12/p^0.835. (1) | W = 0.12/p^0.835. (1) | |||
This HighSpeed response function is illustrated in Table 3 below. | This HighSpeed response function is illustrated in Table 3 below. | |||
For HighSpeed TCP, the number of round-trip times between losses, | For HighSpeed TCP, the number of round-trip times between losses, | |||
1/(pW), equals 12.7 W^0.2, for W > 38 segments. | 1/(pW), equals 12.7 W^0.2, for W > 38 segments. | |||
Packet Drop Rate P Congestion Window W RTTs Between Losses | Packet Drop Rate P Congestion Window W RTTs Between Losses | |||
------------------ ------------------- ------------------- | ------------------ ------------------- ------------------- | |||
10^-2 12 8 | 10^-2 12 8 | |||
skipping to change at page 10, line 45 | skipping to change at page 8, line 42 | |||
Table 3: TCP Response Function for HighSpeed TCP. The average | Table 3: TCP Response Function for HighSpeed TCP. The average | |||
congestion window W in MSS-sized segments is given as a function of | congestion window W in MSS-sized segments is given as a function of | |||
the packet drop rate P. | the packet drop rate P. | |||
We believe that the problem of backward compatibility with Standard | We believe that the problem of backward compatibility with Standard | |||
TCP requires a response function that is quite close to that of | TCP requires a response function that is quite close to that of | |||
Standard TCP for loss rates of 10^-1, 10^-2, or 10^-3. We believe, | Standard TCP for loss rates of 10^-1, 10^-2, or 10^-3. We believe, | |||
however, that such stringent TCP-compatibility is not required for | however, that such stringent TCP-compatibility is not required for | |||
smaller loss rates, and that an appropriate response function is one | smaller loss rates, and that an appropriate response function is one | |||
that gives a plausible packet drop rate for a connection throughput | that gives a plausible packet drop rate for a connection throughput | |||
of 10 Gbps. This also gives a slowly increasing number of round- | of 10 Gbps. This also gives a slowly increasing number of round-trip | |||
trip times between loss events as a function of a decreasing packet | times between loss events as a function of a decreasing packet drop | |||
drop rate. | rate. | |||
Another way to look at the HighSpeed response function is to | Another way to look at the HighSpeed response function is to consider | |||
consider that HighSpeed TCP is roughly emulating the congestion | that HighSpeed TCP is roughly emulating the congestion control | |||
control response of N parallel TCP connections, where N is initially | response of N parallel TCP connections, where N is initially one, and | |||
one, and where N increases as a function of the HighSpeed TCP's | where N increases as a function of the HighSpeed TCP's congestion | |||
congestion window. Thus for the HighSpeed response function in | window. Thus for the HighSpeed response function in Equation (1) | |||
Equation (1) above, the response function can be viewed as | above, the response function can be viewed as equivalent to that of | |||
equivalent to that of N(W) parallel TCP connections, where N(W) | N(W) parallel TCP connections, where N(W) varies as a function of the | |||
varies as a function of the congestion window W. Recall that for a | congestion window W. Recall that for a single standard TCP | |||
single standard TCP connection, the average congestion window equals | connection, the average congestion window equals 1.2/sqrt(p). For N | |||
1.2/sqrt(p). For N parallel TCP connections, the aggregate | parallel TCP connections, the aggregate congestion window for the N | |||
congestion window for the N connections equals N*1.2/sqrt(p). From | connections equals N*1.2/sqrt(p). From the HighSpeed response | |||
the HighSpeed response function in Equation (1) and the relationship | function in Equation (1) and the relationship above, we can derive | |||
above, we can derive the following: | the following: | |||
N(W) = 0.23*W^(0.4) | N(W) = 0.23*W^(0.4) | |||
for N(W) the number of parallel TCP connections emulated by the | for N(W) the number of parallel TCP connections emulated by the | |||
HighSpeed TCP response function, and for N(W) >= 1. This is shown | HighSpeed TCP response function, and for N(W) >= 1. This is shown in | |||
in Table 4 below. | Table 4 below. | |||
Congestion Window W Number N(W) of Parallel TCPs | Congestion Window W Number N(W) of Parallel TCPs | |||
------------------- ------------------------- | ------------------- ------------------------- | |||
1 1 | 1 1 | |||
10 1 | 10 1 | |||
100 1.4 | 100 1.4 | |||
1,000 3.6 | 1,000 3.6 | |||
10,000 9.2 | 10,000 9.2 | |||
100,000 23.0 | 100,000 23.0 | |||
Table 4: Number N(W) of parallel TCP connections roughly emulated by | Table 4: Number N(W) of parallel TCP connections roughly emulated by | |||
the HighSpeed TCP response function. | the HighSpeed TCP response function. | |||
We do not in this document attempt to seriously evaluate the | In this document, we do not attempt to seriously evaluate the | |||
HighSpeed response function for congestion windows greater than | HighSpeed response function for congestion windows greater than | |||
100,000 packets. We believe that we will learn more about the | 100,000 packets. We believe that we will learn more about the | |||
requirements for sustaining the throughput of best-effort | requirements for sustaining the throughput of best-effort connections | |||
connections in that range as we gain more experience with HighSpeed | in that range as we gain more experience with HighSpeed TCP with | |||
TCP with congestion windows of thousands and tens of thousands of | congestion windows of thousands and tens of thousands of packets. | |||
packets. There also might be limitations to the per-connection | There also might be limitations to the per-connection throughput that | |||
throughput that can be realistically achieved for best-effort | can be realistically achieved for best-effort traffic, in terms of | |||
traffic, in terms of congestion window of hundreds of thousands of | congestion window of hundreds of thousands of packets or more, in the | |||
packets or more, in the absence of additional support or feedback | absence of additional support or feedback from the routers along the | |||
from the routers along the path. | path. | |||
6. Fairness Implications of the HighSpeed Response Function. | 6. Fairness Implications of the HighSpeed Response Function | |||
The Standard and Highspeed Response Functions can be used directly | The Standard and Highspeed Response Functions can be used directly to | |||
to infer the relative fairness between flows using the two response | infer the relative fairness between flows using the two response | |||
functions. For example, given a packet drop rate P, assume that | functions. For example, given a packet drop rate P, assume that | |||
Standard TCP has an average congestion window of W_Standard, and | Standard TCP has an average congestion window of W_Standard, and | |||
HighSpeed TCP has a higher average congestion window of W_HighSpeed. | HighSpeed TCP has a higher average congestion window of W_HighSpeed. | |||
In this case, a single HighSpeed TCP connection is receiving | In this case, a single HighSpeed TCP connection is receiving | |||
W_HighSpeed/W_Standard times the throughput of a single Standard TCP | W_HighSpeed/W_Standard times the throughput of a single Standard TCP | |||
connection competing in the same environment. | connection competing in the same environment. | |||
This relative fairness is illustrated below in Table 5, for the | This relative fairness is illustrated below in Table 5, for the | |||
parameters used for the Highspeed response function in the section | parameters used for the Highspeed response function in the section | |||
above. The second column gives the relative fairness, for the | above. The second column gives the relative fairness, for the | |||
steady-state packet drop rate specified in the first column. To | steady-state packet drop rate specified in the first column. To help | |||
help calibrate, the third column gives the aggregate average | calibrate, the third column gives the aggregate average congestion | |||
congestion window for the two TCP connections, and the fourth column | window for the two TCP connections, and the fourth column gives the | |||
gives the bandwidth that would be needed by the two connections to | bandwidth that would be needed by the two connections to achieve that | |||
achieve that aggregate window and packet drop rate, given 100 ms | aggregate window and packet drop rate, given 100 ms round-trip times | |||
round-trip times and 1500-byte packets. | and 1500-byte packets. | |||
Packet Drop Rate P Fairness Aggregate Window Bandwidth | Packet Drop Rate P Fairness Aggregate Window Bandwidth | |||
------------------ -------- ---------------- --------- | ------------------ -------- ---------------- --------- | |||
10^-2 1.0 24 2.8 Mbps | 10^-2 1.0 24 2.8 Mbps | |||
10^-3 1.0 76 9.1 Mbps | 10^-3 1.0 76 9.1 Mbps | |||
10^-4 2.2 383 45.9 Mbps | 10^-4 2.2 383 45.9 Mbps | |||
10^-5 4.7 2174 260.8 Mbps | 10^-5 4.7 2174 260.8 Mbps | |||
10^-6 10.2 13479 1.6 Gbps | 10^-6 10.2 13479 1.6 Gbps | |||
10^-7 22.1 87776 10.5 Gbps | 10^-7 22.1 87776 10.5 Gbps | |||
Table 5: Relative Fairness between the HighSpeed and Standard | Table 5: Relative Fairness between the HighSpeed and Standard | |||
Response Functions. | Response Functions. | |||
Thus, for packet drop rates of 10^-4, a flow with the HighSpeed | Thus, for packet drop rates of 10^-4, a flow with the HighSpeed | |||
response function can expect to receive 2.2 times the throughput of | response function can expect to receive 2.2 times the throughput of a | |||
a flow using the Standard response function, given the same round- | flow using the Standard response function, given the same round-trip | |||
trip times and packet sizes. With packet drop rates of 10^-6 (or | times and packet sizes. With packet drop rates of 10^-6 (or 10^-7), | |||
10^-7), the unfairness is more severe, and we have entered the | the unfairness is more severe, and we have entered the regime where a | |||
regime where a Standard TCP connection requires at most one | Standard TCP connection requires at most one congestion event every | |||
congestion event every 800 (or 2530) round-trip times in order to | 800 (or 2530) round-trip times in order to make use of the available | |||
make use of the available bandwidth. Our judgement would be that | bandwidth. Our judgement would be that there are not a lot of TCP | |||
there are not a lot of TCP connections effectively operating in this | connections effectively operating in this regime today, with | |||
regime today, with congestion windows of thousands of packets, and | congestion windows of thousands of packets, and that therefore the | |||
that therefore the benefits of the HighSpeed response function would | benefits of the HighSpeed response function would outweigh the | |||
outweigh the unfairness that would be experienced by Standard TCP in | unfairness that would be experienced by Standard TCP in this regime. | |||
this regime. However, one purpose of this document is to solicit | However, one purpose of this document is to solicit feedback on this | |||
feedback on this issue. The parameter Low_Window determines | issue. The parameter Low_Window determines directly the point of | |||
directly the point of divergence between the Standard and HighSpeed | divergence between the Standard and HighSpeed Response Functions. | |||
Response Functions. | ||||
The third column of Table 5, the Aggregate Window, gives the | The third column of Table 5, the Aggregate Window, gives the | |||
aggregate congestion window of the two competing TCP connections, | aggregate congestion window of the two competing TCP connections, | |||
with HighSpeed and Standard TCP, given the packet drop rate | with HighSpeed and Standard TCP, given the packet drop rate specified | |||
specified in the first column. From Table 5, a HighSpeed TCP | in the first column. From Table 5, a HighSpeed TCP connection would | |||
connection would receive ten times the bandwidth of a Standard TCP | receive ten times the bandwidth of a Standard TCP in an environment | |||
in an environment with a packet drop rate of 10^-6. This would | with a packet drop rate of 10^-6. This would occur when the two | |||
occur when the two flows sharing a single pipe achieved an aggregate | flows sharing a single pipe achieved an aggregate window of 13479 | |||
window of 13479 packets. Given a round-trip time of 100 ms and a | packets. Given a round-trip time of 100 ms and a packet size of 1500 | |||
packet size of 1500 bytes, this would occur with an available | bytes, this would occur with an available bandwidth for the two | |||
bandwidth for the two competing flows of 1.6 Gbps. | competing flows of 1.6 Gbps. | |||
Next we consider the time that it takes a standard or HighSpeed TCP | Next we consider the time that it takes a standard or HighSpeed TCP | |||
flow to converge to fairness against a pre-existing HighSpeed TCP | flow to converge to fairness against a pre-existing HighSpeed TCP | |||
flow. The worst case for convergence to fairness occurs when a new | flow. The worst case for convergence to fairness occurs when a new | |||
flow is starting up, competing against a high-bandwidth existing | flow is starting up, competing against a high-bandwidth existing | |||
flow, and the new flow suffers a packet drop and exits slow-start | flow, and the new flow suffers a packet drop and exits slow-start | |||
while its window is still small. In the worst case, consider that | while its window is still small. In the worst case, consider that | |||
the new flow has entered the congestion avoidance phase while its | the new flow has entered the congestion avoidance phase while its | |||
window is only one packet. A standard TCP flow in congestion | window is only one packet. A standard TCP flow in congestion | |||
avoidance increases its window by at most one packet per round-trip | avoidance increases its window by at most one packet per round-trip | |||
time, and after N round-trip times has only achieved a window of N | time, and after N round-trip times has only achieved a window of N | |||
packets (when starting with a window of 1 in the first round-trip | packets (when starting with a window of 1 in the first round-trip | |||
time). In contrast, a HighSpeed TCP flows increases much faster | time). In contrast, a HighSpeed TCP flows increases much faster than | |||
than a standard TCP flow while in the congestion avoidance phase, | a standard TCP flow while in the congestion avoidance phase, and we | |||
and we can expect its convergence to fairness to be much better. | can expect its convergence to fairness to be much better. This is | |||
This is shown in Table 6 below. The script used to generate this | shown in Table 6 below. The script used to generate this table is | |||
table is given in Appendix C. | given in Appendix C. | |||
RTT HS_Window Standard_TCP_Window | RTT HS_Window Standard_TCP_Window | |||
--- --------- ------------------- | --- --------- ------------------- | |||
100 131 100 | 100 131 100 | |||
200 475 200 | 200 475 200 | |||
300 1131 300 | 300 1131 300 | |||
400 2160 400 | 400 2160 400 | |||
500 3601 500 | 500 3601 500 | |||
600 5477 600 | 600 5477 600 | |||
700 7799 700 | 700 7799 700 | |||
skipping to change at page 14, line 30 | skipping to change at page 11, line 51 | |||
1400 35856 1400 | 1400 35856 1400 | |||
1500 41336 1500 | 1500 41336 1500 | |||
1600 47115 1600 | 1600 47115 1600 | |||
1700 53170 1700 | 1700 53170 1700 | |||
1800 59477 1800 | 1800 59477 1800 | |||
1900 66013 1900 | 1900 66013 1900 | |||
2000 72754 2000 | 2000 72754 2000 | |||
Table 6: For a HighSpeed and a Standard TCP connection, the | Table 6: For a HighSpeed and a Standard TCP connection, the | |||
congestion window during congestion avoidance phase (starting with a | congestion window during congestion avoidance phase (starting with a | |||
congestion window of 1 packet during RTT 1. | congestion window of 1 packet during RTT 1). | |||
The classic paper on relative fairness is from Chiu and Jain [CJ89]. | The classic paper on relative fairness is from Chiu and Jain [CJ89]. | |||
This paper shows that AIMD (Additive Increase Multiplicative | This paper shows that AIMD (Additive Increase Multiplicative | |||
Decrease) converges to fairness in an environment with synchronized | Decrease) converges to fairness in an environment with synchronized | |||
congestion events. From [CJ89], it is easy to see that MIMD and | congestion events. From [CJ89], it is easy to see that MIMD and AIAD | |||
AIAD do not converge to fairness in this environment. However, the | do not converge to fairness in this environment. However, the | |||
results of [CJ89] do not apply to an asynchronous environment such | results of [CJ89] do not apply to an asynchronous environment such as | |||
as that of the current Internet, where the frequency of congestion | that of the current Internet, where the frequency of congestion | |||
feedback can be different for different flows. For example, it has | feedback can be different for different flows. For example, it has | |||
been shown that MIMD converges to fair states in a model with | been shown that MIMD converges to fair states in a model with | |||
proportional instead of synchronous feedback in terms of packet | proportional instead of synchronous feedback in terms of packet drops | |||
drops [GV02]. Thus, we are not concerned about abandoning a strict | [GV02]. Thus, we are not concerned about abandoning a strict model | |||
model of AIMD for HighSpeed TCP. | of AIMD for HighSpeed TCP. However, we note that in an environment | |||
with Drop-Tail queue management, there is likely to be some | ||||
synchronization of packet drops. In this environment, the model of | ||||
completely synchronous feedback does not hold, but the model of | ||||
completely asynchronous feedback is not accurate either. Fairness in | ||||
Drop-Tail environments is discussed in more detail in Sections 9 and | ||||
12. | ||||
7. Translating the HighSpeed Response Function into Congestion Control | 7. Translating the HighSpeed Response Function into Congestion Control | |||
Parameters. | Parameters | |||
For equation-based congestion control such as TFRC, the HighSpeed | For equation-based congestion control such as TFRC, the HighSpeed | |||
Response Function above could be used directly by the TFRC | Response Function above could be used directly by the TFRC congestion | |||
congestion control mechanism. However, for TCP the HighSpeed | control mechanism. However, for TCP the HighSpeed response function | |||
response function has to be translated into additive increase and | has to be translated into additive increase and multiplicative | |||
multiplicative decrease parameters. The HighSpeed response function | decrease parameters. The HighSpeed response function cannot be | |||
cannot be achieved by TCP with an additive increase of one segment | achieved by TCP with an additive increase of one segment per round- | |||
per round-trip time and a multiplicative decrease of halving the | trip time and a multiplicative decrease of halving the current | |||
current congestion window; HighSpeed TCP will have to modify either | congestion window; HighSpeed TCP will have to modify either the | |||
the increase or the decrease parameter, or both. We have concluded | increase or the decrease parameter, or both. We have concluded that | |||
that HighSpeed TCP is most likely to achieve an acceptable | HighSpeed TCP is most likely to achieve an acceptable compromise | |||
compromise between moderate increases and timely decreases by | between moderate increases and timely decreases by modifying both the | |||
modifying both the increase and the decrease parameter. | increase and the decrease parameter. | |||
That is, for HighSpeed TCP let the congestion window increase by | That is, for HighSpeed TCP let the congestion window increase by a(w) | |||
a(w) segments per round-trip time in the absence of congestion, and | segments per round-trip time in the absence of congestion, and let | |||
let the congestion window decrease to w(1-b(w)) segments in response | the congestion window decrease to w(1-b(w)) segments in response to a | |||
to a round-trip time with one or more loss events. Thus, in | round-trip time with one or more loss events. Thus, in response to a | |||
response to a single acknowledgement HighSpeed TCP increases its | single acknowledgement HighSpeed TCP increases its congestion window | |||
congestion window in segments as follows: | in segments as follows: | |||
w <- w + a(w)/w. | w <- w + a(w)/w. | |||
In response to a congestion event, HighSpeed TCP decreases as | In response to a congestion event, HighSpeed TCP decreases as | |||
follows: | follows: | |||
w <- (1-b(w))w. | w <- (1-b(w))w. | |||
For Standard TCP, a(w) = 1 and b(w) = 1/2, regardless of the value | For Standard TCP, a(w) = 1 and b(w) = 1/2, regardless of the value of | |||
of w. HighSpeed TCP uses the same values of a(w) and b(w) for w <= | w. HighSpeed TCP uses the same values of a(w) and b(w) for w <= | |||
Low_Window. This section specifies a(w) and b(w) for HighSpeed TCP | Low_Window. This section specifies a(w) and b(w) for HighSpeed TCP | |||
for larger values of w. | for larger values of w. | |||
For w = High_Window, we have specified a loss rate of High_P. From | For w = High_Window, we have specified a loss rate of High_P. From | |||
[FRS02], or from elementary calculations, this requires the | [FRS02], or from elementary calculations, this requires the following | |||
following relationship between a(w) and b(w) for w = High_Window: | relationship between a(w) and b(w) for w = High_Window: | |||
a(w) = High_Window^2 * High_P * 2 * b(w)/(2-b(w). (2) | a(w) = High_Window^2 * High_P * 2 * b(w)/(2-b(w)). (2) | |||
We use the parameter High_Decrease to specify the decrease parameter | We use the parameter High_Decrease to specify the decrease parameter | |||
b(w) for w = High_Window, and use Equation (2) to derive the | b(w) for w = High_Window, and use Equation (2) to derive the increase | |||
increase parameter a(w) for w = High_Window. Along with High_P = | parameter a(w) for w = High_Window. Along with High_P = 10^-7 and | |||
10^-7 and High_Window = 83000, for example, we specify High_Decrease | High_Window = 83000, for example, we specify High_Decrease = 0.1, | |||
= 0.1, specifying that b(83000) = 0.1, giving a decrease of 10% | specifying that b(83000) = 0.1, giving a decrease of 10% after a | |||
after a congestion event. Equation (2) then gives a(83000) = 72, | congestion event. Equation (2) then gives a(83000) = 72, for an | |||
for an increase of 72 segments, or just under 0.1%, within a round- | increase of 72 segments, or just under 0.1%, within a round-trip | |||
trip time, for w = 83000. | time, for w = 83000. | |||
This moderate decrease strikes us as acceptable, particularly when | This moderate decrease strikes us as acceptable, particularly when | |||
coupled with the role of TCP's ACK-clocking in limiting the sending | coupled with the role of TCP's ACK-clocking in limiting the sending | |||
rate in response to more severe congestion [BBFS01]. A more severe | rate in response to more severe congestion [BBFS01]. A more severe | |||
decrease would require a more aggressive increase in the congestion | decrease would require a more aggressive increase in the congestion | |||
window for a round-trip time without congestion. In particular, a | window for a round-trip time without congestion. In particular, a | |||
decrease factor High_Decrease of 0.5, as in Standard TCP, would | decrease factor High_Decrease of 0.5, as in Standard TCP, would | |||
require an increase of 459 segments per round-trip time when w = | require an increase of 459 segments per round-trip time when w = | |||
83000. | 83000. | |||
Given decrease parameters of b(w) = 1/2 for w = Low_Window, and b(w) | Given decrease parameters of b(w) = 1/2 for w = Low_Window, and b(w) | |||
= High_Decrease for w = High_Window, we are left to specify the | = High_Decrease for w = High_Window, we are left to specify the value | |||
value of b(w) for other values of w > Low_Window. From [FRS02], we | of b(w) for other values of w > Low_Window. From [FRS02], we let | |||
let b(w) vary linearly as the log of w, as follows: | b(w) vary linearly as the log of w, as follows: | |||
b(w) = (High_Decrease - 0.5) (log(w)-log(W)) / (log(W_1)-log(W)) + | b(w) = (High_Decrease - 0.5) (log(w)-log(W)) / (log(W_1)-log(W)) + | |||
0.5, | 0.5, | |||
for W = Low_window and W_1 = High_window. The increase parameter | for W = Low_window and W_1 = High_window. The increase parameter | |||
a(w) can then be computed as follows: | a(w) can then be computed as follows: | |||
a(w) = w^2 * p(w) * 2 * b(w)/(2-b(w)), | a(w) = w^2 * p(w) * 2 * b(w)/(2-b(w)), | |||
for p(w) the packet drop rate for congestion window w. From | for p(w) the packet drop rate for congestion window w. From | |||
skipping to change at page 16, line 36 | skipping to change at page 14, line 15 | |||
We assume that experimental implementations of HighSpeed TCP for | We assume that experimental implementations of HighSpeed TCP for | |||
further investigation will use a pre-computed look-up table for | further investigation will use a pre-computed look-up table for | |||
finding a(w) and b(w). For example, the implementation from Tom | finding a(w) and b(w). For example, the implementation from Tom | |||
Dunigan adjusts the a(w) and b(w) parameters every 0.1 seconds. In | Dunigan adjusts the a(w) and b(w) parameters every 0.1 seconds. In | |||
the appendix we give such a table for our default values of | the appendix we give such a table for our default values of | |||
Low_Window = 38, High_Window = 83,000, High_P = 10^-7, and | Low_Window = 38, High_Window = 83,000, High_P = 10^-7, and | |||
High_Decrease = 0.1. These are also the default values in the NS | High_Decrease = 0.1. These are also the default values in the NS | |||
simulator; example simulations in NS can be run with the command | simulator; example simulations in NS can be run with the command | |||
"./test-all-tcpHighspeed" in the directory tcl/test. | "./test-all-tcpHighspeed" in the directory tcl/test. | |||
8. An alternate, linear response functions. | 8. An alternate, linear response functions | |||
In this section we explore an alternate, linear response function | In this section we explore an alternate, linear response function for | |||
for HighSpeed TCP that has been proposed by a number of other | HighSpeed TCP that has been proposed by a number of other people, in | |||
people, in particular by Glenn Vinnicombe and Tom Kelly. Similarly, | particular by Glenn Vinnicombe and Tom Kelly. Similarly, it has been | |||
it has been suggested by others that a less "ad-hoc" guideline for a | suggested by others that a less "ad-hoc" guideline for a response | |||
response function for HighSpeed TCP would be to specify a constant | function for HighSpeed TCP would be to specify a constant value for | |||
value for the number of round-trip times between congestion events. | the number of round-trip times between congestion events. | |||
Assume that we keep the value of Low_Window as 38 MSS-sized | Assume that we keep the value of Low_Window as 38 MSS-sized segments, | |||
segments, indicating when the HighSpeed response function diverges | indicating when the HighSpeed response function diverges from the | |||
from the current TCP response function, but that we modify the | current TCP response function, but that we modify the High_Window and | |||
High_Window and High_P parameters that specify the upper range of | High_P parameters that specify the upper range of the HighSpeed | |||
the HighSpeed response function. In particular, consider the | response function. In particular, consider the response function | |||
response function given by High_Window = 380,000 and High_P = 10^-7, | given by High_Window = 380,000 and High_P = 10^-7, with Low_Window = | |||
with Low_Window = 38 and Low_P = 10^-3 as before. | 38 and Low_P = 10^-3 as before. | |||
Using the equations in Section 5, this would give the following | Using the equations in Section 5, this would give the following | |||
Linear response function, for w > Low_Window: | Linear response function, for w > Low_Window: | |||
W = 0.038/p. | W = 0.038/p. | |||
This Linear HighSpeed response function is illustrated in Table 7 | This Linear HighSpeed response function is illustrated in Table 7 | |||
below. For HighSpeed TCP, the number of round-trip times between | below. For HighSpeed TCP, the number of round-trip times between | |||
losses, 1/(pW), equals 1/0.38, or equivalently, 26, for W > 38 | losses, 1/(pW), equals 1/0.38, or equivalently, 26, for W > 38 | |||
segments. | segments. | |||
skipping to change at page 18, line 23 | skipping to change at page 15, line 51 | |||
Table 8: Relative Fairness between the Linear HighSpeed and Standard | Table 8: Relative Fairness between the Linear HighSpeed and Standard | |||
Response Functions. | Response Functions. | |||
One attraction of the linear response function is that it is scale- | One attraction of the linear response function is that it is scale- | |||
invariant, with a fixed increase in the congestion window per | invariant, with a fixed increase in the congestion window per | |||
acknowledgement, and a fixed number of round-trip times between loss | acknowledgement, and a fixed number of round-trip times between loss | |||
events. My own assumption would be that having a fixed length for | events. My own assumption would be that having a fixed length for | |||
the congestion epoch in round-trip times, regardless of the packet | the congestion epoch in round-trip times, regardless of the packet | |||
drop rate, would be a poor fit for an imprecise and imperfect world | drop rate, would be a poor fit for an imprecise and imperfect world | |||
with routers with a range of queue management mechanisms, such as | with routers with a range of queue management mechanisms, such as the | |||
the Drop-Tail queue management that is common today. For example, a | Drop-Tail queue management that is common today. For example, a | |||
response function with a fixed length for the congestion epoch in | response function with a fixed length for the congestion epoch in | |||
round-trip times might give less clearly-differentiated feedback in | round-trip times might give less clearly-differentiated feedback in | |||
an environment with steady-state background losses at fixed | an environment with steady-state background losses at fixed intervals | |||
intervals for all flows (as might occur with a wireless link with | for all flows (as might occur with a wireless link with occasional | |||
occasional short error bursts, giving losses for all flows every N | short error bursts, giving losses for all flows every N seconds | |||
seconds regardless of their sending rate). | regardless of their sending rate). | |||
While it is not a goal to have perfect fairness in an environment | While it is not a goal to have perfect fairness in an environment | |||
with synchronized losses, it would be good to have moderately | with synchronized losses, it would be good to have moderately | |||
acceptable performance in this regime. This goal might argue | acceptable performance in this regime. This goal might argue against | |||
against a response function with a constant number of round-trip | a response function with a constant number of round-trip times | |||
times between congestion events. However, this is a question that | between congestion events. However, this is a question that could | |||
could clearly use additional research and investigation. In | clearly use additional research and investigation. In addition, | |||
addition, flows with different round-trip times would have different | flows with different round-trip times would have different time | |||
time durations for congestion epochs even in the model with a linear | durations for congestion epochs even in the model with a linear | |||
response function. | response function. | |||
The third column of Table 8, the Aggregate Window, gives the | The third column of Table 8, the Aggregate Window, gives the | |||
aggregate congestion window of two competing TCP connections, one | aggregate congestion window of two competing TCP connections, one | |||
with Linear HighSpeed TCP and one with Standard TCP, given the | with Linear HighSpeed TCP and one with Standard TCP, given the packet | |||
packet drop rate specified in the first column. From Table 8, a | drop rate specified in the first column. From Table 8, a Linear | |||
Linear HighSpeed TCP connection would receive fifteen times the | HighSpeed TCP connection would receive fifteen times the bandwidth of | |||
bandwidth of a Standard TCP in an environment with a packet drop | a Standard TCP in an environment with a packet drop rate of 10^-5. | |||
rate of 10^-5. This would occur when the two flows sharing a single | This would occur when the two flows sharing a single pipe achieved an | |||
pipe achieved an aggregate window of 4179 packets. Given a round- | aggregate window of 4179 packets. Given a round-trip time of 100 ms | |||
trip time of 100 ms and a packet size of 1500 bytes, this would | and a packet size of 1500 bytes, this would occur with an available | |||
occur with an available bandwidth for the two competing flows of 501 | bandwidth for the two competing flows of 501 Mbps. Thus, because the | |||
Mbps. Thus, because the Linear HighSpeed TCP is more aggressive | Linear HighSpeed TCP is more aggressive than the HighSpeed TCP | |||
than the HighSpeed TCP proposed above, it also is less fair when | proposed above, it also is less fair when competing with Standard TCP | |||
competing with Standard TCP in a high-bandwidth environment. | in a high-bandwidth environment. | |||
9. Tradeoffs for Choosing Congestion Control Parameters. | 9. Tradeoffs for Choosing Congestion Control Parameters | |||
A range of metrics can be used for evaluating choices for congestion | A range of metrics can be used for evaluating choices for congestion | |||
control parameters for HighSpeed TCP. My assumption in this section | control parameters for HighSpeed TCP. My assumption in this section | |||
is that for a response function of the form w = c/p^d, for constant | is that for a response function of the form w = c/p^d, for constant c | |||
c and exponent d, the only response functions that would be | and exponent d, the only response functions that would be considered | |||
considered are response functions with 1/2 <= d <= 1. The two ends | are response functions with 1/2 <= d <= 1. The two ends of this | |||
of this spectrum are represented by current TCP, with d = 1/2, and | spectrum are represented by current TCP, with d = 1/2, and by the | |||
by the linear response function described in Section 8 above, with d | linear response function described in Section 8 above, with d = 1. | |||
= 1. HighSpeed TCP lies somewhere in the middle of the spectrum, | HighSpeed TCP lies somewhere in the middle of the spectrum, with d = | |||
with d = 0.835. | 0.835. | |||
Response functions with exponents less than 1/2 can be eliminated | Response functions with exponents less than 1/2 can be eliminated | |||
from consideration because they would be even worse than standard | from consideration because they would be even worse than standard TCP | |||
TCP in accomodating connections with high congestion windows. | in accommodating connections with high congestion windows. | |||
9.1. The Number of Round-Trip Times between Loss Events. | 9.1. The Number of Round-Trip Times between Loss Events | |||
Response functions with exponents greater than 1 can be eliminated | Response functions with exponents greater than 1 can be eliminated | |||
from consideration because for these response functions, the number | from consideration because for these response functions, the number | |||
of round-trip times between loss events decreases as congestion | of round-trip times between loss events decreases as congestion | |||
decreases. For a response function of w = c/p^d, with one loss | decreases. For a response function of w = c/p^d, with one loss event | |||
event or congestion event every 1/p packets, the number of round- | or congestion event every 1/p packets, the number of round-trip times | |||
trip times between loss events is w^((1/d)-1)/c^(1/d). Thus, for | between loss events is w^((1/d)-1)/c^(1/d). Thus, for standard TCP | |||
standard TCP the number of round-trip times between loss events is | the number of round-trip times between loss events is linear in w. | |||
linear in w. In contrast, one attraction of the linear response | In contrast, one attraction of the linear response function, as | |||
function, as described in Section 8 above, is that it is scale- | described in Section 8 above, is that it is scale-invariant, in terms | |||
invariant, in terms of a fixed increase in the congestion window per | of a fixed increase in the congestion window per acknowledgement, and | |||
acknowledgement, and a fixed number of round-trip times between loss | a fixed number of round-trip times between loss events. | |||
events. | ||||
However, for a response function with d > 1, the number of round- | However, for a response function with d > 1, the number of round- | |||
trip times between loss events would be proportional to w^((1/d)-1), | trip times between loss events would be proportional to w^((1/d)-1), | |||
for a negative exponent ((1/d)-1), setting smaller as w increases. | for a negative exponent ((1/d)-1), setting smaller as w increases. | |||
This would seem undesirable. | This would seem undesirable. | |||
9.2. The Number of Packet Drops per Loss Event, with Drop-Tail. | 9.2. The Number of Packet Drops per Loss Event, with Drop-Tail | |||
A TCP connection increases its sending rate by a(w) packets per | A TCP connection increases its sending rate by a(w) packets per | |||
round-trip time, and in a Drop-Tail environment, this is likely to | round-trip time, and in a Drop-Tail environment, this is likely to | |||
result in a(w) dropped packets during a single loss event. One | result in a(w) dropped packets during a single loss event. One | |||
attraction of standard TCP is that it has a fixed increase per | attraction of standard TCP is that it has a fixed increase per | |||
round-trip time of one packet, minimizing the number of packets that | round-trip time of one packet, minimizing the number of packets that | |||
would be dropped in a Drop-Tail environment. For an environment | would be dropped in a Drop-Tail environment. For an environment with | |||
with some form of Active Queue Management, and in particular for an | some form of Active Queue Management, and in particular for an | |||
environment that uses ECN, the number of packets dropped in a single | environment that uses ECN, the number of packets dropped in a single | |||
congestion event would not be a problem. However, even in these | congestion event would not be a problem. However, even in these | |||
environments, larger increases in the sending rate per round-trip | environments, larger increases in the sending rate per round-trip | |||
time result in larger stresses on the ability of the queues in the | time result in larger stresses on the ability of the queues in the | |||
router to absorb the fluctuations. | router to absorb the fluctuations. | |||
HighSpeed TCP plays a middle ground between the metrics of a | HighSpeed TCP plays a middle ground between the metrics of a moderate | |||
moderate number of round-trip times between loss events, and a | number of round-trip times between loss events, and a moderate | |||
moderate increase in the sending rate per round-trip time. As shown | increase in the sending rate per round-trip time. As shown in | |||
in Appendix B, for a congestion window of 83,000 packets, HighSpeed | Appendix B, for a congestion window of 83,000 packets, HighSpeed TCP | |||
TCP increases its sending rate by 70 packets per round-trip time, | increases its sending rate by 70 packets per round-trip time, | |||
resulting in at most 70 packet drops when the buffer overflows in a | resulting in at most 70 packet drops when the buffer overflows in a | |||
Drop-Tail environment. This increased aggressiveness is the price | Drop-Tail environment. This increased aggressiveness is the price | |||
paid by HighSpeed TCP for its increased scalability. A large number | paid by HighSpeed TCP for its increased scalability. A large number | |||
of packets dropped per congestion event could result in synchronized | of packets dropped per congestion event could result in synchronized | |||
drops from multiple flows, with a possible loss of throughput as a | drops from multiple flows, with a possible loss of throughput as a | |||
result. | result. | |||
Scalable TCP has an increase a(w) of 0.005 w packets per round-trip | Scalable TCP has an increase a(w) of 0.005 w packets per round-trip | |||
time. For a congestion window of 83,000 packets, this gives an | time. For a congestion window of 83,000 packets, this gives an | |||
increase of 415 packets per round-trip time, resulting in roughly | increase of 415 packets per round-trip time, resulting in roughly 415 | |||
415 packet drops per congestion event in a Drop-Tail environment. | packet drops per congestion event in a Drop-Tail environment. | |||
Thus, HighSpeed TCP and its variants place increased demands on | Thus, HighSpeed TCP and its variants place increased demands on queue | |||
queue management in routers, relative to Standard TCP. (This is | management in routers, relative to Standard TCP. (This is rather | |||
rather similar to the increased demands on queue management that | similar to the increased demands on queue management that would | |||
would result from using N parallel TCP connections instead of a | result from using N parallel TCP connections instead of a single | |||
single Standard TCP connection.) | Standard TCP connection.) | |||
10. Related Issues | 10. Related Issues | |||
10.1. Slow-Start. | 10.1. Slow-Start | |||
An companion internet-draft on "Limited Slow-Start for TCP with | A companion internet-draft on "Limited Slow-Start for TCP with Large | |||
Large Congestion Windows" [F02b] proposes a modification to TCP's | Congestion Windows" [F02b] proposes a modification to TCP's slow- | |||
slow-start procedure that can significantly improve the performance | start procedure that can significantly improve the performance of TCP | |||
of TCP connections slow-starting up to large congestion windows. | connections slow-starting up to large congestion windows. For TCP | |||
For TCP connections that are able to use congestion windows of | connections that are able to use congestion windows of thousands (or | |||
thousands (or tens of thousands) of MSS-sized segments (for MSS the | tens of thousands) of MSS-sized segments (for MSS the sender's | |||
sender's MAXIMUM SEGMENT SIZE), the current slow-start procedure can | MAXIMUM SEGMENT SIZE), the current slow-start procedure can result in | |||
result in increasing the congestion window by thousands of segments | increasing the congestion window by thousands of segments in a single | |||
in a single round-trip time. Such an increase can easily result in | round-trip time. Such an increase can easily result in thousands of | |||
thousands of packets being dropped in one round-trip time. This is | packets being dropped in one round-trip time. This is often | |||
often counter-productive for the TCP flow itself, and is also hard | counter-productive for the TCP flow itself, and is also hard on the | |||
on the rest of the traffic sharing the congested link. | rest of the traffic sharing the congested link. | |||
[F02b] proposes Limited Slow-Start, limiting the number of segments | [F02b] proposes Limited Slow-Start, limiting the number of segments | |||
by which the congestion window is increased for one window of data | by which the congestion window is increased for one window of data | |||
during slow-start, in order to improve performance for TCP | during slow-start, in order to improve performance for TCP | |||
connections with large congestion windows. We have separated out | connections with large congestion windows. We have separated out | |||
Limited Slow-Start to a separate draft because it can be used both | Limited Slow-Start to a separate draft because it can be used both | |||
with Standard or with HighSpeed TCP. | with Standard or with HighSpeed TCP. | |||
Limited Slow-Start is illustrated in the NS simulator, for snapshots | Limited Slow-Start is illustrated in the NS simulator, for snapshots | |||
after May 1, 2002, in the tests "./test-all-tcpHighspeed tcp1A" and | after May 1, 2002, in the tests "./test-all-tcpHighspeed tcp1A" and | |||
"./test-all-tcpHighspeed tcpHighspeed1" in the subdirectory | "./test-all-tcpHighspeed tcpHighspeed1" in the subdirectory | |||
"tcl/lib". | "tcl/lib". | |||
In order for best-effort flows to safely start-up faster than slow- | In order for best-effort flows to safely start-up faster than slow- | |||
start, e.g., in future high-bandwidth networks, we believe that it | start, e.g., in future high-bandwidth networks, we believe that it | |||
would be necessary for the flow to have explicit feedback from the | would be necessary for the flow to have explicit feedback from the | |||
routers along the path. There are a number of proposals for this, | routers along the path. There are a number of proposals for this, | |||
ranging from a minimal proposal for an IP option that allows TCP SYN | ranging from a minimal proposal for an IP option that allows TCP SYN | |||
packets to collect information from routers along the path about the | packets to collect information from routers along the path about the | |||
allowed initial sending rate [J02], to proposals with more power | allowed initial sending rate [J02], to proposals with more power that | |||
that require more fine-tuned and continuous feedback from routers. | require more fine-tuned and continuous feedback from routers. These | |||
These proposals all are somewhat longer-term proposals than the | proposals are all somewhat longer-term proposals than the HighSpeed | |||
HighSpeed TCP proposal in this document, requiring longer lead times | TCP proposal in this document, requiring longer lead times and more | |||
and more coordination for deployment, and will be discussed in later | coordination for deployment, and will be discussed in later | |||
documents. | documents. | |||
10.2. Limiting burstiness on short time scales. | 10.2. Limiting burstiness on short time scales | |||
Because the congestion window achieved by a HighSpeed TCP connection | Because the congestion window achieved by a HighSpeed TCP connection | |||
could be quite large, there is a possibility for the sender to send | could be quite large, there is a possibility for the sender to send a | |||
a large burst of packets in response to a single acknowledgement. | large burst of packets in response to a single acknowledgement. This | |||
This could happen, for example, when there is congestion or | could happen, for example, when there is congestion or reordering on | |||
reordering on the reverse path, and the sender receives an | the reverse path, and the sender receives an acknowledgement | |||
acknowledgement acknowledging hundreds or thousands of new packets. | acknowledging hundreds or thousands of new packets. Such a burst | |||
Such a burst would also result if the application was idle for a | would also result if the application was idle for a short period of | |||
short period of time less than a round-trip time, and then suddenly | time less than a round-trip time, and then suddenly had lots of data | |||
had lots of data available to send. In this case, it would be | available to send. In this case, it would be useful for the | |||
useful for the HighSpeed TCP connection to have some method for | HighSpeed TCP connection to have some method for limiting bursts. | |||
limiting bursts. | ||||
We do not in this document specify TCP mechanisms for reducing the | In this document, we do not specify TCP mechanisms for reducing the | |||
short-term burstiness. One possible mechanism is to use some form | short-term burstiness. One possible mechanism is to use some form of | |||
of rate-based pacing, and another possibility is to use maxburst, | rate-based pacing, and another possibility is to use maxburst, which | |||
which limits the number of packets that are sent in response to a | limits the number of packets that are sent in response to a single | |||
single acknowledgement. We would caution, however, against a | acknowledgement. We would caution, however, against a permanent | |||
permanent reduction in the congestion window as a mechanism for | reduction in the congestion window as a mechanism for limiting | |||
limiting short-term bursts. Such a mechanism has been deployed in | short-term bursts. Such a mechanism has been deployed in some TCP | |||
some TCP stacks, and our view would be that using permanent | stacks, and our view would be that using permanent reductions of the | |||
reductions of the congestion window to reduce transient bursts would | congestion window to reduce transient bursts would be a bad idea | |||
be a bad idea [Fl03]. | [Fl03]. | |||
10.3. Other limitations on window size. | 10.3. Other limitations on window size | |||
The TCP header uses a 16-bit field to report the receive window size | The TCP header uses a 16-bit field to report the receive window size | |||
to the sender. Unmodified, this allows a window size of at most | to the sender. Unmodified, this allows a window size of at most | |||
2**16 = 65K bytes. With window scaling, the maximum window size is | 2**16 = 65K bytes. With window scaling, the maximum window size is | |||
2**30 = 1073M bytes [RFC 1323]. Given 1500-byte packets, this | 2**30 = 1073M bytes [RFC 1323]. Given 1500-byte packets, this allows | |||
allows a window of up to 715,000 packets. | a window of up to 715,000 packets. | |||
10.4. Implementation issues. | 10.4. Implementation issues | |||
One implementation issue that has been raised with HighSpeed TCP is | One implementation issue that has been raised with HighSpeed TCP is | |||
that with congestion windows of 4MB or more, the handling of | that with congestion windows of 4MB or more, the handling of | |||
successive SACK packets after a packet is dropped becomes very time- | successive SACK packets after a packet is dropped becomes very time- | |||
consuming at the TCP sender [S03]. Tom Kelly's Scalable TCP | consuming at the TCP sender [S03]. Tom Kelly's Scalable TCP includes | |||
includes a "SACK Fast Path" patch that addresses this problem. | a "SACK Fast Path" patch that addresses this problem. | |||
The issues addressed in the Web100 project, the Net100 project, and | The issues addressed in the Web100 project, the Net100 project, and | |||
related projects about the tuning necessary to achieve high | related projects about the tuning necessary to achieve high bandwidth | |||
bandwidth data rates with TCP apply to HighSpeed TCP as well | data rates with TCP apply to HighSpeed TCP as well [Net100, Web100]. | |||
[Net100, Web100]. | ||||
11. Deployment issues. | 11. Deployment issues | |||
11.1. Deployment issues of HighSpeed TCP | 11.1. Deployment issues of HighSpeed TCP | |||
We do not claim that the HighSpeed TCP modification to TCP described | We do not claim that the HighSpeed TCP modification to TCP described | |||
in this paper is an optimal transport protocol for high-bandwidth | in this paper is an optimal transport protocol for high-bandwidth | |||
environments. Based on our experiences with HighSpeed TCP in the NS | environments. Based on our experiences with HighSpeed TCP in the NS | |||
simulator [NS], on simulation studies [SA03], and on experimental | simulator [NS], on simulation studies [SA03], and on experimental | |||
reports [ABLLS03,D02,CC03,F03], we believe that HighSpeed TCP | reports [ABLLS03,D02,CC03,F03], we believe that HighSpeed TCP | |||
improves the performance of TCP in high-bandwidth environments, and | improves the performance of TCP in high-bandwidth environments, and | |||
we are documenting it for the benefit of the IETF community. We | we are documenting it for the benefit of the IETF community. We | |||
encourage the use of HighSpeed TCP, and of its underlying response | encourage the use of HighSpeed TCP, and of its underlying response | |||
function, and we further encourage feedback about operational | function, and we further encourage feedback about operational | |||
experiences with this or related modifications. | experiences with this or related modifications. | |||
We note that in environments typical of much of the current | We note that in environments typical of much of the current Internet, | |||
Internet, HighSpeed TCP behaves exactly as does Standard TCP today. | HighSpeed TCP behaves exactly as does Standard TCP today. This is | |||
This is the case any time the congestion window is less than 38 | the case any time the congestion window is less than 38 segments. | |||
segments. | ||||
Bandwidth Avg Cwnd w (pkts) Increase a(w) Decrease b(w) | Bandwidth Avg Cwnd w (pkts) Increase a(w) Decrease b(w) | |||
--------- ----------------- ------------- ------------- | --------- ----------------- ------------- ------------- | |||
1.5 Mbps 12.5 1 0.50 | 1.5 Mbps 12.5 1 0.50 | |||
10 Mbps 83 1 0.50 | 10 Mbps 83 1 0.50 | |||
100 Mbps 833 6 0.35 | 100 Mbps 833 6 0.35 | |||
1 Gbps 8333 26 0.22 | 1 Gbps 8333 26 0.22 | |||
10 Gbps 83333 70 0.10 | 10 Gbps 83333 70 0.10 | |||
Table 9: Performance of a HighSpeed TCP connection. | Table 9: Performance of a HighSpeed TCP connection | |||
To help calibrate, Table 9 considers a TCP connection with 1500-byte | To help calibrate, Table 9 considers a TCP connection with 1500-byte | |||
packets, an RTT of 100 ms (including average queueing delay), and no | packets, an RTT of 100 ms (including average queueing delay), and no | |||
competing traffic, and shows the average congestion window if that | competing traffic, and shows the average congestion window if that | |||
TCP connection had a pipe all to itself and fully used the link | TCP connection had a pipe all to itself and fully used the link | |||
bandwidth, for a range of bandwidths for the pipe. This assumes | bandwidth, for a range of bandwidths for the pipe. This assumes that | |||
that the TCP connection would use Table 12 in determining its | the TCP connection would use Table 12 in determining its increase and | |||
increase and decrease parameters. The first column of Table 9 gives | decrease parameters. The first column of Table 9 gives the | |||
the bandwidth, and the second column gives the average congestion | bandwidth, and the second column gives the average congestion window | |||
window w needed to utilize that bandwidth. The third column show | w needed to utilize that bandwidth. The third column shows the | |||
the increase a(w) in segments per RTT for window w. The fourth | increase a(w) in segments per RTT for window w. The fourth column | |||
column show the decrease b(w) for that window w (where the TCP | shows the decrease b(w) for that window w (where the TCP sender | |||
sender decreases the congestion window from w to w(1-b(w)) segments | decreases the congestion window from w to w(1-b(w)) segments after a | |||
after a loss event). We note that the actual congestion window when | loss event). When a loss occurs we note that the actual congestion | |||
a loss occurs is likely to be greater than the average congestion | window is likely to be greater than the average congestion window w | |||
window w in column 2, so the decrease parameter used could be | in column 2, so the decrease parameter used could be slightly smaller | |||
slightly smaller than the one given in column 4 of Table 9. | than the one given in column 4 of Table 9. | |||
Table 9 shows that a HighSpeed TCP over a 10 Mbps link behaves | Table 9 shows that a HighSpeed TCP over a 10 Mbps link behaves | |||
exactly the same as a Standard TCP connection, even in the absence | exactly the same as a Standard TCP connection, even in the absence of | |||
of competing traffic. One can think of the congestion window | competing traffic. One can think of the congestion window staying | |||
staying generally in the range of 55 to 110 segments, with the | generally in the range of 55 to 110 segments, with the HighSpeed TCP | |||
HighSpeed TCP behavior being exactly the same as the behavior of | behavior being exactly the same as the behavior of Standard TCP. (If | |||
Standard TCP. (If the congestion window is ever 128 segments or | the congestion window is ever 128 segments or more, then the | |||
more, then the HighSpeed TCP increases by two segments per RTT | HighSpeed TCP increases by two segments per RTT instead of by one, | |||
instead of by one, and uses a decrease parameter of 0.44 instead of | and uses a decrease parameter of 0.44 instead of 0.50.) | |||
0.50.) | ||||
Table 9 shows that for a HighSpeed TCP connection over a 100 Mbps | Table 9 shows that for a HighSpeed TCP connection over a 100 Mbps | |||
link, with no competing traffic, HighSpeed TCP behaves roughly as | link, with no competing traffic, HighSpeed TCP behaves roughly as | |||
aggressively as six parallel TCP connections, increasing its | aggressively as six parallel TCP connections, increasing its | |||
congestion window by roughly six segments per round-trip time, and | congestion window by roughly six segments per round-trip time, and | |||
with a decrease parameter of roughly 1/3 (corresponding to | with a decrease parameter of roughly 1/3 (corresponding to decreasing | |||
decreasing down to 2/3-rds of its old congestion window, rather than | down to 2/3-rds of its old congestion window, rather than to half, in | |||
to half, in response to a loss event). | response to a loss event). | |||
For a Standard TCP connection in this environment, the congestion | For a Standard TCP connection in this environment, the congestion | |||
window could be thought of as varying generally in the range of 550 | window could be thought of as generally varying in the range of 550 | |||
to 1100 segments, with an average packet drop rate of 2.2 * 10^-6 | to 1100 segments, with an average packet drop rate of 2.2 * 10^-6 | |||
(corresponding to a bit error rate of 1.8 * 10^-10), or | (corresponding to a bit error rate of 1.8 * 10^-10), or equivalently, | |||
equivalently, roughly 55 seconds between congestion events. While a | roughly 55 seconds between congestion events. While a Standard TCP | |||
Standard TCP connection could sustain such a low packet drop rate in | connection could sustain such a low packet drop rate in a carefully | |||
a carefully controlled environment with minimal competing traffic, | controlled environment with minimal competing traffic, we would | |||
we would contend that in an uncontrolled best-effort environment | contend that in an uncontrolled best-effort environment with even a | |||
with even a small amount of competing traffic, the occasional | small amount of competing traffic, the occasional congestion events | |||
congestion events from smaller competing flows could easily be | from smaller competing flows could easily be sufficient to prevent a | |||
sufficient to prevent a Standard TCP flow with no lower-speed | Standard TCP flow with no lower-speed bottlenecks from fully | |||
bottlenecks from fully utilizing the available bandwidth of the | utilizing the available bandwidth of the underutilized 100 Mbps link. | |||
underutilized 100 Mbps link. | ||||
That is, we would content that in the environment of 100 Mbps links | That is, we would contend that in the environment of 100 Mbps links | |||
with a significant amount of available bandwidth, Standard TCP would | with a significant amount of available bandwidth, Standard TCP would | |||
sometimes be unable to fully utilize the link bandwidth, and that | sometimes be unable to fully utilize the link bandwidth, and that | |||
HighSpeed TCP would be an improvement in this regard. We would | HighSpeed TCP would be an improvement in this regard. We would | |||
further contend that in this environment, the behavior of HighSpeed | further contend that in this environment, the behavior of HighSpeed | |||
TCP is sufficiently close to that of Standard TCP that HighSpeed TCP | TCP is sufficiently close to that of Standard TCP that HighSpeed TCP | |||
would be safe to deploy in the current Internet. | would be safe to deploy in the current Internet. We note that | |||
HighSpeed TCP can only use high congestion windows if allowed by the | ||||
receiver's advertised window size. As a result, even if HighSpeed | ||||
TCP was ubiquitously deployed in the Internet, the impact would be | ||||
limited to those TCP connections with an advertised window from the | ||||
receiver of 118 MSS or larger. | ||||
We do not believe that the deployment of HighSpeed TCP would serve | We do not believe that the deployment of HighSpeed TCP would serve as | |||
as a block to the possible deployment of alternate experimental | a block to the possible deployment of alternate experimental | |||
protocols for high-speed congestion control, such as Scalable TCP, | protocols for high-speed congestion control, such as Scalable TCP, | |||
XCP [KHR02], or FAST TCP [JWL03]. In particular, we don't expect | XCP [KHR02], or FAST TCP [JWL03]. In particular, we don't expect | |||
HighSpeed TCP to interact any more poorly with alternative | HighSpeed TCP to interact any more poorly with alternative | |||
experimental proposals that would the N parallel TCP connections | experimental proposals than would the N parallel TCP connections | |||
commonly used today in the absence of HighSpeed TCP. | commonly used today in the absence of HighSpeed TCP. | |||
11.2. Deployment issues of Scalable TCP | 11.2. Deployment issues of Scalable TCP | |||
We believe that Scalable TCP and HighSpeed TCP have sufficiently | We believe that Scalable TCP and HighSpeed TCP have sufficiently | |||
similar response functions that they could easily coexist in the | similar response functions that they could easily coexist in the | |||
Internet. However, we have not investigated Scalable TCP | Internet. However, we have not investigated Scalable TCP | |||
sufficiently to be able to claim, in this document, that Scalable | sufficiently to be able to claim, in this document, that Scalable TCP | |||
TCP is safe for a widespread deployment in the current Internet. | is safe for a widespread deployment in the current Internet. | |||
Bandwidth Avg Cwnd w (pkts) Increase a(w) Decrease b(w) | Bandwidth Avg Cwnd w (pkts) Increase a(w) Decrease b(w) | |||
--------- ----------------- ------------- ------------- | --------- ----------------- ------------- ------------- | |||
1.5 Mbps 12.5 1 0.50 | 1.5 Mbps 12.5 1 0.50 | |||
10 Mbps 83 0.4 0.125 | 10 Mbps 83 0.4 0.125 | |||
100 Mbps 833 4.1 0.125 | 100 Mbps 833 4.1 0.125 | |||
1 Gbps 8333 41.6 0.125 | 1 Gbps 8333 41.6 0.125 | |||
10 Gbps 83333 416.5 0.125 | 10 Gbps 83333 416.5 0.125 | |||
Table 10: Performance of a Scalable TCP connection. | Table 10: Performance of a Scalable TCP connection. | |||
Table 10 shows the performance of a Scalable TCP connection with | Table 10 shows the performance of a Scalable TCP connection with | |||
1500-byte packets, an RTT of 100 ms (including average queueing | 1500-byte packets, an RTT of 100 ms (including average queueing | |||
delay), and no competing traffic. The TCP connection is assumed to | delay), and no competing traffic. The TCP connection is assumed to | |||
use delayed acknowledgements. The first column of Table 10 gives | use delayed acknowledgements. The first column of Table 10 gives the | |||
the bandwidth, the second column gives the average congestion window | bandwidth, the second column gives the average congestion window | |||
needed to utilize that bandwidth, and the third and fourth columns | needed to utilize that bandwidth, and the third and fourth columns | |||
give the increase and decrease parameters. | give the increase and decrease parameters. | |||
Note that even in an environment with a 10 Mbps link, Scalable TCP's | Note that even in an environment with a 10 Mbps link, Scalable TCP's | |||
behavior is considerably different from that of Standard TCP. The | behavior is considerably different from that of Standard TCP. The | |||
increase parameter is smaller than that of Standard TCP, and the | increase parameter is smaller than that of Standard TCP, and the | |||
decrease is smaller also, 1/8-th instead of 1/2. That is, for 10 | decrease is smaller also, 1/8-th instead of 1/2. That is, for 10 | |||
Mbps links, Scalable TCP increases less aggressively than Standard | Mbps links, Scalable TCP increases less aggressively than Standard | |||
TCP or HighSpeed TCP, but decreases less aggressively as well. | TCP or HighSpeed TCP, but decreases less aggressively as well. | |||
In an environment with a 100 Mbps link, Scalable TCP has an increase | In an environment with a 100 Mbps link, Scalable TCP has an increase | |||
parameter of roughly four segments per round-trip time, with the | parameter of roughly four segments per round-trip time, with the same | |||
same decrease parameter of 1/8-th. A comparison of Tables 9 and 10 | decrease parameter of 1/8-th. A comparison of Tables 9 and 10 shows | |||
shows that for this scenario of 100 Mbps links, HighSpeed TCP | that for this scenario of 100 Mbps links, HighSpeed TCP increases | |||
increases more aggressively than Scalable TCP. | more aggressively than Scalable TCP. | |||
Next we consider the relative fairness between Standard TCP, | Next we consider the relative fairness between Standard TCP, | |||
HighSpeed TCP and Scalable TCP. The relative fairness between | HighSpeed TCP and Scalable TCP. The relative fairness between | |||
HighSpeed TCP and Standard TCP was shown in Table 5 earlier in this | HighSpeed TCP and Standard TCP was shown in Table 5 earlier in this | |||
document, and the relative fairness between Scalable TCP and | document, and the relative fairness between Scalable TCP and Standard | |||
Standard TCP was shown in Table 8. Following the approach in | TCP was shown in Table 8. Following the approach in Section 6, for a | |||
Section 6, for a given packet drop rate p, for p < 10^-3, we can | given packet drop rate p, for p < 10^-3, we can estimate the relative | |||
estimate the relative fairness between Scalable and HighSpeed TCP as | fairness between Scalable and HighSpeed TCP as | |||
W_Scalable/W_HighSpeed. This relative fairness is shown in Table 11 | W_Scalable/W_HighSpeed. This relative fairness is shown in Table 11 | |||
below. The bandwidth in the last column of Table 11 is the | below. The bandwidth in the last column of Table 11 is the aggregate | |||
aggregate bandwidth of the two competing flows given 100 ms round- | bandwidth of the two competing flows given 100 ms round-trip times | |||
trip times and 1500-byte packets. | and 1500-byte packets. | |||
Packet Drop Rate P Fairness Aggregate Window Bandwidth | Packet Drop Rate P Fairness Aggregate Window Bandwidth | |||
------------------ -------- ---------------- --------- | ------------------ -------- ---------------- --------- | |||
10^-2 1.0 24 2.8 Mbps | 10^-2 1.0 24 2.8 Mbps | |||
10^-3 1.0 76 9.1 Mbps | 10^-3 1.0 76 9.1 Mbps | |||
10^-4 1.4 643 77.1 Mbps | 10^-4 1.4 643 77.1 Mbps | |||
10^-5 2.1 5595 671.4 Mbps | 10^-5 2.1 5595 671.4 Mbps | |||
10^-6 3.1 50279 6.0 Gbps | 10^-6 3.1 50279 6.0 Gbps | |||
10^-7 4.5 463981 55.7 Gbps | 10^-7 4.5 463981 55.7 Gbps | |||
Table 11: Relative Fairness between the Scalable and HighSpeed | Table 11: Relative Fairness between the Scalable and HighSpeed | |||
Response Functions. | Response Functions. | |||
The second row of Table 11 shows that for a Scalable TCP and a | The second row of Table 11 shows that for a Scalable TCP and a | |||
HighSpeed TCP flow competing in an environment with 100 ms RTTs and | HighSpeed TCP flow competing in an environment with 100 ms RTTs and a | |||
a 10 Mbps pipe, the two flows would receive essentially the same | 10 Mbps pipe, the two flows would receive essentially the same | |||
bandwidth. The next row shows that for a Scalable TCP and a | bandwidth. The next row shows that for a Scalable TCP and a | |||
HighSpeed TCP flow competing in an environment with 100 ms RTTs and | HighSpeed TCP flow competing in an environment with 100 ms RTTs and a | |||
a 100 Mbps pipe, the Scalable TCP flow would receive roughly 50% | 100 Mbps pipe, the Scalable TCP flow would receive roughly 50% more | |||
more bandwidth than would HighSpeed TCP. Table 11 shows the | bandwidth than would HighSpeed TCP. Table 11 shows the relative | |||
relative fairness in higher-bandwidth environments as well. This | fairness in higher-bandwidth environments as well. This relative | |||
relative fairness seems sufficient that there should be no problems | fairness seems sufficient that there should be no problems with | |||
with Scalable TCP and HighSpeed TCP coexisting in the same | Scalable TCP and HighSpeed TCP coexisting in the same environment as | |||
environment as Experimental variants of TCP. | Experimental variants of TCP. | |||
We note that one question that requires more investigation with | We note that one question that requires more investigation with | |||
Scalable TCP is that of convergence to fairness in environments with | Scalable TCP is that of convergence to fairness in environments with | |||
Drop-Tail queue management. | Drop-Tail queue management. | |||
12. Related Work in HighSpeed TCP. | 12. Related Work in HighSpeed TCP | |||
HighSpeed TCP has been separately investigated in simulations by | HighSpeed TCP has been separately investigated in simulations by | |||
Sylvia Ratnasamy and by Evandro de Souza [SA03]. The simulations in | Sylvia Ratnasamy and by Evandro de Souza [SA03]. The simulations in | |||
[SA03] verify the fairness properties of HighSpeed TCP when sharing | [SA03] verify the fairness properties of HighSpeed TCP when sharing a | |||
a link with Standard TCP. | link with Standard TCP. | |||
These simulations explore the relative fairness of HighSpeed TCP | These simulations explore the relative fairness of HighSpeed TCP | |||
flows when competing with Standard TCP. The simulation environment | flows when competing with Standard TCP. The simulation environment | |||
includes background forward and reverse-path TCP traffic limited by | includes background forward and reverse-path TCP traffic limited by | |||
the TCP receive window, along with a small amount of forward and | the TCP receive window, along with a small amount of forward and | |||
reverse-path traffic from the web traffic generator. Most of the | reverse-path traffic from the web traffic generator. Most of the | |||
simulations so far explore performance on a simple dumbbell topology | simulations so far explore performance on a simple dumbbell topology | |||
with a 1 Gbps link with a propagation delay of 50 ms. Simulations | with a 1 Gbps link with a propagation delay of 50 ms. Simulations | |||
have been run with Adaptive RED and with DropTail queue management. | have been run with Adaptive RED and with DropTail queue management. | |||
skipping to change at page 27, line 19 | skipping to change at page 24, line 22 | |||
HighSpeed TCP flows receiving an even larger share of the link | HighSpeed TCP flows receiving an even larger share of the link | |||
bandwidth. This is not surprising; with Active Queue Management at | bandwidth. This is not surprising; with Active Queue Management at | |||
the congested link, the fraction of packet drops received by each | the congested link, the fraction of packet drops received by each | |||
flow should be roughly proportional to that flow's share of the link | flow should be roughly proportional to that flow's share of the link | |||
bandwidth, while this property no longer holds with Drop Tail queue | bandwidth, while this property no longer holds with Drop Tail queue | |||
management. We also note that relative fairness in simulations with | management. We also note that relative fairness in simulations with | |||
Drop Tail queue management can sometimes depend on small details of | Drop Tail queue management can sometimes depend on small details of | |||
the simulation scenario, and that Drop Tail simulations need special | the simulation scenario, and that Drop Tail simulations need special | |||
care to avoid phase effects [F92]. | care to avoid phase effects [F92]. | |||
[SA03] explores the bandwidth `stolen' by HighSpeed TCP from | [SA03] explores the bandwidth `stolen' by HighSpeed TCP from standard | |||
standard TCP by exploring the fraction of the link bandwidth N | TCP by exploring the fraction of the link bandwidth N standard TCP | |||
standard TCP flows receive when competing against N other standard | flows receive when competing against N other standard TCP flows, and | |||
TCP flows, and comparing this to the fraction of the link bandwidth | comparing this to the fraction of the link bandwidth the N standard | |||
the N standard TCP flows receive when competing against N HighSpeed | TCP flows receive when competing against N HighSpeed TCP flows. For | |||
TCP flows. For the 1 Gbps simulation scenarios dominated by long- | the 1 Gbps simulation scenarios dominated by long-lived traffic, a | |||
lived traffic, a small number of standard TCP flows are able to | small number of standard TCP flows are able to achieve high link | |||
achieve high link utilization, and the HighSpeed TCP flows can be | utilization, and the HighSpeed TCP flows can be viewed as stealing | |||
viewed as stealing bandwidth from the competing standard TCP flows, | bandwidth from the competing standard TCP flows, as predicted in | |||
as predicted in Section 6 on the Fairness Implications of the | Section 6 on the Fairness Implications of the HighSpeed Response | |||
HighSpeed Response Function. However, [SA03] shows that when even a | Function. However, [SA03] shows that when even a small fraction of | |||
small fraction of the link bandwidth is used by more bursty, short | the link bandwidth is used by more bursty, short TCP connections, the | |||
TCP connections, the standard TCP flows are unable to achieve high | standard TCP flows are unable to achieve high link utilization, and | |||
link utilization, and the HighSpeed TCP flows in this case are not | the HighSpeed TCP flows in this case are not `stealing' bandwidth | |||
`stealing' bandwidth from the standard TCP flows, but instead are | from the standard TCP flows, but instead are using bandwidth that | |||
using bandwidth that otherwise would not be utilized. | otherwise would not be utilized. | |||
The conclusions of [SA03] are that "HighSpeed TCP behaved as forseen | The conclusions of [SA03] are that "HighSpeed TCP behaved as forseen | |||
by its response function, and appears to be a real and viable option | by its response function, and appears to be a real and viable option | |||
for use on high-speed wide area TCP connections." | for use on high-speed wide area TCP connections." | |||
Future work that could be explored in more detail includes | Future work that could be explored in more detail includes | |||
convergence times after new flows start-up; recovery time after a | convergence times after new flows start-up; recovery time after a | |||
transient outage; the response to sudden severe congestion, and | transient outage; the response to sudden severe congestion, and | |||
investigations of the potential for oscillations. We invite | investigations of the potential for oscillations. We invite | |||
contributions from others in this work. | contributions from others in this work. | |||
13. Relationship to other Work. | 13. Relationship to other Work | |||
Our assumption is that HighSpeed TCP will be used with the TCP SACK | Our assumption is that HighSpeed TCP will be used with the TCP SACK | |||
option, and also with the increased Initial Window of three or four | option, and also with the increased Initial Window of three or four | |||
segments, as allowed by [RFC3390]. For paths that have substantial | segments, as allowed by [RFC3390]. For paths that have substantial | |||
reordering, TCP performance would be greatly improved by some of the | reordering, TCP performance would be greatly improved by some of the | |||
mechanisms still in the research stages for robust performance in | mechanisms still in the research stages for robust performance in the | |||
the presence of reordered packets. | presence of reordered packets. | |||
Our view is that HighSpeed TCP is largely orthogonal to proposals | Our view is that HighSpeed TCP is largely orthogonal to proposals for | |||
for higher PMTU (Path MTU) values [M02]. Unlike changes to the | higher PMTU (Path MTU) values [M02]. Unlike changes to the PMTU, | |||
PMTU, HighSpeed TCP does not require any changes in the network or | HighSpeed TCP does not require any changes in the network or at the | |||
at the TCP receiver, and works well in the current Internet. Our | TCP receiver, and works well in the current Internet. Our assumption | |||
assumption is that HighSpeed TCP would be useful even with larger | is that HighSpeed TCP would be useful even with larger values for the | |||
values for the PMTU. Unlike the current congestion window, the PMTU | PMTU. Unlike the current congestion window, the PMTU gives no | |||
gives no information about the bandwidth-delay product available to | information about the bandwidth-delay product available to that | |||
that particular flow. | particular flow. | |||
A related approach is that of a virtual MTU, where the actual MTU of | A related approach is that of a virtual MTU, where the actual MTU of | |||
the path might be limited [VMSS,S02]. The virtual MTU approach has | the path might be limited [VMSS,S02]. The virtual MTU approach has | |||
not been fully investigated, and we do not explore the virtual MTU | not been fully investigated, and we do not explore the virtual MTU | |||
approach further in this document. | approach further in this document. | |||
14. Conclusions. | 14. Conclusions | |||
This document has proposed HighSpeed TCP, a modification to TCP's | This document has proposed HighSpeed TCP, a modification to TCP's | |||
congestion control mechanism for use with TCP connections with large | congestion control mechanism for use with TCP connections with large | |||
congestion windows. We have explored this proposal in simulations, | congestion windows. We have explored this proposal in simulations, | |||
and others have explored HighSpeed TCP with experiments, and we | and others have explored HighSpeed TCP with experiments, and we | |||
believe HighSpeed TCP to be safe to deploy on the current Internet. | believe HighSpeed TCP to be safe to deploy on the current Internet. | |||
We would welcome additional analysis, simulations, and particularly, | We would welcome additional analysis, simulations, and particularly, | |||
experimentation. More information on simuations and experiments is | experimentation. More information on simulations and experiments is | |||
available from the HighSpeed TCP Web Page [HSTCP]. There are | available from the HighSpeed TCP Web Page [HSTCP]. There are several | |||
several independent implementations of HighSpeed TCP [D02,F03] and | independent implementations of HighSpeed TCP [D02,F03] and of | |||
of Scalable TCP [K03] for further investigation. | Scalable TCP [K03] for further investigation. | |||
We are bringing this proposal to the IETF to be considered as an | ||||
Experimental RFC. | ||||
15. Acknowledgements | 15. Acknowledgements | |||
The HighSpeed TCP proposal is from joint work with Sylvia Ratnasamy | The HighSpeed TCP proposal is from joint work with Sylvia Ratnasamy | |||
and Scott Shenker (and was initiated by Scott Shenker). Additional | and Scott Shenker (and was initiated by Scott Shenker). Additional | |||
investigations of HighSpeed TCP were joint work with Evandro de | investigations of HighSpeed TCP were joint work with Evandro de Souza | |||
Souza and Deb Agarwal. We thank Tom Dunigan for the implementation | and Deb Agarwal. We thank Tom Dunigan for the implementation in the | |||
in the Linux 2.4.16 Web100 kernel, and for resulting experimentation | Linux 2.4.16 Web100 kernel, and for resulting experimentation with | |||
with HighSpeed TCP. We are grateful to the End-to-End Research | HighSpeed TCP. We are grateful to the End-to-End Research Group, the | |||
Group, the members of the Transport Area Working Group, and to | members of the Transport Area Working Group, and to members of the | |||
members of the IPAM program in Large Scale Communication Networks | IPAM program in Large Scale Communication Networks for feedback. We | |||
for feedback. We thank Glenn Vinnicombe for framing the Linear | thank Glenn Vinnicombe for framing the Linear response function in | |||
response function in the parameters of HighSpeed TCP. We are also | the parameters of HighSpeed TCP. We are also grateful for | |||
grateful for contributions and feedback from the following | contributions and feedback from the following individuals: Les | |||
individuals: Les Cottrell, Mitchell Erblich, Jeffrey Hsu, Tom Kelly, | Cottrell, Mitchell Erblich, Jeffrey Hsu, Tom Kelly, Chuck Jackson, | |||
Jitendra Padhye, Andrew Reiter, Stanislav Shalunov, Alex Solan, Paul | Matt Mathis, Jitendra Padhye, Andrew Reiter, Stanislav Shalunov, Alex | |||
Sutter, Brian Tierney, Joe Touch. | Solan, Paul Sutter, Brian Tierney, Joe Touch. | |||
16. Normative References | 16. Normative References | |||
[RFC2581] M. Allman, V. Paxson, and W. Stevens, "TCP Congestion | [RFC2581] Allman, M., Paxson, V. and W. Stevens, "TCP Congestion | |||
Control", RFC 2581, April 1999. | Control", RFC 2581, April 1999. | |||
17. Informative References | 17. Informative References | |||
[ABLLS03] A. Antony, J. Blom, C. de Laat, J. Lee, and W. Sjouw, | [ABLLS03] A. Antony, J. Blom, C. de Laat, J. Lee, and W. Sjouw, | |||
Macroscopic Examination of TCP Flows over Transatlantic Links, | "Microscopic Examination of TCP Flows over Transatlantic | |||
January 2003. URL | Links", iGrid2002 special issue, Future Generation | |||
"http://carol.wins.uva.nl/%7Edelaat/techrep-2003-2-tcp.pdf". | Computer Systems, volume 19 issue 6 (2003), URL | |||
"http://www.science.uva.nl/~delaat/techrep-2003-2- | ||||
tcp.pdf". | ||||
[BBFS01] Deepak Bansal, Hari Balakrishnan, Sally Floyd, and Scott | [BBFS01] Deepak Bansal, Hari Balakrishnan, Sally Floyd, and Scott | |||
Shenker, "Dynamic Behavior of Slowly-Responsive Congestion Control | Shenker, "Dynamic Behavior of Slowly-Responsive Congestion | |||
Algorithms", SIGCOMM 2001, August 2001. | Control Algorithms", SIGCOMM 2001, August 2001. | |||
[CC03] Fabrizio Coccetti and Les Cottrell, TCP Stack Measurements on | [CC03] Fabrizio Coccetti and Les Cottrell, "TCP Stack | |||
Lightly Loaded Testbeds, 2003. URL "http://www- | Measurements on Lightly Loaded Testbeds", 2003. URL | |||
iepm.slac.stanford.edu/monitoring/bulk/fast/". | "http://www-iepm.slac.stanford.edu/monitoring/bulk/fast/". | |||
[CJ89] D. Chiu and R. Jain, "Analysis of the Increase and Decrease | [CJ89] D. Chiu and R. Jain, "Analysis of the Increase and | |||
Algorithms for Congestion Avoidance in Computer Networks", Computer | Decrease Algorithms for Congestion Avoidance in Computer | |||
Networks and ISDN Systems, Vol. 17, pp. 1-14, 1989. | Networks", Computer Networks and ISDN Systems, Vol. 17, | |||
pp. 1-14, 1989. | ||||
[CO98] J. Crowcroft and P. Oechslin, "Differentiated End-to-end | [CO98] J. Crowcroft and P. Oechslin, "Differentiated End-to-end | |||
Services using a Weighted Proportional Fair Share TCP", Computer | Services using a Weighted Proportional Fair Share TCP", | |||
Communication Review, 28(3):53--69, 1998. | Computer Communication Review, 28(3):53--69, 1998. | |||
[D02] Tom Dunigan, Floyd's TCP slow-start and AIMD mods, URL | [D02] Tom Dunigan, "Floyd's TCP slow-start and AIMD mods", URL | |||
"http://www.csm.ornl.gov/~dunigan/net100/floyd.html". | "http://www.csm.ornl.gov/~dunigan/net100/floyd.html". | |||
[F03] Gareth Fairey, High-Speed TCP, 2003. URL | [F03] Gareth Fairey, "High-Speed TCP", 2003. URL | |||
"http://www.hep.man.ac.uk/u/garethf/hstcp/". | "http://www.hep.man.ac.uk/u/garethf/hstcp/". | |||
[F92] S. Floyd and V. Jacobson, On Traffic Phase Effects in Packet- | [F92] S. Floyd and V. Jacobson, "On Traffic Phase Effects in | |||
Switched Gateways, Internetworking: Research and Experience, V.3 | Packet-Switched Gateways, Internetworking: Research and | |||
N.3, September 1992, p.115-156. URL | Experience", V.3 N.3, September 1992, p.115-156. URL | |||
"http://www.icir.org/floyd/papers.html". | "http://www.icir.org/floyd/papers.html". | |||
[Fl03] Sally Floyd, "Re: [Tsvwg] taking NewReno (RFC 2582) to | [Fl03] Sally Floyd, "Re: [Tsvwg] taking NewReno (RFC 2582) to | |||
Proposed Standard", Email to the tsvwg mailing list, May 14, 2003, | Proposed Standard", Email to the tsvwg mailing list, May | |||
14, 2003. | ||||
URLs "http://www1.ietf.org/mail-archive/working- | URLs "http://www1.ietf.org/mail-archive/working- | |||
groups/tsvwg/current/msg04086.html" and "http://www1.ietf.org/mail- | groups/tsvwg/current/msg04086.html" and | |||
archive/working-groups/tsvwg/current/msg04087.html". | "http://www1.ietf.org/mail-archive/working- | |||
groups/tsvwg/current/msg04087.html". | ||||
[FF98] Floyd, S., and Fall, K., "Promoting the Use of End-to-End | [FF98] Floyd, S., and Fall, K., "Promoting the Use of End-to-End | |||
Congestion Control in the Internet", IEEE/ACM Transactions on | Congestion Control in the Internet", IEEE/ACM Transactions | |||
Networking, August 1999. | on Networking, August 1999. | |||
[FRS02] Sally Floyd, Sylvia Ratnasamy, and Scott Shenker, "Modifying | [FRS02] Sally Floyd, Sylvia Ratnasamy, and Scott Shenker, | |||
TCP's Congestion Control for High Speeds", May 2002. URL | "Modifying TCP's Congestion Control for High Speeds", May | |||
"http://www.icir.org/floyd/notes.html". | 2002. URL "http://www.icir.org/floyd/notes.html". | |||
[GRK99] Panos Gevros, Fulvio Risso and Peter Kirstein, "Analysis of | [GRK99] Panos Gevros, Fulvio Risso and Peter Kirstein, "Analysis | |||
a Method for Differential TCP Service" In Proceedings of the IEEE | of a Method for Differential TCP Service". In Proceedings | |||
GLOBECOM'99, Symposium on Global Internet , December 1999, Rio de | of the IEEE GLOBECOM'99, Symposium on Global Internet , | |||
Janeiro, Brazil. | December 1999, Rio de Janeiro, Brazil. | |||
[GV02] S. Gorinsky and H. Vin, "Extended Analysis of Binary | [GV02] S. Gorinsky and H. Vin, "Extended Analysis of Binary | |||
Adjustment Algorithms", Technical Report TR2002-39, Department of | Adjustment Algorithms", Technical Report TR2002-39, | |||
Computer Sciences, The University of Texas at Austin, August 2002. | Department of Computer Sciences, The University of Texas | |||
URL "http://www.cs.utexas.edu/users/gorinsky/pubs.html". | at Austin, August 2002. URL | |||
"http://www.cs.utexas.edu/users/gorinsky/pubs.html". | ||||
[HSTCP] HighSpeed TCP Web Page, URL | [HSTCP] HighSpeed TCP Web Page, URL | |||
"http://www.icir.org/floyd/hstcp.html". | "http://www.icir.org/floyd/hstcp.html". | |||
[J02] Amit Jain and Sally Floyd, "Quick-Start for TCP and IP", | [J02] Amit Jain and Sally Floyd, "Quick-Start for TCP and IP", | |||
internet draft draft-amit-quick-start-02.txt, work in progress, | Work in Progress, 2002. | |||
2002. | ||||
[JWL03] Cheng Jin, David X. Wei and Steven H. Low, FAST TCP for | [JWL03] Cheng Jin, David X. Wei and Steven H. Low, "FAST TCP for | |||
High-speed Long-distance Networks, internet-draft draft-jwl-tcp- | High-speed Long-distance Networks", Work in Progress, June | |||
fast-01.txt, work-in-progress, June 2003. | 2003. | |||
[K03] Tom Kelly, "Scalable TCP: Improving Performance in HighSpeed | [K03] Tom Kelly, "Scalable TCP: Improving Performance in | |||
Wide Area Networks", February 2003. URL "http://www- | HighSpeed Wide Area Networks", February 2003. URL | |||
lce.eng.cam.ac.uk/~ctk21/scalable/". | "http://www-lce.eng.cam.ac.uk/~ctk21/scalable/". | |||
[KHR02] Dina Katabi, Mark Handley, and Charlie Rohrs, Congestion | [KHR02] Dina Katabi, Mark Handley, and Charlie Rohrs, "Congestion | |||
Control for High Bandwidth-Delay Product Networks, SIGCOMM 2002. | Control for High Bandwidth-Delay Product Networks", | |||
SIGCOMM 2002. | ||||
[M02] Matt Mathis, "Raising the Internet MTU", Web Page, URL | [M02] Matt Mathis, "Raising the Internet MTU", Web Page, URL | |||
"http://www.psc.edu/~mathis/MTU/". | "http://www.psc.edu/~mathis/MTU/". | |||
[Net100] The DOE/MICS Net100 project. URL | [Net100] The DOE/MICS Net100 project. URL | |||
"http://www.csm.ornl.gov/~dunigan/net100/". | "http://www.csm.ornl.gov/~dunigan/net100/". | |||
[NS] The NS Simulator, "http://www.isi.edu/nsnam/ns/". | [NS] The NS Simulator, "http://www.isi.edu/nsnam/ns/". | |||
[RFC 1323] V. Jacobson, R. Braden, and D. Borman, TCP Extensions for | [RFC 1323] Jacobson, V., Braden, R. and D. Borman, "TCP Extensions | |||
High Performance, RFC 1323, May 1992. | for High Performance", RFC 1323, May 1992. | |||
[RFC3390] Allman, M., Floyd, S., and Partridge, C., "Increasing | [RFC3390] Allman, M., Floyd, S. and C., Partridge, "Increasing TCP's | |||
TCP's Initial Window", RFC 3390, October 2002. | Initial Window", RFC 3390, October 2002. | |||
[RFC3448] Mark Handley, Jitendra Padhye, Sally Floyd, and Joerg | [RFC3448] Handley, M., Padhye, J., Floyd, S. and J. Widmer, "TCP | |||
Widmer, TCP Friendly Rate Control (TFRC): Protocol Specification, | Friendly Rate Control (TFRC): Protocol Specification", RFC | |||
RFC 3448, January 2003. | 3448, January 2003. | |||
[SA03] Souza, E., and Agarwal, D.A., A HighSpeed TCP Study: | [SA03] Souza, E. and D.A., Agarwal, "A HighSpeed TCP Study: | |||
Characteristics and Deployment Issues, LBNL Technical Report | Characteristics and Deployment Issues", LBNL Technical | |||
LBNL-53215. URL "http://www.icir.org/floyd/hstcp.html". | Report LBNL-53215. URL | |||
"http://www.icir.org/floyd/hstcp.html". | ||||
[S02] Stanislav Shalunov, TCP Armonk, draft, 2002, URL | [S02] Stanislav Shalunov, "TCP Armonk", Work in Progress, 2002, | |||
"http://www.internet2.edu/~shalunov/tcpar/". | URL "http://www.internet2.edu/~shalunov/tcpar/". | |||
[S03] Alex Solan, private communication, 2003. | [S03] Alex Solan, private communication, 2003. | |||
[VMSS] "Web100 at ORNL", Web Page, | [VMSS] "Web100 at ORNL", Web Page, | |||
"http://www.csm.ornl.gov/~dunigan/netperf/web100.html". | "http://www.csm.ornl.gov/~dunigan/netperf/web100.html". | |||
[Web100] The Web100 project. URL "http://www.web100.org/". | [Web100] The Web100 project. URL "http://www.web100.org/". | |||
18. Security Considerations | 18. Security Considerations | |||
This proposal makes no changes to the underlying security of TCP. | This proposal makes no changes to the underlying security of TCP. | |||
19. IANA Considerations | 19. IANA Considerations | |||
There are no IANA considerations regarding this document. | There are no IANA considerations regarding this document. | |||
20. TCP's Loss Event Rate in Steady-State | A. TCP's Loss Event Rate in Steady-State | |||
This section gives the number of round-trip times between congestion | This section gives the number of round-trip times between congestion | |||
events for a TCP flow with D-byte packets, for D=1500, as a function | events for a TCP flow with D-byte packets, for D=1500, as a function | |||
of the connection's average throughput B in bps. To achieve this | of the connection's average throughput B in bps. To achieve this | |||
average throughput B, a TCP connection with round-trip time R in | average throughput B, a TCP connection with round-trip time R in | |||
seconds requires an average congestion window w of BR/(8D) segments. | seconds requires an average congestion window w of BR/(8D) segments. | |||
In steady-state, TCP's average congestion window w is roughly | In steady-state, TCP's average congestion window w is roughly | |||
1.2/sqrt(p) segments. This is equivalent to a lost event at most | 1.2/sqrt(p) segments. This is equivalent to a lost event at most | |||
once every 1/p packets, or at most once every 1/(pw) = w/1.5 round- | once every 1/p packets, or at most once every 1/(pw) = w/1.5 round- | |||
skipping to change at page 36, line 33 | skipping to change at page 33, line 38 | |||
$lastrtt = $rtt; | $lastrtt = $rtt; | |||
} | } | |||
$hswin += $aw; | $hswin += $aw; | |||
$regwin += 1; | $regwin += 1; | |||
$rtt ++; | $rtt ++; | |||
} | } | |||
Table 14: Perl Program for computing the window in congestion | Table 14: Perl Program for computing the window in congestion | |||
avoidance. | avoidance. | |||
AUTHORS' ADDRESSES | Author's Address | |||
Sally Floyd | Sally Floyd | |||
Phone: +1 (510) 666-2989 | ||||
ICIR (ICSI Center for Internet Research) | ICIR (ICSI Center for Internet Research) | |||
Email: floyd@acm.org | ||||
Phone: +1 (510) 666-2989 | ||||
EMail: floyd@acm.org | ||||
URL: http://www.icir.org/floyd/ | URL: http://www.icir.org/floyd/ | |||
This draft was created in August 2003. | Full Copyright Statement | |||
Copyright (C) The Internet Society (2003). All Rights Reserved. | ||||
This document and translations of it may be copied and furnished to | ||||
others, and derivative works that comment on or otherwise explain it | ||||
or assist in its implementation may be prepared, copied, published | ||||
and distributed, in whole or in part, without restriction of any | ||||
kind, provided that the above copyright notice and this paragraph are | ||||
included on all such copies and derivative works. However, this | ||||
document itself may not be modified in any way, such as by removing | ||||
the copyright notice or references to the Internet Society or other | ||||
Internet organizations, except as needed for the purpose of | ||||
developing Internet standards in which case the procedures for | ||||
copyrights defined in the Internet Standards process must be | ||||
followed, or as required to translate it into languages other than | ||||
English. | ||||
The limited permissions granted above are perpetual and will not be | ||||
revoked by the Internet Society or its successors or assignees. | ||||
This document and the information contained herein is provided on an | ||||
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING | ||||
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING | ||||
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION | ||||
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF | ||||
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. | ||||
Acknowledgement | ||||
Funding for the RFC Editor function is currently provided by the | ||||
Internet Society. | ||||
End of changes. | ||||
This html diff was produced by rfcdiff 1.25, available from http://www.levkowetz.com/ietf/tools/rfcdiff/ |