draft-ietf-mptcp-congestion-02.txt   draft-ietf-mptcp-congestion-03.txt 
Internet Engineering Task Force C. Raiciu Internet Engineering Task Force C. Raiciu
Internet-Draft M. Handley Internet-Draft M. Handley
Intended status: Experimental D. Wischik Intended status: Experimental D. Wischik
Expires: September 15, 2011 University College London Expires: October 13, 2011 University College London
March 14, 2011 April 11, 2011
Coupled Congestion Control for Multipath Transport Protocols Coupled Congestion Control for Multipath Transport Protocols
draft-ietf-mptcp-congestion-02 draft-ietf-mptcp-congestion-03
Abstract Abstract
Often endpoints are connected by multiple paths, but communications Often endpoints are connected by multiple paths, but communications
are usually restricted to a single path per connection. Resource are usually restricted to a single path per connection. Resource
usage within the network would be more efficient were it possible for usage within the network would be more efficient were it possible for
these multiple paths to be used concurrently. Multipath TCP is a these multiple paths to be used concurrently. Multipath TCP is a
proposal to achieve multipath transport in TCP. proposal to achieve multipath transport in TCP.
New congestion control algorithms are needed for multipath transport New congestion control algorithms are needed for multipath transport
skipping to change at page 2, line 13 skipping to change at page 2, line 13
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on September 15, 2011. This Internet-Draft will expire on October 13, 2011.
Copyright Notice Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the BSD License. described in the BSD License.
Table of Contents Table of Contents
1. Requirements Language . . . . . . . . . . . . . . . . . . . . 4 1. Requirements Language . . . . . . . . . . . . . . . . . . . . 4
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Coupled Congestion Control Algorithm . . . . . . . . . . . . . 5 3. Coupled Congestion Control Algorithm . . . . . . . . . . . . . 6
4. Implementation Considerations . . . . . . . . . . . . . . . . 7 4. Implementation Considerations . . . . . . . . . . . . . . . . 7
4.1. Implementation Considerations when CWND is Expressed 4.1. Implementation Considerations when CWND is Expressed
in Packets . . . . . . . . . . . . . . . . . . . . . . . . 9 in Packets . . . . . . . . . . . . . . . . . . . . . . . . 9
5. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 10 5. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 10
6. Security Considerations . . . . . . . . . . . . . . . . . . . 10 6. Security Considerations . . . . . . . . . . . . . . . . . . . 10
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
9.1. Normative References . . . . . . . . . . . . . . . . . . . 11 9.1. Normative References . . . . . . . . . . . . . . . . . . . 11
9.2. Informative References . . . . . . . . . . . . . . . . . . 11 9.2. Informative References . . . . . . . . . . . . . . . . . . 11
skipping to change at page 4, line 35 skipping to change at page 4, line 35
Bottleneck fairness is just one requirement multipath congestion Bottleneck fairness is just one requirement multipath congestion
control should meet. The following three goals capture the desirable control should meet. The following three goals capture the desirable
properties of a practical multipath congestion control algorithm: properties of a practical multipath congestion control algorithm:
o Goal 1 (Improve Throughput) A multipath flow should perform at o Goal 1 (Improve Throughput) A multipath flow should perform at
least as well as a single path flow would on the best of the paths least as well as a single path flow would on the best of the paths
available to it. available to it.
o Goal 2 (Do no harm) A multipath flow should not take up more o Goal 2 (Do no harm) A multipath flow should not take up more
capacity on any one of its paths than if it was a single path flow capacity from any of the resources shared by its different paths,
using only that route. This guarantees it will not unduly harm than if it was a single flow using only one of these paths. This
other flows. guarantees it will not unduly harm other flows.
o Goal 3 (Balance congestion) A multipath flow should move as much o Goal 3 (Balance congestion) A multipath flow should move as much
traffic as possible off its most congested paths, subject to traffic as possible off its most congested paths, subject to
meeting the first two goals. meeting the first two goals.
Goals 1 and 2 together ensure fairness at the bottleneck. Goal 3 Goals 1 and 2 together ensure fairness at the bottleneck. Goal 3
captures the concept of resource pooling [WISCHIK]: if each multipath captures the concept of resource pooling [WISCHIK]: if each multipath
flow sends more data through its least congested path, the traffic in flow sends more data through its least congested path, the traffic in
the network will move away from congested areas. This improves the network will move away from congested areas. This improves
robustness and overall throughput, among other things. The way to robustness and overall throughput, among other things. The way to
skipping to change at page 5, line 13 skipping to change at page 5, line 13
We propose an algorithm that couples only the additive increase We propose an algorithm that couples only the additive increase
function of the subflows, and uses unmodified TCP New Reno behavior function of the subflows, and uses unmodified TCP New Reno behavior
in case of a drop. The algorithm relies on the traditional TCP in case of a drop. The algorithm relies on the traditional TCP
mechanisms to detect drops, to retransmit data, etc. mechanisms to detect drops, to retransmit data, etc.
Detecting shared bottlenecks reliably is quite difficult, but is just Detecting shared bottlenecks reliably is quite difficult, but is just
one part of a bigger question. This bigger question is how much one part of a bigger question. This bigger question is how much
bandwidth a multipath user should use in total, even if there is no bandwidth a multipath user should use in total, even if there is no
shared bottleneck. shared bottleneck.
Our solution sets the multipath flow's aggregate bandwidth to be the The congestion controller aims to set the multipath flow's aggregate
same bandwidth a regular TCP flow would get on the best path bandwidth to be the same as a regular TCP flow would get on the best
available to the multipath flow. To estimate the bandwidth of a path available to the multipath flow. To estimate the bandwidth of a
regular TCP flow, the multipath flow estimates loss rates and round regular TCP flow, the multipath flow estimates loss rates and round
trip times and computes the target rate. Then it adjusts the overall trip times and computes the target rate. Then it adjusts the overall
aggresiveness (parameter alpha) to achieve the desired rate. aggresiveness (parameter alpha) to achieve the desired rate.
We note that in cases with low statistical multiplexing (where the While the mechanism above applies always, its effect depends on
multipath flow influences the loss rates on the path) the multipath whether the multipath TCP flow influences or does not influence the
throughput will be strictly higher than a single TCP would get on any link loss rates (high vs. low statistical multiplexing). If MPTCP
of the paths. In particular, if using two idle paths, multipath does not influence link loss rates, MPTCP will get the same
throughput will be sum of the two paths' throughput. throughput as TCP on the best path. In cases with low statistical
multiplexing, where the multipath flow influences the loss rates on
the path, the multipath throughput will be strictly higher than a
single TCP would get on any of the paths. In particular, if using
two idle paths, multipath throughput will be sum of the two paths'
throughput.
This algorithm ensures bottleneck fairness and fairness in the This algorithm ensures bottleneck fairness and fairness in the
broader, network sense. We acknowledge that current TCP fairness broader, network sense. We acknowledge that current TCP fairness
criteria are far from ideal, but a multipath TCP needs to be criteria are far from ideal, but a multipath TCP needs to be
deployable in the current Internet. If needed, new fairness criteria deployable in the current Internet. If needed, new fairness criteria
can be implemented by the same algorithm we propose by appropriately can be implemented by the same algorithm we propose by appropriately
scaling the overall aggressiveness. scaling the overall aggressiveness.
It is intended that the algorithm presented here can be applied to It is intended that the algorithm presented here can be applied to
other multipath transport protocols, such as alternative multipath other multipath transport protocols, such as alternative multipath
skipping to change at page 6, line 35 skipping to change at page 6, line 39
The increase formula takes the minimum between the computed increase The increase formula takes the minimum between the computed increase
for the multipath subflow (first argument to min), and the increase for the multipath subflow (first argument to min), and the increase
TCP would get in the same scenario (the second argument). In this TCP would get in the same scenario (the second argument). In this
way, we ensure that any multipath subflow cannot be more aggressive way, we ensure that any multipath subflow cannot be more aggressive
than a TCP flow in the same circumstances, hence achieving goal 2 (do than a TCP flow in the same circumstances, hence achieving goal 2 (do
no harm). no harm).
"alpha" is a parameter of the algorithm that describes the "alpha" is a parameter of the algorithm that describes the
aggresiveness of the multipath flow. To meet Goal 1 (improve aggresiveness of the multipath flow. To meet Goal 1 (improve
throughput), the value of alpha is chosen such that the aggregate throughput), the value of alpha is chosen such that the aggregate
throughput of the multipath flow is equal to the rate a TCP flow throughput of the multipath flow is equal to the throughput a TCP
would get if it ran on the best path. flow would get if it ran on the best path.
To get an intuition of what the algorithm is trying to do, let's take To get an intuition of what the algorithm is trying to do, let's take
the case where all the subflows have the same round trip time and the case where all the subflows have the same round trip time and
MSS. In this case the algorithm will grow the total window by MSS. In this case the algorithm will grow the total window by
approximately alpha*MSS per RTT. This increase is distributed to the approximately alpha*MSS per RTT. This increase is distributed to the
individual flows according to their instantaneous window size. individual flows according to their instantaneous window size.
Subflow i will increase by alpha*cwnd_i/tot_cwnd segments per RTT. Subflow i will increase by alpha*cwnd_i/tot_cwnd segments per RTT.
Note that, as in standard TCP, when tot_cwnd is large the increase Note that, as in standard TCP, when tot_cwnd is large the increase
may be 0. In this case the increase MUST be set to 1. We discuss may be 0. In this case the increase MUST be set to 1. We discuss
skipping to change at page 7, line 21 skipping to change at page 7, line 25
that are app limited, the computation should consider the minimum that are app limited, the computation should consider the minimum
between flight_size_i and cwnd_i, and flight_size_i and ssthresh_i between flight_size_i and cwnd_i, and flight_size_i and ssthresh_i
where appropriate. where appropriate.
The total throughput of a multipath flow depends on the value of The total throughput of a multipath flow depends on the value of
alpha and the loss rates, maximum segment sizes and round trip times alpha and the loss rates, maximum segment sizes and round trip times
of its paths. Since we require that the total throughput is no worse of its paths. Since we require that the total throughput is no worse
than the throughput a single TCP would get on the best path, it is than the throughput a single TCP would get on the best path, it is
impossible to choose a-priori a single value of alpha that achieves impossible to choose a-priori a single value of alpha that achieves
the desired throughput in every ocasion. Hence, alpha must be the desired throughput in every ocasion. Hence, alpha must be
computed for each multipath flow, based on the observed properties of computed based on the observed properties of the paths.
the paths.
The formula to compute alpha is: The formula to compute alpha is:
cwnd_i cwnd_i
max -------- max --------
i 2 i 2
rtt_i rtt_i
alpha = tot_cwnd * ---------------- alpha = tot_cwnd * ----------------
/ cwnd_i \ 2 / cwnd_i \ 2
| sum ---------| | sum ---------|
skipping to change at page 8, line 9 skipping to change at page 8, line 11
approximate the above calculations using integer arithmetic. approximate the above calculations using integer arithmetic.
Let alpha_scale be an integer. When computing alpha, use alpha_scale Let alpha_scale be an integer. When computing alpha, use alpha_scale
* tot_cwnd instead of tot_cwnd, and do all the operations in integer * tot_cwnd instead of tot_cwnd, and do all the operations in integer
arithmetic. arithmetic.
Then, scale down the increase per ack by alpha_scale. The algorithm Then, scale down the increase per ack by alpha_scale. The algorithm
is: is:
o For each ack received on subflow i, increase cwnd_i by min ( o For each ack received on subflow i, increase cwnd_i by min (
alpha*bytes_acked*mss_i/tot_cwnd/alpha_scale , bytes_acked*mss_i/ (alpha*bytes_acked*mss_i/tot_cwnd)/alpha_scale ,
cwnd_i ) bytes_acked*mss_i/cwnd_i )
Alpha scale denotes the precision we want for computing alpha. Alpha scale denotes the precision we want for computing alpha.
Observe that the errors in computing the numerator or the denominator Observe that the errors in computing the numerator or the denominator
in the formula for alpha are quite small, as the cwnd in bytes is in the formula for alpha are quite small, as the cwnd in bytes is
typically much larger than the RTT (measured in ms). typically much larger than the RTT (measured in ms).
With these changes, all the operations can be done using integer With these changes, all the operations can be done using integer
arithmetic. We propose alpha_scale be a small power of two, to allow arithmetic. We propose alpha_scale be a small power of two, to allow
using faster shift operations instead of multiplication and division. using faster shift operations instead of multiplication and division.
Our experiments show that using alpha_scale=512 works well in a wide Our experiments show that using alpha_scale=512 works well in a wide
skipping to change at page 9, line 24 skipping to change at page 9, line 26
4.1. Implementation Considerations when CWND is Expressed in Packets 4.1. Implementation Considerations when CWND is Expressed in Packets
When the congestion control algorithm maintains cwnd in packets When the congestion control algorithm maintains cwnd in packets
rather than bytes, the algorithms above must change to take into rather than bytes, the algorithms above must change to take into
account path mss. account path mss.
To compute the increase when an ack is received, the implementation To compute the increase when an ack is received, the implementation
for multipath congestion control is a simple extension of the TCP New for multipath congestion control is a simple extension of the TCP New
Reno code. In TCP New Reno cwnd_cnt is an additional state variable Reno code. In TCP New Reno cwnd_cnt is an additional state variable
that tracks the number of bytes acked since the last cwnd increment; that tracks the number of segments acked since the last cwnd
cwnd is incremented only when cwnd_cnt > cwnd; then cwnd_cnt is set increment; cwnd is incremented only when cwnd_cnt > cwnd; then
to 0. cwnd_cnt is set to 0.
In the multipath case, cwnd_cnt_i is maintained for each subflow as In the multipath case, cwnd_cnt_i is maintained for each subflow as
above, and cwnd_i is increased by 1 when cwnd_cnt_i > alpha_scale * above, and cwnd_i is increased by 1 when cwnd_cnt_i > max(alpha_scale
tot_cwnd / alpha. * tot_cwnd / alpha, cwnd_i).
When computing alpha for packet-based stacks, the errors in computing When computing alpha for packet-based stacks, the errors in computing
the terms in the denominator are larger (this is because cwnd is much the terms in the denominator are larger (this is because cwnd is much
smaller and rtt may be comparatively large). Let max be the index of smaller and rtt may be comparatively large). Let max be the index of
the subflow used in the numerator. To reduce errors, it is easiest the subflow used in the numerator. To reduce errors, it is easiest
to move rtt_max (once calculated) from the numerator to the to move rtt_max (once calculated) from the numerator to the
denominator, obtaining the equivalent formula below. denominator, obtaining the equivalent formula below.
cwnd_max cwnd_max
alpha = alpha_scale * tot_cwnd * ----------------------- alpha = alpha_scale * tot_cwnd * -----------------------
 End of changes. 12 change blocks. 
27 lines changed or deleted 31 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/