draft-ietf-tsvwg-byte-pkt-congest-08.txt | draft-ietf-tsvwg-byte-pkt-congest-09.txt | |||
---|---|---|---|---|
Transport Area Working Group B. Briscoe | Transport Area Working Group B. Briscoe | |||
Internet-Draft BT | Internet-Draft BT | |||
Updates: 2309 (if approved) J. Manner | Updates: 2309 (if approved) J. Manner | |||
Intended status: BCP Aalto University | Intended status: BCP Aalto University | |||
Expires: February 14, 2013 August 13, 2012 | Expires: May 11, 2013 November 7, 2012 | |||
Byte and Packet Congestion Notification | Byte and Packet Congestion Notification | |||
draft-ietf-tsvwg-byte-pkt-congest-08 | draft-ietf-tsvwg-byte-pkt-congest-09 | |||
Abstract | Abstract | |||
This document provides recommendations of best current practice for | This document provides recommendations of best current practice for | |||
dropping or marking packets using active queue management (AQM) such | dropping or marking packets using active queue management (AQM) such | |||
as random early detection (RED) or pre-congestion notification (PCN). | as random early detection (RED) or pre-congestion notification (PCN). | |||
We give three strong recommendations: (1) packet size should be taken | We give three strong recommendations: (1) packet size should be taken | |||
into account when transports read and respond to congestion | into account when transports read and respond to congestion | |||
indications, (2) packet size should not be taken into account when | indications, (2) packet size should not be taken into account when | |||
network equipment creates congestion signals (marking, dropping), and | network equipment creates congestion signals (marking, dropping), and | |||
skipping to change at page 1, line 41 | skipping to change at page 1, line 41 | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on February 14, 2013. | This Internet-Draft will expire on May 11, 2013. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2012 IETF Trust and the persons identified as the | Copyright (c) 2012 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 3, line 16 | skipping to change at page 3, line 16 | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
1.1. Terminology and Scoping . . . . . . . . . . . . . . . . . 6 | 1.1. Terminology and Scoping . . . . . . . . . . . . . . . . . 6 | |||
1.2. Example Comparing Packet-Mode Drop and Byte-Mode Drop . . 7 | 1.2. Example Comparing Packet-Mode Drop and Byte-Mode Drop . . 7 | |||
2. Recommendations . . . . . . . . . . . . . . . . . . . . . . . 8 | 2. Recommendations . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
2.1. Recommendation on Queue Measurement . . . . . . . . . . . 9 | 2.1. Recommendation on Queue Measurement . . . . . . . . . . . 9 | |||
2.2. Recommendation on Encoding Congestion Notification . . . . 9 | 2.2. Recommendation on Encoding Congestion Notification . . . . 9 | |||
2.3. Recommendation on Responding to Congestion . . . . . . . . 10 | 2.3. Recommendation on Responding to Congestion . . . . . . . . 10 | |||
2.4. Recommendation on Handling Congestion Indications when | 2.4. Recommendation on Handling Congestion Indications when | |||
Splitting or Merging Packets . . . . . . . . . . . . . . . 11 | Splitting or Merging Packets . . . . . . . . . . . . . . . 11 | |||
3. Motivating Arguments . . . . . . . . . . . . . . . . . . . . . 11 | 3. Motivating Arguments . . . . . . . . . . . . . . . . . . . . . 12 | |||
3.1. Avoiding Perverse Incentives to (Ab)use Smaller Packets . 12 | 3.1. Avoiding Perverse Incentives to (Ab)use Smaller Packets . 12 | |||
3.2. Small != Control . . . . . . . . . . . . . . . . . . . . . 13 | 3.2. Small != Control . . . . . . . . . . . . . . . . . . . . . 13 | |||
3.3. Transport-Independent Network . . . . . . . . . . . . . . 13 | 3.3. Transport-Independent Network . . . . . . . . . . . . . . 13 | |||
3.4. Scaling Congestion Control with Packet Size . . . . . . . 14 | 3.4. Partial Deployment of AQM . . . . . . . . . . . . . . . . 15 | |||
3.5. Implementation Efficiency . . . . . . . . . . . . . . . . 16 | 3.5. Implementation Efficiency . . . . . . . . . . . . . . . . 16 | |||
4. A Survey and Critique of Past Advice . . . . . . . . . . . . . 16 | 4. A Survey and Critique of Past Advice . . . . . . . . . . . . . 16 | |||
4.1. Congestion Measurement Advice . . . . . . . . . . . . . . 16 | 4.1. Congestion Measurement Advice . . . . . . . . . . . . . . 17 | |||
4.1.1. Fixed Size Packet Buffers . . . . . . . . . . . . . . 17 | 4.1.1. Fixed Size Packet Buffers . . . . . . . . . . . . . . 17 | |||
4.1.2. Congestion Measurement without a Queue . . . . . . . . 18 | 4.1.2. Congestion Measurement without a Queue . . . . . . . . 18 | |||
4.2. Congestion Notification Advice . . . . . . . . . . . . . . 19 | 4.2. Congestion Notification Advice . . . . . . . . . . . . . . 19 | |||
4.2.1. Network Bias when Encoding . . . . . . . . . . . . . . 19 | 4.2.1. Network Bias when Encoding . . . . . . . . . . . . . . 19 | |||
4.2.2. Transport Bias when Decoding . . . . . . . . . . . . . 21 | 4.2.2. Transport Bias when Decoding . . . . . . . . . . . . . 21 | |||
4.2.3. Making Transports Robust against Control Packet | 4.2.3. Making Transports Robust against Control Packet | |||
Losses . . . . . . . . . . . . . . . . . . . . . . . . 22 | Losses . . . . . . . . . . . . . . . . . . . . . . . . 22 | |||
4.2.4. Congestion Notification: Summary of Conflicting | 4.2.4. Congestion Notification: Summary of Conflicting | |||
Advice . . . . . . . . . . . . . . . . . . . . . . . . 22 | Advice . . . . . . . . . . . . . . . . . . . . . . . . 23 | |||
5. Outstanding Issues and Next Steps . . . . . . . . . . . . . . 24 | 5. Outstanding Issues and Next Steps . . . . . . . . . . . . . . 24 | |||
5.1. Bit-congestible Network . . . . . . . . . . . . . . . . . 24 | 5.1. Bit-congestible Network . . . . . . . . . . . . . . . . . 24 | |||
5.2. Bit- & Packet-congestible Network . . . . . . . . . . . . 24 | 5.2. Bit- & Packet-congestible Network . . . . . . . . . . . . 24 | |||
6. Security Considerations . . . . . . . . . . . . . . . . . . . 24 | 6. Security Considerations . . . . . . . . . . . . . . . . . . . 25 | |||
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 | 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 | |||
8. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 25 | 8. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 26 | |||
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 26 | 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 27 | |||
10. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 27 | 10. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 27 | |||
11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 27 | 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 27 | |||
11.1. Normative References . . . . . . . . . . . . . . . . . . . 27 | 11.1. Normative References . . . . . . . . . . . . . . . . . . . 27 | |||
11.2. Informative References . . . . . . . . . . . . . . . . . . 27 | 11.2. Informative References . . . . . . . . . . . . . . . . . . 27 | |||
Appendix A. Survey of RED Implementation Status . . . . . . . . . 31 | Appendix A. Survey of RED Implementation Status . . . . . . . . . 31 | |||
Appendix B. Sufficiency of Packet-Mode Drop . . . . . . . . . . . 32 | Appendix B. Sufficiency of Packet-Mode Drop . . . . . . . . . . . 32 | |||
B.1. Packet-Size (In)Dependence in Transports . . . . . . . . . 33 | B.1. Packet-Size (In)Dependence in Transports . . . . . . . . . 33 | |||
B.2. Bit-Congestible and Packet-Congestible Indications . . . . 36 | B.2. Bit-Congestible and Packet-Congestible Indications . . . . 36 | |||
Appendix C. Byte-mode Drop Complicates Policing Congestion | Appendix C. Byte-mode Drop Complicates Policing Congestion | |||
Response . . . . . . . . . . . . . . . . . . . . . . 37 | Response . . . . . . . . . . . . . . . . . . . . . . 38 | |||
Appendix D. Changes from Previous Versions . . . . . . . . . . . 38 | Appendix D. Changes from Previous Versions . . . . . . . . . . . 39 | |||
1. Introduction | 1. Introduction | |||
This memo concerns how we should correctly scale congestion control | This memo concerns how we should correctly scale congestion control | |||
functions with packet size for the long term. It also recognises | functions with packet size for the long term. It also recognises | |||
that expediency may be necessary to deal with existing widely | that expediency may be necessary to deal with existing widely | |||
deployed protocols that don't live up to the long term goal. | deployed protocols that don't live up to the long term goal. | |||
When notifying congestion, the problem of how (and whether) to take | When notifying congestion, the problem of how (and whether) to take | |||
packet sizes into account has exercised the minds of researchers and | packet sizes into account has exercised the minds of researchers and | |||
skipping to change at page 10, line 5 | skipping to change at page 10, line 5 | |||
necessary to drop every packet with probability 0.1% without regard | necessary to drop every packet with probability 0.1% without regard | |||
to the size of each packet. | to the size of each packet. | |||
This approach ensures the network layer offers sufficient congestion | This approach ensures the network layer offers sufficient congestion | |||
information for all known and future transport protocols and also | information for all known and future transport protocols and also | |||
ensures no perverse incentives are created that would encourage | ensures no perverse incentives are created that would encourage | |||
transports to use inappropriately small packet sizes. | transports to use inappropriately small packet sizes. | |||
What this advice means for the case of RED: | What this advice means for the case of RED: | |||
1. AQM algorithms such as RED SHOULD NOT use byte-mode drop, which | 1. AQM algorithms such as RED SHOULD use packet-mode drop, ie they | |||
deflates RED's drop probability for smaller packet sizes. RED's | SHOULD NOT use byte-mode drop. The latter is more complex, it | |||
byte-mode drop has no enduring advantages. It is more complex, | creates the perverse incentive to fragment segments into tiny | |||
it creates the perverse incentive to fragment segments into tiny | pieces and it is vulnerable to floods of small packets. | |||
pieces and it reopens the vulnerability to floods of small- | ||||
packets that drop-tail queues suffered from and AQM was designed | ||||
to remove. | ||||
2. If a vendor has implemented byte-mode drop, and an operator has | 2. If a vendor has implemented byte-mode drop, and an operator has | |||
turned it on, it is RECOMMENDED to turn it off. Note that RED as | turned it on, it is RECOMMENDED to turn it off, after | |||
a whole SHOULD NOT be turned off, as without it, a drop tail | establishing if there are any implications on the relative | |||
queue also biases against large packets. But note also that | performance of applications using different packet sizes. | |||
turning off byte-mode drop may alter the relative performance of | RED as a whole SHOULD NOT be turned off. Without RED, a drop | |||
applications using different packet sizes, so it would be | tail queue biases against large packets and is vulnerable to | |||
advisable to establish the implications before turning it off. | floods of small packets. | |||
Note well that RED's byte-mode queue drop is completely | Note well that RED's byte-mode queue drop is completely orthogonal to | |||
orthogonal to byte-mode queue measurement and should not be | byte-mode queue measurement and should not be confused with it. If a | |||
confused with it. If a RED implementation has a byte-mode but | RED implementation has a byte-mode but does not specify what sort of | |||
does not specify what sort of byte-mode, it is most probably | byte-mode, it is most probably byte-mode queue measurement, which is | |||
byte-mode queue measurement, which is fine. However, if in | fine. However, if in doubt, the vendor should be consulted. | |||
doubt, the vendor should be consulted. | ||||
A survey (Appendix A) showed that there appears to be little, if any, | A survey (Appendix A) showed that there appears to be little, if any, | |||
installed base of the byte-mode drop variant of RED. This suggests | installed base of the byte-mode drop variant of RED. This suggests | |||
that deprecating byte-mode drop will have little, if any, incremental | that deprecating byte-mode drop will have little, if any, incremental | |||
deployment impact. | deployment impact. | |||
2.3. Recommendation on Responding to Congestion | 2.3. Recommendation on Responding to Congestion | |||
When a transport detects that a packet has been lost or congestion | When a transport detects that a packet has been lost or congestion | |||
marked, it SHOULD consider the strength of the congestion indication | marked, it SHOULD consider the strength of the congestion indication | |||
as proportionate to the size in octets (bytes) of the missing or | as proportionate to the size in octets (bytes) of the missing or | |||
marked packet. | marked packet. | |||
In other words, when a packet indicates congestion (by being lost or | In other words, when a packet indicates congestion (by being lost or | |||
marked) it can be considered conceptually as if there is a congestion | marked) it can be considered conceptually as if there is a congestion | |||
indication on every octet of the packet, not just one indication per | indication on every octet of the packet, not just one indication per | |||
packet. | packet. | |||
Therefore, the IETF transport area should continue its programme of; | To be clear, the above recommendation solely describes how a | |||
transport should interpret the meaning of a congestion indication. | ||||
It makes no recommendation on whether a transport should act | ||||
differently based on this interpretation. It merely aids | ||||
interoperablity between transports, if they choose to make their | ||||
actions depend on the strength of congestion indications. | ||||
This definition will be useful as the the IETF transport area | ||||
continues its programme of; | ||||
o updating host-based congestion control protocols to take account | o updating host-based congestion control protocols to take account | |||
of packet size | of packet size | |||
o making transports less sensitive to losing control packets like | o making transports less sensitive to losing control packets like | |||
SYNs and pure ACKs. | SYNs and pure ACKs. | |||
What this advice means for the case of TCP: | What this advice means for the case of TCP: | |||
1. If two TCP flows with different packet sizes are required to run | 1. If two TCP flows with different packet sizes are required to run | |||
at equal bit rates under the same path conditions, this should be | at equal bit rates under the same path conditions, this should be | |||
done by altering TCP (Section 4.2.2), not network equipment (the | done by altering TCP (Section 4.2.2), not network equipment (the | |||
latter affects other transports besides TCP). | latter affects other transports besides TCP). | |||
2. If it is desired to improve TCP performance by reducing the | 2. If it is desired to improve TCP performance by reducing the | |||
chance that a SYN or a pure ACK will be dropped, this should be | chance that a SYN or a pure ACK will be dropped, this should be | |||
done by modifying TCP (Section 4.2.3), not network equipment. | done by modifying TCP (Section 4.2.3), not network equipment. | |||
To be clear, we are not recommending at all that TCPs under | ||||
equivalent conditions should aim for equal bit-rates. We are merely | ||||
saying that anyone trying to do such a thing should modify their TCP | ||||
algorithm, not the network. | ||||
2.4. Recommendation on Handling Congestion Indications when Splitting | 2.4. Recommendation on Handling Congestion Indications when Splitting | |||
or Merging Packets | or Merging Packets | |||
Packets carrying congestion indications may be split or merged in | Packets carrying congestion indications may be split or merged in | |||
some circumstances (e.g. at a RTCP transcoder or during IP fragment | some circumstances (e.g. at a RTCP transcoder or during IP fragment | |||
reassembly). Splitting and merging only make sense in the context of | reassembly). Splitting and merging only make sense in the context of | |||
ECN, not loss. | ECN, not loss. | |||
The general rule to follow is that the number of octets in packets | The general rule to follow is that the number of octets in packets | |||
with congestion indications SHOULD be equivalent before and after | with congestion indications SHOULD be equivalent before and after | |||
skipping to change at page 13, line 40 | skipping to change at page 13, line 48 | |||
packets (see 'Making Transports Robust against Control Packet Losses' | packets (see 'Making Transports Robust against Control Packet Losses' | |||
in Section 4.2.3). | in Section 4.2.3). | |||
3.3. Transport-Independent Network | 3.3. Transport-Independent Network | |||
TCP congestion control ensures that flows competing for the same | TCP congestion control ensures that flows competing for the same | |||
resource each maintain the same number of segments in flight, | resource each maintain the same number of segments in flight, | |||
irrespective of segment size. So under similar conditions, flows | irrespective of segment size. So under similar conditions, flows | |||
with different segment sizes will get different bit-rates. | with different segment sizes will get different bit-rates. | |||
One motivation for the network biasing congestion notification by | To counter this effect it seems tempting not to follow our | |||
packet size is to counter this effect and try to equalise the bit- | recommendation, and instead for the network to bias congestion | |||
rates of flows with different packet sizes. However, in order to do | notification by packet size in order to equalise the bit-rates of | |||
this, the queuing algorithm has to make assumptions about the | flows with different packet sizes. However, in order to do this, the | |||
transport, which become embedded in the network. Specifically: | queuing algorithm has to make assumptions about the transport, which | |||
become embedded in the network. Specifically: | ||||
o The queuing algorithm has to assume how aggressively the transport | o The queuing algorithm has to assume how aggressively the transport | |||
will respond to congestion (see Section 4.2.4). If the network | will respond to congestion (see Section 4.2.4). If the network | |||
assumes the transport responds as aggressively as TCP NewReno, it | assumes the transport responds as aggressively as TCP NewReno, it | |||
will be wrong for Compound TCP and differently wrong for Cubic | will be wrong for Compound TCP and differently wrong for Cubic | |||
TCP, etc. To achieve equal bit-rates, each transport then has to | TCP, etc. To achieve equal bit-rates, each transport then has to | |||
guess what assumption the network made, and work out how to | guess what assumption the network made, and work out how to | |||
replace this assumed aggressiveness with its own aggressiveness. | replace this assumed aggressiveness with its own aggressiveness. | |||
o Also, if the network biases congestion notification by packet size | o Also, if the network biases congestion notification by packet size | |||
it has to assume a baseline packet size--all proposed algorithms | it has to assume a baseline packet size--all proposed algorithms | |||
use the local MTU. Then transports have to guess which link was | use the local MTU (for example see the byte-mode loss probability | |||
congested and what its local MTU was, in order to know how to | formula in Table 1). Then if the non-Reno transports mentioned | |||
tailor their congestion response to that link. | above are trying to reverse engineer what the network assumed, | |||
they also have to guess the MTU of the congested link. | ||||
Even though reducing the drop probability of small packets (e.g. | Even though reducing the drop probability of small packets (e.g. | |||
RED's byte-mode drop) helps ensure TCP flows with different packet | RED's byte-mode drop) helps ensure TCP flows with different packet | |||
sizes will achieve similar bit rates, we argue this correction should | sizes will achieve similar bit rates, we argue this correction should | |||
be made to any future transport protocols based on TCP, not to the | be made to any future transport protocols based on TCP, not to the | |||
network in order to fix one transport, no matter how predominant it | network in order to fix one transport, no matter how predominant it | |||
is. Effectively, favouring small packets is reverse engineering of | is. Effectively, favouring small packets is reverse engineering of | |||
network equipment around one particular transport protocol (TCP), | network equipment around one particular transport protocol (TCP), | |||
contrary to the excellent advice in [RFC3426], which asks designers | contrary to the excellent advice in [RFC3426], which asks designers | |||
to question "Why are you proposing a solution at this layer of the | to question "Why are you proposing a solution at this layer of the | |||
skipping to change at page 14, line 41 | skipping to change at page 15, line 5 | |||
scenarios. | scenarios. | |||
When the network does not take account of packet size, it allows | When the network does not take account of packet size, it allows | |||
transport protocols to choose whether to take account of packet size | transport protocols to choose whether to take account of packet size | |||
or not. However, if the network were to bias congestion notification | or not. However, if the network were to bias congestion notification | |||
by packet size, transport protocols would have no choice; those that | by packet size, transport protocols would have no choice; those that | |||
did not take account of packet size themselves would unwittingly | did not take account of packet size themselves would unwittingly | |||
become dependent on packet size, and those that already took account | become dependent on packet size, and those that already took account | |||
of packet size would end up taking account of it twice. | of packet size would end up taking account of it twice. | |||
3.4. Scaling Congestion Control with Packet Size | 3.4. Partial Deployment of AQM | |||
Having so far justified only our recommendations for the network, | In overview, the argument in this section runs as follows: | |||
this section focuses on the host. We construct a scaling argument to | ||||
justify the recommendation that a host should respond to a dropped or | ||||
marked packet in proportion to its size, not just as a single | ||||
congestion event. | ||||
The argument assumes that we have already sufficiently justified our | o Because the network does not and cannot always drop packets in | |||
recommendation that the network should not take account of packet | proportion to their size, it shouldn't be given the task of making | |||
size. | drop signals depend on packet size at all. | |||
Also, we assume bit-congestible links are the predominant source of | o Transports on the other hand don't always want to make their rate | |||
congestion. As the Internet stands, it is hard if not impossible to | response proportional to the size of dropped packets, but if they | |||
know whether congestion notification is from a bit-congestible or a | want to, they always can. | |||
packet-congestible resource (see Appendix B.2) so we have to assume | ||||
the most prevalent case (see Section 1.1). If this assumption is | ||||
wrong, and particular congestion indications are actually due to | ||||
overload of packet-processing, there is no issue of safety at stake. | ||||
Any congestion control that triggers a multiplicative decrease in | ||||
response to a congestion indication will bring packet processing back | ||||
to its operating point just as quickly. The only issue at stake is | ||||
that the resource could be utilised more efficiently if packet- | ||||
congestion could be separately identified. | ||||
Imagine a bit-congestible link shared by many flows, so that each | The argument is similar to the end-to-end argument that says "Don't | |||
busy period tends to cause packets to be lost from different flows. | do X in the network if end-systems can do X by themselves, and they | |||
Consider further two sources that have the same data rate but break | want to be able to choose whether to do X anyway." Actually the | |||
the load into large packets in one application (A) and small packets | following argument is stronger; in addition it says "Don't give the | |||
in the other (B). Of course, because the load is the same, there | network task X that could be done by the end-systems, if X is not | |||
will be proportionately more packets in the small packet flow (B). | deployed on all network nodes, and end-systems won't be able to tell | |||
whether their network is doing X, or whether they need to do X | ||||
themselves." In this case, the X in question is "making the response | ||||
to congestion depend on packet size". | ||||
If a congestion control scales with packet size it should respond in | We will now re-run this argument taking each step in more depth. The | |||
the same way to the same congestion notification, irrespective of the | argument applies solely to drop, not to ECN marking. | |||
size of the packets containing the bytes that contribute to | ||||
congestion. | ||||
A bit-congestible queue suffering congestion has to drop or mark the | A queue drops packets for either of two reasons: a) to signal to host | |||
same excess bytes whether they are in a few large packets (A) or many | congestion controls that they should reduce the load and b) because | |||
small packets (B). So for the same amount of congestion overload, | there is no buffer left to store the packets. Active queue | |||
the same amount of bytes has to be shed to get the load back to its | management tries to use drops as a signal for hosts to slow down | |||
operating point. For smaller packets (B) more packets will have to | (case a) so that drop due to buffer exhaustion (case b) should not be | |||
be discarded to shed the same bytes. | necessary. | |||
If both the transports interpret each drop/mark as a single loss | AQM is not universally deployed in every queue in the Internet; many | |||
event irrespective of the size of the packet dropped, the flow of | cheap ethernet bridges, software firewalls, NATs on consumer devices, | |||
smaller packets (B) will respond more times to the same congestion. | etc implement simple tail-drop buffers. Even if AQM were universal, | |||
On the other hand, if a transport responds proportionately less when | it has to be able to cope with buffer exhaustion (by switching to a | |||
smaller packets are dropped/marked, overall it will be able to | behaviour like tail-drop), in order to cope with unresponsive or | |||
respond the same to the same amount of congestion. | excessive transports. For these reasons networks will sometimes be | |||
dropping packets as a last resort (case b) rather than under AQM | ||||
control (case a). | ||||
Therefore, for a congestion control to scale with packet size it | When buffers are exhausted (case b), they don't naturally drop | |||
should respond to dropped or marked bytes (as TFRC-SP [RFC4828] | packets in proportion to their size. The network can only reduce the | |||
effectively does), instead of dropped or marked packets (as TCP | probability of dropping smaller packets if it has enough space to | |||
does). | store them somewhere while it waits for a larger packet that it can | |||
drop. If the buffer is exhausted, it does not have this choice. | ||||
Admittedly tail-drop does naturally drop somewhat fewer small | ||||
packets, but exactly how few depends more on the mix of sizes than | ||||
the size of the packet in question. Nonetheless, in general, if we | ||||
wanted networks to do size-dependent drop, we would need universal | ||||
deployment of (packet-size dependent) AQM code, which is currently | ||||
unrealistic. | ||||
For the avoidance of doubt, this is not a recommendation that TCP | A host transport cannot know whether any particular drop was a | |||
should be changed so that it scales with packet size. It is a | deliberate signal from an AQM or a sign of a queue shedding packets | |||
recommendation that any future transport protocol proposal should | due to buffer exhaustion. Therefore, because the network cannot | |||
respond to dropped or marked bytes if it wishes to claim that it is | universally do size-dependent drop, it should not do it all. | |||
scalable. | ||||
Whereas universality is desirable in the network, diversity is | ||||
desirable between different transport layer protocols - some, like | ||||
NewReno TCP [RFC5681], may not choose to make their rate response | ||||
proportionate to the size of each dropped packet, while others will | ||||
(e.g. TFRC-SP [RFC4828]). | ||||
3.5. Implementation Efficiency | 3.5. Implementation Efficiency | |||
Allowing for packet size at the transport rather than in the network | Biasing against large packets typically requires an extra multiply | |||
ensures that neither the network nor the transport needs to do a | and divide in the network (see the example byte-mode drop formula in | |||
multiply operation--multiplication by packet size is effectively | Table 1). Allowing for packet size at the transport rather than in | |||
achieved as a repeated add when the transport adds to its count of | the network ensures that neither the network nor the transport needs | |||
marked bytes as each congestion event is fed to it. This isn't a | to do a multiply operation--multiplication by packet size is | |||
principled reason in itself, but it is a happy consequence of the | effectively achieved as a repeated add when the transport adds to its | |||
other principled reasons. | count of marked bytes as each congestion event is fed to it. Also | |||
the work to do the biasing is spread over many hosts, rather than | ||||
concentrated in just the congested network element. These aren't | ||||
principled reasons in themselves, but they are a happy consequence of | ||||
the other principled reasons. | ||||
4. A Survey and Critique of Past Advice | 4. A Survey and Critique of Past Advice | |||
This section is informative, not normative. | This section is informative, not normative. | |||
The original 1993 paper on RED [RED93] proposed two options for the | The original 1993 paper on RED [RED93] proposed two options for the | |||
RED active queue management algorithm: packet mode and byte mode. | RED active queue management algorithm: packet mode and byte mode. | |||
Packet mode measured the queue length in packets and dropped (or | Packet mode measured the queue length in packets and dropped (or | |||
marked) individual packets with a probability independent of their | marked) individual packets with a probability independent of their | |||
size. Byte mode measured the queue length in bytes and marked an | size. Byte mode measured the queue length in bytes and marked an | |||
skipping to change at page 28, line 38 | skipping to change at page 29, line 4 | |||
[I-D.ietf-avtcore-ecn-for-rtp] Westerlund, M., Johansson, I., | [I-D.ietf-avtcore-ecn-for-rtp] Westerlund, M., Johansson, I., | |||
Perkins, C., O'Hanlon, P., and K. | Perkins, C., O'Hanlon, P., and K. | |||
Carlberg, "Explicit Congestion | Carlberg, "Explicit Congestion | |||
Notification (ECN) for RTP over UDP", | Notification (ECN) for RTP over UDP", | |||
draft-ietf-avtcore-ecn-for-rtp-08 | draft-ietf-avtcore-ecn-for-rtp-08 | |||
(work in progress), May 2012. | (work in progress), May 2012. | |||
[I-D.ietf-conex-concepts-uses] Briscoe, B., Woundy, R., and A. | [I-D.ietf-conex-concepts-uses] Briscoe, B., Woundy, R., and A. | |||
Cooper, "ConEx Concepts and Use | Cooper, "ConEx Concepts and Use | |||
Cases", | Cases", | |||
draft-ietf-conex-concepts-uses-04 | ||||
(work in progress), March 2012. | (work in progress), March 2012. | |||
[IOSArch] Bollapragada, V., White, R., and C. | [IOSArch] Bollapragada, V., White, R., and C. | |||
Murphy, "Inside Cisco IOS Software | Murphy, "Inside Cisco IOS Software | |||
Architecture", Cisco Press: CCIE | Architecture", Cisco Press: CCIE | |||
Professional Development ISBN13: 978- | Professional Development ISBN13: 978- | |||
1-57870-181-0, July 2000. | 1-57870-181-0, July 2000. | |||
[PktSizeEquCC] Vasallo, P., "Variable Packet Size | [PktSizeEquCC] Vasallo, P., "Variable Packet Size | |||
Equation-Based Congestion Control", | Equation-Based Congestion Control", | |||
End of changes. 29 change blocks. | ||||
100 lines changed or deleted | 116 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |