draft-ietf-ippm-model-based-metrics-01.txt | draft-ietf-ippm-model-based-metrics-02.txt | |||
---|---|---|---|---|
IP Performance Working Group M. Mathis | IP Performance Working Group M. Mathis | |||
Internet-Draft Google, Inc | Internet-Draft Google, Inc | |||
Intended status: Experimental A. Morton | Intended status: Experimental A. Morton | |||
Expires: April 24, 2014 AT&T Labs | Expires: August 18, 2014 AT&T Labs | |||
October 21, 2013 | February 14, 2014 | |||
Model Based Bulk Performance Metrics | Model Based Bulk Performance Metrics | |||
draft-ietf-ippm-model-based-metrics-01.txt | draft-ietf-ippm-model-based-metrics-02.txt | |||
Abstract | Abstract | |||
We introduce a new class of model based metrics designed to determine | We introduce a new class of model based metrics designed to determine | |||
if a long network path can meet predefined end-to-end application | if an end-to-end Internet path can meet predefined transport | |||
performance targets by applying a suite of IP diagnostic tests to | performance targets by applying a suite of IP diagnostic tests to | |||
successive subpaths. The subpath at a time tests are designed to | successive subpaths. The subpath-at-a-time tests are designed to | |||
exclude all known conditions which might prevent the full end-to-end | accurately detect if any subpath will prevent the full end-to-end | |||
path from meeting the user's target application performance. | path from meeting the specified target performance. Each IP | |||
diagnostic test consists of a precomputed traffic pattern and a | ||||
statistical criteria for evaluating packet delivery. | ||||
This approach makes it possible to to determine the IP performance | The IP diagnostics tests are based on traffic patterns that are | |||
requirements needed to support the desired end-to-end TCP | precomputed to mimic TCP or other transport protocol over a long path | |||
performance. The IP metrics are based on traffic patterns that mimic | but are independent of the actual details of the subpath under test. | |||
TCP or other transport protocol but are precomputed independently of | Likewise the success criteria depends on the target performance and | |||
the actual behavior of the transport protocol over the subpath under | not the actual performance of the subpath. This makes the | |||
test. This makes the measurements open loop, eliminating nearly all | measurements open loop, eliminating nearly all of the difficulties | |||
of the difficulties encountered by traditional bulk transport | encountered by traditional bulk transport metrics. | |||
metrics, which fundamentally depend on congestion control equilibrium | ||||
behavior. | ||||
A natural consequence of this methodology is verifiable network | This document does not fully define diagnostic tests, but provides a | |||
measurement: measurements from any given vantage point can be | framework for designing suites of diagnostics tests that are tailored | |||
verified by repeating them from other vantage points. | the confirming the target performance. | |||
Formatted: Mon Oct 21 15:42:35 PDT 2013 | By making the tests open loop, we eliminate standards congestion | |||
control equilibrium behavior, which otherwise causes every measured | ||||
parameter to be sensitive to every component of the system. As an | ||||
open loop test, various measurable properties become independent, and | ||||
potentially subject to an algebra enabling several important new | ||||
uses. | ||||
Interim DRAFT Formatted: Fri Feb 14 14:07:33 PST 2014 | ||||
Status of this Memo | Status of this Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on April 24, 2014. | This Internet-Draft will expire on August 18, 2014. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2013 IETF Trust and the persons identified as the | Copyright (c) 2014 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
1.1. TODO . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 | 1.1. TODO . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
3. New requirements relative to RFC 2330 . . . . . . . . . . . . 9 | 3. New requirements relative to RFC 2330 . . . . . . . . . . . . 10 | |||
4. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 10 | 4. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
4.1. TCP properties . . . . . . . . . . . . . . . . . . . . . . 12 | 4.1. TCP properties . . . . . . . . . . . . . . . . . . . . . . 12 | |||
5. Common Models and Parameters . . . . . . . . . . . . . . . . . 14 | 4.2. Diagnostic Approach . . . . . . . . . . . . . . . . . . . 13 | |||
5.1. Target End-to-end parameters . . . . . . . . . . . . . . . 14 | 5. Common Models and Parameters . . . . . . . . . . . . . . . . . 15 | |||
5.1. Target End-to-end parameters . . . . . . . . . . . . . . . 15 | ||||
5.2. Common Model Calculations . . . . . . . . . . . . . . . . 15 | 5.2. Common Model Calculations . . . . . . . . . . . . . . . . 15 | |||
5.3. Parameter Derating . . . . . . . . . . . . . . . . . . . . 16 | 5.3. Parameter Derating . . . . . . . . . . . . . . . . . . . . 16 | |||
6. Common testing procedures . . . . . . . . . . . . . . . . . . 16 | 6. Common testing procedures . . . . . . . . . . . . . . . . . . 17 | |||
6.1. Traffic generating techniques . . . . . . . . . . . . . . 16 | 6.1. Traffic generating techniques . . . . . . . . . . . . . . 17 | |||
6.1.1. Paced transmission . . . . . . . . . . . . . . . . . . 16 | 6.1.1. Paced transmission . . . . . . . . . . . . . . . . . . 17 | |||
6.1.2. Constant window pseudo CBR . . . . . . . . . . . . . . 17 | 6.1.2. Constant window pseudo CBR . . . . . . . . . . . . . . 18 | |||
6.1.3. Scanned window pseudo CBR . . . . . . . . . . . . . . 18 | 6.1.3. Scanned window pseudo CBR . . . . . . . . . . . . . . 18 | |||
6.1.4. Concurrent or channelized testing . . . . . . . . . . 18 | 6.1.4. Concurrent or channelized testing . . . . . . . . . . 19 | |||
6.1.5. Intermittent Testing . . . . . . . . . . . . . . . . . 19 | 6.1.5. Intermittent Testing . . . . . . . . . . . . . . . . . 19 | |||
6.1.6. Intermittent Scatter Testing . . . . . . . . . . . . . 20 | 6.1.6. Intermittent Scatter Testing . . . . . . . . . . . . . 20 | |||
6.2. Interpreting the Results . . . . . . . . . . . . . . . . . 20 | 6.2. Interpreting the Results . . . . . . . . . . . . . . . . . 20 | |||
6.2.1. Test outcomes . . . . . . . . . . . . . . . . . . . . 20 | 6.2.1. Test outcomes . . . . . . . . . . . . . . . . . . . . 20 | |||
6.2.2. Statistical criteria for measuring run_length . . . . 21 | 6.2.2. Statistical criteria for measuring run_length . . . . 22 | |||
6.2.3. Reordering Tolerance . . . . . . . . . . . . . . . . . 23 | 6.2.2.1. Alternate criteria for measuring run_length . . . 24 | |||
6.3. Test Qualifications . . . . . . . . . . . . . . . . . . . 23 | 6.2.3. Reordering Tolerance . . . . . . . . . . . . . . . . . 25 | |||
6.3.1. Verify the Traffic Generation Accuracy . . . . . . . . 23 | 6.3. Test Qualifications . . . . . . . . . . . . . . . . . . . 26 | |||
6.3.2. Verify the absence of cross traffic . . . . . . . . . 24 | 7. Diagnostic Tests . . . . . . . . . . . . . . . . . . . . . . . 27 | |||
6.3.3. Additional test preconditions . . . . . . . . . . . . 25 | 7.1. Basic Data Rate and Run Length Tests . . . . . . . . . . . 27 | |||
7. Diagnostic Tests . . . . . . . . . . . . . . . . . . . . . . . 25 | 7.1.1. Run Length at Paced Full Data Rate . . . . . . . . . . 27 | |||
7.1. Basic Data Rate and Run Length Tests . . . . . . . . . . . 25 | 7.1.2. Run Length at Full Data Windowed Rate . . . . . . . . 28 | |||
7.1.1. Run Length at Paced Full Data Rate . . . . . . . . . . 26 | 7.1.3. Background Run Length Tests . . . . . . . . . . . . . 28 | |||
7.1.2. run length at Full Data Windowed Rate . . . . . . . . 26 | 7.2. Standing Queue tests . . . . . . . . . . . . . . . . . . . 28 | |||
7.1.3. Background Run Length Tests . . . . . . . . . . . . . 26 | 7.2.1. Congestion Avoidance . . . . . . . . . . . . . . . . . 29 | |||
7.2. Standing Queue tests . . . . . . . . . . . . . . . . . . . 26 | 7.2.2. Bufferbloat . . . . . . . . . . . . . . . . . . . . . 30 | |||
7.2.1. Congestion Avoidance . . . . . . . . . . . . . . . . . 28 | 7.2.3. Non excessive loss . . . . . . . . . . . . . . . . . . 30 | |||
7.2.2. Bufferbloat . . . . . . . . . . . . . . . . . . . . . 28 | 7.2.4. Duplex Self Interference . . . . . . . . . . . . . . . 30 | |||
7.2.3. Non excessive loss . . . . . . . . . . . . . . . . . . 28 | 7.3. Slowstart tests . . . . . . . . . . . . . . . . . . . . . 30 | |||
7.2.4. Duplex Self Interference . . . . . . . . . . . . . . . 28 | 7.3.1. Full Window slowstart test . . . . . . . . . . . . . . 31 | |||
7.3. Slowstart tests . . . . . . . . . . . . . . . . . . . . . 29 | 7.3.2. Slowstart AQM test . . . . . . . . . . . . . . . . . . 31 | |||
7.3.1. Full Window slowstart test . . . . . . . . . . . . . . 29 | 7.4. Sender Rate Burst tests . . . . . . . . . . . . . . . . . 31 | |||
7.3.2. Slowstart AQM test . . . . . . . . . . . . . . . . . . 29 | 7.5. Combined Tests . . . . . . . . . . . . . . . . . . . . . . 32 | |||
7.4. Sender Rate Burst tests . . . . . . . . . . . . . . . . . 29 | 7.5.1. Sustained burst test . . . . . . . . . . . . . . . . . 32 | |||
7.5. Combined Tests . . . . . . . . . . . . . . . . . . . . . . 30 | 7.5.2. Live Streaming Media . . . . . . . . . . . . . . . . . 33 | |||
7.5.1. Sustained burst test . . . . . . . . . . . . . . . . . 30 | 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 | |||
7.5.2. Live Streaming Media . . . . . . . . . . . . . . . . . 31 | 8.1. Near serving HD streaming video . . . . . . . . . . . . . 34 | |||
8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 | 8.2. Far serving SD streaming video . . . . . . . . . . . . . . 34 | |||
8.1. Near serving HD streaming video . . . . . . . . . . . . . 32 | 8.3. Bulk delivery of remote scientific data . . . . . . . . . 35 | |||
8.2. Far serving SD streaming video . . . . . . . . . . . . . . 32 | ||||
8.3. Bulk delivery of remote scientific data . . . . . . . . . 33 | 9. Validation . . . . . . . . . . . . . . . . . . . . . . . . . . 35 | |||
9. Validation . . . . . . . . . . . . . . . . . . . . . . . . . . 33 | 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 37 | |||
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 34 | 11. Informative References . . . . . . . . . . . . . . . . . . . . 37 | |||
11. Informative References . . . . . . . . . . . . . . . . . . . . 35 | Appendix A. Model Derivations . . . . . . . . . . . . . . . . . . 39 | |||
Appendix A. Model Derivations . . . . . . . . . . . . . . . . . . 36 | A.1. Queueless Reno . . . . . . . . . . . . . . . . . . . . . . 39 | |||
A.1. Aggregate Reno . . . . . . . . . . . . . . . . . . . . . . 37 | A.2. CUBIC . . . . . . . . . . . . . . . . . . . . . . . . . . 40 | |||
A.2. CUBIC . . . . . . . . . . . . . . . . . . . . . . . . . . 37 | Appendix B. Complex Queueing . . . . . . . . . . . . . . . . . . 41 | |||
Appendix B. Version Control . . . . . . . . . . . . . . . . . . . 38 | Appendix C. Version Control . . . . . . . . . . . . . . . . . . . 42 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 38 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 42 | |||
1. Introduction | 1. Introduction | |||
Model based bulk performance metrics evaluate an Internet path's | Bulk performance metrics evaluate an Internet path's ability to carry | |||
ability to carry bulk data. TCP models are used to design a targeted | bulk data. Model based bulk performance metrics rely on mathematical | |||
diagnostic suite (TDS) of IP performance tests which can be applied | TCP models to design a targeted diagnostic suite (TDS) of IP | |||
independently to each subpath of the full end-to-end path. A | performance tests which can be applied independently to each subpath | |||
targeted diagnostic suite is constructed such that independent tests | of the full end-to-end path. These targeted diagnostic suites allow | |||
of the subpaths will accurately predict if the full end-to-end path | independent tests of subpaths to accurately detect if any subpath | |||
can deliver bulk data at the specified performance target, | will prevent the full end-to-end path from delivering bulk data at | |||
independent of the measurement vantage points or other details of the | the specified performance target, independent of the measurement | |||
test procedures used to measure each subpath. | vantage points or other details of the test procedures used for each | |||
measurement. | ||||
Each test in the TDS consists of a precomputed traffic pattern and | The end-to-end target performance is determined by the needs of the | |||
statistical criteria for evaluating packet delivery. | user or application, outside the scope of this document. For bulk | |||
data transport, the primary performance parameter of interest is the | ||||
target data rate. However, since TCP's ability to compensate for | ||||
less than ideal network conditions is fundamentally affected by the | ||||
Round Trip Time (RTT) and the Maximum Transmission Unit (MTU) of the | ||||
entire end-to-end path over which the data traverses, these | ||||
parameters must also be specified in advance. They may reflect a | ||||
specific real path through the Internet or an idealized path | ||||
representing a typical user community. The target values for these | ||||
three parameters, Data Rate, RTT and MTU, inform the mathematical | ||||
models used to design the TDS. | ||||
TCP models are used to design traffic patterns that mimic TCP or | Each IP diagnostic test in a TDS consists of a precomputed traffic | |||
other bulk transport protocol operating at the target performance and | pattern and statistical criteria for evaluating packet delivery. | |||
RTT over a full range of conditions, including flows that are bursty | ||||
at multiple time scales. The traffic patterns are computed in | ||||
advance based on the properties of the full end-to-end path and | ||||
independent of the properties of individual subpaths. As much as | ||||
possible the traffic is generated deterministically in ways that | ||||
minimizes the extent to which test methodology, measurement points, | ||||
measurement vantage or path partitioning effect the details of the | ||||
traffic. | ||||
Models are also used to compute the bounds on the packet delivery | Mathematical models are used to design traffic patterns that mimic | |||
statistics for acceptable the IP performance. The criteria for | TCP or other bulk transport protocol operating at the target data | |||
passing each test are determined from the end-to-end target | rate, MTU and RTT over a full range of conditions, including flows | |||
performance and are independent of the subpath under test. In | that are bursty at multiple time scales. The traffic patterns are | |||
addition to passing or failing, a test can be inconclusive if the | computed in advance based on the three target parameters of the end- | |||
precomputed traffic pattern was not authentically generated, test | to-end path and independent of the properties of individual subpaths. | |||
preconditions were not met or the measurement results were not | As much as possible the measurement traffic is generated | |||
statistically significant. | deterministically in ways that minimize the extent to which test | |||
methodology, measurement points, measurement vantage or path | ||||
partitioning affect the details of the measurement traffic. | ||||
TCP's ability to compensate for less than ideal network conditions is | Mathematical models are also used to compute the bounds on the packet | |||
fundamentally affected by the RTT and MTU of the end-to-end Internet | delivery statistics for acceptable IP performance. Since these | |||
path that it traverses. The end-to-end path determines fixed bounds | statistics, such as packet loss, are typically aggregated from all | |||
on these parameters. The target values for these three parameters, | subpaths of the end-to-end path, the end-to-end statistical bounds | |||
Data Rate, RTT and MTU, are determined by the application, its | need to be apportioned as a separate bound for each subpath. Note | |||
intended use and the physical infrastructure over which it is | that links that are expected to be bottlenecks are expected to | |||
intended to traverse. These parameters are used to inform the models | contribute more packet loss and/or delay. In compensation, other | |||
used to design the TDS. | links have to be constrained to contribute less packet loss and | |||
delay. The criteria for passing each test of a TDS is an apportioned | ||||
share of the total bound determined by the mathematical model from | ||||
the end-to-end target performance. | ||||
This document describes a framework for deriving the traffic and | In addition to passing or failing, a test can be deemed to be | |||
inconclusive for a number of reasons including, the precomputed | ||||
traffic pattern was not accurately generated, measurement results | ||||
were not statistically significant, and others such as failing to | ||||
meet some test preconditions. | ||||
This document describes a framework for deriving traffic patterns and | ||||
delivery statistics for model based metrics. It does not fully | delivery statistics for model based metrics. It does not fully | |||
specify any measurement techniques. Important details such as packet | specify any measurement techniques. Important details such as packet | |||
type-p selection, sampling techniques, vantage selection, etc are out | type-p selection, sampling techniques, vantage selection, etc. are | |||
of scope for this document. We imagine Fully Specified Targeted | not specified here. We imagine Fully Specified Targeted Diagnostic | |||
Diagnostic Suites (FSTDS), that fully defines all of these details. | Suites (FSTDS), that define all of these details. We use TDS to | |||
We use TDS to refer to the subset of such a specification that is in | refer to the subset of such a specification that is in scope for this | |||
scope for this document. A TDS includes specification for the | document. A TDS includes the target parameters, documentation of the | |||
traffic and delivery statistics for the diagnostic tests themselves, | models and assumptions used to derive the diagnostic test parameters, | |||
documentation of the models and any assumptions or derating used to | specifications for the traffic and delivery statistics for the tests | |||
derive the test parameters and a description of the test setup used | themselves, and a description of a test setup that can be used to | |||
to calibrate the models, as described in later sections. | validate the tests and models. | |||
Section 2 defines terminology used throughout this document. | Section 2 defines terminology used throughout this document. | |||
It has been difficult to develop BTC metrics due to some overlooked | It has been difficult to develop Bulk Transport Capacity [RFC3148] | |||
requirements described in Section 3 and some intrinsic problems with | metrics due to some overlooked requirements described in Section 3 | |||
using protocols for measurement, described in Section 4. | and some intrinsic problems with using protocols for measurement, | |||
described in Section 4. | ||||
In Section 5 we describe the models and common parameters used to | In Section 5 we describe the models and common parameters used to | |||
derive the targeted diagnostic suite. In Section 6 we describe | derive the targeted diagnostic suite. In Section 6 we describe | |||
common testing procedures. Each subpath is evaluated using suite of | common testing procedures. Each subpath is evaluated using suite of | |||
far simpler and more predictable diagnostic tests described in | far simpler and more predictable diagnostic tests described in | |||
Section 7. In Section 8 we present three example TDS, one that might | Section 7. In Section 8 we present three example TDS', one that | |||
be representative of HD video, when served fairly close to the user, | might be representative of HD video, when served fairly close to the | |||
a second that might be representative of standard video, served from | user, a second that might be representative of standard video, served | |||
a greater distance, and a third that might be representative of an | from a greater distance, and a third that might be representative of | |||
network designed to support high performance bulk download. | high performance bulk data delivered over a transcontinental path. | |||
There exists a small risk that model based metric itself might yield | There exists a small risk that model based metric itself might yield | |||
a false pass result, in the sense that every subpath of an end-to-end | a false pass result, in the sense that every subpath of an end-to-end | |||
path passes every IP diagnostic test and yet a real application falls | path passes every IP diagnostic test and yet a real application fails | |||
to attain the performance target over the end-to-end path. If this | to attain the performance target over the end-to-end path. If this | |||
happens, then the validation procedure described in Section 9 needs | happens, then the validation procedure described in Section 9 needs | |||
to be used to prove and potentially revise the models. | to be used to prove and potentially revise the models. | |||
Future document will define model based metrics for other traffic | Future documents will define model based metrics for other traffic | |||
classes and application types, such as real time streaming media. | classes and application types, such as real time streaming media. | |||
1.1. TODO | 1.1. TODO | |||
Please send comments on this draft to ippm@ietf.org. See | Please send comments on this draft to ippm@ietf.org. See | |||
http://goo.gl/02tkD for more information including: interim drafts, | http://goo.gl/02tkD for more information including: interim drafts, | |||
an up to date todo list and information on contributing. | an up to date todo list and information on contributing. | |||
Formatted: Mon Oct 21 15:42:35 PDT 2013 | Formatted: Fri Feb 14 14:07:33 PST 2014 | |||
2. Terminology | 2. Terminology | |||
Terminology about paths, etc. See [RFC2330] and | Terminology about paths, etc. See [RFC2330] and | |||
[I-D.morton-ippm-lmap-path]. | [I-D.morton-ippm-lmap-path]. | |||
[data] sender Host sending data and receiving ACKs, typically via | [data] sender Host sending data and receiving ACKs. | |||
TCP. | [data] receiver Host receiving data and sending ACKs. | |||
[data] receiver Host receiving data and sending ACKs, typically via | ||||
TCP. | ||||
subpath A portion of the full path. Note that there is no | subpath A portion of the full path. Note that there is no | |||
requirement that subpaths be non-overlapping. | requirement that subpaths be non-overlapping. | |||
Measurement Point Measurement points as described in | Measurement Point Measurement points as described in | |||
[I-D.morton-ippm-lmap-path]. | [I-D.morton-ippm-lmap-path]. | |||
test path A path between two measurement points that includes a | test path A path between two measurement points that includes a | |||
subpath of the end-to-end path under test, plus possibly | subpath of the end-to-end path under test, and could include | |||
additional infrastructure between the measurement points and the | infrastructure between the measurement points and the subpath. | |||
subpath. | [Dominant] Bottleneck The Bottleneck that generally dominates | |||
[Dominant] Bottleneck The Bottleneck that determines a flow's self | traffic statistics for the entire path. It typically determines a | |||
clock. It generally determines the traffic statistics for the | flow's self clock timing, packet loss and ECN marking rate. See | |||
entire path. See Section 4.1. | Section 4.1. | |||
front path The subpath from the data sender to the dominant | front path The subpath from the data sender to the dominant | |||
bottleneck. | bottleneck. | |||
back path The subpath from the dominant bottleneck to the receiver. | back path The subpath from the dominant bottleneck to the receiver. | |||
return path The path taken by the ACKs from the data receiver to the | return path The path taken by the ACKs from the data receiver to the | |||
data sender. | data sender. | |||
cross traffic Other, potentially interfering, traffic competing for | cross traffic Other, potentially interfering, traffic competing for | |||
resources (network and/or queue capacity). | resources (network and/or queue capacity). | |||
Properties determined by the end-to-end path and application. They | Properties determined by the end-to-end path and application. They | |||
are described in more detail in Section 5.1. | are described in more detail in Section 5.1. | |||
Application Data Rate General term for the data rate as seen by the | Application Data Rate General term for the data rate as seen by the | |||
application above the transport layer. This is the payload data | application above the transport layer. This is the payload data | |||
rate, and excludes TCP/IP (or other protocol) headers and | rate, and excludes transport and lower level headers(TCP/IP or | |||
retransmits. | other protocols) and as well as retransmissions and other data | |||
that does not contribute to the total quantity of data delivered | ||||
to the application. | ||||
Link Data Rate General term for the data rate as seen by the link or | Link Data Rate General term for the data rate as seen by the link or | |||
lower layers. It includes transport and IP headers, retransmits | lower layers. The link data rate includes transport and IP | |||
and other transport layer overhead. This document is agnostic as | headers, retransmits and other transport layer overhead. This | |||
to whether the link data rate includes or excludes framing, MAC or | document is agnostic as to whether the link data rate includes or | |||
other lower layer overheads, except that they must be treated | excludes framing, MAC, or other lower layer overheads, except that | |||
uniformly. | they must be treated uniformly. | |||
end-to-end target parameters: Application or transport performance | end-to-end target parameters: Application or transport performance | |||
goals for the end-to-end path. They include the target data rate, | goals for the end-to-end path. They include the target data rate, | |||
RTT and MTU described below. | RTT and MTU described below. | |||
Target Data Rate: The application or ultimate user's performance | Target Data Rate: The application data rate, typically the ultimate | |||
goal. When converted to link data rate, it must be slightly | user's performance goal. | |||
smaller than the actual link data rate, otherwise there is no | ||||
margin for compensating for RTT or other path properties. These | ||||
test will be excessively brittle if the target data rate does not | ||||
include any built in headroom. | ||||
Target RTT (Round Trip Time): The baseline (minimum) RTT of the | Target RTT (Round Trip Time): The baseline (minimum) RTT of the | |||
longest end-to-end path the over which the application expects to | longest end-to-end path over which the application expects to meet | |||
meet the target performance. This must be specified considering | the target performance. TCP and other transport protocol's | |||
authentic packets sizes: MTU sized packets on the forward path, | ability to compensate for path problems is generally proportional | |||
header_overhead sized packets on the return (ACK) path. | to the number of round trips per second. The Target RTT | |||
determines both key parameters of the traffic patterns (e.g. burst | ||||
sizes) and the thresholds on acceptable traffic statistics. The | ||||
Target RTT must be specified considering authentic packets sizes: | ||||
MTU sized packets on the forward path, ACK sized packets | ||||
(typically the header_overhead) on the return path. | ||||
Target MTU (Maximum Transmission Unit): The maximum MTU supported by | Target MTU (Maximum Transmission Unit): The maximum MTU supported by | |||
the end-to-end path the over which the application expects to meet | the end-to-end path the over which the application expects to meet | |||
the target performance. Assume 1500 Bytes per packet unless | the target performance. Assume 1500 Byte packet unless otherwise | |||
otherwise specified. If some subpath forces a smaller MTU, then | specified. If some subpath forces a smaller MTU, then it becomes | |||
it becomes the target MTU, and all model calculations and subpath | the target MTU, and all model calculations and subpath tests must | |||
tests must use the same smaller MTU. | use the same smaller MTU. | |||
Effective Bottleneck Data Rate: This is the bottleneck data rate | Effective Bottleneck Data Rate: This is the bottleneck data rate | |||
that might be inferred from the ACK stream, by looking at how much | inferred from the ACK stream, by looking at how much data the ACK | |||
data the ACK stream reports was delivered per unit time. See | stream reports delivered per unit time. If the path is thinning | |||
Section 4.1 for more details. | ACKs or batching packets the effective bottleneck rate can be much | |||
[sender] [interface] rate: The burst data rate, constrained by the | higher than the average link rate. See Section 4.1 and Appendix B | |||
for more details. | ||||
[sender | interface] rate: The burst data rate, constrained by the | ||||
data sender's interfaces. Today 1 or 10 Gb/s are typical. | data sender's interfaces. Today 1 or 10 Gb/s are typical. | |||
Header overhead: The IP and TCP header sizes, which are the portion | Header_overhead: The IP and TCP header sizes, which are the portion | |||
of each MTU not available for carrying application payload. | of each MTU not available for carrying application payload. | |||
Without loss of generality this is assumed to be the size for | Without loss of generality this is assumed to be the size for | |||
returning acknowledgements (ACKs). For TCP, the Maximum Segment | returning acknowledgements (ACKs). For TCP, the Maximum Segment | |||
Size (MSS) is the Target MTU minus the header overhead. | Size (MSS) is the Target MTU minus the header_overhead. | |||
Basic parameters common to models and subpath tests. They are | Basic parameters common to models and subpath tests. They are | |||
described in more detail in Section 5.2. | described in more detail in Section 5.2. Note that these are mixed | |||
between application transport performance (excludes headers) and link | ||||
IP performance (includes headers). | ||||
pipe size A general term for number of packets needed in flight (the | pipe size A general term for number of packets needed in flight (the | |||
window size) to exactly fill some network path or subpath. This | window size) to exactly fill some network path or subpath. This | |||
is the window size which in normally the onset of queueing. | is the window size which is normally the onset of queueing. | |||
target_pipe_size: The number of packets in flight (the window size) | target_pipe_size: The number of packets in flight (the window size) | |||
needed to exactly meet the target rate, with a single stream and | needed to exactly meet the target rate, with a single stream and | |||
no cross traffic for the specified target data rate, RTT and MTU. | no cross traffic for the specified application target data rate, | |||
run length A general term for the observed, measured or specified | RTT, and MTU. It is the amount of circulating data required to | |||
meet the target data rate, and implies the scale of the bursts | ||||
that the network might experience. | ||||
run length A general term for the observed, measured, or specified | ||||
number of packets that are (to be) delivered between losses or ECN | number of packets that are (to be) delivered between losses or ECN | |||
marks. Nominally one over the loss or ECN marking probability. | marks. Nominally one over the loss or ECN marking probability, if | |||
target_run_length Required run length computed from the target data | there are independently and identically distributed. | |||
rate, RTT and MTU. | target_run_length The target_run_length is an estimate of the | |||
minimum required headway between losses or ECN marks necessary to | ||||
attain the target_data_rate over a path with the specified | ||||
target_RTT and target_MTU, as computed by a mathematical model of | ||||
TCP congestion control. A reference calculation is show in | ||||
Section 5.2 and alternatives in Appendix A | ||||
Ancillary parameters used for some tests | Ancillary parameters used for some tests | |||
derating: Under some conditions the standard models are too | derating: Under some conditions the standard models are too | |||
conservative. The modeling framework permits some latitude in | conservative. The modeling framework permits some latitude in | |||
relaxing or derating some test parameters as described in | relaxing or derating some test parameters as described in | |||
Section 5.3 in exchange for a more stringent TDS validation | Section 5.3 in exchange for a more stringent TDS validation | |||
procedures, described in Section 9. | procedures, described in Section 9. | |||
subpath_data_rate The maximum IP data rate supported by a subpath. | subpath_data_rate The maximum IP data rate supported by a subpath. | |||
This typically includes TCP/IP overhead, including headers, | This typically includes TCP/IP overhead, including headers, | |||
retransmits, etc. | retransmits, etc. | |||
test_path_RTT The RTT (using appropriate packet sizes) between two | test_path_RTT The RTT between two measurement points using | |||
measurement points. | appropriate data and ACK packet sizes. | |||
test_path_pipe The amount of data necessary to fill a test path. | test_path_pipe The amount of data necessary to fill a test path. | |||
Nominally the test path RTT times the subpath_data_rate (which | Nominally the test path RTT times the subpath_data_rate (which | |||
should be part of the end-to-end subpath). | should be part of the end-to-end subpath). | |||
test_window The window necessary to meet the target_rate over a | test_window The window necessary to meet the target_rate over a | |||
subpath. Typically test_window=target_data_rate*test_RTT/ | subpath. Typically test_window=target_data_rate*test_RTT/ | |||
target_MTU. | (target_MTU - header_overhead). | |||
Tests can be classified into groups according to their applicability | Tests can be classified into groups according to their applicability. | |||
Capacity tests determine if a network subpath has sufficient | Capacity tests determine if a network subpath has sufficient | |||
capacity to deliver the target performance. As long as the test | capacity to deliver the target performance. As long as the test | |||
traffic is within the proper envelope for the target end-to-end | traffic is within the proper envelope for the target end-to-end | |||
performance, the average packet losses or ECN must be below the | performance, the average packet losses or ECN must be below the | |||
threshold computed by the model. As such, they reflect parameters | threshold computed by the model. As such, capacity tests reflect | |||
that can transition from passing to failing as a consequence of | parameters that can transition from passing to failing as a | |||
additional presented load or the actions of other network users. | consequence of cross traffic, additional presented load or the | |||
By definition, capacity tests also consume significant network | actions of other network users. By definition, capacity tests | |||
resources (data capacity and/or buffer space), and the test | also consume significant network resources (data capacity and/or | |||
schedules must be balanced by their cost. | buffer space), and the test schedules must be balanced by their | |||
Monitoring tests are design to capture the most important aspects of | cost. | |||
a capacity test, but without causing unreasonable ongoing load | Monitoring tests are designed to capture the most important aspects | |||
themselves. As such they may miss some details of the network | of a capacity test, but without presenting excessive ongoing load | |||
performance, but can serve as a useful reduced cost proxy for a | themselves. As such they may miss some details of the network's | |||
performance, but can serve as a useful reduced-cost proxy for a | ||||
capacity test. | capacity test. | |||
Engineering tests evaluate how network algorithms (such as AQM and | Engineering tests evaluate how network algorithms (such as AQM and | |||
channel allocation) interact with TCP style self clocked protocols | channel allocation) interact with TCP-style self clocked protocols | |||
and adaptive congestion control based on packet loss and ECN | and adaptive congestion control based on packet loss and ECN | |||
marks. These tests are likely to have complicated interactions | marks. These tests are likely to have complicated interactions | |||
with other traffic and under some conditions can be inversely | with other traffic and under some conditions can be inversely | |||
sensitive to load. For example a test to verify that an AQM | sensitive to load. For example a test to verify that an AQM | |||
algorithm causes ECN marks or packet drops early enough to limit | algorithm causes ECN marks or packet drops early enough to limit | |||
queue occupancy may experience a false pass results in the | queue occupancy may experience a false pass result in the presence | |||
presence of bursty cross traffic. It is important that | of bursty cross traffic. It is important that engineering tests | |||
engineering tests be performed under a wide range of conditions, | be performed under a wide range of conditions, including both in | |||
including both in situ and bench testing, and over a wide variety | situ and bench testing, and over a wide variety of load | |||
of load conditions. Ongoing monitoring is less likely to be | conditions. Ongoing monitoring is less likely to be useful for | |||
useful for engineering tests, although sparse in situ testing | engineering tests, although sparse in situ testing might be | |||
might be appropriate. | appropriate. | |||
General Terminology: | ||||
Targeted Diagnostic Test (TDS) A set of IP Diagnostics designed to | ||||
determine if a subpath can sustain flows at a specific | ||||
target_data_rate over a path that has a target_RTT using | ||||
target_MTU sided packets. | ||||
Fully Specified Targeted Diagnostic Test A TDS together with | ||||
additional specification such as "type-p", etc which are out of | ||||
scope for this document, but need to be drawn from other standards | ||||
documents. | ||||
apportioned To divide and allocate, as in budgeting packet loss | ||||
rates across multiple subpaths to accumulate below a specified | ||||
end-to-end loss rate. | ||||
open loop A control theory term used to describe a class of | ||||
techniques where systems that exhibit circular dependencies can be | ||||
analyzed by suppressing some of the dependences, such that the | ||||
resulting dependency graph is acyclic. | ||||
3. New requirements relative to RFC 2330 | 3. New requirements relative to RFC 2330 | |||
[Move this entire section to a future paper] | ||||
Model Based Metrics are designed to fulfill some additional | Model Based Metrics are designed to fulfill some additional | |||
requirement that were not recognized at the time RFC 2330 [RFC2330] | requirement that were not recognized at the time RFC 2330 was written | |||
was written. These missing requirements may have significantly | [RFC2330]. These missing requirements may have significantly | |||
contributed to policy difficulties in the IP measurement space. Some | contributed to policy difficulties in the IP measurement space. Some | |||
additional requirements are: | additional requirements are: | |||
o Metrics must be actionable by the ISP - they have to be | o IP metrics must be actionable by the ISP - they have to be | |||
interpreted in terms of behaviors or properties at the IP or lower | interpreted in terms of behaviors or properties at the IP or lower | |||
layers, that an ISP can test, repair and verify. | layers, that an ISP can test, repair and verify. | |||
o Metrics must be vantage point invariant over a significant range | o Metrics must be vantage point invariant over a significant range | |||
of measurement point choices (e.g., measurement points as | of measurement point choices, including off path measurement | |||
described in [I-D.morton-ippm-lmap-path]), including off path | points. The only requirements on MP selection should be that the | |||
measurement points. The only requirements on MP selection should | portion of the test path that is not under test is effectively | |||
be that the portion of the path that is not under test is | ideal (or is non ideal in ways that can be calibrated out of the | |||
effectively ideal (or is non ideal in calibratable ways) and the | measurements) and the test RTT between the MPs is below some | |||
RTT between MPs is below some reasonable bound. | reasonable bound. | |||
o Metrics must be repeatable by multiple parties. It must be | o Metrics must be repeatable by multiple parties with no specialized | |||
possible for different parties to make the same measurement and | access to MPs or diagnostic infrastructure. It must be possible | |||
observe the same results. In particular it is specifically | for different parties to make the same measurement and observe the | |||
important that both a consumer (or their delegate) and ISP be able | same results. In particular it is specifically important that | |||
to perform the same measurement and get the same result. | both a consumer (or their delegate) and ISP be able to perform the | |||
same measurement and get the same result. | ||||
NB: All of the metric requirements in RFC 2330 should be reviewed and | NB: All of the metric requirements in RFC 2330 should be reviewed and | |||
potentially revised. If such a document is opened soon enough, this | potentially revised. If such a document is opened soon enough, this | |||
entire section should be dropped. | entire section should be dropped. | |||
4. Background | 4. Background | |||
[Move to a future paper, abridge here, ] | ||||
At the time the IPPM WG was chartered, sound Bulk Transport Capacity | At the time the IPPM WG was chartered, sound Bulk Transport Capacity | |||
measurement was known to be beyond our capabilities. By hindsight it | measurement was known to be beyond our capabilities. By hindsight it | |||
is now clear why it is such a hard problem: | is now clear why it is such a hard problem: | |||
o TCP is a control system with circular dependencies - everything | o TCP is a control system with circular dependencies - everything | |||
affects performance, including components that are explicitly not | affects performance, including components that are explicitly not | |||
part of the test. | part of the test. | |||
o Congestion control is an equilibrium process, transport protocols | o Congestion control is an equilibrium process, such that transport | |||
change the network (raise loss probability and/or RTT) to conform | protocols change the network (raise loss probability and/or RTT) | |||
to their behavior. | to conform to their behavior. | |||
o TCP's ability to compensate for network flaws is directly | o TCP's ability to compensate for network flaws is directly | |||
proportional to the number of roundtrips per second (i.e. | proportional to the number of roundtrips per second (i.e. | |||
inversely proportional to the RTT). As a consequence a flawed | inversely proportional to the RTT). As a consequence a flawed | |||
link may pass a short RTT local test even though it fails when the | link may pass a short RTT local test even though it fails when the | |||
path is extended by a perfect network to some larger RTT. | path is extended by a perfect network to some larger RTT. | |||
o TCP has a meta Heisenberg problem - Measurement and cross traffic | o TCP has a meta Heisenberg problem - Measurement and cross traffic | |||
interact in unknown and ill defined ways. The situation is | interact in unknown and ill defined ways. The situation is | |||
actually worse than the traditional physics problem where you can | actually worse than the traditional physics problem where you can | |||
at least estimate the relative momentum of the measurement and | at least estimate the relative momentum of the measurement and | |||
measured particles. For network measurement you can not in | measured particles. For network measurement you can not in | |||
general determine the relative "elasticity" of the measurement | general determine the relative "elasticity" of the measurement | |||
traffic and cross traffic, so you can not even gage the relative | traffic and cross traffic, so you can not even gauge the relative | |||
magnitude of their effects on each other. | magnitude of their effects on each other. | |||
The MBM approach is to "open loop" TCP by precomputing traffic | These properties are a consequence of the equilibrium behavior | |||
patterns that are typically generated by TCP operating at the given | intrinsic to how all throughput optimizing protocols interact with | |||
target parameters, and evaluating delivery statistics (losses, ECN | the network. The protocols rely on control systems based on multiple | |||
marks and delay). In this approach the measurement software | network estimators to regulate the quantity of data sent into the | |||
explicitly controls the data rate, transmission pattern or cwnd | network. The data in turn alters network and the properties observed | |||
(TCP's primary congestion control state variables) to create | by the estimators, such that there are circular dependencies between | |||
repeatable traffic patterns that mimic TCP behavior but are | every component and every property. Since some of these estimators | |||
independent of the actual network behavior of the subpath under test. | are non-linear, the entire system is nonlinear, and any change | |||
These patterns are manipulated to probe the network to verify that it | anywhere causes difficult to predict changes in every parameter. | |||
can deliver all of the traffic patterns that a transport protocol is | ||||
likely to generate under normal operation at the target rate and RTT. | ||||
Models are used to determine the actual test parameters (burst size, | ||||
loss rate, etc) from the target parameters. The basic method is to | ||||
use models to estimate specific network properties required to | ||||
sustain a given transport flow (or set of flows), and using a suite | ||||
of metrics to confirm that the network meets the required properties. | ||||
A network is expected to be able to sustain a Bulk TCP flow of a | ||||
given data rate, MTU and RTT when the following conditions are met: | ||||
o The raw link rate is higher than the target data rate. | ||||
o The raw packet run length is larger than required by a suitable | ||||
TCP performance model | ||||
o There is sufficient buffering at the dominant bottleneck to absorb | ||||
a slowstart rate burst large enough to get the flow out of | ||||
slowstart at a suitable window size. | ||||
o There is sufficient buffering in the front path to absorb and | ||||
smooth sender interface rate bursts at all scales that are likely | ||||
to be generated by the application, any channel arbitration in the | ||||
ACK path or other mechanisms. | ||||
o When there is a standing queue at a bottleneck for a shared media | ||||
subpath, there are suitable bounds on how the data and ACKs | ||||
interact, for example due to the channel arbitration mechanism. | ||||
o When there is a slowly rising standing queue at the bottleneck the | ||||
onset of packet loss has to be at an appropriate point (time or | ||||
queue depth) and progressive. | ||||
The tests to verify these condition are described in Section 7. | ||||
A singleton [RFC2330] measurement is a pass/fail evaluation of a | ||||
given path or subpath at a given performance. Note that measurements | ||||
to confirm that a link passes at one particular performance might not | ||||
be be useful to predict if the link will pass at a different | ||||
performance. | ||||
A TDS does have several valuable properties, such as natural ways to | ||||
define several different composition metrics [RFC5835]. | ||||
[Add text on algebra on metrics (A-Frame from [RFC2330]) and | ||||
tomography.] The Spatial Composition of fundamental IPPM metrics has | ||||
been studied and standardized. For example, the algebra to combine | ||||
empirical assessments of loss ratio to estimate complete path | ||||
performance is described in section 5.1.5. of [RFC6049]. We intend | ||||
to use this and other composition metrics as necessary. | ||||
We are developing a tool that can perform many of the tests described | Model Based Metrics overcome these problems by forcing the | |||
here[MBMSource]. | measurement system to be open loop: the delivery statistics (akin to | |||
the network estimators) do not affect the traffic. The traffic and | ||||
traffic patterns (bursts) are computed on the basis of the target | ||||
performance. In order for a network to pass, the resulting delivery | ||||
statistics and corresponding network estimators have to be such that | ||||
they would not cause the control systems slow the traffic below the | ||||
target rate. | ||||
4.1. TCP properties | 4.1. TCP properties | |||
[Move this entire section to a future paper] | ||||
TCP and SCTP are self clocked protocols. The dominant steady state | TCP and SCTP are self clocked protocols. The dominant steady state | |||
behavior is to have an approximately fixed quantity of data and | behavior is to have an approximately fixed quantity of data and | |||
acknowledgements (ACKs) circulating in the network. The receiver | acknowledgements (ACKs) circulating in the network. The receiver | |||
reports arriving data by returning ACKs to the data sender, the data | reports arriving data by returning ACKs to the data sender, the data | |||
sender most frequently responds by sending exactly the same quantity | sender typically responds by sending exactly the same quantity of | |||
of data back into the network. The quantity of data plus the data | data back into the network. The total quantity of data plus the data | |||
represented by ACKs circulating in the network is referred to as the | represented by ACKs circulating in the network is referred to as the | |||
window. The mandatory congestion control algorithms incrementally | window. The mandatory congestion control algorithms incrementally | |||
adjust the widow by sending slightly more or less data in response to | adjust the window by sending slightly more or less data in response | |||
each ACK. The fundamentally important property of this systems is | to each ACK. The fundamentally important property of this systems is | |||
that it is entirely self clocked: The data transmissions are a | that it is entirely self clocked: The data transmissions are a | |||
reflection of the ACKs that were delivered by the network, the ACKs | reflection of the ACKs that were delivered by the network, the ACKs | |||
are a reflection of the data arriving from the network. | are a reflection of the data arriving from the network. | |||
A number of phenomena can cause bursts of data, even in idealized | A number of phenomena can cause bursts of data, even in idealized | |||
networks that are modeled as simple queueing systems. | networks that are modeled as simple queueing systems. | |||
During slowstart the data rate is doubled on each RTT by sending | During slowstart the data rate is doubled on each RTT by sending | |||
twice as much data as was delivered to the receiver on the prior RTT. | twice as much data as was delivered to the receiver on the prior RTT. | |||
For slowstart to be able to fill such a network the network must be | For slowstart to be able to fill such a network the network must be | |||
able to tolerate slowstart bursts up to the full pipe size inflated | able to tolerate slowstart bursts up to the full pipe size inflated | |||
by the anticipated window reduction on the first loss or ECN mark. | by the anticipated window reduction on the first loss or ECN mark. | |||
For example, with classic Reno congestion control, an optimal | For example, with classic Reno congestion control, an optimal | |||
slowstart has to end with a burst that is twice the bottleneck rate | slowstart has to end with a burst that is twice the bottleneck rate | |||
for exactly one RTT in duration. This burst causes a queue which is | for exactly one RTT in duration. This burst causes a queue which is | |||
exactly equal to the pipe size (the window is exactly twice the pipe | exactly equal to the pipe size (i.e. the window is exactly twice the | |||
size) so when the window is halved, the new window will be exactly | pipe size) so when the window is halved in response to the first | |||
the pipe size. | loss, the new window will be exactly the pipe size. | |||
Another source of bursts are application pauses. If the application | ||||
pauses (stops reading or writing data) for some fraction of one RTT, | ||||
state-of-the-art TCP to "catches up" to the earlier window size by | ||||
sending a burst of data at the full sender interface rate. To fill | ||||
such a network with a realistic application, the network has to be | ||||
able to tolerate interface rate bursts from the data sender large | ||||
enough to cover application pauses. | ||||
Note that if the bottleneck data rate is significantly slower than | Note that if the bottleneck data rate is significantly slower than | |||
the rest of the path, the slowstart bursts will not cause significant | the rest of the path, the slowstart bursts will not cause significant | |||
queues anywhere else along the path; they primarily exercise the | queues anywhere else along the path; they primarily exercise the | |||
queue at the dominant bottleneck. Furthermore, although the | queue at the dominant bottleneck. | |||
interface rate bursts caused by the application are likely to be | ||||
smaller than last burst of a slowstart, they are at a higher rate so | ||||
they can exercise queues at arbitrary points along the "front path" | ||||
from the data sender up to and including the queue at the bottleneck. | ||||
For many network technologies a simple queueing model does not apply: | Other sources of bursts include application pauses and channel | |||
the network schedules, thins or otherwise alters the timing of ACKs | allocation mechanisms. Appendix B describes the treatment of channel | |||
and data, generally to raise the efficiency of the channel allocation | allocation systems. If the application pauses (stops reading or | |||
process when confronted with relatively widely spaced small ACKs. | writing data) for some fraction of one RTT, state-of-the-art TCP | |||
These efficiency strategies are ubiquitous for half duplex, wireless | catches up to the earlier window size by sending a burst of data at | |||
or broadcast media. | the full sender interface rate. To fill such a network with a | |||
realistic application, the network has to be able to tolerate | ||||
interface rate bursts from the data sender large enough to cover | ||||
application pauses. | ||||
Altering the ACK stream generally has two consequences: raising the | Although the interface rate bursts are typically smaller than last | |||
effective bottleneck data rate making slowstart burst at higher rates | burst of a slowstart, they are at a higher data rate so they | |||
(possibly as high as the sender's interface rate) and effectively | potentially exercise queues at arbitrary points along the front path | |||
raising the RTT by the time that the ACKs were postponed. The first | from the data sender up to and including the queue at the dominant | |||
effect can be partially mitigated by reclocking ACKs once they are | bottleneck. There is no model for how frequent or what sizes of | |||
beyond the bottleneck on the return path to the sender, however this | sender rate bursts should be tolerated. | |||
further raises the effective RTT. The most extreme example of this | ||||
class of behaviors is a half duplex channel that is never released | ||||
until the current end point has no pending traffic. Such | ||||
environments cause self clocked protocols revert to extremely | ||||
inefficient stop and wait behavior, where they send an entire window | ||||
of data as a single burst, followed by the entire window of ACKs on | ||||
the return path. | ||||
If a particular end-to-end path contains a link or device that alters | To verify that a path can meet a performance target, it is necessary | |||
the ACK stream, then the entire path from the sender up to the | to independently confirm that the path can tolerate bursts in the | |||
bottleneck must be tested at the burst parameters implied by the ACK | dimensions that can be caused by these mechanisms. Three cases are | |||
scheduling algorithm. The most important parameter is the Effective | likely to be sufficient: | |||
Bottleneck Data Rate, which is the average rate at which the ACKs | ||||
advance snd.una. Note that thinning the ACKs (relying on the | ||||
cumulative nature of seg.ack to permit discarding some ACKs) is | ||||
implies an effectively infinite bottleneck data rate. | ||||
To verify that a path can meet the performance target, it is | o Slowstart bursts sufficient to get connections started properly. | |||
necessary to independently confirm that the entire path can tolerate | o Frequent sender interface rate bursts that are small enough where | |||
bursts in the dimensions that are likely to be induced by the | they can be assumed not to significantly affect delivery | |||
application and any data or ACK scheduling anywhere in the path. Two | statistics. (Implicitly derated by selecting the burst size). | |||
common cases are the most important: slowstart bursts at twice the | o Infrequent sender interface rate full target_pipe_size bursts that | |||
effective bottleneck data rate; and somewhat smaller sender interface | do affect the delivery statistics. (Target_run_length is | |||
rate bursts. | derated). | |||
The slowstart rate bursts must be at least as least as large | 4.2. Diagnostic Approach | |||
target_pipe_size packets and should be twice as large (so the peak | ||||
queue occupancy at the dominant bottleneck would be approximately | ||||
target_pipe_size). | ||||
There is no general model for how well the network needs to tolerate | The MBM approach is to open loop TCP by precomputing traffic patterns | |||
sender interface rate bursts. All existing TCP implementations send | that are typically generated by TCP operating at the given target | |||
full sized full rate bursts under some typically uncommon conditions, | parameters, and evaluating delivery statistics (packet loss, ECN | |||
such as application pauses that approximately match the RTT, or when | marks and delay). In this approach the measurement software | |||
ACKs are lost or thinned. Strawman: partial window bursts (some | explicitly controls the data rate, transmission pattern or cwnd | |||
fraction of target_pipe_size) should be tolerated without | (TCP's primary congestion control state variables) to create | |||
significantly raising the loss probability. Full target_pipe_size | repeatable traffic patterns that mimic TCP behavior but are | |||
bursts may slightly increase the loss probability. Interface rate | independent of the actual behavior of the subpath under test. These | |||
bursts as large as twice target_pipe_size should not cause | patterns are manipulated to probe the network to verify that it can | |||
deterministic packet drops. | deliver all of the traffic patterns that a transport protocol is | |||
likely to generate under normal operation at the target rate and RTT. | ||||
By opening the protocol control loops, we remove most sources of | ||||
temporal and spatial correlation in the traffic delivery statistics, | ||||
such that each subpath's contribution to the end-to-end statistics | ||||
can be assumed to be independent and stationary (The delivery | ||||
statistics depend on the fine structure of the data transmissions, | ||||
but not on long time scale state imbedded in the sender, receiver or | ||||
other network components.) Therefore each subpath's contribution to | ||||
the end-to-end delivery statistics can be assumed to be independent, | ||||
and spatial composition techniques such as [RFC5835] apply. | ||||
In typical networks, the dominant bottleneck contributes the majority | ||||
of the packet loss and ECN marks. Often the rest of the path makes | ||||
insignificant contribution to these properties. A TDS should | ||||
apportion the end-to-end budget for the specified parameters | ||||
(primarily packet loss and ECN marks) to each subpath or group of | ||||
subpaths. For example the dominant bottleneck may be permitted to | ||||
contribute 90% of the loss budget, while the rest of the path is only | ||||
permitted to contribute 10%. | ||||
A TDS or FSTDS MUST apportion all relevant packet delivery statistics | ||||
between different subpaths, such that the spatial composition of the | ||||
metrics yields end-to-end statics which are within the bounds | ||||
determined by the models. | ||||
A network is expected to be able to sustain a Bulk TCP flow of a | ||||
given data rate, MTU and RTT when the following conditions are met: | ||||
o The raw link rate is higher than the target data rate. | ||||
o The observed run length is larger than required by a suitable TCP | ||||
performance model | ||||
o There is sufficient buffering at the dominant bottleneck to absorb | ||||
a slowstart rate burst large enough to get the flow out of | ||||
slowstart at a suitable window size. | ||||
o There is sufficient buffering in the front path to absorb and | ||||
smooth sender interface rate bursts at all scales that are likely | ||||
to be generated by the application, any channel arbitration in the | ||||
ACK path or other mechanisms. | ||||
o When there is a standing queue at a bottleneck for a shared media | ||||
subpath, there are suitable bounds on how the data and ACKs | ||||
interact, for example due to the channel arbitration mechanism. | ||||
o When there is a slowly rising standing queue at the bottleneck the | ||||
onset of packet loss has to be at an appropriate point (time or | ||||
queue depth) and progressive. This typically requires some form | ||||
of Automatic Queue Management [RFC2309]. | ||||
We are developing a tool that can perform many of the tests described | ||||
here[MBMSource]. | ||||
5. Common Models and Parameters | 5. Common Models and Parameters | |||
5.1. Target End-to-end parameters | 5.1. Target End-to-end parameters | |||
The target end to end parameters are the target data rate, target RTT | The target end-to-end parameters are the target data rate, target RTT | |||
and target MTU as defined in Section 2 These parameters are | and target MTU as defined in Section 2. These parameters are | |||
determined by the needs of the application or the ultimate end user | determined by the needs of the application or the ultimate end user | |||
and the end-to-end Internet path over which the application is | and the end-to-end Internet path over which the application is | |||
expected to operate. The target parameters are in units that make | expected to operate. The target parameters are in units that make | |||
sense to the upper layer: payload bytes delivered to the application, | sense to upper layers: payload bytes delivered to the application, | |||
above TCP. They exclude overheads associated with TCP and IP | above TCP. They exclude overheads associated with TCP and IP | |||
headers, retransmitts and other protocols (e.g. DNS). In addition, | headers, retransmits and other protocols (e.g. DNS). | |||
other end-to-end parameters include the effective bottleneck data | ||||
rate, the sender interface data rate and the TCP/IP header sizes | Other end-to-end parameters defined in Section 2 include the | |||
(overhead). | effective bottleneck data rate, the sender interface data rate and | |||
the TCP/IP header sizes (overhead). | ||||
The target data rate must be smaller than all link data rates by | ||||
enough headroom to carry the transport protocol overhead, explicitly | ||||
including retransmissions and an allowance fluctuations in the actual | ||||
data rate, needed to meet the specified average rate. Specifying a | ||||
target rate with insufficient headroom are likely to result in | ||||
brittle measurements having little predictive value. | ||||
Note that the target parameters can be specified for a hypothetical | Note that the target parameters can be specified for a hypothetical | |||
path, for example to construct TDS designed for bench testing in the | path, for example to construct TDS designed for bench testing in the | |||
absence of a real application, or for a real physical test, for in | absence of a real application, or for a real physical test, for in | |||
situ testing of production infrastructure. | situ testing of production infrastructure. | |||
The number of concurrent connections is explicitly not a parameter to | The number of concurrent connections is explicitly not a parameter to | |||
this model [unlike earlier drafts]. If a subpath requires multiple | this model. If a subpath requires multiple connections in order to | |||
connections in order to meet the specified performance, that must be | meet the specified performance, that must be stated explicitly and | |||
stated explicitly and the procedure described in Section 6.1.4 | the procedure described in Section 6.1.4 applies. | |||
applies. | ||||
5.2. Common Model Calculations | 5.2. Common Model Calculations | |||
The most important derived parameter is target_pipe_size (in | The end-to-end target parameters are used to derive the | |||
packets), which is the window size --- the number of packets needed | target_pipe_size and the reference target_run_length. | |||
exactly meet the target rate, with no cross traffic for the specified | ||||
target RTT and MTU. It is given by: | The target_pipe_size, is the average window size in packets needed to | |||
meet the target rate, for the specified target RTT and MTU. It is | ||||
given by: | ||||
target_pipe_size = target_rate * target_RTT / ( target_MTU - | target_pipe_size = target_rate * target_RTT / ( target_MTU - | |||
header_overhead ) | header_overhead ) | |||
Target_run_length is an estimate of the minimum required headway | ||||
between losses or ECN marks, as computed by a mathematical model of | ||||
TCP congestion control. The derivation here follows [MSMO97], and by | ||||
design is quite conservative. The alternate models described in | ||||
Appendix A generally yield smaller run_lengths (higher loss rates), | ||||
but may not apply in all situations. In any case alternate models | ||||
should be compared to the reference target_run_length computed here. | ||||
If the transport protocol (e.g. TCP) average window size is smaller | Reference target_run_length is derived as follows: assume the | |||
than this, it will not meet the target rate. | ||||
The reference target_run_length, is a very conservative model for the | ||||
minimum required spacing between losses or ECN marks. The reference | ||||
target_run_length can derived as follows: assume the | ||||
subpath_data_rate is infinitesimally larger than the target_data_rate | subpath_data_rate is infinitesimally larger than the target_data_rate | |||
plus the required header overheads. Then target_pipe_size also | plus the required header_overhead. Then target_pipe_size also | |||
predicts the onset of queueing. If the transport protocol (e.g. | predicts the onset of queueing. A larger window will cause a | |||
TCP) has a window size that is larger than the target_pipe_size, the | standing queue at the bottleneck. | |||
excess packets will raise the RTT, typically by forming a standing | ||||
queue at the bottleneck. | ||||
Assume the transport protocol is using standard Reno style Additive | Assume the transport protocol is using standard Reno style Additive | |||
Increase, Multiplicative Decrease congestion control [RFC5681] and | Increase, Multiplicative Decrease congestion control [RFC5681] (but | |||
the receiver is using standard delayed ACKs. With delayed ACKs there | not Appropriate Byte Counting [RFC3465]) and the receiver is using | |||
must be 2*target_pipe_size roundtrips between losses. Otherwise the | standard delayed ACKs. Reno increases the window by one packet every | |||
multiplicative window reduction triggered by a loss would cause the | pipe_size worth of ACKs. With delayed ACKs this takes 2 Round Trip | |||
network to be underfilled. We derive the number of packets between | Times per increase. To exactly fill the pipe losses must be no | |||
losses from the area under the AIMD sawtooth following [MSMO97]. | closer than when the peak of the AIMD sawtooth reached exactly twice | |||
They must be no more frequent than every 1 in | the target_pipe_size otherwise the multiplicative window reduction | |||
(3/2)*target_pipe_size*(2*target_pipe_size) packets. This simplifies | triggered by the loss would cause the network to be underfilled. | |||
to: | Following [MSMO97] the number of packets between losses must be the | |||
area under the AIMD sawtooth. They must be no more frequent than | ||||
every 1 in ((3/2)*target_pipe_size)*(2*target_pipe_size) packets, | ||||
which simplifies to: | ||||
target_run_length = 3*(target_pipe_size^2) | target_run_length = 3*(target_pipe_size^2) | |||
Note that this calculation is very conservative and is based on a | Note that this calculation is very conservative and is based on a | |||
number of assumptions that may not apply. Appendix A discusses these | number of assumptions that may not apply. Appendix A discusses these | |||
assumptions and provides some alternative models. If a less | assumptions and provides some alternative models. If a less | |||
conservative model is used, a fully specified TDS or FSTDS MUST | conservative model is used, a fully specified TDS or FSTDS MUST | |||
document the actual method for computing target_run_length along with | document the actual method for computing target_run_length along with | |||
the rationale for the underlying assumptions and the ratio of chosen | the rationale for the underlying assumptions and the ratio of chosen | |||
target_run_length to the reference target_run_length calculated | target_run_length to the reference target_run_length calculated | |||
above. | above. | |||
These two parameters, target_pipe_size and target_run_length, | These two parameters, target_pipe_size and target_run_length, | |||
directly imply most of the individual parameters for the tests below. | directly imply most of the individual parameters for the tests in | |||
Target_pipe_size is the window size, the amount of circulating data | Section 7. | |||
required to meet the target data rate, and implies the scale of the | ||||
bursts that the network might experience. Target_run_length is the | ||||
amount of data required between losses or ECN marks standard for | ||||
standard congestion control. | ||||
The individual parameters are for each diagnostic test is described | ||||
below. In a few case there are not well established models for what | ||||
is considered correct network operation. In many of these cases the | ||||
problems might either be partially mitigated by future improvements | ||||
to TCP implementations. | ||||
5.3. Parameter Derating | 5.3. Parameter Derating | |||
Since some aspects of the models are very conservative, this | Since some aspects of the models are very conservative, this | |||
framework permits some latitude in derating test parameters. Rather | framework permits some latitude in derating test parameters. Rather | |||
than trying to formalize more complicated models we permit some test | than trying to formalize more complicated models we permit some test | |||
parameters to be relaxed as long as they meet some additional | parameters to be relaxed as long as they meet some additional | |||
procedural constraints: | procedural constraints: | |||
o The TDS or FSTDS MUST document and justify the actual method used | o The TDS or FSTDS MUST document and justify the actual method used | |||
compute the derated metric parameters. | compute the derated metric parameters. | |||
o The validation procedures described in Section 9 must be used to | o The validation procedures described in Section 9 must be used to | |||
demonstrate the feasibility of meeting the performance targets | demonstrate the feasibility of meeting the performance targets | |||
with infrastructure that infinitessimally passes the derated | with infrastructure that infinitesimally passes the derated tests. | |||
tests. | ||||
o The validation process itself must be documented is such a way | o The validation process itself must be documented is such a way | |||
that other researchers can duplicate the validation experiments. | that other researchers can duplicate the validation experiments. | |||
Except as noted, all tests below assume no derating. Tests where | Except as noted, all tests below assume no derating. Tests where | |||
there is not currently a well established model for the required | there is not currently a well established model for the required | |||
parameters include derating as a way to indicate flexibility in the | parameters explicitly include derating as a way to indicate | |||
parameters. | flexibility in the parameters. | |||
6. Common testing procedures | 6. Common testing procedures | |||
6.1. Traffic generating techniques | 6.1. Traffic generating techniques | |||
6.1.1. Paced transmission | 6.1.1. Paced transmission | |||
Paced (burst) transmissions: send bursts of data on a timer to meet a | Paced (burst) transmissions: send bursts of data on a timer to meet a | |||
particular target rate and pattern. In all cases the specified data | particular target rate and pattern. In all cases the specified data | |||
rate can either be the application or link rates. Header overheads | rate can either be the application or link rates. Header overheads | |||
skipping to change at page 17, line 15 | skipping to change at page 17, line 39 | |||
Paced single packets: Send individual packets at the specified rate | Paced single packets: Send individual packets at the specified rate | |||
or headway. | or headway. | |||
Burst: Send sender interface rate bursts on a timer. Specify any 3 | Burst: Send sender interface rate bursts on a timer. Specify any 3 | |||
of: average rate, packet size, burst size (number of packets) and | of: average rate, packet size, burst size (number of packets) and | |||
burst headway (burst start to start). These bursts are typically | burst headway (burst start to start). These bursts are typically | |||
sent as back-to-back packets at the testers interface rate. | sent as back-to-back packets at the testers interface rate. | |||
Slowstart bursts: Send 4 packet sender interface rate bursts at an | Slowstart bursts: Send 4 packet sender interface rate bursts at an | |||
average data rate equal to twice effective bottleneck link rate | average data rate equal to twice effective bottleneck link rate | |||
(but not more than the sender interface rate). This corresponds | (but not more than the sender interface rate). This corresponds | |||
to the average rate during a TCP slowstart when Appropriate Byte | to the average rate during a TCP slowstart when Appropriate Byte | |||
Counting [ABC] is present or delayed ack is disabled. | Counting [RFC3465] is present or delayed ack is disabled. Note | |||
that if the effective bottleneck link rate is more than half of | ||||
the sender interface rate, slowstart bursts become sender | ||||
interface rate bursts. | ||||
Repeated Slowstart bursts: Slowstart bursts are typically part of | Repeated Slowstart bursts: Slowstart bursts are typically part of | |||
larger scale pattern of repeated bursts, such as sending | larger scale pattern of repeated bursts, such as sending | |||
target_pipe_size packets as slowstart bursts on a target_RTT | target_pipe_size packets as slowstart bursts on a target_RTT | |||
headway (burst start to burst start). Such a stream has three | headway (burst start to burst start). Such a stream has three | |||
different average rates, depending on the averaging time scale. | different average rates, depending on the averaging interval. At | |||
At the finest time scale the average rate is the same as the | the finest time scale the average rate is the same as the sender | |||
sender interface rate, at a medium scale the average rate is twice | interface rate, at a medium scale the average rate is twice the | |||
the effective bottleneck link rate and at the longest time scales | effective bottleneck link rate and at the longest time scales the | |||
the average rate is the target data rate. | average rate is equal to the target data rate. | |||
Note that if the effective bottleneck link rate is more than half of | Note that in conventional measurement theory exponential | |||
the sender interface rate, slowstart bursts become sender interface | distributions are often used to eliminate many sorts of correlations. | |||
rate bursts. | For the procedures above, the correlations are created by the network | |||
elements and accurately reflect their behavior. At some point in the | ||||
future, it may be desirable to introduce noise sources into the above | ||||
pacing models, but the are not warranted at this time. | ||||
6.1.2. Constant window pseudo CBR | 6.1.2. Constant window pseudo CBR | |||
Implement pseudo constant bit rate by running a standard protocol | Implement pseudo constant bit rate by running a standard protocol | |||
such as TCP with a fixed bound on the window size. The rate is only | such as TCP with a fixed bound on the window size. The rate is only | |||
maintained in average over each RTT, and is subject to limitations of | maintained in average over each RTT, and is subject to limitations of | |||
the transport protocol. | the transport protocol. | |||
The bound on the window size is computed from the target_data_rate | The bound on the window size is computed from the target_data_rate | |||
and the actual RTT of the test path. | and the actual RTT of the test path. | |||
If the transport protocol fails to maintain the test rate within | If the transport protocol fails to maintain the test rate within | |||
prescribed data rates, the test MUST NOT be considered passing. If | prescribed limits the test would typically be considered inconclusive | |||
there is a signature of a network problem (e.g. the run length is too | or failing, depending depending on what mechanism caused the reduced | |||
small) then the test can be considered to fail. Since packet loss | rate. See the discussion of test outcomes in Section 6.2.1. | |||
and ECN marks are required to reduce the data rate for standard | ||||
transport protocols, the test specification must include suitable | ||||
allowances in the prescribed data rates. If there is not sufficient | ||||
signature of a network problem, then failing to make the prescribed | ||||
data rate must be considered inconclusive. Otherwise there are some | ||||
cases where tester failures might cause false negative test results. | ||||
6.1.3. Scanned window pseudo CBR | 6.1.3. Scanned window pseudo CBR | |||
Same as the above, except the window is scanned across a range of | Same as the above, except the window is scanned across a range of | |||
sizes designed to include two key events, the onset of queueing and | sizes designed to include two key events, the onset of queueing and | |||
the onset of packet loss or ECN marks. The window is scanned by | the onset of packet loss or ECN marks. The window is scanned by | |||
incrementing it by one packet for every 2*target_pipe_size delivered | incrementing it by one packet for every 2*target_pipe_size delivered | |||
packets. This mimics the additive increase phase of standard | packets. This mimics the additive increase phase of standard | |||
congestion avoidance and normally separates the the window increases | congestion avoidance and normally separates the the window increases | |||
by approximately twice the target_RTT. | by approximately twice the target_RTT. | |||
There are two versions of this test: one built by applying a window | There are two versions of this test: one built by applying a window | |||
clamp to standard congestion control and one one built by stiffening | clamp to standard congestion control and one one built by stiffening | |||
a non-standard transport protocol. When standard congestion control | a non-standard transport protocol. When standard congestion control | |||
is in effect, any losses or ECN marks cause the transport to revert | is in effect, any losses or ECN marks cause the transport to revert | |||
to a window smaller than the clamp such that the scanning clamp | to a window smaller than the clamp such that the scanning clamp loses | |||
looses control the window size. The NPAD pathdiag tool is an example | control the window size. The NPAD pathdiag tool is an example of | |||
of this class of algorithms [Pathdiag]. | this class of algorithms [Pathdiag]. | |||
Alternatively a non-standard congestion control algorithm can respond | Alternatively a non-standard congestion control algorithm can respond | |||
to losses by transmitting extra data, such that it (attempts) to | to losses by transmitting extra data, such that it maintains the | |||
maintain the specified window size independent of losses or ECN | specified window size independent of losses or ECN marks. Such a | |||
marks. Such a stiffened transport explicitly violates mandatory | stiffened transport explicitly violates mandatory Internet congestion | |||
Internet congestion control and is not suitable for in situ testing. | control and is not suitable for in situ testing. It is only | |||
It is only appropriate for engineering testing under laboratory | appropriate for engineering testing under laboratory conditions. The | |||
conditions. The Windowed Ping tools implemented such a test [WPING]. | Windowed Ping tools implemented such a test [WPING]. This tool has | |||
This tool has been updated and is under test.[mpingSource] | been updated and is under test.[mpingSource] | |||
The test procedures in Section 7.2 describe how to the partition the | The test procedures in Section 7.2 describe how to the partition the | |||
scans into regions and how to interpret the results. | scans into regions and how to interpret the results. | |||
6.1.4. Concurrent or channelized testing | 6.1.4. Concurrent or channelized testing | |||
The procedures described in his document are only directly applicable | The procedures described in his document are only directly applicable | |||
to single stream performance measurement, e.g. one TCP connection. | to single stream performance measurement, e.g. one TCP connection. | |||
In an Ideal world, we would disallow all performance claims based | In an ideal world, we would disallow all performance claims based | |||
multiple concurrent stream but this is not practical due to at least | multiple concurrent streams but this is not practical due to at least | |||
two different issues. First, many very high rate link technologies | two different issues. First, many very high rate link technologies | |||
are channelized, and pin individual flows to specific channels to | are channelized and pin individual flows to specific channels to | |||
minimize reordering or solve other problems and second TCP itself has | minimize reordering or other problems and second, TCP itself has | |||
scaling limits. Although the former problem might be overcome | scaling limits. Although the former problem might be overcome | |||
through different design decisions, the later problem is more deeply | through different design decisions, the later problem is more deeply | |||
rooted. | rooted. | |||
All standard [RFC 5681] and de facto standard [CUBIC] congestion | All standard [RFC5681] and de facto standard congestion control | |||
control algorithms have scaling limits, in the sense that as a | algorithms [CUBIC] have scaling limits, in the sense that as a long | |||
network over a fixed RTT and MTU gets faster all congestion control | fast network (LFN) with a fixed RTT and MTU gets faster, all | |||
algorithms get less accurate. In general their noise immunity drops | congestion control algorithms get less accurate and as a consequence | |||
(a single packet drop should have less effect as individual packets | have difficulty filling the network [SLowScaling]. These properties | |||
become smaller relative to the window size) and the control frequency | are a consequence of the original Reno AIMD congestion control design | |||
of the AIMD sawtooth also drops, meaning that as TCP is using more | and the requirement in RFC 5681 that all transport protocols have | |||
total capacity it gets less information about the state of the | uniform response to congestion. | |||
network and other traffic. These properties are a direct consequence | ||||
of the original Reno design and are implicitly required by the | ||||
requirement that all transport protocols be "TCP friendly" | ||||
[Guidelines] There are a number of reason to want to specify | ||||
performance in term of multiple concurrent flows. Although there are | ||||
a number of downsides to @@@@ | ||||
The use of multiple connections in the Internet has been very | ||||
controversial since the beginning of the World-Wide-Web[first | ||||
complaint]. Modern browsers open many connections [BScope]. Experts | ||||
associated with IETF transport area have frequently spoken against | ||||
this practice [long list]. It is not inappropriate to assume some | ||||
small number of concurrent connections (e.g. 4 or 6), to compensate | ||||
for limitation in TCP. However, choosing too large a number is at | ||||
risk of being interpreted as a signal by the web browser community | ||||
that this practice has been embraced by the Internet service provider | ||||
community. It may not be desirable to send such a signal. | ||||
Note that the current proposal for httpbis [SPDY] is specifically | There are a number of reasons to want to specify performance in term | |||
designed to work best with a single TCP connection per client server | of multiple concurrent flows, however this approach is not | |||
pair, because it uses adaptive compression which requires sending | recommended for data rates below several Mb/s, which can be attained | |||
separate compression dictionaries per connection. As long as TCP can | with run lengths under 10000 packets. Since run length goes as the | |||
use IW10 and some of the transport parameter can be cached, multiple | square of the data rate, at higher rates the run lengths can be | |||
connections provide a negative gain, due to the replicated | unfeasibly large, and multiple connection might be the only feasible | |||
compression overhead. | approach. For an example of this problem see Section 8.3. | |||
The specification to use multiple connections is not recommended for | If multiple connections are deemed necessary to meet aggregate | |||
data rates below several Mb/s, which can be attained with run lengths | performance targets then this MUST be stated both the design of the | |||
under 10000. Since run length goes as the square of the data rates, | TDS and in any claims about network performance. The tests MUST be | |||
at higher rates (see Section 8.3) the run lengths can be unfeasibly | performed concurrently with the specified number of connections. For | |||
large, and multiple connection might be the only feasible approach. | the the tests that using bursty traffic, the bursts should be | |||
synchronized across flows. | ||||
6.1.5. Intermittent Testing | 6.1.5. Intermittent Testing | |||
Any test which does not depend on queueing (e.g. the CBR tests) or | Any test which does not depend on queueing (e.g. the CBR tests) or | |||
experiences periodic zero outstanding data during normal operation | experiences periodic zero outstanding data during normal operation | |||
(e.g. between bursts for the various burst tests), can be formulated | (e.g. between bursts for the various burst tests), can be formulated | |||
as an intermittent test. | as an intermittent test, to reduce the perceived impact on other | |||
traffic. The approach is to insert periodic pauses in the test at | ||||
The Intermittent testing can be used for ongoing monitoring for | any point when there is no expected queue occupancy. | |||
changes in subpath quality with minimal disruption users. It should | ||||
be used in conjunction with the full rate test because this method | ||||
assesses an average_run_length over a long time interval w.r.t. user | ||||
sessions. It may false fail due to other legitimate congestion | ||||
causing traffic or may false pass changes in underlying link | ||||
properties (e.g. a modem retraining to an out of contract lower | ||||
rate). | ||||
[Need text about bias (false pass) in the shadow of loss caused by | Intermittent testing can be used for ongoing monitoring for changes | |||
excessive bursts] | in subpath quality with minimal disruption users. However it is not | |||
suitable in environments where there are reactive links[REACTIVE]. | ||||
6.1.6. Intermittent Scatter Testing | 6.1.6. Intermittent Scatter Testing | |||
Intermittent scatter testing: when testing the network path to or | Intermittent scatter testing is a technique for non-disruptively | |||
from an ISP subscriber aggregation point (CMTS, DSLAM, etc), | evaluating the front path from a sender to a subscriber aggregation | |||
intermittent tests can be spread across a pool of users such that no | point within an ISP at full load by intermittently testing across a | |||
one users experiences the full impact of the testing, even though the | pool of subscriber access links, such that each subscriber sees | |||
traffic to or from the ISP subscriber aggregation point is sustained | tolerable test traffic loads. The load on the front path should be | |||
at full rate. | limited to be no more than that which would be caused by a single | |||
test to an known to otherwise be idle subscriber. This test in | ||||
aggregate mimics a full load test from a content provider to the | ||||
aggregation point. | ||||
Intermittent scatter testing can be used to reduce the measurement | ||||
noise introduced by unknown traffic on customer access links. | ||||
6.2. Interpreting the Results | 6.2. Interpreting the Results | |||
6.2.1. Test outcomes | 6.2.1. Test outcomes | |||
A singleton is a pass/fail measurement of a subpath. If any subpath | To perform an exhaustive test of an end-to-end network path, each | |||
fails any test then the end-to-end path is also expected to fail to | test of the TDS is applied to each subpath of an end-to-end path. If | |||
attain the target performance under some conditions. | any subpath fails any test then an application running over the end- | |||
to-end path can also be expected to fail to attain the target | ||||
performance under some conditions. | ||||
In addition we use "inconclusive outcome" to indicate that a test | In addition to passing or failing, a test can be deemed to be | |||
failed to attain the required test conditions. A test is | inconclusive for a number of reasons. Proper instrumentation and | |||
inconclusive if the precomputed traffic pattern was not authentically | treatment of inclusive outcomes is critical to the accuracy and | |||
generated, test preconditions were not met or the measurement results | robustness of Model Based Metrics. Tests can be inconclusive if the | |||
were not statistically significantly. | precomputed traffic pattern was not accurately generated; the | |||
measurement results were not statistically significant; and others | ||||
causes such as failing to meet some required preconditions for the | ||||
test. | ||||
This is important to the extent that the diagnostic tests use | For example consider a test that implements Constant Window Pseudo | |||
protocols which themselves include built in control systems which | CBR (Section 6.1.2) by adding rate controls and detailed traffic | |||
might interfere with some aspect of the test. For example consider a | instrumentation to TCP (e.g. [RFC4898]). TCP includes built in | |||
test that is implemented by adding rate controls and loss | control systems which might interfere with the sending data rate. If | |||
instrumentation to TCP: meeting the run length specification while | such a test meets the the run length specification while failing to | |||
failing to attain the specified data rate must be treated as an | attain the specified data rate it must be treated as an inconclusive | |||
inconclusive result, because we can not a priori determine if the | result, because we can not a priori determine if the reduced data | |||
reduced data rate was caused by a TCP problem or a network problem, | rate was caused by a TCP problem or a network problem, or if the | |||
or if the reduced data rate had a material effect on the run length | reduced data rate had a material effect on the run length measurement | |||
measurement. (Note that if the measured run length was too small, | itself. | |||
the test can be considered to have failed because it doesn't really | ||||
matter that the test didn't attain the required data rate). | ||||
The vantage independence properties of Model Based Metrics depends on | Note that for load tests such as this example, an observed run length | |||
the accuracy of the distinction between conclusive (pass or fail) and | that is too small can be considered to have failed the test because | |||
inconclusive tests. One way to view inconclusive tests is that they | it doesn't really matter that the test didn't attain the required | |||
reflect situations where the signature is ambiguous between problems | data rate. | |||
with the the subpath and problems with the diagnostic test itself. | ||||
One of the goals for evolving diagnostic test designs will be to keep | ||||
sharpening this distinction. | ||||
One of the goals of evolving the testing process, procedures and | The really important new properties of MBM, such as vantage | |||
measurement point selection should be to minimize the number of | independence, are a direct consequence of opening the control loops | |||
inconclusive tests. | in the protocols, such that the test traffic does not depend on | |||
network conditions or traffic received. Any mechanism that | ||||
introduces feedback between the traffic measurements and the traffic | ||||
generation is at risk of introducing nonlinearities that spoil these | ||||
properties. Any exceptional event that indicates that such feedback | ||||
has happened should cause the test to be considered inconclusive. | ||||
One way to view inconclusive tests is that they reflect situations | ||||
where a test outcome is ambiguous between limitations of the network | ||||
and some unknown limitation of the diagnostic test itself, which was | ||||
presumably caused by some uncontrolled feedback from the network. | ||||
Note that procedures that attempt to sweep the target parameter space | Note that procedures that attempt to sweep the target parameter space | |||
to find the bounds on some parameter (for example to find the highest | to find the bounds on some parameter (for example to find the highest | |||
data rate for a subpath) are likely to break the location independent | data rate for a subpath) are likely to break the location independent | |||
properties of Model Based Metrics, because the boundary between | properties of Model Based Metrics, because the boundary between | |||
passing and inconclusive is extremely likely to be RTT sensitive, | passing and inconclusive is sensitive to the RTT because TCP's | |||
because TCP's ability to compensate for problems scales with the | ability to compensate for problems scales with the number of round | |||
number of round trips per second. | trips per second. Repeating the same procedure from another vantage | |||
point with a different RTT is likely get a different result, because | ||||
TCP will get lower performance on the path with the longer RTT. | ||||
One of the goals for evolving TDS designs will be to keep sharpening | ||||
distinction between inconclusive, passing and failing tests. The | ||||
criteria for for passing, failing and inclusive tests MUST be | ||||
explicitly stated for every test in the TDS or FSTDS. | ||||
One of the goals of evolving the testing process, procedures tools | ||||
and measurement point selection should be to minimize the number of | ||||
inconclusive tests. | ||||
It may be useful to keep raw data delivery statistics for deeper | ||||
study of the behavior of the network path and to measure the tools. | ||||
This can help to drive tool evolution. Under some conditions it | ||||
might be possible to reevaluate the raw data for satisfying alternate | ||||
performance targets. However such procedures are likely to introduce | ||||
sampling bias and other implicit feedback which can cause false | ||||
results and exhibit MP vantage sensitivity. | ||||
6.2.2. Statistical criteria for measuring run_length | 6.2.2. Statistical criteria for measuring run_length | |||
When evaluating the observed run_length, we need to determine | When evaluating the observed run_length, we need to determine | |||
appropriate packet stream sizes and acceptable error levels for | appropriate packet stream sizes and acceptable error levels for | |||
efficient methods of measurement. In practice, can we compare the | efficient measurement. In practice, can we compare the empirically | |||
empirically estimated loss probabilities with the targets as the | estimated packet loss and ECN marking probabilities with the targets | |||
sample size grows? How large a sample is needed to say that the | as the sample size grows? How large a sample is needed to say that | |||
measurements of packet transfer indicate a particular run-length is | the measurements of packet transfer indicate a particular run length | |||
present? | is present? | |||
The generalized measurement can be described as recursive testing: | The generalized measurement can be described as recursive testing: | |||
send packets (individually or in patterns) and observe the packet | send packets (individually or in patterns) and observe the packet | |||
transfer performance (loss ratio or other metric, any defect we | delivery performance (loss ratio or other metric, any marking we | |||
define). | define). | |||
As each packet is sent and measured, we have an ongoing estimate of | As each packet is sent and measured, we have an ongoing estimate of | |||
the performance in terms of defect to total packet ratio (or an | the performance in terms of the ratio of packet loss or ECN mark to | |||
empirical probability). We continue to send until conditions support | total packets (i.e. an empirical probability). We continue to send | |||
a conclusion or a maximum sending limit has been reached. | until conditions support a conclusion or a maximum sending limit has | |||
been reached. | ||||
We have a target_defect_probability, 1 defect per target_run_length, | We have a target_mark_probability, 1 mark per target_run_length, | |||
where a "defect" is defined as a lost packet, a packet with ECN mark, | where a "mark" is defined as a lost packet, a packet with ECN mark, | |||
or other impairment. This constitutes the null Hypothesis: | or other signal. This constitutes the null Hypothesis: | |||
H0: no more than one defect in target_run_length = | H0: no more than one mark in target_run_length = | |||
3*(target_pipe_size)^2 packets | 3*(target_pipe_size)^2 packets | |||
and we can stop sending packets if on-going measurements support | and we can stop sending packets if on-going measurements support | |||
accepting H0 with the specified Type I error = alpha (= 0.05 for | accepting H0 with the specified Type I error = alpha (= 0.05 for | |||
example). | example). | |||
We also have an alternative Hypothesis to evaluate: if performance is | We also have an alternative Hypothesis to evaluate: if performance is | |||
significantly lower than the target_defect_probability. Based on | significantly lower than the target_mark_probability. Based on | |||
analysis of typical values and practical limits on measurement | analysis of typical values and practical limits on measurement | |||
duration, we choose four times the H0 probability: | duration, we choose four times the H0 probability: | |||
H1: one or more defects in (target_run_length/4) packets | H1: one or more marks in (target_run_length/4) packets | |||
and we can stop sending packets if measurements support rejecting H0 | and we can stop sending packets if measurements support rejecting H0 | |||
with the specified Type II error = beta (= 0.05 for example), thus | with the specified Type II error = beta (= 0.05 for example), thus | |||
preferring the alternate hypothesis H1. | preferring the alternate hypothesis H1. | |||
H0 and H1 constitute the Success and Failure outcomes described | H0 and H1 constitute the Success and Failure outcomes described | |||
elsewhere in the memo, and while the ongoing measurements do not | elsewhere in the memo, and while the ongoing measurements do not | |||
support either hypothesis the current status of measurements is | support either hypothesis the current status of measurements is | |||
inconclusive. | inconclusive. | |||
The problem above is formulated to match the Sequential Probability | The problem above is formulated to match the Sequential Probability | |||
Ratio Test (SPRT) [StatQC], which also starts with a pair of | Ratio Test (SPRT) [StatQC]. Note that as originally framed the | |||
events under consideration were all manufacturing defects. In | ||||
networking, ECN marks and lost packets are not defects but signals, | ||||
indicating that the transport protocol should slow down. | ||||
The Sequential Probability Ratio Test also starts with a pair of | ||||
hypothesis specified as above: | hypothesis specified as above: | |||
H0: p0 = one defect in target_run_length | H0: p0 = one defect in target_run_length | |||
H1: p1 = one defect in target_run_length/4 | H1: p1 = one defect in target_run_length/4 | |||
As packets are sent and measurements collected, the tester evaluates | As packets are sent and measurements collected, the tester evaluates | |||
the cumulative defect count against two boundaries representing H0 | the cumulative defect count against two boundaries representing H0 | |||
Acceptance or Rejection (and acceptance of H1): | Acceptance or Rejection (and acceptance of H1): | |||
Acceptance line: Xa = -h1 + sn | Acceptance line: Xa = -h1 + sn | |||
Rejection line: Xr = h2 + sn | Rejection line: Xr = h2 + sn | |||
skipping to change at page 22, line 45 | skipping to change at page 23, line 40 | |||
for p0 and p1 as defined in the null and alternative Hypotheses | for p0 and p1 as defined in the null and alternative Hypotheses | |||
statements above, and alpha and beta as the Type I and Type II error. | statements above, and alpha and beta as the Type I and Type II error. | |||
The SPRT specifies simple stopping rules: | The SPRT specifies simple stopping rules: | |||
o Xa < defect_count(n) < Xb: continue testing | o Xa < defect_count(n) < Xb: continue testing | |||
o defect_count(n) <= Xa: Accept H0 | o defect_count(n) <= Xa: Accept H0 | |||
o defect_count(n) >= Xb: Accept H1 | o defect_count(n) >= Xb: Accept H1 | |||
The calculations above are implemented in the R-tool for Statistical | The calculations above are implemented in the R-tool for Statistical | |||
Analysis, in the add-on package for Cross-Validation via Sequential | Analysis [Rtool] , in the add-on package for Cross-Validation via | |||
Testing (CVST) [http://www.r-project.org/] [Rtool] [CVST] . | Sequential Testing (CVST) [CVST] . | |||
Using the equations above, we can calculate the minimum number of | Using the equations above, we can calculate the minimum number of | |||
packets (n) needed to accept H0 when x defects are observed. For | packets (n) needed to accept H0 when x defects are observed. For | |||
example, when x = 0: | example, when x = 0: | |||
Xa = 0 = -h1 + sn | Xa = 0 = -h1 + sn | |||
and n = h1 / s | and n = h1 / s | |||
6.2.3. Reordering Tolerance | 6.2.2.1. Alternate criteria for measuring run_length | |||
All tests must be instrumented for reordering [RFC4737]. | An alternate calculation, contributed by Alex Gilgur (Google). | |||
NB: there is no global consensus for how much reordering tolerance is | The probability of failure within an interval whose length is | |||
appropriate or reasonable. ("None" is absolutely unreasonable.) | target_run_length is given by an exponential distribution with rate = | |||
1 / target_run_length (a memoryless process). The implication of | ||||
this is that it will be different, depending on the total count of | ||||
packets that have been through the pipe, the formula being: | ||||
P(t1 < T < t2) = R(t1) - R(t2), | ||||
where | ||||
T = number of packets at which a failure will occur with probability P; | ||||
t = number of packets: | ||||
t1 = number of packets (e.g., when failure last occurred) | ||||
t2 = t1 + target_run_length | ||||
R = failure rate: | ||||
R(t1) = exp (-t1/target_run_length) | ||||
R(t2) = exp (-t2/target_run_length) | ||||
The algorithm: | ||||
initialize the packet.counter = 0 | ||||
initialize the failed.packet.counter = 0 | ||||
start the loop | ||||
if paket_response = ACK: | ||||
increment the packet.counter | ||||
else: | ||||
### The packet failed | ||||
increment the packet.counter | ||||
increment the failed.packet.counter | ||||
P_fail_observed = failed.packet.counter/packet.counter | ||||
upper_bound = packet.counter + target.run.length / 2 | ||||
lower_bound = packet.counter - target.run.length / 2 | ||||
R1 = exp( -upper_bound / target.run.length) | ||||
R0 = R(max(0, lower_bound)/ target.run.length) | ||||
P_fail_predicted = R1-R0 | ||||
Compare P_fail_observed vs. P_fail_predicted | ||||
end-if | ||||
continue the loop | ||||
This algorithm allows accurate comparison of the observed failure | ||||
probability with the corresponding values predicted based on a fixed | ||||
target_failure_rate, which is equal to 1.0 / target_run_length. | ||||
6.2.3. Reordering Tolerance | ||||
All tests must be instrumented for packet level reordering [RFC4737]. | ||||
However, there is no consensus for how much reordering should be | ||||
acceptable. Over the last two decades the general trend has been to | ||||
make protocols and applications more tolerant to reordering, in | ||||
response to the gradual increase in reordering in the network. This | ||||
increase has been due to the gradual deployment of parallelism in the | ||||
network, as a consequence of such technologies as multithreaded route | ||||
lookups and Equal Cost Multipath (ECMP) routing. These techniques to | ||||
increase network parallelism are critical to enabling overall | ||||
Internet growth to exceed Moore's Law. | ||||
Section 5 of [RFC4737] proposed a metric that may be sufficient to | Section 5 of [RFC4737] proposed a metric that may be sufficient to | |||
designate isolated reordered packets as effectively lost, because | designate isolated reordered packets as effectively lost, because | |||
TCP's retransmission response would be the same. | TCP's retransmission response would be the same. | |||
[As a strawman, we propose the following:] TCP should be able to | TCP should be able to adapt to reordering as long as the reordering | |||
adapt to reordering as long as the reordering extent is no more than | extent is no more than the maximum of one half window or 1 mS, | |||
the maximum of one half window or 1 mS, whichever is larger. Note | whichever is larger. Note that there is a fundamental tradeoff | |||
that there is a fundamental tradeoff between tolerance to reordering | between tolerance to reordering and how quickly algorithms such as | |||
and how quickly algorithms such as fast retransmit can repair losses. | fast retransmit can repair losses. Within this limit on reorder | |||
Within this limit on reorder extent, there should be no bound on | extent, there should be no bound on reordering density. | |||
reordering density. | ||||
NB: Traditional TCP implementations were not compatible with this | NB: Traditional TCP implementations were not compatible with this | |||
metric, however newer implementations still need to be evaluated | metric, however newer implementations still need to be evaluated | |||
Parameters: | Parameters: | |||
Reordering displacement: the maximum of one half of target_pipe_size | Reordering displacement: the maximum of one half of target_pipe_size | |||
or 1 mS. | or 1 mS. | |||
6.3. Test Qualifications | 6.3. Test Qualifications | |||
This entire section might be summarized as "needs to be specified in | This entire section need to be completely overhauled. @@@@ It might | |||
a FSTDS" | be summarized as "needs to be specified in a FSTDS". | |||
Things to monitor before, during and after a test. | ||||
6.3.1. Verify the Traffic Generation Accuracy | ||||
[Excess detail for this doc. To be summarized] | Send pre-load traffic as needed to activate radios with a sleep mode, | |||
or other "reactive network" elements (term defined in | ||||
[draft-morton-ippm-2330-update-01]). | ||||
for most tests, failing to accurately generate the test traffic | In general failing to accurately generate the test traffic has to be | |||
indicates an inconclusive tests, since it has to be presumed that the | treated as an inconclusive test, since it must be presumed that the | |||
error in traffic generation might have affected the test outcome. To | error in traffic generation might have affected the test outcome. To | |||
the extent that the network itself had an effect on the the traffic | the extent that the network itself had an effect on the the traffic | |||
generation (e.g. in the standing queue tests) the possibility exists | generation (e.g. in the standing queue tests) the possibility exists | |||
that allowing too large of error margin in the traffic generation | that allowing too large of error margin in the traffic generation | |||
might introduce feedback loops that comprise the vantage independents | might introduce feedback loops that comprise the vantage independents | |||
properties of these tests. | properties of these tests. | |||
Parameters: | ||||
Maximum Data Rate Error The permitted amount that the test traffic | ||||
can be different than specified for the current test. This is a | ||||
symmetrical bound. | ||||
Maximum Data Rate Overage The permitted amount that the test traffic | ||||
can be above than specified for the current test. | ||||
Maximum Data Rate Underage The permitted amount that the test | ||||
traffic can be less than specified for the current test. | ||||
6.3.2. Verify the absence of cross traffic | ||||
[Excess detail for this doc. To be summarized] | ||||
The proper treatment of cross traffic is different for different | The proper treatment of cross traffic is different for different | |||
subpaths. In general when testing infrastructure which is associated | subpaths. In general when testing infrastructure which is associated | |||
with only one subscriber, the test should be treated as inconclusive | with only one subscriber, the test should be treated as inconclusive | |||
it that subscriber is active on the network. However, for shared | it that subscriber is active on the network. However, for shared | |||
infrastructure, the question at hand is likely to be testing if | infrastructure managed by an ISP, the question at hand is likely to | |||
provider has sufficient total capacity. In such cases the presence | be testing if ISP has sufficient total capacity. In such cases the | |||
of cross traffic due to other subscribers is explicitly part of the | presence of cross traffic due to other subscribers is explicitly part | |||
network conditions and its effects are explicitly part of the test. | of the network conditions and its effects are explicitly part of the | |||
test. | ||||
@@@@ Need to distinguish between ISP managed sharing and unmanaged | These two cases do not cover all subpaths. For example, WiFI which | |||
sharing. e.g. WiFi | itself shares unmanaged channel space with other devices is unlikely | |||
to be unsuitable for any prescriptive measurement. | ||||
Note that canceling tests due to load on subscriber lines may | Note that canceling tests due to load on subscriber lines may | |||
introduce sampling errors for testing other parts of the | introduce sampling bias for testing other parts of the | |||
infrastructure. For this reason tests that are scheduled but not run | infrastructure. For this reason tests that are scheduled but not run | |||
due to load should be treated as a special case of "inconclusive". | due to load should be treated as a special case of "inconclusive". | |||
Use a passive packet or SNMP monitoring to verify that the traffic | ||||
volume on the subpath agrees with the traffic generated by a test. | ||||
Ideally this should be performed before, during and after each test. | ||||
The goal is provide quality assurance on the overall measurement | ||||
process, and specifically to detect the following measurement | ||||
failure: a user observes unexpectedly poor application performance, | ||||
the ISP observes that the access link is running at the rated | ||||
capacity. Both fail to observe that the user's computer has been | ||||
infected by a virus which is spewing traffic as fast as it can. | ||||
Parameters: | ||||
Maximum Cross Traffic Data Rate The amount of excess traffic | ||||
permitted. Note that this will be different for different tests. | ||||
One possible method is an adaptation of: www-didc.lbl.gov/papers/ | ||||
SCNM-PAM03.pdf D Agarwal etal. "An Infrastructure for Passive | ||||
Network Monitoring of Application Data Streams". Use the same | ||||
technique as that paper to trigger the capture of SNMP statistics for | ||||
the link. | ||||
6.3.3. Additional test preconditions | ||||
[Excess detail for this doc. To be summarized] | ||||
Send pre-load traffic as needed to activate radios with a sleep mode, | ||||
or other "reactive network" elements (term defined in | ||||
[draft-morton-ippm-2330-update-01]). | ||||
Use the procedure above to confirm that the pre-test background | ||||
traffic is low enough. | ||||
7. Diagnostic Tests | 7. Diagnostic Tests | |||
The diagnostic tests are organized by which properties are being | The diagnostic tests below are organized by traffic pattern: basic | |||
tested: run length, standing queues; slowstart bursts; sender rate | data rate and run length, standing queues, slowstart bursts, and | |||
bursts; and combined tests. The combined tests reduce overhead at | sender rate bursts. We also introduce some combined tests which are | |||
the expense of conflating the signatures of multiple failures. | more efficient the expense of conflating the signatures of different | |||
failures. | ||||
7.1. Basic Data Rate and Run Length Tests | 7.1. Basic Data Rate and Run Length Tests | |||
We propose several versions of the basic data rate and run length | We propose several versions of the basic data rate and run length | |||
test. All measure the number of packets delivered between losses or | test. All measure the number of packets delivered between losses or | |||
ECN marks, using a data stream that is rate controlled at or below | ECN marks, using a data stream that is rate controlled at or below | |||
the target_data_rate. | the target_data_rate. | |||
The tests below differ in how the data rate is controlled. The data | The tests below differ in how the data rate is controlled. The data | |||
can be paced on a timer, or window controlled at full target data | can be paced on a timer, or window controlled at full target data | |||
skipping to change at page 26, line 17 | skipping to change at page 28, line 8 | |||
7.1.1. Run Length at Paced Full Data Rate | 7.1.1. Run Length at Paced Full Data Rate | |||
Confirm that the observed run length is at least the | Confirm that the observed run length is at least the | |||
target_run_length while relying on timer to send data at the | target_run_length while relying on timer to send data at the | |||
target_rate using the procedure described in in Section 6.1.1 with a | target_rate using the procedure described in in Section 6.1.1 with a | |||
burst size of 1 (single packets). | burst size of 1 (single packets). | |||
The test is considered to be inconclusive if the packet transmission | The test is considered to be inconclusive if the packet transmission | |||
can not be accurately controlled for any reason. | can not be accurately controlled for any reason. | |||
7.1.2. run length at Full Data Windowed Rate | 7.1.2. Run Length at Full Data Windowed Rate | |||
Confirm that the observed run length is at least the | Confirm that the observed run length is at least the | |||
target_run_length while sending at an average rate equal to the | target_run_length while sending at an average rate equal to the | |||
target_data_rate, by controlling (or clamping) the window size of a | target_data_rate, by controlling (or clamping) the window size of a | |||
conventional transport protocol to a fixed value computed from the | conventional transport protocol to a fixed value computed from the | |||
properties of the test path, typically | properties of the test path, typically | |||
test_window=target_data_rate*test_RTT/target_MTU. | test_window=target_data_rate*test_RTT/target_MTU. | |||
Since losses and ECN marks generally cause transport protocols to at | Since losses and ECN marks generally cause transport protocols to at | |||
least temporarily reduce their data rates, this test is expected to | least temporarily reduce their data rates, this test is expected to | |||
skipping to change at page 26, line 45 | skipping to change at page 28, line 36 | |||
7.1.3. Background Run Length Tests | 7.1.3. Background Run Length Tests | |||
The background run length is a low rate version of the target target | The background run length is a low rate version of the target target | |||
rate test above, designed for ongoing lightweight monitoring for | rate test above, designed for ongoing lightweight monitoring for | |||
changes in the observed subpath run length without disrupting users. | changes in the observed subpath run length without disrupting users. | |||
It should be used in conjunction with one of the above full rate | It should be used in conjunction with one of the above full rate | |||
tests because it does not confirm that the subpath can support raw | tests because it does not confirm that the subpath can support raw | |||
data rate. | data rate. | |||
Existing loss metrics such as [RFC 6673] might be appropriate for | Existing loss metrics such as [RFC6673] might be appropriate for | |||
measuring background run length. | measuring background run length. | |||
7.2. Standing Queue tests | 7.2. Standing Queue tests | |||
These test confirm that the bottleneck is well behaved across the | These test confirm that the bottleneck is well behaved across the | |||
onset of packet loss, which typically follows after the onset of | onset of packet loss, which typically follows after the onset of | |||
queueing. Well behaved generally means lossless for transient | queueing. Well behaved generally means lossless for transient | |||
queues, but once the queue has been sustained for a sufficient period | queues, but once the queue has been sustained for a sufficient period | |||
of time (or a sufficient queue depth) there should be a small number | of time (or reaches a sufficient queue depth) there should be a small | |||
of losses to signal to the transport protocol that it should reduce | number of losses to signal to the transport protocol that it should | |||
its window. Losses that are too early can prevent the transport from | reduce its window. Losses that are too early can prevent the | |||
averaging at the target_data_rate. Losses that are too late indicate | transport from averaging at the target_data_rate. Losses that are | |||
that the queue might be subject to bufferbloat [Bufferbloat] and | too late indicate that the queue might be subject to bufferbloat | |||
inflict excess queuing delays on all flows sharing the bottleneck. | [Bufferbloat] and inflict excess queuing delays on all flows sharing | |||
Excess losses make loss recovery problematic for the transport | the bottleneck queue. Excess losses make loss recovery problematic | |||
protocol. Non-linear or erratic RTT fluctuations suggest poor | for the transport protocol. Non-linear or erratic RTT fluctuations | |||
interactions between the channel acquisition systems and the | suggest poor interactions between the channel acquisition systems and | |||
transport self clock. All of the tests in this section use the same | the transport self clock. All of the tests in this section use the | |||
basic scanning algorithm but score the link on the basis of how well | same basic scanning algorithm but score the link on the basis of how | |||
it avoids each of these problems. | well it avoids each of these problems. | |||
For some technologies the data might not be subject to increasing | For some technologies the data might not be subject to increasing | |||
delays, in which case the data rate will vary with the window size | delays, in which case the data rate will vary with the window size | |||
all the way up to the onset of losses or ECN marks. For theses | all the way up to the onset of losses or ECN marks. For theses | |||
technologies, the discussion of queueing does not apply, but it is | technologies, the discussion of queueing does not apply, but it is | |||
still required that the onset of losses (or ECN marks) be at an | still required that the onset of losses (or ECN marks) be at an | |||
appropriate point and progressive. | appropriate point and progressive. | |||
Use the procedure in Section 6.1.3 to sweep the window across the | Use the procedure in Section 6.1.3 to sweep the window across the | |||
onset of queueing and the onset of loss. The tests below all assume | onset of queueing and the onset of loss. The tests below all assume | |||
skipping to change at page 28, line 19 | skipping to change at page 30, line 9 | |||
A link passes the congestion avoidance standing queue test if more | A link passes the congestion avoidance standing queue test if more | |||
than target_run_length packets are delivered between the power point | than target_run_length packets are delivered between the power point | |||
(or test_window) and the first loss or ECN mark. If this test is | (or test_window) and the first loss or ECN mark. If this test is | |||
implemented using a standards congestion control algorithm with a | implemented using a standards congestion control algorithm with a | |||
clamp, it can be used in situ in the production internet as a | clamp, it can be used in situ in the production internet as a | |||
capacity test. For an example of such a test see [NPAD]. | capacity test. For an example of such a test see [NPAD]. | |||
7.2.2. Bufferbloat | 7.2.2. Bufferbloat | |||
This test confirms that there is some mechanism to limit buffer | This test confirms that there is some mechanism to limit buffer | |||
occupancy (e.g. prevents bufferbloat). Note that this is not | occupancy (e.g. that prevents bufferbloat). Note that this is not | |||
strictly a requirement for single stream bulk performance, however if | strictly a requirement for single stream bulk performance, however if | |||
there is no mechanism to limit buffer occupancy then a single stream | there is no mechanism to limit buffer occupancy then a single stream | |||
with sufficient data to deliver is likely to cause the problems | with sufficient data to deliver is likely to cause the problems | |||
described in [RFC 2309] and [Bufferbloat]. This may cause only minor | described in [RFC2309] and [Bufferbloat]. This may cause only minor | |||
symptoms for the dominant flow, but has the potential to make the | symptoms for the dominant flow, but has the potential to make the | |||
link unusable for all other flows and applications. | link unusable for other flows and applications. | |||
Pass if the onset of loss is before a standing queue has introduced | Pass if the onset of loss is before a standing queue has introduced | |||
more delay than than twice target_RTT, or other well defined limit. | more delay than than twice target_RTT, or other well defined limit. | |||
Note that there is not yet a model for how much standing queue is | Note that there is not yet a model for how much standing queue is | |||
acceptable. The factor of two chosen here reflects a rule of thumb. | acceptable. The factor of two chosen here reflects a rule of thumb. | |||
Note that in conjunction with the previous test, this test implies | Note that in conjunction with the previous test, this test implies | |||
that the first loss should occur at a queueing delay which is between | that the first loss should occur at a queueing delay which is between | |||
one and two times the target_RTT. | one and two times the target_RTT. | |||
7.2.3. Non excessive loss | 7.2.3. Non excessive loss | |||
skipping to change at page 28, line 51 | skipping to change at page 30, line 41 | |||
are expected from a simple drop tail queue. Although this test could | are expected from a simple drop tail queue. Although this test could | |||
be made more precise it is really included here for pedantic | be made more precise it is really included here for pedantic | |||
completeness. | completeness. | |||
7.2.4. Duplex Self Interference | 7.2.4. Duplex Self Interference | |||
This engineering test confirms a bound on the interactions between | This engineering test confirms a bound on the interactions between | |||
the forward data path and the ACK return path. Fail if the RTT rises | the forward data path and the ACK return path. Fail if the RTT rises | |||
by more than some fixed bound above the expected queueing time | by more than some fixed bound above the expected queueing time | |||
computed from trom the excess window divided by the link data rate. | computed from trom the excess window divided by the link data rate. | |||
@@@@ This needs further testing. | ||||
7.3. Slowstart tests | 7.3. Slowstart tests | |||
These tests mimic slowstart: data is sent at twice the effective | These tests mimic slowstart: data is sent at twice the effective | |||
bottleneck rate to exercise the queue at the dominant bottleneck. | bottleneck rate to exercise the queue at the dominant bottleneck. | |||
They are deemed inconclusive if the elapsed time to send the data | They are deemed inconclusive if the elapsed time to send the data | |||
burst is not less than half of the time to receive the ACKs. (i.e. | burst is not less than half of the time to receive the ACKs. (i.e. | |||
sending data too fast is ok, but sending it slower than twice the | sending data too fast is ok, but sending it slower than twice the | |||
actual bottleneck rate as indicated by the ACKs is deemed | actual bottleneck rate as indicated by the ACKs is deemed | |||
inconclusive). Space the bursts such that the average data rate is | inconclusive). Space the bursts such that the average data rate is | |||
equal to the target_data_rate. | equal to the target_data_rate. | |||
7.3.1. Full Window slowstart test | 7.3.1. Full Window slowstart test | |||
This is a capacity test to confirm that slowstart is not likely to | This is a capacity test to confirm that slowstart is not likely to | |||
exit prematurely. Send slowstart bursts that are target_pipe_size | exit prematurely. Send slowstart bursts that are target_pipe_size | |||
total packets. Accumulate packet delivery statistics as described in | total packets. | |||
Section 6.2.2 to score the outcome. Pass if it is statistically | ||||
significant that the observed run length is larger than the | Accumulate packet delivery statistics as described in Section 6.2.2 | |||
target_run_length. Fail if it is statistically significant that the | to score the outcome. Pass if it is statistically significant that | |||
observed run length is smaller than the target_run_length. | the observed run length is larger than the target_run_length. Fail | |||
if it is statistically significant that the observed run length is | ||||
smaller than the target_run_length. | ||||
Note that these are the same parameters as the Sender Full Window | Note that these are the same parameters as the Sender Full Window | |||
burst test, except the burst rate is at slowestart rate, rather than | burst test, except the burst rate is at slowestart rate, rather than | |||
sender interface rate. | sender interface rate. | |||
7.3.2. Slowstart AQM test | 7.3.2. Slowstart AQM test | |||
Do a continuous slowstart (send data continuously at slowstart_rate), | Do a continuous slowstart (send data continuously at slowstart_rate), | |||
until the first loss, stop, allow the network to drain and repeat, | until the first loss, stop, allow the network to drain and repeat, | |||
gathering statistics on the last packet delivered before the loss, | gathering statistics on the last packet delivered before the loss, | |||
the loss pattern, maximum RTT and window size. Justify the results. | the loss pattern, maximum observed RTT and window size. Justify the | |||
There is not currently sufficient theory justifying requiring any | results. There is not currently sufficient theory justifying | |||
particular result, however design decisions that affect the outcome | requiring any particular result, however design decisions that affect | |||
of this tests also affect how the network balances between long and | the outcome of this tests also affect how the network balances | |||
short flows (the "mice and elephants" problem) | between long and short flows (the "mice and elephants" problem). | |||
This is an engineering test: It would be best performed on a | This is an engineering test: It would be best performed on a | |||
quiescent network or testbed, since cross traffic has the potential | quiescent network or testbed, since cross traffic has the potential | |||
to change the results. | to change the results. | |||
7.4. Sender Rate Burst tests | 7.4. Sender Rate Burst tests | |||
These tests determine how well the network can deliver bursts sent at | These tests determine how well the network can deliver bursts sent at | |||
sender's interface rate. Note that this test most heavily exercises | sender's interface rate. Note that this test most heavily exercises | |||
the front path, and is likely to include infrastructure nominally out | the front path, and is likely to include infrastructure may be out of | |||
of scope. | scope for a subscriber ISP. | |||
Also, there are a several details that are not precisely defined. | Also, there are a several details that are not precisely defined. | |||
For starters there is not a standard server interface rate. 1 Gb/s is | For starters there is not a standard server interface rate. 1 Gb/s | |||
very common today, but higher rates (e.g. 10 Gb/s) are becoming cost | and 10 Gb/s are very common today, but higher rates will become cost | |||
effective and can be expected to be dominant some time in the future. | effective and can be expected to be dominant some time in the future. | |||
Current standards permit TCP to send a full window bursts following | Current standards permit TCP to send a full window bursts following | |||
an application pause. Congestion Window Validation [RFC 2861], is | an application pause. Congestion Window Validation [RFC2861], is not | |||
not required, but even if was it does not take effect until an | required, but even if was it does not take effect until an | |||
application pause is longer than an RTO. Since this is standard | application pause is longer than an RTO. Since this is standard | |||
behavior, it is desirable that the network be able to deliver it, | behavior, it is desirable that the network be able to deliver such | |||
otherwise application pauses will cause unwarranted losses. | bursts, otherwise application pauses will cause unwarranted losses. | |||
It is also understood in the application and serving community that | It is also understood in the application and serving community that | |||
interface rate bursts have a cost to the network that has to be | interface rate bursts have a cost to the network that has to be | |||
balanced against other costs in the servers themselves. For example | balanced against other costs in the servers themselves. For example | |||
TCP Segmentation Offload [TSO] reduces server CPU in exchange for | TCP Segmentation Offload [TSO] reduces server CPU in exchange for | |||
larger network bursts, which increase the stress on network buffer | larger network bursts, which increase the stress on network buffer | |||
memory. | memory. | |||
There is not yet theory to unify these costs or to provide a | There is not yet theory to unify these costs or to provide a | |||
framework for trying to optimize global efficiency. We do not yet | framework for trying to optimize global efficiency. We do not yet | |||
have a model for how much the network should tolerate server rate | have a model for how much the network should tolerate server rate | |||
bursts. Some bursts must be tolerated by the network, but it is | bursts. Some bursts must be tolerated by the network, but it is | |||
probably unreasonable to expect the network to efficiently deliver | probably unreasonable to expect the network to be able to efficiently | |||
all data as a series of bursts. | deliver all data as a series of bursts. | |||
For this reason, this is the only test for which we explicitly | For this reason, this is the only test for which we explicitly | |||
encourage detrateing. A TDS should include a table of pairs of | encourage detrateing. A TDS should include a table of pairs of | |||
derating parameters: what burst size to use as a fraction of the | derating parameters: what burst size to use as a fraction of the | |||
target_pipe_size, and how much each burst size is permitted to reduce | target_pipe_size, and how much each burst size is permitted to reduce | |||
the run length, relative to to the target_run_length. @@@@ Needs more | the run length, relative to to the target_run_length. | |||
work and experimentation. | ||||
7.5. Combined Tests | 7.5. Combined Tests | |||
These tests are more efficient from a deployment/operational | These tests are more efficient from a deployment/operational | |||
perspective, but may not be possible to diagnose if they fail. | perspective, but may not be possible to diagnose if they fail. | |||
7.5.1. Sustained burst test | 7.5.1. Sustained burst test | |||
Send target_pipe_size*derate sender interface rate bursts every | Send target_pipe_size*derate sender interface rate bursts every | |||
target_RTT*derate, for derate between 0 and 1. Verify that the | target_RTT*derate, for derate between 0 and 1. Verify that the | |||
observed run length meets target_run_length. Key observations: | observed run length meets target_run_length. Key observations: | |||
o This test is subpath RTT invariant, as long as the tester can | o This test is subpath RTT invariant, as long as the tester can | |||
generate the required pattern. | generate the required pattern. | |||
o The subpath under test is expected to go idle for some fraction of | o The subpath under test is expected to go idle for some fraction of | |||
the time: (subpath_data_rate-target_rate)/subpath_data_rate. | the time: (subpath_data_rate-target_rate)/subpath_data_rate. | |||
Failing to do so suggests a problem with the procedure. | Failing to do so suggests a problem with the procedure and an | |||
inconclusive test result. | ||||
o This test is more strenuous than the slowstart tests: they are not | o This test is more strenuous than the slowstart tests: they are not | |||
needed if the link passes this test with derate=1. | needed if the link passes this test with derate=1. | |||
o A link that passes this test is likely to be able to sustain | o A link that passes this test is likely to be able to sustain | |||
higher rates (close to subpath_data_rate) for paths with RTTs | higher rates (close to subpath_data_rate) for paths with RTTs | |||
smaller than the target_RTT. Offsetting this performance | smaller than the target_RTT. Offsetting this performance | |||
underestimation is part of the rationale behind permitting | underestimation is part of the rationale behind permitting | |||
derating in general. | derating in general. | |||
o This test can be implemented with standard instrumented TCP[RFC | ||||
4898], using a specialized measurement application at one end and | o This test can be implemented with standard instrumented | |||
a minimal service at the other end [RFC 863, RFC 864]. It may | TCP[RFC4898], using a specialized measurement application at one | |||
require tweaks to the TCP implementation. | end and a minimal service at the other end [RFC 863, RFC 864]. It | |||
may require tweaks to the TCP implementation. [MBMSource] | ||||
o This test is efficient to implement, since it does not require | o This test is efficient to implement, since it does not require | |||
per-packet timers, and can make use of TSO in modern NIC hardware. | per-packet timers, and can make use of TSO in modern NIC hardware. | |||
o This test is not totally sufficient: the standing window | o This test is not totally sufficient: the standing window | |||
engineering tests are also needed to be sure that the link is well | engineering tests are also needed to be sure that the link is well | |||
behaved at and beyond the onset of congestion. | behaved at and beyond the onset of congestion. | |||
o This one test can be proven to be the one capacity test to | o This one test can be proven to be the one capacity test to | |||
supplant them all. | supplant them all. | |||
7.5.2. Live Streaming Media | 7.5.2. Live Streaming Media | |||
Model Based Metrics can be implemented as a side effect of serving | Model Based Metrics can be implemented as a side effect of serving | |||
any non-throughput maximizing traffic, such as streaming media, by | any non-throughput maximizing traffic*, such as streaming media, with | |||
applying some additional controls to the traffic. The essential | some additional controls and instrumentation in the servers. The | |||
requirement is that the traffic be constrained such that even with | essential requirement is that the traffic be constrained such that | |||
arbitrary application pauses, bursts and data rate fluctuations the | even with arbitrary application pauses, bursts and data rate | |||
traffic stays within the envelope determined by all of the individual | fluctuations, the traffic stays within the envelope defined by the | |||
tests described above, for a specific TDS. | individual tests described above, for a specific TDS. | |||
If the serving RTT is less than the target_RTT, this constraint is | If the serving_data_rate is less than or equal to the | |||
most easily implemented by clamping the transport window size to | target_data_rate and the serving_RTT (the RTT between the sender and | |||
test_window=target_data_rate*serving_RTT/target_MTU. This | client) is less than the target_RTT, this constraint is most easily | |||
test_window size will limit the both the serving data rate and burst | implemented by clamping the transport window size to: | |||
sizes to be no larger than the procedures in Section 7.1.2 and | ||||
Section 7.4, assuming burst size derating equal to the serving_RTT | serving_window_clamp=target_data_rate*serving_RTT/ | |||
divided by the target_RTT. | (target_MTU-header_overhead) | |||
The serving_window_clamp will limit the both the serving data rate | ||||
and burst sizes to be no larger than the procedures in Section 7.1.2 | ||||
and Section 7.4 or Section 7.5.1. Since the serving RTT is smaller | ||||
than the target_RTT, the worst case bursts that might be generated | ||||
under these conditions will be smaller than called for by Section 7.4 | ||||
and the sender rate burst sizes are implicitly derated by the | ||||
serving_window_clamp divided by the target_pipe_size at the very | ||||
least. (The traffic might be smoother than specified by the sender | ||||
interface rate bursts test.) | ||||
Note that if the application tolerates fluctuations in its actual | Note that if the application tolerates fluctuations in its actual | |||
data rate (say by use of a playout buffer) it is important that the | data rate (say by use of a playout buffer) it is important that the | |||
target_data_rate be above the actual average rate needed by the | target_data_rate be above the actual average rate needed by the | |||
application so it can recover after transient pauses caused by | application so it can recover after transient pauses caused by | |||
congestion or the application itself. Since the serving RTT is | congestion or the application itself. | |||
smaller than the target_RTT, the worst case bursts that might be | ||||
generated under these conditions are smaller than called for by | Alternatively the sender data rate and bursts might be explicitly | |||
Section 7.4 | controlled by a host shaper or pacing at the sender. This would | |||
provide better control and work for serving_RTTs that are larger than | ||||
the target_RTT, but it is substantially more complicated to | ||||
implement. With this technique, any traffic might be used for | ||||
measurement. | ||||
* Note that this technique might be applied to any content, if users | ||||
are willing to tolerate reduced data rate to inhibit TCP equilibrium | ||||
behavior. | ||||
8. Examples | 8. Examples | |||
In this section we present TDS for a couple of performance | In this section we present TDS for a couple of performance | |||
specifications. | specifications. | |||
Tentatively: 5 Mb/s*50 ms, 1 Mb/s*50ms, 250kbp*100mS | Tentatively: 5 Mb/s*50 ms, 1 Mb/s*50ms, 250kbp*100mS | |||
8.1. Near serving HD streaming video | 8.1. Near serving HD streaming video | |||
Today the best quality HD video requires slightly less than 5 Mb/s | Today the best quality HD video requires slightly less than 5 Mb/s | |||
[HDvideo]. Since it is desirable to serve such content locally, we | [HDvideo]. Since it is desirable to serve such content locally, we | |||
assume that the content will be within 50 mS, which is enough to | assume that the content will be within 50 mS, which is enough to | |||
cover continental Europe or either US coast. | cover continental Europe or either US coast from a single site. | |||
5 Mb/s over a 50 ms path | 5 Mb/s over a 50 ms path | |||
+----------------------+-------+---------+ | +----------------------+-------+---------+ | |||
| End to End Parameter | Value | units | | | End to End Parameter | Value | units | | |||
+----------------------+-------+---------+ | +----------------------+-------+---------+ | |||
| target_rate | 5 | Mb/s | | | target_rate | 5 | Mb/s | | |||
| target_RTT | 50 | ms | | | target_RTT | 50 | ms | | |||
| traget_MTU | 1500 | bytes | | | traget_MTU | 1500 | bytes | | |||
| target_pipe_size | 22 | packets | | | target_pipe_size | 22 | packets | | |||
skipping to change at page 33, line 5 | skipping to change at page 35, line 5 | |||
Table 1 | Table 1 | |||
This example uses the most conservative TCP model and no derating. | This example uses the most conservative TCP model and no derating. | |||
8.2. Far serving SD streaming video | 8.2. Far serving SD streaming video | |||
Standard Quality video typically fits in 1 Mb/s [SDvideo]. This can | Standard Quality video typically fits in 1 Mb/s [SDvideo]. This can | |||
be reasonably delivered via longer paths with larger. We assume | be reasonably delivered via longer paths with larger. We assume | |||
100mS. | 100mS. | |||
5 Mb/s over a 50 ms path | 1 Mb/s over a 100 ms path | |||
+----------------------+-------+---------+ | +----------------------+-------+---------+ | |||
| End to End Parameter | Value | units | | | End to End Parameter | Value | units | | |||
+----------------------+-------+---------+ | +----------------------+-------+---------+ | |||
| target_rate | 1 | Mb/s | | | target_rate | 1 | Mb/s | | |||
| target_RTT | 100 | ms | | | target_RTT | 100 | ms | | |||
| traget_MTU | 1500 | bytes | | | traget_MTU | 1500 | bytes | | |||
| target_pipe_size | 9 | packets | | | target_pipe_size | 9 | packets | | |||
| target_run_length | 243 | packets | | | target_run_length | 243 | packets | | |||
+----------------------+-------+---------+ | +----------------------+-------+---------+ | |||
skipping to change at page 33, line 43 | skipping to change at page 35, line 43 | |||
| target_RTT | 200 | ms | | | target_RTT | 200 | ms | | |||
| traget_MTU | 1500 | bytes | | | traget_MTU | 1500 | bytes | | |||
| target_pipe_size | 1741 | packets | | | target_pipe_size | 1741 | packets | | |||
| target_run_length | 9093243 | packets | | | target_run_length | 9093243 | packets | | |||
+----------------------+---------+---------+ | +----------------------+---------+---------+ | |||
Table 3 | Table 3 | |||
9. Validation | 9. Validation | |||
This document permits alternate models and parameter derating, as | Since some aspects of the models are likely to be too conservative, | |||
described in Section 5.2 and Section 5.3. In exchange for this | Section 5.2 and Section 5.3 permit alternate protocol models and test | |||
latitude in the modelling process it requires the ability to | parameter derating. In exchange for this latitude in the modelling | |||
demonstrate authentic applications and protocol implementations | process, we require demonstrations that such a TDS can robustly | |||
meeting the target end-to-end performance goals over infrastructure | detect links that will prevent authentic applications using state-of- | |||
that infinitessimally passes the TDS. | the-art protocol implementations from meeting the specified | |||
performance targets. This correctness criteria is potentially | ||||
difficult to prove, because it implicitly requires validating a TDS | ||||
against all possible links and subpaths. | ||||
The validation process relies on constructing a test network such | We suggest two strategies, both of which should be applied: first, | |||
that all of the individual load tests pass only infinitessimally, and | publish a fully open description of the TDS, including what | |||
proving that an authentic application running over a real TCP | assumptions were used and and how it was derived, such that the | |||
implementation (or other protocol as appropriate) can be expected to | research community can evaluate these decisions, test them and | |||
meet the end-to-end target parameters on such a network. | comment on there applicability; and second, demonstrate that an | |||
applications running over an infinitessimally passing testbed do meet | ||||
the performance targets. | ||||
An infinitessimally passing testbed resembles a epsilon-delta proof | ||||
in calculus. Construct a test network such that all of the | ||||
individual tests of the TDS only pass by small (infinitesimal) | ||||
margins, and demonstrate that a variety of authentic applications | ||||
running over real TCP implementations (or other protocol as | ||||
appropriate) meets the end-to-end target parameters over such a | ||||
network. The workloads should include multiple types of streaming | ||||
media and transaction oriented short flows (e.g. synthetic web | ||||
traffic ). | ||||
For example using our example in our HD streaming video TDS described | For example using our example in our HD streaming video TDS described | |||
in Section 8.1, the bottleneck data rate should be 5 Mb/s, the per | in Section 8.1, the bottleneck data rate should be 5 Mb/s, the per | |||
packet random background loss probability should be 1/1453, for a run | packet random background loss probability should be 1/1453, for a run | |||
length of 1452 packets, the bottleneck queue should be 22 packets and | length of 1452 packets, the bottleneck queue should be 22 packets and | |||
the front path should have just enough buffering to withstand 22 | the front path should have just enough buffering to withstand 22 | |||
packet line rate bursts. We want every one of the TDS tests to fail | packet line rate bursts. We want every one of the TDS tests to fail | |||
if we slightly increase the relevant test parameter, so for example | if we slightly increase the relevant test parameter, so for example | |||
sending a 23 packet slowstart bursts should cause excess (possibly | sending a 23 packet slowstart bursts should cause excess (possibly | |||
deterministic) packet drops at the dominant queue at the bottleneck. | deterministic) packet drops at the dominant queue at the bottleneck. | |||
On this infinitessimally passing network it should be possible for a | On this infinitessimally passing network it should be possible for a | |||
real ral application using a stock TCP implementation in the vendor's | real ral application using a stock TCP implementation in the vendor's | |||
default configuration to attain 5 Mb/s over an 50 mS path. | default configuration to attain 5 Mb/s over an 50 mS path. | |||
@@@@ Need to better specify the workload: both short and long flows. | The most difficult part of setting up such a testbed is arranging to | |||
The difficult part of this process is arranging for each subpath to | ||||
infinitesimally pass the individual tests. We suggest two | infinitesimally pass the individual tests. We suggest two | |||
approaches: constraining resources in devices by configuring them not | approaches: constraining the network devices not to use all available | |||
to use all available buffer space or data rate; and preloading | resources (limiting available buffer space or data rate); and | |||
subpaths with cross traffic. Note that is it important that a single | preloading subpaths with cross traffic. Note that is it important | |||
environment is constructed that infinitessimally passes all tests, | that a single environment be constructed which infinitessimally | |||
otherwise there is a chance that TCP can exploit extra latitude in | passes all tests at the same time, otherwise there is a chance that | |||
some parameters (such as data rate) to partially compensate for | TCP can exploit extra latitude in some parameters (such as data rate) | |||
constraints in other parameters. | to partially compensate for constraints in other parameters (queue | |||
space, or viceversa). | ||||
If a TDS validated according to these procedures is used to inform | To the extent that a TDS is used to inform public dialog it should be | |||
public dialog, the validation experiment itself should also be public | fully publicly documented, including the details of the tests, what | |||
with sufficient precision for the experiment to be replicated by | assumptions were used and how it was derived. All of the details of | |||
other researchers. All components should either be open source of | the validation experiment should also be public with sufficient | |||
fully specified proprietary implementations that are available to the | detail for the experiments to be replicated by other researchers. | |||
research community. | All components should either be open source of fully described | |||
proprietary implementations that are available to the research | ||||
community. | ||||
TODO: paper proving the validation process. | This work here is inspired by open tools running on an open platform, | |||
using open techniques to collect open data. See Measurement Lab | ||||
[http://www.measurementlab.net/] | ||||
10. Acknowledgements | 10. Acknowledgements | |||
Ganga Maguluri suggested the statistical test for measuring loss | Ganga Maguluri suggested the statistical test for measuring loss | |||
probability in the target run length. | probability in the target run length. Alex Gilgur for helping with | |||
the statistics and contributing and alternate model. | ||||
Meredith Whittaker for improving the clarity of the communications. | Meredith Whittaker for improving the clarity of the communications. | |||
11. Informative References | 11. Informative References | |||
[RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering, | ||||
S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., | ||||
Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, | ||||
S., Wroclawski, J., and L. Zhang, "Recommendations on | ||||
Queue Management and Congestion Avoidance in the | ||||
Internet", RFC 2309, April 1998. | ||||
[RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, | [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, | |||
"Framework for IP Performance Metrics", RFC 2330, | "Framework for IP Performance Metrics", RFC 2330, | |||
May 1998. | May 1998. | |||
[RFC2861] Handley, M., Padhye, J., and S. Floyd, "TCP Congestion | ||||
Window Validation", RFC 2861, June 2000. | ||||
[RFC3148] Mathis, M. and M. Allman, "A Framework for Defining | ||||
Empirical Bulk Transfer Capacity Metrics", RFC 3148, | ||||
July 2001. | ||||
[RFC3465] Allman, M., "TCP Congestion Control with Appropriate Byte | ||||
Counting (ABC)", RFC 3465, February 2003. | ||||
[RFC4898] Mathis, M., Heffner, J., and R. Raghunarayan, "TCP | ||||
Extended Statistics MIB", RFC 4898, May 2007. | ||||
[RFC4737] Morton, A., Ciavattone, L., Ramachandran, G., Shalunov, | [RFC4737] Morton, A., Ciavattone, L., Ramachandran, G., Shalunov, | |||
S., and J. Perser, "Packet Reordering Metrics", RFC 4737, | S., and J. Perser, "Packet Reordering Metrics", RFC 4737, | |||
November 2006. | November 2006. | |||
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion | [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion | |||
Control", RFC 5681, September 2009. | Control", RFC 5681, September 2009. | |||
[RFC5835] Morton, A. and S. Van den Berghe, "Framework for Metric | [RFC5835] Morton, A. and S. Van den Berghe, "Framework for Metric | |||
Composition", RFC 5835, April 2010. | Composition", RFC 5835, April 2010. | |||
[RFC6049] Morton, A. and E. Stephan, "Spatial Composition of | [RFC6049] Morton, A. and E. Stephan, "Spatial Composition of | |||
Metrics", RFC 6049, January 2011. | Metrics", RFC 6049, January 2011. | |||
[RFC6673] Morton, A., "Round-Trip Packet Loss Metrics", RFC 6673, | ||||
August 2012. | ||||
[I-D.morton-ippm-lmap-path] | [I-D.morton-ippm-lmap-path] | |||
Bagnulo, M., Burbridge, T., Crawford, S., Eardley, P., and | Bagnulo, M., Burbridge, T., Crawford, S., Eardley, P., and | |||
A. Morton, "A Reference Path and Measurement Points for | A. Morton, "A Reference Path and Measurement Points for | |||
LMAP", draft-morton-ippm-lmap-path-00 (work in progress), | LMAP", draft-morton-ippm-lmap-path-00 (work in progress), | |||
January 2013. | January 2013. | |||
[MSMO97] Mathis, M., Semke, J., Mahdavi, J., and T. Ott, "The | [MSMO97] Mathis, M., Semke, J., Mahdavi, J., and T. Ott, "The | |||
Macroscopic Behavior of the TCP Congestion Avoidance | Macroscopic Behavior of the TCP Congestion Avoidance | |||
Algorithm", Computer Communications Review volume 27, | Algorithm", Computer Communications Review volume 27, | |||
number3, July 1997. | number3, July 1997. | |||
skipping to change at page 35, line 52 | skipping to change at page 38, line 43 | |||
[MBMSource] | [MBMSource] | |||
Hamon, D., "Git Repository for Model Based Metrics", | Hamon, D., "Git Repository for Model Based Metrics", | |||
Sept 2013, <https://github.com/m-lab/MBM>. | Sept 2013, <https://github.com/m-lab/MBM>. | |||
[Pathdiag] | [Pathdiag] | |||
Mathis, M., Heffner, J., O'Neil, P., and P. Siemsen, | Mathis, M., Heffner, J., O'Neil, P., and P. Siemsen, | |||
"Pathdiag: Automated TCP Diagnosis", Passive and Active | "Pathdiag: Automated TCP Diagnosis", Passive and Active | |||
Measurement , June 2008. | Measurement , June 2008. | |||
[BScope] Broswerscope, "Browserscope Network tests", Sept 2012, | [StatQC] Montgomery, D., "Introduction to Statistical Quality | |||
<http://www.browserscope.org/?category=network>. | Control - 2nd ed.", ISBN 0-471-51988-X, 1990. | |||
[Rtool] R Development Core Team, "R: A language and environment | [Rtool] R Development Core Team, "R: A language and environment | |||
for statistical computing. R Foundation for Statistical | for statistical computing. R Foundation for Statistical | |||
Computing, Vienna, Austria. ISBN 3-900051-07-0, URL | Computing, Vienna, Austria. ISBN 3-900051-07-0, URL | |||
http://www.R-project.org/", , 2011. | http://www.R-project.org/", , 2011. | |||
[StatQC] Montgomery, D., "Introduction to Statistical Quality | ||||
Control - 2nd ed.", ISBN 0-471-51988-X, 1990. | ||||
[CVST] Krueger, T. and M. Braun, "R package: Fast Cross- | [CVST] Krueger, T. and M. Braun, "R package: Fast Cross- | |||
Validation via Sequential Testing", version 0.1, 11 2012. | Validation via Sequential Testing", version 0.1, 11 2012. | |||
[LMCUBIC] Ledesma Goyzueta, R. and Y. Chen, "A Deterministic Loss | [LMCUBIC] Ledesma Goyzueta, R. and Y. Chen, "A Deterministic Loss | |||
Model Based Analysis of CUBIC, IEEE International | Model Based Analysis of CUBIC, IEEE International | |||
Conference on Computing, Networking and Communications | Conference on Computing, Networking and Communications | |||
(ICNC), E-ISBN : 978-1-4673-5286-4", January 2013. | (ICNC), E-ISBN : 978-1-4673-5286-4", January 2013. | |||
Appendix A. Model Derivations | Appendix A. Model Derivations | |||
The reference target_run_length described in Section 5.2 is based on | The reference target_run_length described in Section 5.2 is based on | |||
very conservative assumptions: that all window above target_pipe_size | very conservative assumptions: that all window above target_pipe_size | |||
contributes to a standing queue that raises the RTT, and that classic | contributes to a standing queue that raises the RTT, and that classic | |||
Reno congestion control is in effect. In this section we provide two | Reno congestion control with delayed ACKs are in effect. In this | |||
alternative calculations using different assumptions. | section we provide two alternative calculations using different | |||
assumptions. | ||||
It may seem out of place to allow such latitude in a measurement | It may seem out of place to allow such latitude in a measurement | |||
standard, but the section provides offsetting requirements. | standard, but the section provides offsetting requirements. | |||
These models provide estimates that make the most sense if network | The estimates provided by these models make the most sense if network | |||
performance is viewed logarithmically. In the operational internet, | performance is viewed logarithmically. In the operational Internet, | |||
data rates span more than 8 orders of magnitude, RTT spans more than | data rates span more than 8 orders of magnitude, RTT spans more than | |||
3 orders of magnitude, and loss probability spans at least 8 orders | 3 orders of magnitude, and loss probability spans at least 8 orders | |||
of magnitude. When viewed logarithmically (as in decibels), these | of magnitude. When viewed logarithmically (as in decibels), these | |||
correspond to 80 dB of dynamic range. On an 80 db scale, a 3 dB | correspond to 80 dB of dynamic range. On an 80 db scale, a 3 dB | |||
error is less than 4% of the scale, even though it might represent a | error is less than 4% of the scale, even though it might represent a | |||
factor of 2 in raw parameter. | factor of 2 in untransformed parameter. | |||
Although this document gives a lot of latitude for calculating | This document gives a lot of latitude for calculating | |||
target_run_length, people designing suites of tests need to consider | target_run_length, however people designing a TDS should consider the | |||
the effect of their choices on the ongoing conversation and tussle | effect of their choices on the ongoing tussle about the relevance of | |||
about the relevance of "TCP friendliness" as an appropriate model for | "TCP friendliness" as an appropriate model for Internet capacity | |||
capacity allocation. Choosing a target_run_length that is | allocation. Choosing a target_run_length that is substantially | |||
substantially smaller than the reference target_run_length specified | smaller than the reference target_run_length specified in Section 5.2 | |||
in Section 5.2 is equivalent to saying that it is appropriate for the | strengthens the argument that it may be appropriate to abandon "TCP | |||
transport research community to abandon "TCP friendliness" as a | friendliness" as the Internet fairness model. This gives developers | |||
fairness model and to develop more aggressive Internet transport | incentive and permission to develop even more aggressive applications | |||
protocols, and for applications to continue (or even increase) the | and protocols, for example by increasing the number of connections | |||
number of connections that they open concurrently. | that they open concurrently. | |||
A.1. Aggregate Reno | A.1. Queueless Reno | |||
In Section 5.2 it is assumed that the target rate is the same as the | In Section 5.2 it is assumed that the target rate is the same as the | |||
link rate, and any excess window causes a standing queue at the | link rate, and any excess window causes a standing queue at the | |||
bottleneck. This might be representative of a non-shared access | bottleneck. This might be representative of a non-shared access | |||
link. An alternative situation would be a heavily aggregated subpath | link. An alternative situation would be a heavily aggregated subpath | |||
where individual flows do not significantly contribute to the | where individual flows do not significantly contribute to the | |||
queueing delay, and losses are determined monitoring the average data | queueing delay, and losses are determined monitoring the average data | |||
rate, for example by the use of a virtual queue as in [AFD]. In such | rate, for example by the use of a virtual queue as in [AFD]. In such | |||
a scheme the RTT is constant and TCP's AIMD congestion control causes | a scheme the RTT is constant and TCP's AIMD congestion control causes | |||
the data rate to fluctuate in a sawtooth. If the traffic is being | the data rate to fluctuate in a sawtooth. If the traffic is being | |||
controlled in a manner that is consistent with the metrics here, goal | controlled in a manner that is consistent with the metrics here, goal | |||
would be to make the actual average rate equal to the | would be to make the actual average rate equal to the | |||
target_data_rate. | target_data_rate. | |||
We can derive a model for Reno TCP and delayed ACK under the above | We can derive a model for Reno TCP and delayed ACK under the above | |||
set of assumptions: for some value of Wmin, the window will sweep | set of assumptions: for some value of Wmin, the window will sweep | |||
from Wmin to 2*Wmin in 2*Wmin RTT. Between losses each sawtooth | from Wmin to 2*Wmin in 2*Wmin RTT. Unlike the queueing case where | |||
delivers (1/2)(Wmin+2*Wmin)(2Wmin) packets in 2*Wmin round trip | Wmin = Target_pipe_size, we want the average of Wmin and 2*Wmin to be | |||
times. However, unlike the queueing case where Wmin = | the target_pipe_size, so the average rate is the target rate. Thus | |||
Target_pipe_size, we want the average of Wmin and 2*Wmin to be the | we want Wmin = (2/3)*target_pipe_size. | |||
target_pipe_size, so the average rate is the target rate. Thus we | ||||
want Wmin = (2/3)*target_pipe_size. | ||||
(@@@@ something is wrong above) Substituting these together we get: | Between losses each sawtooth delivers (1/2)(Wmin+2*Wmin)(2Wmin) | |||
packets in 2*Wmin round trip times. | ||||
target_run_length = (8/3)(target_pipe_size^2) | Substituting these together we get: | |||
Note that this is always 88% of the reference run length. | target_run_length = (4/3)(target_pipe_size^2) | |||
Note that this is 44% of the reference run length. This makes sense | ||||
because under the assumptions in Section 5.2 the AMID sawtooth caused | ||||
a queue at the bottleneck, which raised the effective RTT by 50%. | ||||
A.2. CUBIC | A.2. CUBIC | |||
CUBIC has three operating regions. The model for the expected value | CUBIC has three operating regions. The model for the expected value | |||
of window size derived in [LMCUBIC] assumes operation in the | of window size derived in [LMCUBIC] assumes operation in the | |||
"concave" region only, which is a non-TCP friendly region for long- | "concave" region only, which is a non-TCP friendly region for long- | |||
lived flows. The authors make the following assumptions: packet loss | lived flows. The authors make the following assumptions: packet loss | |||
probability, p, is independent and periodic, losses occur one at a | probability, p, is independent and periodic, losses occur one at a | |||
time, and they are true losses due to tail drop or corruption. This | time, and they are true losses due to tail drop or corruption. This | |||
definition of p aligns very well with our definition of | definition of p aligns very well with our definition of | |||
skipping to change at page 38, line 11 | skipping to change at page 41, line 4 | |||
authors transform the time to reach the maximum Window size in terms | authors transform the time to reach the maximum Window size in terms | |||
of RTT and a parameter for the multiplicative rate decrease on | of RTT and a parameter for the multiplicative rate decrease on | |||
observing loss, beta (whose default value is 0.2 in CUBIC). The | observing loss, beta (whose default value is 0.2 in CUBIC). The | |||
expected value of Window size, E[W], is also dependent on C, a | expected value of Window size, E[W], is also dependent on C, a | |||
parameter of CUBIC that determines its window-growth aggressiveness | parameter of CUBIC that determines its window-growth aggressiveness | |||
(values from 0.01 to 4). | (values from 0.01 to 4). | |||
E[W] = ( C*(RTT/p)^3 * ((4-beta)/beta) )^-4 | E[W] = ( C*(RTT/p)^3 * ((4-beta)/beta) )^-4 | |||
and, further assuming Poisson arrival, the mean throughput, x, is | and, further assuming Poisson arrival, the mean throughput, x, is | |||
x = E[W]/RTT | x = E[W]/RTT | |||
We note that under these conditions (deterministic single losses), | We note that under these conditions (deterministic single losses), | |||
the value of E[W] is always greater than 0.8 of the maximum window | the value of E[W] is always greater than 0.8 of the maximum window | |||
size ~= reference_run_length. (as far as I can tell) | size ~= reference_run_length. (as far as I can tell) | |||
Commentary on the consequence of the choice. | Appendix B. Complex Queueing | |||
Appendix B. Version Control | For many network technologies simple queueing models do not apply: | |||
the network schedules, thins or otherwise alters the timing of ACKs | ||||
and data, generally to raise the efficiency of the channel allocation | ||||
process when confronted with relatively widely spaced small ACKs. | ||||
These efficiency strategies are ubiquitous for half duplex, wireless | ||||
and broadcast media. | ||||
Formatted: Mon Oct 21 15:42:35 PDT 2013 | Altering the ACK stream generally has two consequences: it raises the | |||
effective bottleneck data rate, making slowstart burst at higher | ||||
rates (possibly as high as the sender's interface rate) and it | ||||
effectively raises the RTT by the average time that the ACKs were | ||||
delayed. The first effect can be partially mitigated by reclocking | ||||
ACKs once they are beyond the bottleneck on the return path to the | ||||
sender, however this further raises the effective RTT. | ||||
The most extreme example of this sort of behavior would be a half | ||||
duplex channel that is not released as long as end point currently | ||||
holding the channel has pending traffic. Such environments cause | ||||
self clocked protocols under full load to revert to extremely | ||||
inefficient stop and wait behavior, where they send an entire window | ||||
of data as a single burst, followed by the entire window of ACKs on | ||||
the return path. | ||||
If a particular end-to-end path contains a link or device that alters | ||||
the ACK stream, then the entire path from the sender up to the | ||||
bottleneck must be tested at the burst parameters implied by the ACK | ||||
scheduling algorithm. The most important parameter is the Effective | ||||
Bottleneck Data Rate, which is the average rate at which the ACKs | ||||
advance snd.una. Note that thinning the ACKs (relying on the | ||||
cumulative nature of seg.ack to permit discarding some ACKs) is | ||||
implies an effectively infinite bottleneck data rate. It is | ||||
important to note that due to the self clock, ill conceived channel | ||||
allocation mechanisms can increase the stress on upstream links in a | ||||
long path. | ||||
Holding data or ACKs for channel allocation or other reasons (such as | ||||
error correction) always raises the effective RTT relative to the | ||||
minimum delay for the path. Therefore it may be necessary to replace | ||||
target_RTT in the calculation in Section 5.2 by an effective_RTT, | ||||
which includes the target_RTT reflecting the fixed part of the path | ||||
plus a term to account for the extra delays introduced by these | ||||
mechanisms. | ||||
Appendix C. Version Control | ||||
Formatted: Fri Feb 14 14:07:33 PST 2014 | ||||
Authors' Addresses | Authors' Addresses | |||
Matt Mathis | Matt Mathis | |||
Google, Inc | Google, Inc | |||
1600 Amphitheater Parkway | 1600 Amphitheater Parkway | |||
Mountain View, California 93117 | Mountain View, California 94043 | |||
USA | USA | |||
Email: mattmathis@google.com | Email: mattmathis@google.com | |||
Al Morton | Al Morton | |||
AT&T Labs | AT&T Labs | |||
200 Laurel Avenue South | 200 Laurel Avenue South | |||
Middletown, NJ 07748 | Middletown, NJ 07748 | |||
USA | USA | |||
End of changes. 169 change blocks. | ||||
736 lines changed or deleted | 908 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |