draft-ietf-opsawg-large-flow-load-balancing-14.txt | draft-ietf-opsawg-large-flow-load-balancing-15.txt | |||
---|---|---|---|---|
OPSAWG R. Krishnan | OPSAWG R. Krishnan | |||
Internet Draft Brocade Communications | Internet Draft Brocade Communications | |||
Intended status: Informational L. Yong | Intended status: Informational L. Yong | |||
Expires: March 13, 2015 Huawei USA | Expires: April 6, 2015 Huawei USA | |||
A. Ghanwani | A. Ghanwani | |||
Dell | Dell | |||
Ning So | Ning So | |||
Tata Communications | Tata Communications | |||
B. Khasnabish | B. Khasnabish | |||
ZTE Corporation | ZTE Corporation | |||
September 26, 2014 | October 7, 2014 | |||
Mechanisms for Optimizing LAG/ECMP Component Link Utilization in | Mechanisms for Optimizing LAG/ECMP Component Link Utilization in | |||
Networks | Networks | |||
draft-ietf-opsawg-large-flow-load-balancing-14.txt | draft-ietf-opsawg-large-flow-load-balancing-15.txt | |||
Status of this Memo | Status of this Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. This document may not be modified, | provisions of BCP 78 and BCP 79. This document may not be modified, | |||
and derivative works of it may not be created, except to publish it | and derivative works of it may not be created, except to publish it | |||
as an RFC and to translate it into languages other than English. | as an RFC and to translate it into languages other than English. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
skipping to change at page 1, line 42 | skipping to change at page 1, line 42 | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html | http://www.ietf.org/shadow.html | |||
This Internet-Draft will expire on March 26, 2015. | This Internet-Draft will expire on April 67, 2014. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2014 IETF Trust and the persons identified as the | Copyright (c) 2014 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
skipping to change at page 2, line 35 | skipping to change at page 2, line 32 | |||
bandwidth scaling. This draft explores some of the mechanisms useful | bandwidth scaling. This draft explores some of the mechanisms useful | |||
for achieving this. | for achieving this. | |||
Table of Contents | Table of Contents | |||
1. Introduction...................................................3 | 1. Introduction...................................................3 | |||
1.1. Acronyms..................................................4 | 1.1. Acronyms..................................................4 | |||
1.2. Terminology...............................................4 | 1.2. Terminology...............................................4 | |||
2. Flow Categorization............................................5 | 2. Flow Categorization............................................5 | |||
3. Hash-based Load Distribution in LAG/ECMP.......................6 | 3. Hash-based Load Distribution in LAG/ECMP.......................6 | |||
4. Mechanisms for Optimizing LAG/ECMP Component Link Utilization..8 | 4. Mechanisms for Optimizing LAG/ECMP Component Link Utilization..7 | |||
4.1. Differences in LAG vs ECMP................................8 | 4.1. Differences in LAG vs ECMP................................8 | |||
4.2. Operational Overview.....................................10 | 4.2. Operational Overview......................................9 | |||
4.3. Large Flow Recognition...................................11 | 4.3. Large Flow Recognition...................................10 | |||
4.3.1. Flow Identification.................................11 | 4.3.1. Flow Identification.................................10 | |||
4.3.2. Criteria and Techniques for Large Flow Recognition..11 | 4.3.2. Criteria and Techniques for Large Flow Recognition..11 | |||
4.3.3. Sampling Techniques.................................12 | 4.3.3. Sampling Techniques.................................11 | |||
4.3.4. Inline Data Path Measurement........................13 | 4.3.4. Inline Data Path Measurement........................13 | |||
4.3.5. Use of Multiple Methods for Large Flow Recognition..14 | 4.3.5. Use of Multiple Methods for Large Flow Recognition..14 | |||
4.4. Load Rebalancing Options.................................15 | 4.4. Load Rebalancing Options.................................14 | |||
4.4.1. Alternative Placement of Large Flows................15 | 4.4.1. Alternative Placement of Large Flows................14 | |||
4.4.2. Redistributing Small Flows..........................15 | 4.4.2. Redistributing Small Flows..........................15 | |||
4.4.3. Component Link Protection Considerations............16 | 4.4.3. Component Link Protection Considerations............15 | |||
4.4.4. Load Rebalancing Algorithms.........................16 | 4.4.4. Load Rebalancing Algorithms.........................15 | |||
4.4.5. Load Rebalancing Example............................16 | 4.4.5. Load Rebalancing Example............................16 | |||
5. Information Model for Flow Rebalancing........................17 | 5. Information Model for Flow Rebalancing........................17 | |||
5.1. Configuration Parameters for Flow Rebalancing............17 | ||||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | 5.2. System Configuration and Identification Parameters.......18 | |||
2014 | ||||
5.1. Configuration Parameters for Flow Rebalancing............18 | ||||
5.2. System Configuration and Identification Parameters.......19 | ||||
5.3. Information for Alternative Placement of Large Flows.....19 | 5.3. Information for Alternative Placement of Large Flows.....19 | |||
5.4. Information for Redistribution of Small Flows............20 | 5.4. Information for Redistribution of Small Flows............19 | |||
5.5. Export of Flow Information...............................20 | 5.5. Export of Flow Information...............................20 | |||
5.6. Monitoring information...................................21 | 5.6. Monitoring information...................................20 | |||
5.6.1. Interface (link) utilization........................21 | 5.6.1. Interface (link) utilization........................20 | |||
5.6.2. Other monitoring information........................21 | 5.6.2. Other monitoring information........................21 | |||
6. Operational Considerations....................................22 | 6. Operational Considerations....................................21 | |||
6.1. Rebalancing Frequency....................................22 | 6.1. Rebalancing Frequency....................................21 | |||
6.2. Handling Route Changes...................................22 | 6.2. Handling Route Changes...................................21 | |||
6.3. Forwarding Resources.....................................22 | 6.3. Forwarding Resources.....................................22 | |||
7. IANA Considerations...........................................23 | 7. IANA Considerations...........................................22 | |||
8. Security Considerations.......................................23 | 8. Security Considerations.......................................22 | |||
9. Contributing Authors..........................................23 | 9. Contributing Authors..........................................22 | |||
10. Acknowledgements.............................................23 | 10. Acknowledgements.............................................23 | |||
11. References...................................................24 | 11. References...................................................23 | |||
11.1. Normative References....................................24 | 11.1. Normative References....................................23 | |||
11.2. Informative References..................................24 | 11.2. Informative References..................................23 | |||
1. Introduction | 1. Introduction | |||
Networks extensively use link aggregation groups (LAG) [802.1AX] and | Networks extensively use link aggregation groups (LAG) [802.1AX] and | |||
equal cost multi-paths (ECMP) [RFC 2991] as techniques for capacity | equal cost multi-paths (ECMP) [RFC 2991] as techniques for capacity | |||
scaling. For the problems addressed by this document, network traffic | scaling. For the problems addressed by this document, network traffic | |||
can be predominantly categorized into two traffic types: long-lived | can be predominantly categorized into two traffic types: long-lived | |||
large flows and other flows. These other flows, which include long- | large flows and other flows. These other flows, which include long- | |||
lived small flows, short-lived small flows, and short-lived large | lived small flows, short-lived small flows, and short-lived large | |||
flows, are referred to as "small flows" in this document. Long-lived | flows, are referred to as "small flows" in this document. Long-lived | |||
skipping to change at page 4, line 4 | skipping to change at page 3, line 49 | |||
This draft describes mechanisms for optimizing LAG/ECMP component | This draft describes mechanisms for optimizing LAG/ECMP component | |||
link utilization while using hash-based techniques. The mechanisms | link utilization while using hash-based techniques. The mechanisms | |||
comprise the following steps -- recognizing large flows in a router; | comprise the following steps -- recognizing large flows in a router; | |||
and assigning the large flows to specific LAG/ECMP component links or | and assigning the large flows to specific LAG/ECMP component links or | |||
redistributing the small flows when a component link on the router is | redistributing the small flows when a component link on the router is | |||
congested. | congested. | |||
It is useful to keep in mind that in typical use cases for this | It is useful to keep in mind that in typical use cases for this | |||
mechanism the large flows are those that consume a significant amount | mechanism the large flows are those that consume a significant amount | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
of bandwidth on a link, e.g. greater than 5% of link bandwidth. The | of bandwidth on a link, e.g. greater than 5% of link bandwidth. The | |||
number of such flows would necessarily be fairly small, e.g. on the | number of such flows would necessarily be fairly small, e.g. on the | |||
order of 10's or 100's per LAG/ECMP. In other words, the number of | order of 10's or 100's per LAG/ECMP. In other words, the number of | |||
large flows is NOT expected to be on the order of millions of flows. | large flows is NOT expected to be on the order of millions of flows. | |||
Examples of such large flows would be IPsec tunnels in service | Examples of such large flows would be IPsec tunnels in service | |||
provider backbone networks or storage backup traffic in data center | provider backbone networks or storage backup traffic in data center | |||
networks. | networks. | |||
1.1. Acronyms | 1.1. Acronyms | |||
skipping to change at page 4, line 46 | skipping to change at page 4, line 40 | |||
TCAM: Ternary Content Addressable Memory | TCAM: Ternary Content Addressable Memory | |||
VXLAN: Virtual Extensible LAN | VXLAN: Virtual Extensible LAN | |||
1.2. Terminology | 1.2. Terminology | |||
Central management entity: Refers to an entity that is capable of | Central management entity: Refers to an entity that is capable of | |||
monitoring information about link utilization and flows in routers | monitoring information about link utilization and flows in routers | |||
across the network and may be capable of making traffic engineering | across the network and may be capable of making traffic engineering | |||
decisions for placement of large flows. It may include the functions | decisions for placement of large flows. It may include the functions | |||
of a collector if the routers employ a sampling technique [RFC 7011]. | of a collector [RFC 7011]. | |||
ECMP component link: An individual nexthop within an ECMP group. An | ECMP component link: An individual nexthop within an ECMP group. An | |||
ECMP component link may itself comprise a LAG. | ECMP component link may itself comprise a LAG. | |||
ECMP table: A table that is used as the nexthop of an ECMP route that | ECMP table: A table that is used as the nexthop of an ECMP route that | |||
comprises the set of ECMP component links and the weights associated | comprises the set of ECMP component links and the weights associated | |||
with each of those ECMP component links. The input for looking up | with each of those ECMP component links. The input for looking up | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
the table is the hash value for the packet, and the weights are used | the table is the hash value for the packet, and the weights are used | |||
to determine which values of the hash function map to a given ECMP | to determine which values of the hash function map to a given ECMP | |||
component link. | component link. | |||
LAG component link: An individual link within a LAG. A LAG component | LAG component link: An individual link within a LAG. A LAG component | |||
link is typically a physical link. | link is typically a physical link. | |||
LAG table: A table that is used as the output port which is a LAG | LAG table: A table that is used as the output port which is a LAG | |||
that comprises the set of LAG component links and the weights | that comprises the set of LAG component links and the weights | |||
associated with each of those component links. The input for looking | associated with each of those component links. The input for looking | |||
skipping to change at page 6, line 5 | skipping to change at page 5, line 31 | |||
2. Flow Categorization | 2. Flow Categorization | |||
In general, based on the size and duration, a flow can be categorized | In general, based on the size and duration, a flow can be categorized | |||
into any one of the following four types, as shown in Figure 1: | into any one of the following four types, as shown in Figure 1: | |||
(a) Short-lived Large Flow (SLLF), | (a) Short-lived Large Flow (SLLF), | |||
(b) Short-lived Small Flow (SLSF), | (b) Short-lived Small Flow (SLSF), | |||
(c) Long-lived Large Flow (LLLF), and | (c) Long-lived Large Flow (LLLF), and | |||
(d) Long-lived Small Flow (LLSF). | (d) Long-lived Small Flow (LLSF). | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
Flow Bandwidth | Flow Bandwidth | |||
^ | ^ | |||
|--------------------|--------------------| | |--------------------|--------------------| | |||
| | | | | | | | |||
Large | SLLF | LLLF | | Large | SLLF | LLLF | | |||
Flow | | | | Flow | | | | |||
|--------------------|--------------------| | |--------------------|--------------------| | |||
| | | | | | | | |||
Small | SLSF | LLSF | | Small | SLSF | LLSF | | |||
Flow | | | | Flow | | | | |||
skipping to change at page 7, line 4 | skipping to change at page 6, line 30 | |||
number of flows is mapped to each component link, if the individual | number of flows is mapped to each component link, if the individual | |||
flow rates are much smaller as compared to the link capacity, and if | flow rates are much smaller as compared to the link capacity, and if | |||
the rate differences are not dramatic, hash-based techniques produce | the rate differences are not dramatic, hash-based techniques produce | |||
good results with respect to utilization of the individual component | good results with respect to utilization of the individual component | |||
links. However, if one or more of these conditions are not met, hash- | links. However, if one or more of these conditions are not met, hash- | |||
based techniques may result in imbalance in the loads on individual | based techniques may result in imbalance in the loads on individual | |||
component links. | component links. | |||
One example is illustrated in Figure 2. In Figure 2, there are two | One example is illustrated in Figure 2. In Figure 2, there are two | |||
routers, R1 and R2, and there is a LAG between them which has 3 | routers, R1 and R2, and there is a LAG between them which has 3 | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
component links (1), (2), (3). There are a total of 10 flows that | component links (1), (2), (3). There are a total of 10 flows that | |||
need to be distributed across the links in this LAG. The result of | need to be distributed across the links in this LAG. The result of | |||
applying the hash-based technique is as follows: | applying the hash-based technique is as follows: | |||
. Component link (1) has 3 flows -- 2 small flows and 1 large | . Component link (1) has 3 flows -- 2 small flows and 1 large | |||
flow -- and the link utilization is normal. | flow -- and the link utilization is normal. | |||
. Component link (2) has 3 flows -- 3 small flows and no large | . Component link (2) has 3 flows -- 3 small flows and no large | |||
flow -- and the link utilization is light. | flow -- and the link utilization is light. | |||
skipping to change at page 7, line 34 | skipping to change at page 7, line 11 | |||
o The presence of 2 large flows causes congestion on this | o The presence of 2 large flows causes congestion on this | |||
component link. | component link. | |||
+-----------+ -> +-----------+ | +-----------+ -> +-----------+ | |||
| | -> | | | | | -> | | | |||
| | ===> | | | | | ===> | | | |||
| (1)|--------|(1) | | | (1)|--------|(1) | | |||
| | -> | | | | | -> | | | |||
| | -> | | | | | -> | | | |||
| (R1) | -> | (R2) | | | (R1) | -> | (R2) | | |||
| (2)|--------|(2) | | | (2)|--------|(2) | | |||
| | -> | | | | | -> | | | |||
| | -> | | | | | -> | | | |||
| | ===> | | | | | ===> | | | |||
| | ===> | | | | | ===> | | | |||
| (3)|--------|(3) | | | (3)|--------|(3) | | |||
| | | | | | | | | | |||
+-----------+ +-----------+ | +-----------+ +-----------+ | |||
Where: -> small flow | Where: -> small flow | |||
===> large flow | ===> large flow | |||
Figure 2: Unevenly Utilized Component Links | Figure 2: Unevenly Utilized Component Links | |||
This document presents mechanisms for addressing the imbalance in | This document presents mechanisms for addressing the imbalance in | |||
load distribution resulting from commonly used hash-based techniques | load distribution resulting from commonly used hash-based techniques | |||
for LAG/ECMP that were shown in the above example. The mechanisms use | for LAG/ECMP that were shown in the above example. The mechanisms use | |||
large flow awareness to compensate for the imbalance in load | large flow awareness to compensate for the imbalance in load | |||
distribution. | distribution. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
4. Mechanisms for Optimizing LAG/ECMP Component Link Utilization | 4. Mechanisms for Optimizing LAG/ECMP Component Link Utilization | |||
The suggested mechanisms in this draft are about a local optimization | The suggested mechanisms in this draft are about a local optimization | |||
solution; they are local in the sense that both the identification of | solution; they are local in the sense that both the identification of | |||
large flows and re-balancing of the load can be accomplished | large flows and re-balancing of the load can be accomplished | |||
completely within individual nodes in the network without the need | completely within individual nodes in the network without the need | |||
for interaction with other nodes. | for interaction with other nodes. | |||
This approach may not yield a global optimization of the placement of | This approach may not yield a global optimization of the placement of | |||
large flows across multiple nodes in a network, which may be | large flows across multiple nodes in a network, which may be | |||
skipping to change at page 9, line 4 | skipping to change at page 8, line 30 | |||
parts of the network. The scheme works equally well for unicast and | parts of the network. The scheme works equally well for unicast and | |||
multicast flows. | multicast flows. | |||
On the other hand, with ECMP, redistributing the load across | On the other hand, with ECMP, redistributing the load across | |||
component links that are part of the ECMP group may impact traffic | component links that are part of the ECMP group may impact traffic | |||
patterns at all of the nodes that are downstream of the given router | patterns at all of the nodes that are downstream of the given router | |||
between itself and the destination. The local optimization may | between itself and the destination. The local optimization may | |||
result in congestion at a downstream node. (In its simplest form, an | result in congestion at a downstream node. (In its simplest form, an | |||
ECMP group may be used to distribute traffic on component links that | ECMP group may be used to distribute traffic on component links that | |||
are between two adjacent routers, and in that case, the ECMP group is | are between two adjacent routers, and in that case, the ECMP group is | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
no different than a LAG for the purpose of this discussion. It | no different than a LAG for the purpose of this discussion. It | |||
should be noted that an ECMP component link may itself comprise a | should be noted that an ECMP component link may itself comprise a | |||
LAG, in which case the scheme may be further applied to the component | LAG, in which case the scheme may be further applied to the component | |||
links within the LAG.) | links within the LAG.) | |||
+-----+ +-----+ | +-----+ +-----+ | |||
| S1 | | S2 | | | S1 | | S2 | | |||
+-----+ +-----+ | +-----+ +-----+ | |||
/ \ \ / /\ | / \ \ / /\ | |||
/ +---------+ / \ | / +---------+ / \ | |||
skipping to change at page 10, line 5 | skipping to change at page 9, line 26 | |||
either S1 or S2. | either S1 or S2. | |||
The other issue with applying this scheme to ECMP groups is that it | The other issue with applying this scheme to ECMP groups is that it | |||
may not apply equally to unicast and multicast traffic because of the | may not apply equally to unicast and multicast traffic because of the | |||
way multicast trees are constructed. | way multicast trees are constructed. | |||
Finally, it is possible for a single physical link to participate as | Finally, it is possible for a single physical link to participate as | |||
a component link in multiple ECMP groups, whereas with LAGs, a link | a component link in multiple ECMP groups, whereas with LAGs, a link | |||
can participate as a component link of only one LAG. | can participate as a component link of only one LAG. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
4.2. Operational Overview | 4.2. Operational Overview | |||
The various steps in optimizing LAG/ECMP component link utilization | The various steps in optimizing LAG/ECMP component link utilization | |||
in networks are detailed below: | in networks are detailed below: | |||
Step 1) This involves large flow recognition in routers and | Step 1) This involves large flow recognition in routers and | |||
maintaining the mapping of the large flow to the component link that | maintaining the mapping of the large flow to the component link that | |||
it uses. The recognition of large flows is explained in Section 4.3. | it uses. The recognition of large flows is explained in Section 4.3. | |||
Step 2) The egress component links are periodically scanned for link | Step 2) The egress component links are periodically scanned for link | |||
skipping to change at page 11, line 5 | skipping to change at page 10, line 24 | |||
while paths P2 and P3 may be under-utilized. This is something that | while paths P2 and P3 may be under-utilized. This is something that | |||
the local router does not have visibility into. With the help of a | the local router does not have visibility into. With the help of a | |||
central management entity, the operator could redistribute some of | central management entity, the operator could redistribute some of | |||
the flows from P1 to P2 and/or P3 resulting in a more optimized flow | the flows from P1 to P2 and/or P3 resulting in a more optimized flow | |||
of traffic. | of traffic. | |||
The mechanisms described above are especially useful when bundling | The mechanisms described above are especially useful when bundling | |||
links of different bandwidths for e.g. 10 Gbps and 100 Gbps as | links of different bandwidths for e.g. 10 Gbps and 100 Gbps as | |||
described in [ID.ietf-rtgwg-cl-requirement]. | described in [ID.ietf-rtgwg-cl-requirement]. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
4.3. Large Flow Recognition | 4.3. Large Flow Recognition | |||
4.3.1. Flow Identification | 4.3.1. Flow Identification | |||
A flow (large flow or small flow) can be defined as a sequence of | A flow (large flow or small flow) can be defined as a sequence of | |||
packets for which ordered delivery should be maintained. Flows are | packets for which ordered delivery should be maintained. Flows are | |||
typically identified using one or more fields from the packet header, | typically identified using one or more fields from the packet header, | |||
for example: | for example: | |||
. Layer 2: Source MAC address, destination MAC address, VLAN ID. | . Layer 2: Source MAC address, destination MAC address, VLAN ID. | |||
skipping to change at page 12, line 4 | skipping to change at page 11, line 23 | |||
From a bandwidth and time duration perspective, in order to recognize | From a bandwidth and time duration perspective, in order to recognize | |||
large flows we define an observation interval and observe the | large flows we define an observation interval and observe the | |||
bandwidth of the flow over that interval. A flow that exceeds a | bandwidth of the flow over that interval. A flow that exceeds a | |||
certain minimum bandwidth threshold over that observation interval | certain minimum bandwidth threshold over that observation interval | |||
would be considered a large flow. | would be considered a large flow. | |||
The two parameters -- the observation interval, and the minimum | The two parameters -- the observation interval, and the minimum | |||
bandwidth threshold over that observation interval -- should be | bandwidth threshold over that observation interval -- should be | |||
programmable to facilitate handling of different use cases and | programmable to facilitate handling of different use cases and | |||
traffic characteristics. For example, a flow which is at or above 10% | traffic characteristics. For example, a flow which is at or above 10% | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
of link bandwidth for a time period of at least 1 second could be | of link bandwidth for a time period of at least 1 second could be | |||
declared a large flow [DevoFlow]. | declared a large flow [DevoFlow]. | |||
In order to avoid excessive churn in the rebalancing, once a flow has | In order to avoid excessive churn in the rebalancing, once a flow has | |||
been recognized as a large flow, it should continue to be recognized | been recognized as a large flow, it should continue to be recognized | |||
as a large flow for as long as the traffic received during an | as a large flow for as long as the traffic received during an | |||
observation interval exceeds some fraction of the bandwidth | observation interval exceeds some fraction of the bandwidth | |||
threshold, for example 80% of the bandwidth threshold. | threshold, for example 80% of the bandwidth threshold. | |||
Various techniques to recognize a large flow are described below. | Various techniques to recognize a large flow are described below. | |||
skipping to change at page 13, line 4 | skipping to change at page 12, line 20 | |||
sampling. | sampling. | |||
Alternatively, since sampling techniques require that the sample be | Alternatively, since sampling techniques require that the sample be | |||
annotated with the packet's egress port information, ingress sampling | annotated with the packet's egress port information, ingress sampling | |||
may suffice. However, this means that sampling would have to be | may suffice. However, this means that sampling would have to be | |||
enabled on all ports, rather than only on those ports where such | enabled on all ports, rather than only on those ports where such | |||
monitoring is desired. There is one situation in which this approach | monitoring is desired. There is one situation in which this approach | |||
may not work. If there are tunnels that originate from the given | may not work. If there are tunnels that originate from the given | |||
router, and if the resulting tunnel comprises the large flow, then | router, and if the resulting tunnel comprises the large flow, then | |||
this cannot be deduced from ingress sampling at the given router. | this cannot be deduced from ingress sampling at the given router. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
Instead, if egress sampling is unavailable, then ingress sampling | Instead, if egress sampling is unavailable, then ingress sampling | |||
from the downstream router must be used. | from the downstream router must be used. | |||
To illustrate the use of ingress versus egress sampling, we refer to | To illustrate the use of ingress versus egress sampling, we refer to | |||
Figure 2. Since we are looking at rebalancing flows at R1, we would | Figure 2. Since we are looking at rebalancing flows at R1, we would | |||
need to enable egress sampling on ports (1), (2), and (3) on R1. If | need to enable egress sampling on ports (1), (2), and (3) on R1. If | |||
egress sampling is not available, and if R2 is also under the control | egress sampling is not available, and if R2 is also under the control | |||
of the same administrator, enabling ingress sampling on R2's ports | of the same administrator, enabling ingress sampling on R2's ports | |||
(1), (2), and (3) would also work, but it would necessitate the | (1), (2), and (3) would also work, but it would necessitate the | |||
involvement of a central management entity in order for R1 to obtain | involvement of a central management entity in order for R1 to obtain | |||
skipping to change at page 14, line 5 | skipping to change at page 13, line 19 | |||
4.3.4. Inline Data Path Measurement | 4.3.4. Inline Data Path Measurement | |||
Implementations may perform recognition of large flows by performing | Implementations may perform recognition of large flows by performing | |||
measurements on traffic in the data path of a router. Such an | measurements on traffic in the data path of a router. Such an | |||
approach would be expected to operate at the interface speed on every | approach would be expected to operate at the interface speed on every | |||
interface, accounting for all packets processed by the data path of | interface, accounting for all packets processed by the data path of | |||
the router. An example of such an approach is described in IPFIX | the router. An example of such an approach is described in IPFIX | |||
[RFC 5470]. | [RFC 5470]. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
Using inline data path measurement, a faster and more accurate | Using inline data path measurement, a faster and more accurate | |||
indication of large flows mapped to each of the component links in a | indication of large flows mapped to each of the component links in a | |||
LAG/ECMP group may be possible (as compared to the sampling-based | LAG/ECMP group may be possible (as compared to the sampling-based | |||
approach). | approach). | |||
The advantages and disadvantages of inline data path measurement are: | The advantages and disadvantages of inline data path measurement are: | |||
Advantages: | Advantages: | |||
. As link speeds get higher, sampling rates are typically reduced | . As link speeds get higher, sampling rates are typically reduced | |||
skipping to change at page 15, line 5 | skipping to change at page 14, line 17 | |||
It is possible that a router may have line cards that support a | It is possible that a router may have line cards that support a | |||
sampling technique while other line cards support inline data path | sampling technique while other line cards support inline data path | |||
measurement of large flows. As long as there is a way for the router | measurement of large flows. As long as there is a way for the router | |||
to reliably determine the mapping of large flows to component links | to reliably determine the mapping of large flows to component links | |||
of a LAG/ECMP group, it is acceptable for the router to use more than | of a LAG/ECMP group, it is acceptable for the router to use more than | |||
one method for large flow recognition. | one method for large flow recognition. | |||
If both methods are supported, inline data path measurement may be | If both methods are supported, inline data path measurement may be | |||
preferable because of its speed of detection [FLOW-ACC]. | preferable because of its speed of detection [FLOW-ACC]. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
4.4. Load Rebalancing Options | 4.4. Load Rebalancing Options | |||
Below are suggested techniques for load balancing. Equipment vendors | Below are suggested techniques for load balancing. Equipment vendors | |||
may implement more than one technique, including those not described | may implement more than one technique, including those not described | |||
in this document, and allow the operator to choose between them. | in this document, and allow the operator to choose between them. | |||
Note that regardless of the method used, perfect rebalancing of large | Note that regardless of the method used, perfect rebalancing of large | |||
flows may not be possible since flows arrive and depart at different | flows may not be possible since flows arrive and depart at different | |||
times. Also, any flows that are moved from one component link to | times. Also, any flows that are moved from one component link to | |||
another may experience momentary packet reordering. | another may experience momentary packet reordering. | |||
skipping to change at page 16, line 5 | skipping to change at page 15, line 15 | |||
would still result in some imbalance in the utilization across the | would still result in some imbalance in the utilization across the | |||
component links. | component links. | |||
4.4.2. Redistributing Small Flows | 4.4.2. Redistributing Small Flows | |||
Some large flows may consume the entire bandwidth of the component | Some large flows may consume the entire bandwidth of the component | |||
link(s). In this case, it would be desirable for the small flows to | link(s). In this case, it would be desirable for the small flows to | |||
not use the congested component link(s). This can be accomplished in | not use the congested component link(s). This can be accomplished in | |||
one of the following ways. | one of the following ways. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
This method works on some existing router hardware. The idea is to | This method works on some existing router hardware. The idea is to | |||
prevent, or reduce the probability, that the small flow hashes into | prevent, or reduce the probability, that the small flow hashes into | |||
the congested component link(s). | the congested component link(s). | |||
. The LAG/ECMP table is modified to include only non-congested | . The LAG/ECMP table is modified to include only non-congested | |||
component link(s). Small flows hash into this table to be mapped | component link(s). Small flows hash into this table to be mapped | |||
to a destination component link. Alternatively, if certain | to a destination component link. Alternatively, if certain | |||
component links are heavily loaded, but not congested, the | component links are heavily loaded, but not congested, the | |||
output of the hash function can be adjusted to account for large | output of the hash function can be adjusted to account for large | |||
flow loading on each of the component links. | flow loading on each of the component links. | |||
skipping to change at page 17, line 5 | skipping to change at page 16, line 14 | |||
4.4.5. Load Rebalancing Example | 4.4.5. Load Rebalancing Example | |||
Optimizing LAG/ECMP component utilization for the use case in Figure | Optimizing LAG/ECMP component utilization for the use case in Figure | |||
2 is depicted below in Figure 4. The large flow rebalancing explained | 2 is depicted below in Figure 4. The large flow rebalancing explained | |||
in Section 4.4 is used. The improved link utilization is as follows: | in Section 4.4 is used. The improved link utilization is as follows: | |||
. Component link (1) has 3 flows -- 2 small flows and 1 large | . Component link (1) has 3 flows -- 2 small flows and 1 large | |||
flow -- and the link utilization is normal. | flow -- and the link utilization is normal. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
. Component link (2) has 4 flows -- 3 small flows and 1 large | . Component link (2) has 4 flows -- 3 small flows and 1 large | |||
flow -- and the link utilization is normal now. | flow -- and the link utilization is normal now. | |||
. Component link (3) has 3 flows -- 2 small flows and 1 large | . Component link (3) has 3 flows -- 2 small flows and 1 large | |||
flow -- and the link utilization is normal now. | flow -- and the link utilization is normal now. | |||
+-----------+ -> +-----------+ | +-----------+ -> +-----------+ | |||
| | -> | | | | | -> | | | |||
| | ===> | | | | | ===> | | | |||
| (1)|--------|(1) | | | (1)|--------|(1) | | |||
| | | | | | | | | | |||
| | ===> | | | | | ===> | | | |||
| | -> | | | | | -> | | | |||
| | -> | | | | | -> | | | |||
| (R1) | -> | (R2) | | | (R1) | -> | (R2) | | |||
| (2)|--------|(2) | | | (2)|--------|(2) | | |||
| | | | | | | | | | |||
| | -> | | | | | -> | | | |||
| | -> | | | | | -> | | | |||
| | ===> | | | | | ===> | | | |||
| (3)|--------|(3) | | | (3)|--------|(3) | | |||
| | | | | | | | | | |||
+-----------+ +-----------+ | +-----------+ +-----------+ | |||
Where: -> small flow | Where: -> small flow | |||
skipping to change at page 18, line 5 | skipping to change at page 17, line 14 | |||
5. Information Model for Flow Rebalancing | 5. Information Model for Flow Rebalancing | |||
In order to support flow rebalancing in a router from an external | In order to support flow rebalancing in a router from an external | |||
system, the exchange of some information is necessary between the | system, the exchange of some information is necessary between the | |||
router and the external system. This section provides an exemplary | router and the external system. This section provides an exemplary | |||
information model covering the various components needed for the | information model covering the various components needed for the | |||
purpose. The model is intended to be informational and may be used | purpose. The model is intended to be informational and may be used | |||
as input for development of a data model. | as input for development of a data model. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
5.1. Configuration Parameters for Flow Rebalancing | 5.1. Configuration Parameters for Flow Rebalancing | |||
The following parameters are required the configuration of this | The following parameters are required the configuration of this | |||
feature: | feature: | |||
. Large flow recognition parameters: | . Large flow recognition parameters: | |||
o Observation interval: The observation interval is the time | o Observation interval: The observation interval is the time | |||
period in seconds over which the packet arrivals are | period in seconds over which the packet arrivals are | |||
observed for the purpose of large flow recognition. | observed for the purpose of large flow recognition. | |||
skipping to change at page 19, line 5 | skipping to change at page 18, line 14 | |||
. Rebalancing interval: The minimum amount of time between | . Rebalancing interval: The minimum amount of time between | |||
rebalancing events. This parameter ensures that rebalancing is | rebalancing events. This parameter ensures that rebalancing is | |||
not invoked too frequently as it impacts packet ordering. | not invoked too frequently as it impacts packet ordering. | |||
These parameters may be configured on a system-wide basis or it may | These parameters may be configured on a system-wide basis or it may | |||
apply to an individual LAG. It may be applied to an ECMP group | apply to an individual LAG. It may be applied to an ECMP group | |||
provided the component links are not shared with any other ECMP | provided the component links are not shared with any other ECMP | |||
group. | group. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
5.2. System Configuration and Identification Parameters | 5.2. System Configuration and Identification Parameters | |||
The following parameters are useful for router configuration and | The following parameters are useful for router configuration and | |||
operation when using the mechanisms in this document. | operation when using the mechanisms in this document. | |||
. IP address: The IP address of a specific router that the | . IP address: The IP address of a specific router that the | |||
feature is being configured on, or that the large flow placement | feature is being configured on, or that the large flow placement | |||
is being applied to. | is being applied to. | |||
. LAG ID: Identifies the LAG on a given router. The LAG ID may be | . LAG ID: Identifies the LAG on a given router. The LAG ID may be | |||
skipping to change at page 20, line 5 | skipping to change at page 19, line 16 | |||
In cases where large flow recognition is handled by an external | In cases where large flow recognition is handled by an external | |||
management station (see Section 4.3.3), an information model for | management station (see Section 4.3.3), an information model for | |||
flows is required to allow the import of large flow information to | flows is required to allow the import of large flow information to | |||
the router. | the router. | |||
Typical fields use for identifying large flows were discussed in | Typical fields use for identifying large flows were discussed in | |||
Section 4.3.1. The IPFIX information model [RFC 7012] can be | Section 4.3.1. The IPFIX information model [RFC 7012] can be | |||
leveraged for large flow identification. | leveraged for large flow identification. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
Large Flow placement is achieved by specifying the relevant flow | Large Flow placement is achieved by specifying the relevant flow | |||
information along with the following: | information along with the following: | |||
. For LAG: Router's IP address, LAG ID, LAG component link ID. | . For LAG: Router's IP address, LAG ID, LAG component link ID. | |||
. For ECMP: Router's IP address, ECMP group, ECMP component link | . For ECMP: Router's IP address, ECMP group, ECMP component link | |||
ID. | ID. | |||
In the case where the ECMP component link itself comprises a LAG, we | In the case where the ECMP component link itself comprises a LAG, we | |||
would have to specify the parameters for both the ECMP group as well | would have to specify the parameters for both the ECMP group as well | |||
skipping to change at page 21, line 5 | skipping to change at page 20, line 17 | |||
Exporting large flow information is required when large flow | Exporting large flow information is required when large flow | |||
recognition is being done on a router, but the decision to rebalance | recognition is being done on a router, but the decision to rebalance | |||
is being made in an external management station. Large flow | is being made in an external management station. Large flow | |||
information includes flow identification and the component link ID | information includes flow identification and the component link ID | |||
that the flow currently is assigned to. Other information such as | that the flow currently is assigned to. Other information such as | |||
flow QoS and bandwidth may be exported too. | flow QoS and bandwidth may be exported too. | |||
The IPFIX information model [RFC 7012] can be leveraged for large | The IPFIX information model [RFC 7012] can be leveraged for large | |||
flow identification. | flow identification. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
5.6. Monitoring information | 5.6. Monitoring information | |||
5.6.1. Interface (link) utilization | 5.6.1. Interface (link) utilization | |||
The incoming bytes (ifInOctets), outgoing bytes (ifOutOctets) and | The incoming bytes (ifInOctets), outgoing bytes (ifOutOctets) and | |||
interface speed (ifSpeed) can be obtained, for example, from the | interface speed (ifSpeed) can be obtained, for example, from the | |||
Interface table (iftable) MIB [RFC 1213]. | Interface table (iftable) MIB [RFC 1213]. | |||
The link utilization can then be computed as follows: | The link utilization can then be computed as follows: | |||
skipping to change at page 22, line 5 | skipping to change at page 21, line 17 | |||
Additional monitoring information that is useful includes: | Additional monitoring information that is useful includes: | |||
. Number of times rebalancing was done. | . Number of times rebalancing was done. | |||
. Time since the last rebalancing event. | . Time since the last rebalancing event. | |||
. The number of large flows currently rebalanced by the scheme. | . The number of large flows currently rebalanced by the scheme. | |||
. A list of the large flows that have been rebalanced including | . A list of the large flows that have been rebalanced including | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
o the rate of each large flow at the time of the last | o the rate of each large flow at the time of the last | |||
rebalancing for that flow, | rebalancing for that flow, | |||
o the time that rebalancing was last performed for the given | o the time that rebalancing was last performed for the given | |||
large flow, and | large flow, and | |||
o the interfaces that the large flows was (re)directed to. | o the interfaces that the large flows was (re)directed to. | |||
. The settings for the weights of the interfaces within a | . The settings for the weights of the interfaces within a | |||
LAG/ECMP used by the small flows which depend on hashing. | LAG/ECMP used by the small flows which depend on hashing. | |||
skipping to change at page 23, line 4 | skipping to change at page 22, line 15 | |||
and 4.4.2 must be withdrawn in order to avoid the creation of | and 4.4.2 must be withdrawn in order to avoid the creation of | |||
forwarding loops. | forwarding loops. | |||
6.3. Forwarding Resources | 6.3. Forwarding Resources | |||
Hash-based techniques used for load balancing with LAG/ECMP are | Hash-based techniques used for load balancing with LAG/ECMP are | |||
usually stateless. The mechanisms described in this document require | usually stateless. The mechanisms described in this document require | |||
additional resources in the forwarding plane of routers for creating | additional resources in the forwarding plane of routers for creating | |||
PBR rules that are capable of overriding the forwarding decision from | PBR rules that are capable of overriding the forwarding decision from | |||
the hash-based approach. These resources may limit the number of | the hash-based approach. These resources may limit the number of | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
flows that can be rebalanced and may also impact the latency | flows that can be rebalanced and may also impact the latency | |||
experienced by packets due to the additional lookups that are | experienced by packets due to the additional lookups that are | |||
required. | required. | |||
7. IANA Considerations | 7. IANA Considerations | |||
This memo includes no request to IANA. | This memo includes no request to IANA. | |||
8. Security Considerations | 8. Security Considerations | |||
skipping to change at page 24, line 5 | skipping to change at page 23, line 17 | |||
The authors would like to thank the following individuals for their | The authors would like to thank the following individuals for their | |||
review and valuable feedback on earlier versions of this document: | review and valuable feedback on earlier versions of this document: | |||
Shane Amante, Fred Baker, Michael Bugenhagen, Zhen Cao, Brian | Shane Amante, Fred Baker, Michael Bugenhagen, Zhen Cao, Brian | |||
Carpenter, Benoit Claise, Michael Fargano, Wes George, Sriganesh | Carpenter, Benoit Claise, Michael Fargano, Wes George, Sriganesh | |||
Kini, Roman Krzanowski, Andrew Malis, Dave McDysan, Pete Moyer, | Kini, Roman Krzanowski, Andrew Malis, Dave McDysan, Pete Moyer, | |||
Peter Phaal, Dan Romascanu, Curtis Villamizar, Jianrong Wong, George | Peter Phaal, Dan Romascanu, Curtis Villamizar, Jianrong Wong, George | |||
Yum, and Weifeng Zhang. As a part of the IETF Last Call process, | Yum, and Weifeng Zhang. As a part of the IETF Last Call process, | |||
valuable comments were received from Martin Thomson and Carlos | valuable comments were received from Martin Thomson and Carlos | |||
Pignatro. | Pignatro. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
11. References | 11. References | |||
11.1. Normative References | 11.1. Normative References | |||
[802.1AX] IEEE Standards Association, "IEEE Std 802.1AX-2008 IEEE | [802.1AX] IEEE Standards Association, "IEEE Std 802.1AX-2008 IEEE | |||
Standard for Local and Metropolitan Area Networks - Link | Standard for Local and Metropolitan Area Networks - Link | |||
Aggregation", 2008. | Aggregation", 2008. | |||
[RFC 2991] Thaler, D. and C. Hopps, "Multipath Issues in Unicast and | [RFC 2991] Thaler, D. and C. Hopps, "Multipath Issues in Unicast and | |||
Multicast," November 2000. | Multicast," November 2000. | |||
skipping to change at page 25, line 5 | skipping to change at page 24, line 15 | |||
[FLOW-ACC] Zseby, T., et al., "Packet sampling for flow accounting: | [FLOW-ACC] Zseby, T., et al., "Packet sampling for flow accounting: | |||
challenges and limitations," Proceedings of the 9th international | challenges and limitations," Proceedings of the 9th international | |||
conference on Passive and active network measurement, 2008. | conference on Passive and active network measurement, 2008. | |||
[ID.ietf-rtgwg-cl-requirement] Villamizar, C. et al., "Requirements | [ID.ietf-rtgwg-cl-requirement] Villamizar, C. et al., "Requirements | |||
for MPLS over a Composite Link," September 2013. | for MPLS over a Composite Link," September 2013. | |||
[ITCOM] Jo, J., et al., "Internet traffic load balancing using | [ITCOM] Jo, J., et al., "Internet traffic load balancing using | |||
dynamic hashing with flow volume," SPIE ITCOM, 2002. | dynamic hashing with flow volume," SPIE ITCOM, 2002. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
[NDTM] Estan, C. and G. Varghese, "New directions in traffic | [NDTM] Estan, C. and G. Varghese, "New directions in traffic | |||
measurement and accounting," Proceedings of ACM SIGCOMM, August 2002. | measurement and accounting," Proceedings of ACM SIGCOMM, August 2002. | |||
[NVGRE] Sridharan, M. et al., "NVGRE: Network Virtualization using | [NVGRE] Sridharan, M. et al., "NVGRE: Network Virtualization using | |||
Generic Routing Encapsulation," draft-sridharan-virtualization- | Generic Routing Encapsulation," draft-sridharan-virtualization- | |||
nvgre-05, January 2015. | nvgre-06, January 2015. | |||
[RFC 2784] Farinacci, D. et al., "Generic Routing Encapsulation | [RFC 2784] Farinacci, D. et al., "Generic Routing Encapsulation | |||
(GRE)," March 2000. | (GRE)," March 2000. | |||
[RFC 6790] Kompella, K. et al., "The Use of Entropy Labels in MPLS | [RFC 6790] Kompella, K. et al., "The Use of Entropy Labels in MPLS | |||
Forwarding," November 2012. | Forwarding," November 2012. | |||
[RFC 1213] McCloghrie, K., "Management Information Base for Network | [RFC 1213] McCloghrie, K., "Management Information Base for Network | |||
Management of TCP/IP-based internets: MIB-II," March 1991. | Management of TCP/IP-based internets: MIB-II," March 1991. | |||
skipping to change at page 26, line 5 | skipping to change at page 25, line 14 | |||
[RFC 5640] Filsfils, C., P. Mohapatra, and C. Pignataro, "Load | [RFC 5640] Filsfils, C., P. Mohapatra, and C. Pignataro, "Load | |||
Balancing for Mesh Softwires," August 2009. | Balancing for Mesh Softwires," August 2009. | |||
[RFC 5681] Allman, M. et al., "TCP Congestion Control," September | [RFC 5681] Allman, M. et al., "TCP Congestion Control," September | |||
2009. | 2009. | |||
[RFC 7223] Bjorklund, M., "A YANG Data Model for Interface | [RFC 7223] Bjorklund, M., "A YANG Data Model for Interface | |||
Management," May 2014. | Management," May 2014. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
[SAMP-BASIC] Phaal, P. and S. Panchen, "Packet Sampling Basics," | [SAMP-BASIC] Phaal, P. and S. Panchen, "Packet Sampling Basics," | |||
http://www.sflow.org/packetSamplingBasics/. | http://www.sflow.org/packetSamplingBasics/. | |||
[sFlow-v5] Phaal, P. and M. Lavine, "sFlow version 5," | [sFlow-v5] Phaal, P. and M. Lavine, "sFlow version 5," | |||
http://www.sflow.org/sflow_version_5.txt, July 2004. | http://www.sflow.org/sflow_version_5.txt, July 2004. | |||
[sFlow-LAG] Phaal, P. and A. Ghanwani, "sFlow LAG counters | [sFlow-LAG] Phaal, P. and A. Ghanwani, "sFlow LAG counters | |||
structure," http://www.sflow.org/sflow_lag.txt, September 2012. | structure," http://www.sflow.org/sflow_lag.txt, September 2012. | |||
[STT] Davie, B. (Ed.) and J. Gross, "A Stateless Transport Tunneling | [STT] Davie, B. (Ed.) and J. Gross, "A Stateless Transport Tunneling | |||
skipping to change at page 27, line 5 | skipping to change at page 26, line 13 | |||
congested while other paths are underutilized [YONG]. | congested while other paths are underutilized [YONG]. | |||
The simulation also shows substantial improvement by using the large | The simulation also shows substantial improvement by using the large | |||
flow-aware hash-based distribution technique described in this | flow-aware hash-based distribution technique described in this | |||
document. In using the same simulated traffic, the improved | document. In using the same simulated traffic, the improved | |||
rebalancing can achieve < 10% load differences among the paths. It | rebalancing can achieve < 10% load differences among the paths. It | |||
proves how large flow-aware hash-based distribution can effectively | proves how large flow-aware hash-based distribution can effectively | |||
compensate the uneven load balancing caused by hashing and the | compensate the uneven load balancing caused by hashing and the | |||
traffic characteristics [YONG]. | traffic characteristics [YONG]. | |||
Internet-Draft Optimizing Load Distribution over LAG/ECMP September | ||||
2014 | ||||
Authors' Addresses | Authors' Addresses | |||
Ram Krishnan | Ram Krishnan | |||
Brocade Communications | Brocade Communications | |||
San Jose, 95134, USA | San Jose, 95134, USA | |||
Phone: +1-408-406-7890 | Phone: +1-408-406-7890 | |||
Email: ramkri123@gmail.com | Email: ramkri123@gmail.com | |||
Lucy Yong | Lucy Yong | |||
Huawei USA | Huawei USA | |||
End of changes. 44 change blocks. | ||||
117 lines changed or deleted | 31 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |