draft-ietf-intarea-frag-fragile-14.txt | draft-ietf-intarea-frag-fragile-15.txt | |||
---|---|---|---|---|
Internet Area WG R. Bonica | Internet Area WG R. Bonica | |||
Internet-Draft Juniper Networks | Internet-Draft Juniper Networks | |||
Intended status: Best Current Practice F. Baker | Intended status: Best Current Practice F. Baker | |||
Expires: January 6, 2020 Unaffiliated | Expires: January 7, 2020 Unaffiliated | |||
G. Huston | G. Huston | |||
APNIC | APNIC | |||
R. Hinden | R. Hinden | |||
Check Point Software | Check Point Software | |||
O. Troan | O. Troan | |||
Cisco | Cisco | |||
F. Gont | F. Gont | |||
SI6 Networks | SI6 Networks | |||
July 5, 2019 | July 6, 2019 | |||
IP Fragmentation Considered Fragile | IP Fragmentation Considered Fragile | |||
draft-ietf-intarea-frag-fragile-14 | draft-ietf-intarea-frag-fragile-15 | |||
Abstract | Abstract | |||
This document describes IP fragmentation and explains how it | This document describes IP fragmentation and explains how it | |||
introduces fragility to Internet communication. | introduces fragility to Internet communication. | |||
This document also proposes alternatives to IP fragmentation and | This document also proposes alternatives to IP fragmentation and | |||
provides recommendations for developers and network operators. | provides recommendations for developers and network operators. | |||
Status of This Memo | Status of This Memo | |||
skipping to change at page 1, line 43 ¶ | skipping to change at page 1, line 43 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on January 6, 2020. | This Internet-Draft will expire on January 7, 2020. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2019 IETF Trust and the persons identified as the | Copyright (c) 2019 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 2, line 23 ¶ | skipping to change at page 2, line 23 ¶ | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
1.1. IP-in-IP Tunnels . . . . . . . . . . . . . . . . . . . . 3 | 1.1. IP-in-IP Tunnels . . . . . . . . . . . . . . . . . . . . 3 | |||
1.2. Requirements Language . . . . . . . . . . . . . . . . . . 3 | 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 3 | |||
2. IP Fragmentation . . . . . . . . . . . . . . . . . . . . . . 4 | 2. IP Fragmentation . . . . . . . . . . . . . . . . . . . . . . 4 | |||
2.1. Links, Paths, MTU and PMTU . . . . . . . . . . . . . . . 4 | 2.1. Links, Paths, MTU and PMTU . . . . . . . . . . . . . . . 4 | |||
2.2. Fragmentation Procedures . . . . . . . . . . . . . . . . 6 | 2.2. Fragmentation Procedures . . . . . . . . . . . . . . . . 6 | |||
2.3. Upper-Layer Reliance on IP Fragmentation . . . . . . . . 6 | 2.3. Upper-Layer Reliance on IP Fragmentation . . . . . . . . 7 | |||
3. Increased Fragility . . . . . . . . . . . . . . . . . . . . . 7 | 3. Increased Fragility . . . . . . . . . . . . . . . . . . . . . 7 | |||
3.1. Virtual Reassembly . . . . . . . . . . . . . . . . . . . 7 | 3.1. Virtual Reassembly . . . . . . . . . . . . . . . . . . . 7 | |||
3.2. Policy-Based Routing . . . . . . . . . . . . . . . . . . 8 | 3.2. Policy-Based Routing . . . . . . . . . . . . . . . . . . 8 | |||
3.3. Network Address Translation (NAT) . . . . . . . . . . . . 9 | 3.3. Network Address Translation (NAT) . . . . . . . . . . . . 9 | |||
3.4. Stateless Firewalls . . . . . . . . . . . . . . . . . . . 9 | 3.4. Stateless Firewalls . . . . . . . . . . . . . . . . . . . 9 | |||
3.5. Equal Cost Multipath, Link Aggregate Groups and Stateless | 3.5. Equal Cost Multipath, Link Aggregate Groups and Stateless | |||
Load-Balancers . . . . . . . . . . . . . . . . . . . . . 9 | Load-Balancers . . . . . . . . . . . . . . . . . . . . . 10 | |||
3.6. IPv4 Reassembly Errors at High Data Rates . . . . . . . . 11 | 3.6. IPv4 Reassembly Errors at High Data Rates . . . . . . . . 11 | |||
3.7. Security Vulnerabilities . . . . . . . . . . . . . . . . 11 | 3.7. Security Vulnerabilities . . . . . . . . . . . . . . . . 11 | |||
3.8. PMTU Blackholing Due to ICMP Loss . . . . . . . . . . . . 12 | 3.8. PMTU Blackholing Due to ICMP Loss . . . . . . . . . . . . 12 | |||
3.8.1. Transient Loss . . . . . . . . . . . . . . . . . . . 13 | 3.8.1. Transient Loss . . . . . . . . . . . . . . . . . . . 13 | |||
3.8.2. Incorrect Implementation of Security Policy . . . . . 13 | 3.8.2. Incorrect Implementation of Security Policy . . . . . 13 | |||
3.8.3. Persistent Loss Caused By Anycast . . . . . . . . . . 14 | 3.8.3. Persistent Loss Caused By Anycast . . . . . . . . . . 14 | |||
3.8.4. Persistent Loss Caused By Unidirectional Routing . . 14 | 3.8.4. Persistent Loss Caused By Unidirectional Routing . . 14 | |||
3.9. Blackholing Due To Filtering or Loss . . . . . . . . . . 14 | 3.9. Blackholing Due To Filtering or Loss . . . . . . . . . . 14 | |||
4. Alternatives to IP Fragmentation . . . . . . . . . . . . . . 15 | 4. Alternatives to IP Fragmentation . . . . . . . . . . . . . . 15 | |||
4.1. Transport Layer Solutions . . . . . . . . . . . . . . . . 15 | 4.1. Transport Layer Solutions . . . . . . . . . . . . . . . . 15 | |||
4.2. Application Layer Solutions . . . . . . . . . . . . . . . 16 | 4.2. Application Layer Solutions . . . . . . . . . . . . . . . 17 | |||
5. Applications That Rely on IPv6 Fragmentation . . . . . . . . 17 | 5. Applications That Rely on IPv6 Fragmentation . . . . . . . . 17 | |||
5.1. Domain Name Service (DNS) . . . . . . . . . . . . . . . . 18 | 5.1. Domain Name Service (DNS) . . . . . . . . . . . . . . . . 18 | |||
5.2. Open Shortest Path First (OSPF) . . . . . . . . . . . . . 18 | 5.2. Open Shortest Path First (OSPF) . . . . . . . . . . . . . 18 | |||
5.3. Packet-in-Packet Encapsulations . . . . . . . . . . . . . 18 | 5.3. Packet-in-Packet Encapsulations . . . . . . . . . . . . . 18 | |||
5.4. UDP Applications Enhancing Performance . . . . . . . . . 19 | 5.4. UDP Applications Enhancing Performance . . . . . . . . . 19 | |||
6. Recommendations . . . . . . . . . . . . . . . . . . . . . . . 19 | 6. Recommendations . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
6.1. For Application and Protocol Developers . . . . . . . . . 19 | 6.1. For Application and Protocol Developers . . . . . . . . . 19 | |||
6.2. For System Developers . . . . . . . . . . . . . . . . . . 20 | 6.2. For System Developers . . . . . . . . . . . . . . . . . . 20 | |||
6.3. For Middle Box Developers . . . . . . . . . . . . . . . . 20 | 6.3. For Middle Box Developers . . . . . . . . . . . . . . . . 20 | |||
6.4. For ECMP, LAG and Load-Balancer Developers And Operators 20 | 6.4. For ECMP, LAG and Load-Balancer Developers And Operators 20 | |||
6.5. For Network Operators . . . . . . . . . . . . . . . . . . 20 | 6.5. For Network Operators . . . . . . . . . . . . . . . . . . 21 | |||
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 | 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 | |||
8. Security Considerations . . . . . . . . . . . . . . . . . . . 21 | 8. Security Considerations . . . . . . . . . . . . . . . . . . . 21 | |||
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 21 | 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 21 | |||
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 | 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
10.1. Normative References . . . . . . . . . . . . . . . . . . 21 | 10.1. Normative References . . . . . . . . . . . . . . . . . . 21 | |||
10.2. Informative References . . . . . . . . . . . . . . . . . 23 | 10.2. Informative References . . . . . . . . . . . . . . . . . 23 | |||
Appendix A. Contributors' Address . . . . . . . . . . . . . . . 26 | Appendix A. Contributors' Address . . . . . . . . . . . . . . . 26 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 26 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 27 | |||
1. Introduction | 1. Introduction | |||
Operational experience [Kent] [Huston] [RFC7872] reveals that IP | Operational experience [Kent] [Huston] [RFC7872] reveals that IP | |||
fragmentation introduces fragility to Internet communication. This | fragmentation introduces fragility to Internet communication. This | |||
document describes IP fragmentation and explains the fragility it | document describes IP fragmentation and explains the fragility it | |||
introduces. It also proposes alternatives to IP fragmentation and | introduces. It also proposes alternatives to IP fragmentation and | |||
provides recommendations for developers and network operators. | provides recommendations for developers and network operators. | |||
While this document identifies issues associated with IP | While this document identifies issues associated with IP | |||
skipping to change at page 4, line 10 ¶ | skipping to change at page 4, line 10 ¶ | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
"OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in BCP | |||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
capitals, as shown here. | capitals, as shown here. | |||
2. IP Fragmentation | 2. IP Fragmentation | |||
2.1. Links, Paths, MTU and PMTU | 2.1. Links, Paths, MTU and PMTU | |||
An Internet path connects a source node to a destination node. A | An Internet path connects a source node to a destination node. A | |||
path can contain links and routers. If a path contains more than one | path may contain links and routers. If a path contains more than one | |||
link, the links are connected in series and a router connects each | link, the links are connected in series and a router connects each | |||
link to the next. | link to the next. | |||
Internet paths are dynamic. Assume that the path from one node to | Internet paths are dynamic. Assume that the path from one node to | |||
another contains a set of links and routers. If a link fails, the | another contains a set of links and routers. If a link fails, the | |||
path can also change so that it includes a different set of links and | path can also change so that it includes a different set of links and | |||
routers. | routers. | |||
Each link is constrained by the number of bytes that it can convey in | Each link is constrained by the number of bytes that it can convey in | |||
a single IP packet. This constraint is called the link Maximum | a single IP packet. This constraint is called the link Maximum | |||
Transmission Unit (MTU). IPv4 [RFC0791] requires every link to | Transmission Unit (MTU). Whlie the end-to-end Path MTU is the size | |||
support a specified MTU (see NOTE 1). IPv6 [RFC8200] requires every | of a single IPv4 header, IPv4 [RFC0791] requires every link to | |||
link to support an MTU of 1280 bytes or greater. These are called | support at least a specified MTU (see NOTE 1). IPv6 [RFC8200] | |||
the IPv4 and IPv6 minimum link MTU's. | similarly requires every link to support an MTU of 1280 bytes or | |||
greater. These are called the IPv4 and IPv6 minimum link MTU's. | ||||
Likewise, each Internet path is constrained by the number of bytes | Likewise, each Internet path is constrained by the number of bytes | |||
that it can convey in a single IP packet. This constraint is called | that it can convey in a single IP packet. This constraint is called | |||
the Path MTU (PMTU). For any given path, the PMTU is equal to the | the Path MTU (PMTU). For any given path, the PMTU is equal to the | |||
smallest of its link MTU's. Because Internet paths are dynamic, PMTU | smallest of its link MTU's. Because Internet paths are dynamic, PMTU | |||
is also dynamic. | is also dynamic. | |||
For reasons described below, source nodes estimate the PMTU between | For reasons described below, source nodes estimate the PMTU between | |||
themselves and destination nodes. A source node can produce | themselves and destination nodes. A source node can produce | |||
extremely conservative PMTU estimates in which: | extremely conservative PMTU estimates in which: | |||
skipping to change at page 5, line 49 ¶ | skipping to change at page 6, line 5 ¶ | |||
to minimize the requirement for fragmentation en route. So, for the | to minimize the requirement for fragmentation en route. So, for the | |||
purposes of this document, we assume that the IPv4 minimum path MTU | purposes of this document, we assume that the IPv4 minimum path MTU | |||
is 576 bytes. | is 576 bytes. | |||
NOTE 2: A non-fragmentable packet can be fragmented at its source. | NOTE 2: A non-fragmentable packet can be fragmented at its source. | |||
However, it cannot be fragmented by a downstream node. An IPv4 | However, it cannot be fragmented by a downstream node. An IPv4 | |||
packet whose DF-bit is set to 0 is fragmentable. An IPv4 packet | packet whose DF-bit is set to 0 is fragmentable. An IPv4 packet | |||
whose DF-bit is set to 1 is non-fragmentable. All IPv6 packets are | whose DF-bit is set to 1 is non-fragmentable. All IPv6 packets are | |||
also non-fragmentable. | also non-fragmentable. | |||
NOTE 3:: The ICMP PTB message has two instantiations. In ICMPv4 | NOTE 3: The ICMP PTB message has two instantiations. In ICMPv4 | |||
[RFC0792], the ICMP PTB message is a Destination Unreachable message | [RFC0792], the ICMP PTB message is a Destination Unreachable message | |||
with Code equal to 4 fragmentation needed and DF set. This message | with Code equal to 4 fragmentation needed and DF set. This message | |||
was augmented by [RFC1191] to indicate the MTU of the link through | was augmented by [RFC1191] to indicate the MTU of the link through | |||
which the packet could not be forwarded. In ICMPv6 [RFC4443], the | which the packet could not be forwarded. In ICMPv6 [RFC4443], the | |||
ICMP PTB message is a Packet Too Big Message with Code equal to 0. | ICMP PTB message is a Packet Too Big Message with Code equal to 0. | |||
This message also indicates the MTU of the link through which the | This message also indicates the MTU of the link through which the | |||
packet could not be forwarded. | packet could not be forwarded. | |||
2.2. Fragmentation Procedures | 2.2. Fragmentation Procedures | |||
When an upper-layer protocol submits data to the underlying IP | When an upper-layer protocol submits data to the underlying IP | |||
module, and the resulting IP packet's length is greater than the | module, and the resulting IP packet's length is greater than the | |||
PMTU, the packet is divided into fragments. Each fragment includes | PMTU, the packet is divided into fragments. Each fragment includes | |||
an IP header and a portion of the original packet. | an IP header and a portion of the original packet. | |||
[RFC0791] describes IPv4 fragmentation procedures. An IPv4 packet | [RFC0791] describes IPv4 fragmentation procedures. An IPv4 packet | |||
whose DF-bit is set to 1 can be fragmented by the source node, but | whose DF-bit is set to 1 may be fragmented by the source node, but | |||
cannot be fragmented by a downstream router. An IPv4 packet whose | may not be fragmented by a downstream router. An IPv4 packet whose | |||
DF-bit is set to 0 can be fragmented by the source node or by a | DF-bit is set to 0 may be fragmented by the source node or by a | |||
downstream router. When an IPv4 packet is fragmented, all IP options | downstream router. When an IPv4 packet is fragmented, all IP options | |||
appear in the first fragment, but only options whose "copy" bit is | (which are within the IPv4 header) appear in the first fragment, but | |||
set to 1 appear in subsequent fragments. | only options whose "copy" bit is set to 1 appear in subsequent | |||
fragments. | ||||
[RFC8200] describes IPv6 fragmentation procedures. An IPv6 packet | [RFC8200], notably in section 4.5, describes IPv6 fragmentation | |||
can be fragmented at the source node only. When an IPv6 packet is | procedures. An IPv6 packet may be fragmented only at the source | |||
fragmented, all extension headers appear in the first fragment, but | node. When an IPv6 packet is fragmented, all extension headers | |||
only per-fragment headers appear in subsequent fragments. Per- | appear in the first fragment, but only per-fragment headers appear in | |||
fragment headers include the following: | subsequent fragments. Per-fragment headers include the following: | |||
o The IPv6 header. | o The IPv6 header. | |||
o The Hop-by-hop Options header (if present) | o The Hop-by-hop Options header (if present) | |||
o The Destination Options header (if present and if it precedes a | o The Destination Options header (if present and if it precedes a | |||
Routing header) | Routing header) | |||
o The Routing Header (if present) | o The Routing Header (if present) | |||
o The Fragment Header | o The Fragment Header | |||
In both IPv4 and IPv6, the upper-layer header appears in the first | In IPv4, the upper-layer header usually appears in the first | |||
fragment only. It does not appear in subsequent fragments. | fragment, due to the sizes of the headers involved; in IPv6, it is | |||
required to. | ||||
2.3. Upper-Layer Reliance on IP Fragmentation | 2.3. Upper-Layer Reliance on IP Fragmentation | |||
Upper-layer protocols can operate in the following modes: | Upper-layer protocols can operate in the following modes: | |||
o Do not rely on IP fragmentation. | o Do not rely on IP fragmentation. | |||
o Rely on IP fragmentation by the source node only. | o Rely on IP fragmentation by the source node only. | |||
o Rely on IP fragmentation by any node. | o Rely on IP fragmentation by any node. | |||
Upper-layer protocols running over IPv4 can operate in all of the | Upper-layer protocols running over IPv4 can operate in all of the | |||
above-mentioned modes. Upper-layer protocols running over IPv6 can | above-mentioned modes. Upper-layer protocols running over IPv6 can | |||
operate in the first and second modes only. | operate in the first and second modes only. | |||
Upper-layer protocols that operate in the first two modes (above) | Upper-layer protocols that operate in the first two modes (above) | |||
require access to the PMTU estimate. In order to fulfil this | require access to the PMTU estimate. In order to fulfill this | |||
requirement, they can: | requirement, they can: | |||
o Estimate the PMTU to be equal to the IPv4 or IPv6 minimum link | o Estimate the PMTU to be equal to the IPv4 or IPv6 minimum link | |||
MTU. | MTU. | |||
o Access the estimate that PMTUD produced. | o Access the estimate that PMTUD produced. | |||
o Execute PMTUD procedures themselves. | o Execute PMTUD procedures themselves. | |||
o Execute Packetization Layer PMTUD (PLPMTUD) [RFC4821] | o Execute Packetization Layer PMTUD (PLPMTUD) [RFC4821] | |||
skipping to change at page 7, line 40 ¶ | skipping to change at page 7, line 48 ¶ | |||
dropped messages. Therefore, PLPMTUD does not rely on the network's | dropped messages. Therefore, PLPMTUD does not rely on the network's | |||
ability to deliver ICMP PTB messages to the source. | ability to deliver ICMP PTB messages to the source. | |||
3. Increased Fragility | 3. Increased Fragility | |||
This section explains how IP fragmentation introduces fragility to | This section explains how IP fragmentation introduces fragility to | |||
Internet communication. | Internet communication. | |||
3.1. Virtual Reassembly | 3.1. Virtual Reassembly | |||
Virtual reassembly is a procedure in which a device reassembles a | Virtual reassembly is a procedure in which a device conceptually | |||
packet, forwards its fragments, and discards the reassembled copy. | reassembles a packet, forwards its fragments, and discards the | |||
In A+P and CGN, virtual reassembly is required in order to correctly | reassembled copy. In A+P and CGN, virtual reassembly is required in | |||
translate fragment addresses. It can be useful in Section 3.2, | order to correctly translate fragment addresses. It could be useful | |||
Section 3.3, Section 3.4, and Section 3.5. | to address the problems in Section 3.2, Section 3.3, Section 3.4, and | |||
Section 3.5. | ||||
Virtual reassembly in the network is problematic, however, because it | Virtual reassembly in the network is problematic, however, because it | |||
is computationally expensive and because it holds state for | is computationally expensive and because it holds state for | |||
indeterminate periods of time, is prone to errors and, is prone to | indeterminate periods of time, is prone to errors and, is prone to | |||
attacks (Section 3.7). | attacks (Section 3.7). | |||
One of the benefits of fragmenting at the source, as IPv6 does, is | ||||
that there is no question of temporary state or involved processes as | ||||
required in virtual fragmentation. The sender has the entire | ||||
message, and is fragmenting it as needed - and can apply that | ||||
knowledge consistently across the fragments it produces. It is | ||||
better than virtual fragmentation in that sense. | ||||
3.2. Policy-Based Routing | 3.2. Policy-Based Routing | |||
IP Fragmentation causes problems for routers that implement policy- | IP Fragmentation causes problems for routers that implement policy- | |||
based routing. | based routing. | |||
When a router receives a packet, it identifies the next-hop on route | When a router receives a packet, it identifies the next-hop on route | |||
to the packet's destination and forwards the packet to that next-hop. | to the packet's destination and forwards the packet to that next-hop. | |||
In order to identify the next-hop, the router interrogates a local | In order to identify the next-hop, the router interrogates a local | |||
data structure called the Forwarding Information Base (FIB). | data structure called the Forwarding Information Base (FIB). | |||
skipping to change at page 9, line 46 ¶ | skipping to change at page 10, line 14 ¶ | |||
o Block all trailing fragments, possibly blocking legitimate | o Block all trailing fragments, possibly blocking legitimate | |||
traffic. | traffic. | |||
Neither option is attractive. | Neither option is attractive. | |||
3.5. Equal Cost Multipath, Link Aggregate Groups and Stateless Load- | 3.5. Equal Cost Multipath, Link Aggregate Groups and Stateless Load- | |||
Balancers | Balancers | |||
IP fragmentation causes problems for Equal Cost Multipath (ECMP), | IP fragmentation causes problems for Equal Cost Multipath (ECMP), | |||
Link Aggregate Groups (LAG) and other stateless load-balancing | Link Aggregate Groups (LAG) and other stateless load-distribution | |||
technologies. In order to assign a packet or packet fragment to a | technologies. In order to assign a packet or packet fragment to a | |||
link, an intermediate node executes a hash (i.e., load-distributing) | link, an intermediate node executes a hash (i.e., load-distributing) | |||
algorithm. The following paragraphs describe a commonly deployed | algorithm. The following paragraphs describe a commonly deployed | |||
hash algorithm. | hash algorithm. | |||
If the packet or packet fragment contains a transport-layer header, | If the packet or packet fragment contains a transport-layer header, | |||
the algorithm accepts the following 5-tuple as input: | the algorithm accepts the following 5-tuple as input: | |||
o IP Source Address. | o IP Source Address. | |||
skipping to change at page 10, line 30 ¶ | skipping to change at page 10, line 45 ¶ | |||
o IP Source Address. | o IP Source Address. | |||
o IP Destination Address. | o IP Destination Address. | |||
o IPv4 Protocol or IPv6 Next Header. | o IPv4 Protocol or IPv6 Next Header. | |||
Therefore, non-fragmented packets belonging to a flow can be assigned | Therefore, non-fragmented packets belonging to a flow can be assigned | |||
to one link while fragmented packets belonging to the same flow can | to one link while fragmented packets belonging to the same flow can | |||
be divided between that link and another. This can cause suboptimal | be divided between that link and another. This can cause suboptimal | |||
load-balancing. | load-distribution. | |||
[RFC6438] offers a partial solution to this problem for IPv6 devices | [RFC6438] offers a partial solution to this problem for IPv6 devices | |||
only. According to [RFC6438]: | only. According to [RFC6438]: | |||
"At intermediate routers that perform load distribution, the hash | "At intermediate routers that perform load balancing, the hash | |||
algorithm used to determine the outgoing component-link in an ECMP | algorithm used to determine the outgoing component-link in an ECMP | |||
and/or LAG toward the next hop MUST minimally include the 3-tuple | and/or LAG toward the next hop MUST minimally include the 3-tuple | |||
{dest addr, source addr, flow label} and MAY also include the | {dest addr, source addr, flow label} and MAY also include the | |||
remaining components of the 5-tuple." | remaining components of the 5-tuple." | |||
If the algorithm includes only the 3-tuple {dest addr, source addr, | If the algorithm includes only the 3-tuple {dest addr, source addr, | |||
flow label}, it will assign all fragments belonging to a packet to | flow label}, it will assign all fragments belonging to a packet to | |||
the same link. (See [RFC6437] and [RFC7098]). | the same link. (See [RFC6437] and [RFC7098]). | |||
In order to avoid the problem described above, implementations SHOULD | In order to avoid the problem described above, implementations SHOULD | |||
skipping to change at page 20, line 29 ¶ | skipping to change at page 20, line 34 ¶ | |||
that is not compliant with RFC 791 or RFC 8200, or even discard IP | that is not compliant with RFC 791 or RFC 8200, or even discard IP | |||
fragments completely. Such behaviors are NOT RECOMMENDED. If a | fragments completely. Such behaviors are NOT RECOMMENDED. If a | |||
middleboxes implements non-standard behavior with respect to IP | middleboxes implements non-standard behavior with respect to IP | |||
fragmentation, then that behavior MUST be clearly documented. | fragmentation, then that behavior MUST be clearly documented. | |||
6.4. For ECMP, LAG and Load-Balancer Developers And Operators | 6.4. For ECMP, LAG and Load-Balancer Developers And Operators | |||
In their default configuration, when the IPv6 Flow Label is not equal | In their default configuration, when the IPv6 Flow Label is not equal | |||
to zero, IPv6 devices that implement Equal-Cost Multipath (ECMP) | to zero, IPv6 devices that implement Equal-Cost Multipath (ECMP) | |||
Routing as described in OSPF [RFC2328] and other routing protocols, | Routing as described in OSPF [RFC2328] and other routing protocols, | |||
Link Aggregation Grouping (LAG) [RFC7424], or other load-balancing | Link Aggregation Grouping (LAG) [RFC7424], or other load-distribution | |||
technologies SHOULD accept only the following fields as input to | technologies SHOULD accept only the following fields as input to | |||
their hash algorithm: | their hash algorithm: | |||
o IP Source Address. | o IP Source Address. | |||
o IP Destination Address. | o IP Destination Address. | |||
o Flow Label. | o Flow Label. | |||
Operators SHOULD deploy these devices in their default configuration. | Operators SHOULD deploy these devices in their default configuration. | |||
End of changes. 23 change blocks. | ||||
37 lines changed or deleted | 48 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |