draft-ietf-tsvwg-datagram-plpmtud-22.txt | rfc8899.txt | |||
---|---|---|---|---|
Internet Engineering Task Force G. Fairhurst | Internet Engineering Task Force (IETF) G. Fairhurst | |||
Internet-Draft T. Jones | Request for Comments: 8899 T. Jones | |||
Updates: 4821, 4960, 6951, 8085, 8261 (if University of Aberdeen | Updates: 4821, 4960, 6951, 8085, 8261 University of Aberdeen | |||
approved) M. Tuexen | Category: Standards Track M. Tüxen | |||
Intended status: Standards Track I. Ruengeler | ISSN: 2070-1721 I. Rüngeler | |||
Expires: 12 December 2020 T. Voelker | T. Völker | |||
Muenster University of Applied Sciences | Münster University of Applied Sciences | |||
10 June 2020 | September 2020 | |||
Packetization Layer Path MTU Discovery for Datagram Transports | Packetization Layer Path MTU Discovery for Datagram Transports | |||
draft-ietf-tsvwg-datagram-plpmtud-22 | ||||
Abstract | Abstract | |||
This document describes a robust method for Path MTU Discovery | This document specifies Datagram Packetization Layer Path MTU | |||
(PMTUD) for datagram Packetization Layers (PLs). It describes an | Discovery (DPLPMTUD). This is a robust method for Path MTU Discovery | |||
extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path | (PMTUD) for datagram Packetization Layers (PLs). It allows a PL, or | |||
MTU Discovery for IPv4 and IPv6. The method allows a PL, or a | a datagram application that uses a PL, to discover whether a network | |||
datagram application that uses a PL, to discover whether a network | ||||
path can support the current size of datagram. This can be used to | path can support the current size of datagram. This can be used to | |||
detect and reduce the message size when a sender encounters a packet | detect and reduce the message size when a sender encounters a packet | |||
black hole (where packets are discarded). The method can probe a | black hole. It can also probe a network path to discover whether the | |||
network path with progressively larger packets to discover whether | maximum packet size can be increased. This provides functionality | |||
the maximum packet size can be increased. This allows a sender to | for datagram transports that is equivalent to the PLPMTUD | |||
determine an appropriate packet size, providing functionality for | specification for TCP, specified in RFC 4821, which it updates. It | |||
datagram transports that is equivalent to the Packetization Layer | also updates the UDP Usage Guidelines to refer to this method for use | |||
PMTUD specification for TCP, specified in RFC 4821. | with UDP datagrams and updates SCTP. | |||
This document updates RFC 4821 to specify the PLPMTUD method for | ||||
datagram PLs. It also updates RFC 8085 to refer to the method | ||||
specified in this document instead of the method in RFC 4821 for use | ||||
with UDP datagrams. Section 7.3 of RFC 4960 recommends an endpoint | ||||
apply the techniques in RFC 4821 on a per-destination-address basis. | ||||
RFC 4960, RFC 6951, and RFC 8261 are updated to recommend that SCTP, | ||||
SCTP encapsulated in UDP and SCTP encapsulated in DTLS use the method | ||||
specified in this document instead of the method in RFC 4821. | ||||
The document also provides implementation notes for incorporating | The document provides implementation notes for incorporating Datagram | |||
Datagram PMTUD into IETF datagram transports or applications that use | PMTUD into IETF datagram transports or applications that use datagram | |||
datagram transports. | transports. | |||
When published, this specification updates RFC 4960, RFC 4821, RFC | This specification updates RFC 4960, RFC 4821, RFC 6951, RFC 8085, | |||
8085 and RFC 8261. | and RFC 8261. | |||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This is an Internet Standards Track document. | |||
provisions of BCP 78 and BCP 79. | ||||
Internet-Drafts are working documents of the Internet Engineering | ||||
Task Force (IETF). Note that other groups may also distribute | ||||
working documents as Internet-Drafts. The list of current Internet- | ||||
Drafts is at https://datatracker.ietf.org/drafts/current/. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | This document is a product of the Internet Engineering Task Force | |||
and may be updated, replaced, or obsoleted by other documents at any | (IETF). It represents the consensus of the IETF community. It has | |||
time. It is inappropriate to use Internet-Drafts as reference | received public review and has been approved for publication by the | |||
material or to cite them other than as "work in progress." | Internet Engineering Steering Group (IESG). Further information on | |||
Internet Standards is available in Section 2 of RFC 7841. | ||||
This Internet-Draft will expire on 12 December 2020. | Information about the current status of this document, any errata, | |||
and how to provide feedback on it may be obtained at | ||||
https://www.rfc-editor.org/info/rfc8899. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2020 IETF Trust and the persons identified as the | Copyright (c) 2020 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
Please review these documents carefully, as they describe your rights | publication of this document. Please review these documents | |||
and restrictions with respect to this document. Code Components | carefully, as they describe your rights and restrictions with respect | |||
extracted from this document must include Simplified BSD License text | to this document. Code Components extracted from this document must | |||
as described in Section 4.e of the Trust Legal Provisions and are | include Simplified BSD License text as described in Section 4.e of | |||
provided without warranty as described in the Simplified BSD License. | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | ||||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction | |||
1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 4 | 1.1. Classical Path MTU Discovery | |||
1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 6 | 1.2. Packetization Layer Path MTU Discovery | |||
1.3. Path MTU Discovery for Datagram Services . . . . . . . . 7 | 1.3. Path MTU Discovery for Datagram Services | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 8 | 2. Terminology | |||
3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 11 | 3. Features Required to Provide Datagram PLPMTUD | |||
4. DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . . 14 | 4. DPLPMTUD Mechanisms | |||
4.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 14 | 4.1. PLPMTU Probe Packets | |||
4.2. Confirmation of Probed Packet Size . . . . . . . . . . . 15 | 4.2. Confirmation of Probed Packet Size | |||
4.3. Black Hole Detection and Reducing the PLPMTU . . . . . . 15 | 4.3. Black Hole Detection and Reducing the PLPMTU | |||
4.4. The Maximum Packet Size (MPS) . . . . . . . . . . . . . . 17 | 4.4. The Maximum Packet Size (MPS) | |||
4.5. Disabling the Effect of PMTUD . . . . . . . . . . . . . . 18 | 4.5. Disabling the Effect of PMTUD | |||
4.6. Response to PTB Messages . . . . . . . . . . . . . . . . 18 | 4.6. Response to PTB Messages | |||
4.6.1. Validation of PTB Messages . . . . . . . . . . . . . 18 | 4.6.1. Validation of PTB Messages | |||
4.6.2. Use of PTB Messages . . . . . . . . . . . . . . . . . 19 | 4.6.2. Use of PTB Messages | |||
5. Datagram Packetization Layer PMTUD | ||||
5. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 20 | 5.1. DPLPMTUD Components | |||
5.1. DPLPMTUD Components . . . . . . . . . . . . . . . . . . . 21 | 5.1.1. Timers | |||
5.1.1. Timers . . . . . . . . . . . . . . . . . . . . . . . 21 | 5.1.2. Constants | |||
5.1.2. Constants . . . . . . . . . . . . . . . . . . . . . . 22 | 5.1.3. Variables | |||
5.1.3. Variables . . . . . . . . . . . . . . . . . . . . . . 23 | 5.1.4. Overview of DPLPMTUD Phases | |||
5.1.4. Overview of DPLPMTUD Phases . . . . . . . . . . . . . 24 | 5.2. State Machine | |||
5.2. State Machine . . . . . . . . . . . . . . . . . . . . . . 26 | 5.3. Search to Increase the PLPMTU | |||
5.3. Search to Increase the PLPMTU . . . . . . . . . . . . . . 29 | 5.3.1. Probing for a Larger PLPMTU | |||
5.3.1. Probing for a larger PLPMTU . . . . . . . . . . . . . 29 | 5.3.2. Selection of Probe Sizes | |||
5.3.2. Selection of Probe Sizes . . . . . . . . . . . . . . 30 | 5.3.3. Resilience to Inconsistent Path Information | |||
5.3.3. Resilience to Inconsistent Path Information . . . . . 30 | 5.4. Robustness to Inconsistent Paths | |||
5.4. Robustness to Inconsistent Paths . . . . . . . . . . . . 31 | 6. Specification of Protocol-Specific Methods | |||
6. Specification of Protocol-Specific Methods . . . . . . . . . 31 | 6.1. Application Support for DPLPMTUD with UDP or UDP-Lite | |||
6.1. Application support for DPLPMTUD with UDP or UDP-Lite . . 31 | 6.1.1. Application Request | |||
6.1.1. Application Request . . . . . . . . . . . . . . . . . 32 | 6.1.2. Application Response | |||
6.1.2. Application Response . . . . . . . . . . . . . . . . 32 | 6.1.3. Sending Application Probe Packets | |||
6.1.3. Sending Application Probe Packets . . . . . . . . . . 32 | 6.1.4. Initial Connectivity | |||
6.1.4. Initial Connectivity . . . . . . . . . . . . . . . . 32 | 6.1.5. Validating the Path | |||
6.1.5. Validating the Path . . . . . . . . . . . . . . . . . 32 | 6.1.6. Handling of PTB Messages | |||
6.1.6. Handling of PTB Messages . . . . . . . . . . . . . . 32 | 6.2. DPLPMTUD for SCTP | |||
6.2. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 33 | 6.2.1. SCTP/IPv4 and SCTP/IPv6 | |||
6.2.1. SCTP/IPv4 and SCTP/IPv6 . . . . . . . . . . . . . . . 33 | 6.2.1.1. Initial Connectivity | |||
6.2.1.1. Initial Connectivity . . . . . . . . . . . . . . 33 | 6.2.1.2. Sending SCTP Probe Packets | |||
6.2.1.2. Sending SCTP Probe Packets . . . . . . . . . . . 33 | 6.2.1.3. Validating the Path with SCTP | |||
6.2.1.3. Validating the Path with SCTP . . . . . . . . . . 34 | 6.2.1.4. PTB Message Handling by SCTP | |||
6.2.1.4. PTB Message Handling by SCTP . . . . . . . . . . 34 | 6.2.2. DPLPMTUD for SCTP/UDP | |||
6.2.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 34 | 6.2.2.1. Initial Connectivity | |||
6.2.2.1. Initial Connectivity . . . . . . . . . . . . . . 35 | 6.2.2.2. Sending SCTP/UDP Probe Packets | |||
6.2.2.2. Sending SCTP/UDP Probe Packets . . . . . . . . . 35 | 6.2.2.3. Validating the Path with SCTP/UDP | |||
6.2.2.3. Validating the Path with SCTP/UDP . . . . . . . . 35 | 6.2.2.4. Handling of PTB Messages by SCTP/UDP | |||
6.2.2.4. Handling of PTB Messages by SCTP/UDP . . . . . . 35 | 6.2.3. DPLPMTUD for SCTP/DTLS | |||
6.2.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . 35 | 6.2.3.1. Initial Connectivity | |||
6.2.3.1. Initial Connectivity . . . . . . . . . . . . . . 35 | 6.2.3.2. Sending SCTP/DTLS Probe Packets | |||
6.2.3.2. Sending SCTP/DTLS Probe Packets . . . . . . . . . 36 | 6.2.3.3. Validating the Path with SCTP/DTLS | |||
6.2.3.3. Validating the Path with SCTP/DTLS . . . . . . . 36 | 6.2.3.4. Handling of PTB Messages by SCTP/DTLS | |||
6.2.3.4. Handling of PTB Messages by SCTP/DTLS . . . . . . 36 | 6.3. DPLPMTUD for QUIC | |||
6.3. DPLPMTUD for QUIC . . . . . . . . . . . . . . . . . . . . 36 | 7. IANA Considerations | |||
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 36 | 8. Security Considerations | |||
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 36 | 9. References | |||
9. Security Considerations . . . . . . . . . . . . . . . . . . . 37 | 9.1. Normative References | |||
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 38 | 9.2. Informative References | |||
10.1. Normative References . . . . . . . . . . . . . . . . . . 38 | Acknowledgments | |||
10.2. Informative References . . . . . . . . . . . . . . . . . 39 | Authors' Addresses | |||
Appendix A. Revision Notes . . . . . . . . . . . . . . . . . . . 41 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 46 | ||||
1. Introduction | 1. Introduction | |||
The IETF has specified datagram transport using UDP, SCTP, and DCCP, | The IETF has specified datagram transport using UDP, Stream Control | |||
as well as protocols layered on top of these transports (e.g., SCTP/ | Transmission Protocol (SCTP), and Datagram Congestion Control | |||
UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP | Protocol (DCCP), as well as protocols layered on top of these | |||
network layer. This document describes a robust method for Path MTU | transports (e.g., SCTP/UDP, DCCP/UDP, QUIC/UDP) and direct datagram | |||
Discovery (PMTUD) that can be used with these transport protocols (or | transport over the IP network layer. This document describes a | |||
the applications that use their transport service) to discover an | robust method for Path MTU Discovery (PMTUD) that can be used with | |||
appropriate size of packet to use across an Internet path. | these transport protocols (or the applications that use their | |||
transport service) to discover an appropriate size of packet to use | ||||
across an Internet path. | ||||
1.1. Classical Path MTU Discovery | 1.1. Classical Path MTU Discovery | |||
Classical Path Maximum Transmission Unit Discovery (PMTUD) can be | Classical Path Maximum Transmission Unit Discovery (PMTUD) can be | |||
used with any transport that is able to process ICMP Packet Too Big | used with any transport that is able to process ICMP Packet Too Big | |||
(PTB) messages (e.g., [RFC1191] and [RFC8201]). In this document, | (PTB) messages (e.g., [RFC1191] and [RFC8201]). In this document, | |||
the term PTB message is applied to both IPv4 ICMP Unreachable | the term PTB message is applied to both IPv4 ICMP Unreachable | |||
messages (type 3) that carry the error Fragmentation Needed (Type 3, | messages (Type 3) that carry the error Fragmentation Needed (Type 3, | |||
Code 4) [RFC0792] and ICMPv6 Packet Too Big messages (Type 2) | Code 4) [RFC0792] and ICMPv6 Packet Too Big messages (Type 2) | |||
[RFC4443]. When a sender receives a PTB message, it reduces the | [RFC4443]. When a sender receives a PTB message, it reduces the | |||
effective MTU to the value reported as the Link MTU in the PTB | effective MTU to the value reported as the link MTU in the PTB | |||
message. A method from time-to-time increases the packet size in | message. Classical PMTUD specifies a method of periodically | |||
attempt to discover an increase in the supported PMTU. The packets | increasing the packet size in an attempt to discover an increase in | |||
sent with a size larger than the current effective PMTU are known as | the supported PMTU. The packets sent with a size larger than the | |||
probe packets. | current effective PMTU are known as probe packets. | |||
Packets not intended as probe packets are either fragmented to the | Packets not intended as probe packets are either fragmented to the | |||
current effective PMTU, or the attempt to send fails with an error | current effective PMTU, or the attempt to send fails with an error | |||
code. Applications can be provided with a primitive to let them read | code. Applications can be provided with a primitive to let them read | |||
the Maximum Packet Size (MPS), derived from the current effective | the Maximum Packet Size (MPS), which is derived from the current | |||
PMTU. | effective PMTU. | |||
Classical PMTUD is subject to protocol failures. One failure arises | Classical PMTUD is subject to protocol failures. One failure arises | |||
when traffic using a packet size larger than the actual PMTU is | when traffic using a packet size larger than the actual PMTU is | |||
black-holed (all datagrams larger than the actual PMTU, are | black-holed (all datagrams larger than the actual PMTU are | |||
discarded). This could arise when the PTB messages are not delivered | discarded). This could arise when the PTB messages are not sent back | |||
back to the sender for some reason (see for example [RFC2923]). | to the sender for some reason (for example, see [RFC2923]). | |||
Examples where PTB messages are not delivered include: | Examples of where PTB messages are not delivered include the | |||
following: | ||||
* The generation of ICMP messages is usually rate limited. This | * The generation of ICMP messages is usually rate limited. This | |||
could result in no PTB messages being generated to the sender (see | could result in no PTB messages being generated to the sender (see | |||
section 2.4 of [RFC4443]) | Section 2.4 of [RFC4443]). | |||
* ICMP messages can be filtered by middleboxes (including firewalls) | * ICMP messages can be filtered by middleboxes, including firewalls | |||
[RFC4890]. A firewall could be configured with a policy to block | [RFC4890]. A firewall could be configured with a policy to block | |||
incoming ICMP messages, which would prevent reception of PTB | incoming ICMP messages, which would prevent reception of PTB | |||
messages to a sending endpoint behind this firewall. | messages by a sending endpoint behind this firewall. | |||
* When the router issuing the ICMP message drops a tunneled packet, | * When the router issuing the ICMP message drops a tunneled packet, | |||
the resulting ICMP message will be directed to the tunnel ingress. | the resulting ICMP message is directed to the tunnel ingress. | |||
This tunnel endpoint is responsible for forwarding the ICMP | This tunnel endpoint is responsible for forwarding the ICMP | |||
message and also processing the quoted packet within the payload | message, processing the quoted packet within the payload field to | |||
field to remove the effect of the tunnel, and return a correctly | remove the effect of the tunnel and returning a correctly | |||
formatted ICMP message to the sender [I-D.ietf-intarea-tunnels]. | formatted ICMP message to the sender [TUNNELS]. Failure to do | |||
Failure to do this prevents the PTB message reaching the original | this prevents the PTB message from reaching the original sender. | |||
sender. | ||||
* Asymmetry in forwarding can result in there being no return route | * Asymmetry in forwarding can result in there being no return route | |||
to the original sender, which would prevent an ICMP message being | to the original sender, which would prevent an ICMP message from | |||
delivered to the sender. This issue can also arise when policy- | being delivered to the sender. This issue can also arise when | |||
based routing is used, Equal Cost Multipath (ECMP) routing is | either policy-based or Equal-Cost Multipath (ECMP) routing is used | |||
used, or a middlebox acts as an application load balancer. An | or when a middlebox acts as an application load balancer. An | |||
example is where the path towards the server is chosen by ECMP | example of which is an ECMP router choosing a path toward the | |||
routing depending on bytes in the IP payload. In this case, when | server based on the bytes in the IP payload. In this case, if a | |||
a packet sent by the server encounters a problem after the ECMP | packet sent by the server encounters a problem after the ECMP | |||
router, then any resulting ICMP message also needs to be directed | router, then the ECMP router needs to direct any resulting ICMP | |||
by the ECMP router towards the original sender. | message toward the original sender. | |||
* There are additional cases where the next hop destination fails to | * There are additional cases where the next-hop destination fails to | |||
receive a packet because of its size. This could be due to | receive a packet because of its size. This could be due to | |||
misconfiguration of the layer 2 path between nodes, for instance | misconfiguration of the layer 2 path between nodes, for instance | |||
the MTU configured in a layer 2 switch, or misconfiguration of the | the MTU configured in a layer 2 switch, or misconfiguration of the | |||
Maximum Receive Unit (MRU). If a packet is dropped by the link, | Maximum Receive Unit (MRU). If a packet is dropped by the link, | |||
this will not cause a PTB message to be sent to the original | this will not cause a PTB message to be sent to the original | |||
sender. | sender. | |||
Another failure could result if a node that is not on the network | Another failure could result if a node that is not on the network | |||
path sends a PTB message that attempts to force a sender to change | path sends a PTB message that attempts to force a sender to change | |||
the effective PMTU [RFC8201]. A sender can protect itself from | the effective PMTU [RFC8201]. A sender can protect itself from | |||
reacting to such messages by utilizing the quoted packet within a PTB | reacting to such messages by utilizing the quoted packet within a PTB | |||
message payload to validate that the received PTB message was | message payload to validate that the received PTB message was | |||
generated in response to a packet that had actually originated from | generated in response to a packet that had actually originated from | |||
the sender. However, there are situations where a sender would be | the sender. However, there are situations where a sender would be | |||
unable to provide this validation. Examples where validation of the | unable to provide this validation. Examples where the validation of | |||
PTB message is not possible include: | the PTB message is not possible include the following: | |||
* When a router issuing the ICMP message implements RFC792 | * When a router issuing the ICMP message implements RFC 792 | |||
[RFC0792], it is only required to include the first 64 bits of the | [RFC0792], it is only required to include the first 64 bits of the | |||
IP payload of the packet within the quoted payload. There could | IP payload of the packet within the quoted payload. There could | |||
be insufficient bytes remaining for the sender to interpret the | be insufficient bytes remaining for the sender to interpret the | |||
quoted transport information. | quoted transport information. | |||
Note: The recommendation in RFC1812 [RFC1812] is that IPv4 routers | Note: The recommendation in RFC 1812 [RFC1812] is that IPv4 | |||
return a quoted packet with as much of the original datagram as | routers return a quoted packet with as much of the original | |||
possible without the length of the ICMP datagram exceeding 576 | datagram as possible without the length of the ICMP datagram | |||
bytes. IPv6 routers include as much of the invoking packet as | exceeding 576 bytes. IPv6 routers include as much of the invoking | |||
possible without the ICMPv6 packet exceeding 1280 bytes [RFC4443]. | packet as possible without the ICMPv6 packet exceeding 1280 bytes | |||
[RFC4443]. | ||||
* The use of tunnels/encryption can reduce the size of the quoted | * The use of tunnels and/or encryption can reduce the size of the | |||
packet returned to the original source address, increasing the | quoted packet returned to the original source address, increasing | |||
risk that there could be insufficient bytes remaining for the | the risk that there could be insufficient bytes remaining for the | |||
sender to interpret the quoted transport information. | sender to interpret the quoted transport information. | |||
* Even when the PTB message includes sufficient bytes of the quoted | * Even when the PTB message includes sufficient bytes of the quoted | |||
packet, the network layer could lack sufficient context to | packet, the network layer could lack sufficient context to | |||
validate the message, because validation depends on information | validate the message because validation depends on information | |||
about the active transport flows at an endpoint node (e.g., the | about the active transport flows at an endpoint node (e.g., the | |||
socket/address pairs being used, and other protocol header | socket/address pairs being used and other protocol header | |||
information). | information). | |||
* When a packet is encapsulated/tunneled over an encrypted | * When a packet is encapsulated/tunneled over an encrypted | |||
transport, the tunnel/encapsulation ingress might have | transport, the tunnel/encapsulation ingress might have | |||
insufficient context, or computational power, to reconstruct the | insufficient context, or computational power, to reconstruct the | |||
transport header that would be needed to perform validation. | transport header that would be needed to perform validation. | |||
* When an ICMP message is generated by a router in a network segment | * When an ICMP message is generated by a router in a network segment | |||
that has inserted a header into a packet, the quoted packet could | that has inserted a header into a packet, the quoted packet could | |||
contain additional protocol header information that was not | contain additional protocol header information that was not | |||
included in the original sent packet, and which the PL sender does | included in the original sent packet and that the PL sender does | |||
not process or may not know how to process. This could disrupt | not process or may not know how to process. This could disrupt | |||
the ability of the sender to validate this PTB message. | the ability of the sender to validate this PTB message. | |||
* A Network Address Translation (NAT) device that translates a | * A Network Address Translation (NAT) device that translates a | |||
packet header, ought to also translate ICMP messages and update | packet header ought to also translate ICMP messages and update the | |||
the ICMP quoted packet [RFC5508] in that message. If this is not | ICMP-quoted packet [RFC5508] in that message. If this is not | |||
correctly translated then the sender would not be able to | correctly translated, then the sender would not be able to | |||
associate the message with the PL that originated the packet, and | associate the message with the PL that originated the packet, and | |||
hence this ICMP message cannot be validated. | hence this ICMP message cannot be validated. | |||
1.2. Packetization Layer Path MTU Discovery | 1.2. Packetization Layer Path MTU Discovery | |||
The term Packetization Layer (PL) has been introduced to describe the | The term Packetization Layer (PL) has been introduced to describe the | |||
layer that is responsible for placing data blocks into the payload of | layer that is responsible for placing data blocks into the payload of | |||
IP packets and selecting an appropriate MPS. This function is often | IP packets and selecting an appropriate MPS. This function is often | |||
performed by a transport protocol (e.g., DCCP, RTP, SCTP, QUIC), but | performed by a transport protocol (e.g., DCCP, RTP, SCTP, QUIC) but | |||
can also be performed by other encapsulation methods working above | can also be performed by other encapsulation methods working above | |||
the transport layer. | the transport layer. | |||
In contrast to PMTUD, Packetization Layer Path MTU Discovery | In contrast to PMTUD, Packetization Layer Path MTU Discovery | |||
(PLPMTUD) [RFC4821] introduced a method that does not rely upon | (PLPMTUD) [RFC4821] introduces a method that does not rely upon | |||
reception and validation of PTB messages. It is therefore more | reception and validation of PTB messages. It is therefore more | |||
robust than Classical PMTUD. This has become the recommended | robust than Classical PMTUD. This has become the recommended | |||
approach for implementing discovery of the PMTU [BCP145]. | approach for implementing discovery of the PMTU [BCP145]. | |||
It uses a general strategy where the PL sends probe packets to search | This document updates [RFC4821] to specify the PLPMTUD method for | |||
for the largest size of unfragmented datagram that can be sent over a | datagram PLs and also updates [BCP145] to refer to the method | |||
network path. Probe packets are sent to explore using a larger | specified in this document for use with UDP datagrams instead of the | |||
packet size. If a probe packet is successfully delivered (as | method in [RFC4821]. | |||
It uses a general strategy in which the PL sends probe packets to | ||||
search for the largest size of unfragmented datagram that can be sent | ||||
over a network path. Probe packets are sent to explore using a | ||||
larger packet size. If a probe packet is successfully delivered (as | ||||
determined by the PL), then the PLPMTU is raised to the size of the | determined by the PL), then the PLPMTU is raised to the size of the | |||
successful probe. If a black hole is detected (e.g., where packets | successful probe. If a black hole is detected (e.g., where packets | |||
of size PLPMTU are consistently not received), the method reduces the | of size PLPMTU are consistently not received), the method reduces the | |||
PLPMTU. | PLPMTU. | |||
Datagram PLPMTUD introduces flexibility in implementation. At one | Datagram PLPMTUD introduces flexibility in implementation. At one | |||
extreme, it can be configured to only perform Black Hole Detection | extreme, it can be configured to only perform black hole detection | |||
and recovery with increased robustness compared to Classical PMTUD. | and recovery with increased robustness compared to Classical PMTUD. | |||
At the other extreme, all PTB processing can be disabled, and PLPMTUD | At the other extreme, all PTB processing can be disabled, and PLPMTUD | |||
replaces Classical PMTUD. | replaces Classical PMTUD. | |||
PLPMTUD can also include additional consistency checks without | PLPMTUD can also include additional consistency checks without | |||
increasing the risk that data is lost when probing to discover the | increasing the risk that data is lost when probing to discover the | |||
Path MTU. For example, information available at the PL, or higher | Path MTU. For example, information available at the PL, or higher | |||
layers, enables received PTB messages to be validated before being | layers, enables received PTB messages to be validated before being | |||
utilized. | utilized. | |||
1.3. Path MTU Discovery for Datagram Services | 1.3. Path MTU Discovery for Datagram Services | |||
Section 5 of this document presents a set of algorithms for datagram | Section 5 of this document presents a set of algorithms for datagram | |||
protocols to discover the largest size of unfragmented datagram that | protocols to discover the largest size of unfragmented datagram that | |||
can be sent over a network path. The method relies upon features of | can be sent over a network path. The method relies upon features of | |||
the PL described in Section 3 and applies to transport protocols | the PL described in Section 3 and applies to transport protocols | |||
operating over IPv4 and IPv6. It does not require cooperation from | operating over IPv4 and IPv6. It does not require cooperation from | |||
the lower layers, although it can utilize PTB messages when these | the lower layers, although it can utilize PTB messages when these | |||
received messages are made available to the PL. | received messages are made available to the PL. | |||
The message size guidelines in section 3.2 of the UDP Usage | The message size guidelines in Section 3.2 of the UDP Usage | |||
Guidelines [BCP145] state "an application SHOULD either use the Path | Guidelines [BCP145] state that "an application SHOULD either use the | |||
MTU information provided by the IP layer or implement Path MTU | Path MTU information provided by the IP layer or implement Path MTU | |||
Discovery (PMTUD)", but does not provide a mechanism for discovering | Discovery (PMTUD)" but do not provide a mechanism for discovering the | |||
the largest size of unfragmented datagram that can be used on a | largest size of unfragmented datagram that can be used on a network | |||
network path. The present document updates RFC 8085 to specify this | path. The present document updates RFC 8085 to specify this method | |||
method in place of PLPMTUD [RFC4821] and provides a mechanism for | in place of PLPMTUD [RFC4821] and provides a mechanism for sharing | |||
sharing the discovered largest size as the MPS (see Section 4.4). | the discovered largest size as the MPS (see Section 4.4). | |||
Section 10.2 of [RFC4821] recommended a PLPMTUD probing method for | Section 10.2 of [RFC4821] recommended a PLPMTUD probing method for | |||
the Stream Control Transport Protocol (SCTP). SCTP utilizes probe | the Stream Control Transport Protocol (SCTP). SCTP utilizes probe | |||
packets consisting of a minimal sized HEARTBEAT chunk bundled with a | packets consisting of a minimal-sized HEARTBEAT chunk bundled with a | |||
PAD chunk as defined in [RFC4820]. However, RFC 4821 did not provide | PAD chunk as defined in [RFC4820]. However, RFC 4821 did not provide | |||
a complete specification. The present document replaces that | a complete specification. The present document replaces that | |||
description by providing a complete specification. | description by providing a complete specification. | |||
The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires | The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires | |||
implementations to support Classical PMTUD and states that a DCCP | implementations to support Classical PMTUD and states that a DCCP | |||
sender "MUST maintain the MPS allowed for each active DCCP session". | sender "MUST maintain the MPS allowed for each active DCCP session". | |||
It also defines the current congestion control MPS (CCMPS) supported | It also defines the current congestion control MPS (CCMPS) supported | |||
by a network path. This recommends use of PMTUD, and suggests use of | by a network path. This recommends use of PMTUD and suggests use of | |||
control packets (DCCP-Sync) as path probe packets, because they do | control packets (DCCP-Sync) as path probe packets because they do not | |||
not risk application data loss. The method defined in this | risk application data loss. The method defined in this specification | |||
specification can be used with DCCP. | can be used with DCCP. | |||
Section 4 and Section 5 define the protocol mechanisms and | Section 4 and Section 5 define the protocol mechanisms and | |||
specification for Datagram Packetization Layer Path MTU Discovery | specification for Datagram Packetization Layer Path MTU Discovery | |||
(DPLPMTUD). | (DPLPMTUD). | |||
Section 6 specifies the method for datagram transports and provides | Section 6 specifies the method for datagram transports and provides | |||
information to enable the implementation of PLPMTUD with other | information to enable the implementation of PLPMTUD with other | |||
datagram transports and applications that use datagram transports. | datagram transports and applications that use datagram transports. | |||
Section 6 also provides updated recommendations for [RFC6951] and | Section 6 also provides recommendations for SCTP endpoints, updating | |||
[RFC8261]. | [RFC4960], [RFC6951], and [RFC8261] to use the method specified in | |||
this document instead of the method in [RFC4821]. | ||||
2. Terminology | 2. Terminology | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
"OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in | |||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
capitals, as shown here. | capitals, as shown here. | |||
The following terminology is defined. Relevant terms are directly | The following terminology is defined. Relevant terms are directly | |||
copied from [RFC4821], and the definitions in [RFC1122]. | copied from [RFC4821], and the definitions in [RFC1122] apply. | |||
Acknowledged PL: A PL that includes a mechanism that can confirm | Acknowledged PL: A PL that includes a mechanism that can confirm | |||
successful delivery of datagrams to the remote PL endpoint (e.g., | successful delivery of datagrams to the remote PL endpoint (e.g., | |||
SCTP). Typically, the PL receiver returns acknowledgments | SCTP). Typically, the PL receiver returns acknowledgments | |||
corresponding to the received datagrams, which can be utilised to | corresponding to the received datagrams, which can be utilized to | |||
detect black-holing of packets (c.f., Unacknowledged PL). | detect black-holing of packets (c.f., Unacknowledged PL). | |||
Actual PMTU: The Actual PMTU is the PMTU of a network path between a | Actual PMTU: The actual PMTU is the PMTU of a network path between a | |||
sender PL and a destination PL, which the DPLPMTUD algorithm seeks | sender PL and a destination PL, which the DPLPMTUD algorithm seeks | |||
to determine. | to determine. | |||
Black Hole: A Black Hole is encountered when a sender is unaware | Black Hole: A black hole is encountered when a sender is unaware | |||
that packets are not being delivered to the destination end point. | that packets are not being delivered to the destination endpoint. | |||
Two types of Black Hole are relevant to DPLPMTUD: | Two types of black hole are relevant to DPLPMTUD: | |||
* Packets encounter a packet Black Hole when packets are not | * Packets encounter a packet black hole when packets are not | |||
delivered to the destination endpoint (e.g., when the sender | delivered to the destination endpoint (e.g., when the sender | |||
transmits packets of a particular size with a previously known | transmits packets of a particular size with a previously known | |||
effective PMTU and they are discarded by the network). | effective PMTU, and they are discarded by the network). | |||
* An ICMP Black Hole is encountered when the sender is unaware | * An ICMP black hole is encountered when the sender is unaware | |||
that packets are not delivered to the destination endpoint | that packets are not delivered to the destination endpoint | |||
because PTB messages are not received by the originating PL | because PTB messages are not received by the originating PL | |||
sender. | sender. | |||
Classical Path MTU Discovery: Classical PMTUD is a process described | Classical Path MTU Discovery: Classical PMTUD is a process described | |||
in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to | in [RFC1191] and [RFC8201] in which nodes rely on PTB messages to | |||
learn the largest size of unfragmented packet that can be used | learn the largest size of unfragmented packet that can be used | |||
across a network path. | across a network path. | |||
Datagram: A datagram is a transport-layer protocol data unit, | Datagram: A datagram is a transport-layer protocol data unit, | |||
transmitted in the payload of an IP packet. | transmitted in the payload of an IP packet. | |||
Effective PMTU: The Effective PMTU is the current estimated value | DPLPMTUD: Datagram Packetization Layer Path MTU Discovery | |||
(DPLPMTUD), PLPMTUD performed using a datagram transport protocol. | ||||
Effective PMTU: The effective PMTU is the current estimated value | ||||
for PMTU that is used by a PMTUD. This is equivalent to the | for PMTU that is used by a PMTUD. This is equivalent to the | |||
PLPMTU derived by PLPMTUD plus the size of any headers added below | PLPMTU derived by PLPMTUD plus the size of any headers added below | |||
the PL, including the IP layer headers. | the PL, including the IP layer headers. | |||
EMTU_S: The Effective MTU for sending (EMTU_S) is defined in | EMTU_S: The effective MTU for sending (EMTU_S) is defined in | |||
[RFC1122] as "the maximum IP datagram size that may be sent, for a | [RFC1122] as "the maximum IP datagram size that may be sent, for a | |||
particular combination of IP source and destination addresses...". | particular combination of IP source and destination addresses...". | |||
EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in | EMTU_R: The effective MTU for receiving (EMTU_R) is designated in | |||
[RFC1122] as "the largest datagram size that can be reassembled". | [RFC1122] as "the largest datagram size that can be reassembled". | |||
Link: A Link is a communication facility or medium over which nodes | Link: A link is a communication facility or medium over which nodes | |||
can communicate at the link layer, i.e., a layer below the IP | can communicate at the link layer, i.e., a layer below the IP | |||
layer. Examples are Ethernet LANs and Internet (or higher) layer | layer. Examples are Ethernet LANs and Internet (or higher) layer | |||
tunnels. | tunnels. | |||
Link MTU: The Link Maximum Transmission Unit (MTU) is the size in | Link MTU: The link Maximum Transmission Unit (MTU) is the size in | |||
bytes of the largest IP packet, including the IP header and | bytes of the largest IP packet, including the IP header and | |||
payload, that can be transmitted over a link. Note that this | payload, that can be transmitted over a link. Note that this | |||
could more properly be called the IP MTU, to be consistent with | could more properly be called the IP MTU, to be consistent with | |||
how other standards organizations use the acronym. This includes | how other standards organizations use the acronym. This includes | |||
the IP header, but excludes link layer headers and other framing | the IP header but excludes link layer headers and other framing | |||
that is not part of IP or the IP payload. Other standards | that is not part of IP or the IP payload. Other standards | |||
organizations generally define the link MTU to include the link | organizations generally define the link MTU to include the link | |||
layer headers. This specification continues the requirement in | layer headers. This specification continues the requirement in | |||
[RFC4821], that states "All links MUST enforce their MTU: links | [RFC4821] that states, "All links MUST enforce their MTU: links | |||
that might non- deterministically deliver packets that are larger | that might non-deterministically deliver packets that are larger | |||
than their rated MTU MUST consistently discard such packets." | than their rated MTU MUST consistently discard such packets." | |||
MAX_PLPMTU: The MAX_PLPMTU is the largest size of PLPMTU that | MAX_PLPMTU: The MAX_PLPMTU is the largest size of PLPMTU that | |||
DPLPMTUD will attempt to use (see the constants defined in | DPLPMTUD will attempt to use (see the constants defined in | |||
Section 5.1.2). | Section 5.1.2). | |||
MIN_PLPMTU: The MIN_PLPMTU is the smallest size of PLPMTU that | MIN_PLPMTU: The MIN_PLPMTU is the smallest size of PLPMTU that | |||
DPLPMTUD will attempt to use (see the constants defined in | DPLPMTUD will attempt to use (see the constants defined in | |||
Section 5.1.2). | Section 5.1.2). | |||
MPS: The Maximum Packet Size (MPS) is the largest size of | MPS: The Maximum Packet Size (MPS) is the largest size of | |||
application data block that can be sent across a network path by a | application data block that can be sent across a network path by a | |||
PL using a single Datagram (see Section 4.4). | PL using a single datagram (see Section 4.4). | |||
MSL: Maximum Segment Lifetime (MSL) The maximum delay a packet is | MSL: The Maximum Segment Lifetime (MSL) is the maximum delay a | |||
expected to experience across a path, taken as 2 minutes [BCP145]. | packet is expected to experience across a path, taken as 2 minutes | |||
[BCP145]. | ||||
Packet: A Packet is the IP header(s) and any extension headers/ | Packet: A packet is the IP header(s) and any extension headers/ | |||
options plus the IP payload. | options plus the IP payload. | |||
Packetization Layer (PL): The PL is a layer of the network stack | Packetization Layer (PL): The PL is a layer of the network stack | |||
that places data into packets and performs transport protocol | that places data into packets and performs transport protocol | |||
functions. Examples of a PL include: TCP, SCTP, SCTP over UDP, | functions. Examples of a PL include TCP, SCTP, SCTP over UDP, | |||
SCTP over DTLS, or QUIC. | SCTP over DTLS, or QUIC. | |||
Path: The Path is the set of links and routers traversed by a packet | Path: The path is the set of links and routers traversed by a packet | |||
between a source node and a destination node by a particular flow. | between a source node and a destination node by a particular flow. | |||
Path MTU (PMTU): The Path MTU (PMTU) is the minimum of the Link MTU | Path MTU (PMTU): The Path MTU (PMTU) is the minimum of the link MTU | |||
of all the links forming a network path between a source node and | of all the links forming a network path between a source node and | |||
a destination node, as used by PMTUD. | a destination node, as used by PMTUD. | |||
PTB: In this document, the term PTB message is applied to both IPv4 | PTB: In this document, the term PTB message is applied to both IPv4 | |||
ICMP Unreachable messages (type 3) that carry the error | ICMP Unreachable messages (Type 3) that carry the error | |||
Fragmentation Needed (Type 3, Code 4) [RFC0792] and ICMPv6 Packet | Fragmentation Needed (Type 3, Code 4) [RFC0792] and ICMPv6 Packet | |||
Too Big messages (Type 2) [RFC4443]. | Too Big messages (Type 2) [RFC4443]. | |||
PTB_SIZE: The PTB_SIZE is a value reported in a validated PTB | PTB_SIZE: The PTB_SIZE is a value reported in a validated PTB | |||
message that indicates next hop link MTU of a router along the | message that indicates next-hop link MTU of a router along the | |||
path. | path. | |||
PL_PTB_SIZE: The size reported in a validated PTB message, reduced | PL_PTB_SIZE: The size reported in a validated PTB message, reduced | |||
by the size of all headers added by layers below the PL. | by the size of all headers added by layers below the PL. | |||
PLPMTU: The Packetization Layer PMTU is an estimate of the largest | PLPMTU: The Packetization Layer PMTU is an estimate of the largest | |||
size of PL datagram that can be sent by a path, controled by | size of PL datagram that can be sent by a path, controlled by | |||
PLPMTUD. | PLPMTUD. | |||
PLPMTUD: Packetization Layer Path MTU Discovery (PLPMTUD), the | PLPMTUD: Packetization Layer Path MTU Discovery (PLPMTUD), the | |||
method described in this document for datagram PLs, which is an | method described in this document for datagram PLs, which is an | |||
extension to Classical PMTU Discovery. | extension to Classical PMTU Discovery. | |||
Probe packet: A probe packet is a datagram sent with a purposely | Probe packet: A probe packet is a datagram sent with a purposely | |||
chosen size (typically the current PLPMTU or larger) to detect if | chosen size (typically the current PLPMTU or larger) to detect if | |||
packets of this size can be successfully sent end-to-end across | packets of this size can be successfully sent end-to-end across | |||
the network path. | the network path. | |||
skipping to change at page 11, line 26 ¶ | skipping to change at line 497 ¶ | |||
TCP protocol mechanisms. Unlike TCP, a datagram PL requires | TCP protocol mechanisms. Unlike TCP, a datagram PL requires | |||
additional mechanisms and considerations to implement PLPMTUD. | additional mechanisms and considerations to implement PLPMTUD. | |||
The requirements for datagram PLPMTUD are: | The requirements for datagram PLPMTUD are: | |||
1. Managing the PLPMTU: For datagram PLs, the PLPMTU is managed by | 1. Managing the PLPMTU: For datagram PLs, the PLPMTU is managed by | |||
DPLPMTUD. A PL MUST NOT send a datagram (other than a probe | DPLPMTUD. A PL MUST NOT send a datagram (other than a probe | |||
packet) with a size at the PL that is larger than the current | packet) with a size at the PL that is larger than the current | |||
PLPMTU. | PLPMTU. | |||
2. Probe packets: The network interface below PL is REQUIRED to | 2. Probe packets: The network interface below the PL is REQUIRED to | |||
provide a way to transmit a probe packet that is larger than the | provide a way to transmit a probe packet that is larger than the | |||
PLPMTU. In IPv4, a probe packet MUST be sent with the Don't | PLPMTU. In IPv4, a probe packet MUST be sent with the Don't | |||
Fragment (DF) bit set in the IP header, and without network layer | Fragment (DF) bit set in the IP header and without network layer | |||
endpoint fragmentation. In IPv6, a probe packet is always sent | endpoint fragmentation. In IPv6, a probe packet is always sent | |||
without source fragmentation (as specified in section 5.4 of | without source fragmentation (as specified in Section 5.4 of | |||
[RFC8201]). | [RFC8201]). | |||
3. Reception feedback: The destination PL endpoint is REQUIRED to | 3. Reception feedback: The destination PL endpoint is REQUIRED to | |||
provide a feedback method that indicates to the DPLPMTUD sender | provide a feedback method that indicates to the DPLPMTUD sender | |||
when a probe packet has been received by the destination PL | when a probe packet has been received by the destination PL | |||
endpoint. Section 6 provides examples of how a PL can provide | endpoint. Section 6 provides examples of how a PL can provide | |||
this acknowledgment of received probe packets. | this acknowledgment of received probe packets. | |||
4. Probe loss recovery: It is RECOMMENDED to use probe packets that | 4. Probe loss recovery: It is RECOMMENDED to use probe packets that | |||
do not carry any user data that would require retransmission if | do not carry any user data that would require retransmission if | |||
lost. Most datagram transports permit this. If a probe packet | lost. Most datagram transports permit this. If a probe packet | |||
contains user data requiring retransmission in case of loss, the | contains user data requiring retransmission in case of loss, the | |||
PL (or layers above) are REQUIRED to arrange any retransmission/ | PL (or layers above) is REQUIRED to arrange any retransmission | |||
repair of any resulting loss. The PL is REQUIRED to be robust in | and/or repair of any resulting loss. The PL is REQUIRED to be | |||
the case where probe packets are lost due to other reasons | robust in the case where probe packets are lost due to other | |||
(including link transmission error, congestion). | reasons (including link transmission error, congestion). | |||
5. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to utilize | 5. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to utilize | |||
information about the maximum size of packet that can be | information about the maximum size of packet that can be | |||
transmitted by the sender on the local link (e.g., the local Link | transmitted by the sender on the local link (e.g., the local link | |||
MTU). A PL sender MAY utilize similar information about the | MTU). A PL sender MAY utilize similar information about the | |||
maximum size of network layer packet that a receiver can accept | maximum size of network-layer packet that a receiver can accept | |||
when this is supplied (note this could be less than EMTU_R). | when this is supplied (note this could be less than EMTU_R). | |||
This avoids implementations trying to send probe packets that can | This avoids implementations trying to send probe packets that | |||
not be transferred by the local link. Too high of a value could | cannot be transferred by the local link. Too high of a value | |||
reduce the efficiency of the search algorithm. Some applications | could reduce the efficiency of the search algorithm. Some | |||
also have a maximum transport protocol data unit (PDU) size, in | applications also have a maximum transport protocol data unit | |||
which case there is no benefit from probing for a size larger | (PDU) size, in which case there is no benefit from probing for a | |||
than this (unless a transport allows multiplexing multiple | size larger than this (unless a transport allows multiplexing | |||
applications PDUs into the same datagram). | multiple applications' PDUs into the same datagram). | |||
6. Processing PTB messages: A DPLPMTUD sender MAY optionally utilize | 6. Processing PTB messages: A DPLPMTUD sender MAY optionally utilize | |||
PTB messages received from the network layer to help identify | PTB messages received from the network layer to help identify | |||
when a network path does not support the current size of probe | when a network path does not support the current size of probe | |||
packet. Any received PTB message MUST be validated before it is | packet. Any received PTB message MUST be validated before it is | |||
used to update the PLPMTU discovery information [RFC8201]. This | used to update the PLPMTU discovery information [RFC8201]. This | |||
validation confirms that the PTB message was sent in response to | validation confirms that the PTB message was sent in response to | |||
a packet originating by the sender, and needs to be performed | a packet originated by the sender and needs to be performed | |||
before the PLPMTU discovery method reacts to the PTB message. A | before the PLPMTU discovery method reacts to the PTB message. A | |||
PTB message MUST NOT be used to increase the PLPMTU [RFC8201], | PTB message MUST NOT be used to increase the PLPMTU [RFC8201] but | |||
but could trigger a probe to test for a larger PLPMTU. A valid | could trigger a probe to test for a larger PLPMTU. A valid | |||
PTB_SIZE is converted to a PL_PTB_SIZE before it is to be used in | PTB_SIZE is converted to a PL_PTB_SIZE before it is to be used in | |||
the DPLPMTUD state machine. A PL_PTB_SIZE that is greater than | the DPLPMTUD state machine. A PL_PTB_SIZE that is greater than | |||
that currently probed SHOULD be ignored. (This PTB message ought | that currently probed SHOULD be ignored. (This PTB message ought | |||
to be discarded without further processing, but could be utilized | to be discarded without further processing but could be utilized | |||
as an input that enables a resilience mode). | as an input that enables a resilience mode). | |||
7. Probing and congestion control: A PL MAY use a congestion | 7. Probing and congestion control: A PL MAY use a congestion | |||
controller to decide when to send a probe packet. If | controller to decide when to send a probe packet. If | |||
transmission of probe packets is limited by the congestion | transmission of probe packets is limited by the congestion | |||
controller, this could result in transmission of probe packets | controller, this could result in transmission of probe packets | |||
being delayed or suspended during congestion. When the | being delayed or suspended during congestion. When the | |||
transmission of probe packets is not controlled by the congestion | transmission of probe packets is not controlled by the congestion | |||
controller, the interval between probe packets MUST be at least | controller, the interval between probe packets MUST be at least | |||
one RTT. Loss of a probe packet SHOULD NOT be treated as an | one RTT. Loss of a probe packet SHOULD NOT be treated as an | |||
indication of congestion and SHOULD NOT trigger a congestion | indication of congestion and SHOULD NOT trigger a congestion | |||
control reaction [RFC4821], because this could result in | control reaction [RFC4821] because this could result in | |||
unnecessary reduction of the sending rate. An update to the | unnecessary reduction of the sending rate. An update to the | |||
PLPMTU (or MPS) MUST NOT increase the congestion window measured | PLPMTU (or MPS) MUST NOT increase the congestion window measured | |||
in bytes [RFC4821]. Therefore, an increase in the packet size | in bytes [RFC4821]. Therefore, an increase in the packet size | |||
does not cause an increase in the data rate in bytes per second. | does not cause an increase in the data rate in bytes per second. | |||
A PL that maintains the congestion window in terms of a limit to | A PL that maintains the congestion window in terms of a limit to | |||
the number of outstanding fixed size packets SHOULD adapt this | the number of outstanding fixed-size packets SHOULD adapt this | |||
limit to compensate for the size of the actual packets. The | limit to compensate for the size of the actual packets. The | |||
transmission of probe packets can interact with the operation of | transmission of probe packets can interact with the operation of | |||
a PL that performs burst mitigation or pacing and could need | a PL that performs burst mitigation or pacing, and the PL could | |||
transmission of probe packets to be regulated by these methods. | need transmission of probe packets to be regulated by these | |||
methods. | ||||
8. Probing and flow control: Flow control at the PL concerns the | 8. Probing and flow control: Flow control at the PL concerns the | |||
end-to-end flow of data using the PL service. Flow control | end-to-end flow of data using the PL service. Flow control | |||
SHOULD NOT apply to DPLPMTU when probe packets use a design that | SHOULD NOT apply to DPLPMTU when probe packets use a design that | |||
does not carry user data to the remote application. | does not carry user data to the remote application. | |||
9. Shared PLPMTU state: The PMTU value calculated from the PLPMTU | 9. Shared PLPMTU state: The PMTU value calculated from the PLPMTU | |||
MAY also be stored with the corresponding entry associated with | MAY also be stored with the corresponding entry associated with | |||
the destination in the IP layer cache, and used by other PL | the destination in the IP layer cache and used by other PL | |||
instances. The specification of PLPMTUD [RFC4821] states: "If | instances. The specification of PLPMTUD [RFC4821] states, "If | |||
PLPMTUD updates the MTU for a particular path, all Packetization | PLPMTUD updates the MTU for a particular path, all Packetization | |||
Layer sessions that share the path representation (as described | Layer sessions that share the path representation (as described | |||
in Section 5.2 of [RFC4821]) SHOULD be notified to make use of | in Section 5.2) SHOULD be notified to make use of the new MTU". | |||
the new MTU". Such methods MUST be robust to the wide variety of | Such methods MUST be robust to the wide variety of underlying | |||
underlying network forwarding behaviors. Section 5.2 of | network forwarding behaviors. Section 5.2 of [RFC8201] provides | |||
[RFC8201] provides guidance on the caching of PMTU information | guidance on the caching of PMTU information and also the relation | |||
and also the relation to IPv6 flow labels. | to IPv6 flow labels. | |||
In addition, the following principles are stated for design of a | In addition, the following principles are stated for design of a | |||
DPLPMTUD method: | DPLPMTUD method: | |||
* A PL MAY be designed to segment data blocks larger than the MPS | * A PL MAY be designed to segment data blocks larger than the MPS | |||
into multiple datagrams. However, not all datagram PLs support | into multiple datagrams. However, not all datagram PLs support | |||
segmentation of data blocks. It is RECOMMENDED that methods avoid | segmentation of data blocks. It is RECOMMENDED that methods avoid | |||
forcing an application to use an arbitrary small MPS for | forcing an application to use an arbitrary small MPS for | |||
transmission while the method is searching for the currently | transmission while the method is searching for the currently | |||
supported PLPMTU. A reduced MPS can adversely impact the | supported PLPMTU. A reduced MPS can adversely impact the | |||
performance of an application. | performance of an application. | |||
* To assist applications in choosing a suitable data block size, the | * To assist applications in choosing a suitable data block size, the | |||
PL is RECOMMENDED to provide a primitive that returns the MPS | PL is RECOMMENDED to provide a primitive that returns the MPS | |||
derived from the PLPMTU to the higher layer using the PL. The | derived from the PLPMTU to the higher layer using the PL. The | |||
value of the MPS can change following a change in the path, or | value of the MPS can change following a change in the path or loss | |||
loss of probe packets. | of probe packets. | |||
* Path validation: It is RECOMMENDED that methods are robust to path | * Path validation: It is RECOMMENDED that methods are robust to path | |||
changes that could have occurred since the path characteristics | changes that could have occurred since the path characteristics | |||
were last confirmed, and to the possibility of inconsistent path | were last confirmed and to the possibility of inconsistent path | |||
information being received. | information being received. | |||
* Datagram reordering: A method is REQUIRED to be robust to the | * Datagram reordering: A method is REQUIRED to be robust to the | |||
possibility that a flow encounters reordering, or the traffic | possibility that a flow encounters reordering or that the traffic | |||
(including probe packets) is divided over more than one network | (including probe packets) is divided over more than one network | |||
path. | path. | |||
* Datagram delay and duplication: The feedback mechanism is REQUIRED | * Datagram delay and duplication: The feedback mechanism is REQUIRED | |||
to be robust to the possibility that packets could be | to be robust to the possibility that packets could be | |||
significantly delayed or duplicated along a network path. | significantly delayed or duplicated along a network path. | |||
* When to probe: It is RECOMMENDED that methods determine whether | * When to probe: It is RECOMMENDED that methods determine whether | |||
the path has changed since it last measured the path. This can | the path has changed since it last measured the path. This can | |||
help determine when to probe the path again. | help determine when to probe the path again. | |||
skipping to change at page 14, line 29 ¶ | skipping to change at line 644 ¶ | |||
[RFC4821]. In contrast, a datagram PL that constructs a probe packet | [RFC4821]. In contrast, a datagram PL that constructs a probe packet | |||
has to either request an application to send a data block that is | has to either request an application to send a data block that is | |||
larger than that generated by an application, or to utilize padding | larger than that generated by an application, or to utilize padding | |||
functions to extend a datagram beyond the size of the application | functions to extend a datagram beyond the size of the application | |||
data block. Protocols that permit exchange of control messages | data block. Protocols that permit exchange of control messages | |||
(without an application data block) can generate a probe packet by | (without an application data block) can generate a probe packet by | |||
extending a control message with padding data. The total size of a | extending a control message with padding data. The total size of a | |||
probe packet includes all headers and padding added to the payload | probe packet includes all headers and padding added to the payload | |||
data being sent (e.g., including protocol option fields, security- | data being sent (e.g., including protocol option fields, security- | |||
related fields such as an Authenticated Encryption with Associated | related fields such as an Authenticated Encryption with Associated | |||
Data (AEAD) tag and TLS record layer padding). | Data (AEAD) tag, and TLS record layer padding). | |||
A receiver is REQUIRED to be able to distinguish an in-band data | A receiver is REQUIRED to be able to distinguish an in-band data | |||
block from any added padding. This is needed to ensure that any | block from any added padding. This is needed to ensure that any | |||
added padding is not passed on to an application at the receiver. | added padding is not passed on to an application at the receiver. | |||
This results in three possible ways that a sender can create a probe | This results in three possible ways that a sender can create a probe | |||
packet: | packet: | |||
Probing using padding data: A probe packet that contains only | Probing using padding data: A probe packet that contains only | |||
control information together with any padding, which is needed to | control information together with any padding, which is needed to | |||
be inflated to the size of the probe packet. Since these probe | inflate to the size of the probe packet. Since these probe | |||
packets do not carry an application-supplied data block, they do | packets do not carry an application-supplied data block, they do | |||
not typically require retransmission, although they do still | not typically require retransmission, although they do still | |||
consume network capacity and incur endpoint processing. | consume network capacity and incur endpoint processing. | |||
Probing using application data and padding data: A probe packet that | Probing using application data and padding data: A probe packet that | |||
contains a data block supplied by an application that is combined | contains a data block supplied by an application that is combined | |||
with padding to inflate the length of the datagram to the size of | with padding to inflate the length of the datagram to the size of | |||
the probe packet. | the probe packet. | |||
Probing using application data: A probe packet that contains a data | Probing using application data: A probe packet that contains a data | |||
block supplied by an application that matches the size of the | block supplied by an application that matches the size of the | |||
probe packet. This method requests the application to issue a | probe packet. This method requests the application to issue a | |||
data block of the desired probe size. | data block of the desired probe size. | |||
A PL that uses a probe packet carrying application data and needs | A PL that uses a probe packet carrying application data and that | |||
protection from the loss of this probe packet could perform | needs protection from the loss of this probe packet could perform | |||
transport-layer retransmission/repair of the data block (e.g., by | transport-layer retransmission/repair of the data block (e.g., by | |||
retransmission after loss is detected or by duplicating the data | retransmitting after loss is detected or by duplicating the data | |||
block in a datagram without the padding data). This retransmitted | block in a datagram without the padding data). This retransmitted | |||
data block might possibly need to be sent using a smaller PLPMTU, | data block might possibly need to be sent using a smaller PLPMTU, | |||
which could force the PL to to use a smaller packet size to traverse | which could force the PL to use a smaller packet size to traverse the | |||
the end-to-end path. (This could utilize endpoint network-layer | end-to-end path. (This could utilize endpoint network-layer | |||
fragmentation or a PL that can re-segment the data block into | fragmentation or a PL that can resegment the data block into multiple | |||
multiple datagrams). | datagrams). | |||
DPLPMTUD MAY choose to use only one of these methods to simplify the | DPLPMTUD MAY choose to use only one of these methods to simplify the | |||
implementation. | implementation. | |||
Probe messages sent by a PL MUST contain enough information to | Probe messages sent by a PL MUST contain enough information to | |||
uniquely identify the probe within Maximum Segment Lifetime (e.g., | uniquely identify the probe within the Maximum Segment Lifetime | |||
including a unique identifier from the PL or the DPLPMTUD | (e.g., including a unique identifier from the PL or the DPLPMTUD | |||
implementation), while being robust to reordering and replay of probe | implementation), while being robust to reordering and replay of probe | |||
response and PTB messages. | response and PTB messages. | |||
4.2. Confirmation of Probed Packet Size | 4.2. Confirmation of Probed Packet Size | |||
The PL needs a method to determine (confirm) when probe packets have | The PL needs a method to determine (confirm) when probe packets have | |||
been successfully received end-to-end across a network path. | been successfully received end-to-end across a network path. | |||
Transport protocols can include end-to-end methods that detect and | Transport protocols can include end-to-end methods that detect and | |||
report reception of specific datagrams that they send (e.g., DCCP, | report reception of specific datagrams that they send (e.g., DCCP, | |||
skipping to change at page 16, line 5 ¶ | skipping to change at line 714 ¶ | |||
PLs need to rely on an application protocol to detect this loss. | PLs need to rely on an application protocol to detect this loss. | |||
Section 6 specifies this function for a set of IETF-specified | Section 6 specifies this function for a set of IETF-specified | |||
protocols. | protocols. | |||
4.3. Black Hole Detection and Reducing the PLPMTU | 4.3. Black Hole Detection and Reducing the PLPMTU | |||
The description that follows uses the set of constants defined in | The description that follows uses the set of constants defined in | |||
Section 5.1.2 and variables defined in Section 5.1.3. | Section 5.1.2 and variables defined in Section 5.1.3. | |||
Black Hole Detection is triggered by an indication that the network | Black hole detection is triggered by an indication that the network | |||
path could be unable to support the current PLPMTU size. | path could be unable to support the current PLPMTU size. | |||
There are three indicators that can detect black holes: | There are three indicators that can be used to detect black holes: | |||
* A validated PTB message can be received that indicates a | * A validated PTB message can be received that indicates a | |||
PL_PTB_SIZE less than the current PLPMTU. A DPLPMTUD method MUST | PL_PTB_SIZE less than the current PLPMTU. A DPLPMTUD method MUST | |||
NOT rely solely on this method. | NOT rely solely on this method. | |||
* A PL can use the DPLPMTUD probing mechanism to periodically | * A PL can use the DPLPMTUD probing mechanism to periodically | |||
generate probe packets of the size of the current PLPMTU (e.g., | generate probe packets of the size of the current PLPMTU (e.g., | |||
using the confirmation timer Section 5.1.1). A timer tracks | using the CONFIRMATION_TIMER, Section 5.1.1). A timer tracks | |||
whether acknowledgments are received. Successive loss of probes | whether acknowledgments are received. Successive loss of probes | |||
is an indication that the current path no longer supports the | is an indication that the current path no longer supports the | |||
PLPMTU (e.g., when the number of probe packets sent without | PLPMTU (e.g., when the number of probe packets sent without | |||
receiving an acknowledgment, PROBE_COUNT, becomes greater than | receiving an acknowledgment, PROBE_COUNT, becomes greater than | |||
MAX_PROBES). | MAX_PROBES). | |||
* A PL can utilize an event that indicates the network path no | * A PL can utilize an event that indicates the network path no | |||
longer sustains the sender's PLPMTU size. This could use a | longer sustains the sender's PLPMTU size. This could use a | |||
mechanism implemented within the PL to detect excessive loss of | mechanism implemented within the PL to detect excessive loss of | |||
data sent with a specific packet size and then conclude that this | data sent with a specific packet size and then conclude that this | |||
skipping to change at page 16, line 40 ¶ | skipping to change at line 749 ¶ | |||
The three methods can result in different transmission patterns for | The three methods can result in different transmission patterns for | |||
packet probes and are expected to result in different responsiveness | packet probes and are expected to result in different responsiveness | |||
following a change in the actual PMTU. | following a change in the actual PMTU. | |||
A PL MAY inhibit sending probe packets when no application data has | A PL MAY inhibit sending probe packets when no application data has | |||
been sent since the previous probe packet. A PL that resumes sending | been sent since the previous probe packet. A PL that resumes sending | |||
user data MAY continue PLPMTU discovery for each path. This allows | user data MAY continue PLPMTU discovery for each path. This allows | |||
it to use an up-to-date PLPMTU. However, this could result in | it to use an up-to-date PLPMTU. However, this could result in | |||
additional packets being sent. | additional packets being sent. | |||
When the method detects the current PLPMTU is not supported, DPLPMTUD | When the method detects that the current PLPMTU is not supported, | |||
sets a lower PLPMTU, and sets a lower MPS. The PL then confirms that | DPLPMTUD sets a lower PLPMTU and a lower MPS. The PL then confirms | |||
the new PLPMTU can be successfully used across the path. A probe | that the new PLPMTU can be successfully used across the path. A | |||
packet could need to have a size less than the size of the data block | probe packet could need to be smaller than the size of the data block | |||
generated by the application. | generated by the application. | |||
4.4. The Maximum Packet Size (MPS) | 4.4. The Maximum Packet Size (MPS) | |||
The result of probing determines a usable PLPMTU, which is used to | The result of probing determines a usable PLPMTU, which is used to | |||
set the MPS used by the application. The MPS is smaller than the | set the MPS used by the application. The MPS is smaller than the | |||
PLPMTU because it is reduced by the size of PL headers (including the | PLPMTU because it is reduced by the size of PL headers (including the | |||
overhead of security-related fields such as an AEAD tag and TLS | overhead of security-related fields such as an AEAD tag and TLS | |||
record layer padding). The relationship between the MPS and the | record layer padding). The relationship between the MPS and the | |||
PLPMTUD is illustrated in Figure 1. | PLPMTUD is illustrated in Figure 1. | |||
any additional | Any additional | |||
headers .--- MPS -----. | headers .--- MPS -----. | |||
| | | | | | | | |||
v v v | v v v | |||
+------------------------------+ | +------------------------------+ | |||
| IP | ** | PL | protocol data | | | IP | ** | PL | protocol data | | |||
+------------------------------+ | +------------------------------+ | |||
<----- PLPMTU -----> | <----- PLPMTU -----> | |||
<---------- PMTU --------------> | <---------- PMTU --------------> | |||
skipping to change at page 17, line 42 ¶ | skipping to change at line 792 ¶ | |||
DPLPMTUD seeks to avoid IP fragmentation. An attempt to send a data | DPLPMTUD seeks to avoid IP fragmentation. An attempt to send a data | |||
block larger than the MPS will therefore fail if a PL is unable to | block larger than the MPS will therefore fail if a PL is unable to | |||
segment data. To determine the largest data block that can be sent, | segment data. To determine the largest data block that can be sent, | |||
a PL SHOULD provide applications with a primitive that returns the | a PL SHOULD provide applications with a primitive that returns the | |||
MPS, derived from the current PLPMTU. | MPS, derived from the current PLPMTU. | |||
If DPLPMTUD results in a change to the MPS, the application needs to | If DPLPMTUD results in a change to the MPS, the application needs to | |||
adapt to the new MPS. A particular case can arise when packets have | adapt to the new MPS. A particular case can arise when packets have | |||
been sent with a size less than the MPS and the PLPMTU was | been sent with a size less than the MPS and the PLPMTU was | |||
subsequently reduced. If these packets are lost, the PL MAY segment | subsequently reduced. If these packets are lost, the PL MAY segment | |||
the data using the new MPS. If a PL is unable to re-segment a | the data using the new MPS. If a PL is unable to resegment a | |||
previously sent datagram (e.g., [RFC4960]), then the sender either | previously sent datagram (e.g., [RFC4960]), then the sender either | |||
discards the datagram or could perform retransmission using network- | discards the datagram or could perform retransmission using network- | |||
layer fragmentation to form multiple IP packets not larger than the | layer fragmentation to form multiple IP packets not larger than the | |||
PLPMTU. For IPv4, the use of endpoint fragmentation by the sender is | PLPMTU. For IPv4, the use of endpoint fragmentation by the sender is | |||
preferred over clearing the DF bit in the IPv4 header. Operational | preferred over clearing the DF bit in the IPv4 header. Operational | |||
experience reveals that IP fragmentation can reduce the reliability | experience reveals that IP fragmentation can reduce the reliability | |||
of Internet communication [I-D.ietf-intarea-frag-fragile], which may | of Internet communication [RFC8900], which may reduce the probability | |||
reduce the probability of successful retransmission. | of successful retransmission. | |||
4.5. Disabling the Effect of PMTUD | 4.5. Disabling the Effect of PMTUD | |||
A PL implementing this specification MUST suspend network layer | A PL implementing this specification MUST suspend network layer | |||
processing of outgoing packets that enforces a PMTU | processing of outgoing packets that enforces a PMTU | |||
[RFC1191][RFC8201] for each flow utilizing DPLPMTUD, and instead use | [RFC1191][RFC8201] for each flow utilizing DPLPMTUD and instead use | |||
DPLPMTUD to control the size of packets that are sent by a flow. | DPLPMTUD to control the size of packets that are sent by a flow. | |||
This removes the need for the network layer to drop or fragment sent | This removes the need for the network layer to drop or to fragment | |||
packets that have a size greater than the PMTU. | sent packets that have a size greater than the PMTU. | |||
4.6. Response to PTB Messages | 4.6. Response to PTB Messages | |||
This method requires the DPLPMTUD sender to validate any received PTB | This method requires the DPLPMTUD sender to validate any received PTB | |||
message before using the PTB information. The response to a PTB | message before using the PTB information. The response to a PTB | |||
message depends on the PL_PTB_SIZE calculated from the PTB_SIZE in | message depends on the PL_PTB_SIZE calculated from the PTB_SIZE in | |||
the PTB message, the state of the PLPMTUD state machine, and the IP | the PTB message, the state of the PLPMTUD state machine, and the IP | |||
protocol being used. | protocol being used. | |||
Section 4.6.1 first describes validation for both IPv4 ICMP | Section 4.6.1 describes validation for both IPv4 ICMP Unreachable | |||
Unreachable messages (type 3) and ICMPv6 Packet Too Big messages, | messages (Type 3) and ICMPv6 Packet Too Big messages, both of which | |||
both of which are referred to as PTB messages in this document. | are referred to as PTB messages in this document. | |||
4.6.1. Validation of PTB Messages | 4.6.1. Validation of PTB Messages | |||
This section specifies utilization and validation of PTB messages. | This section specifies utilization and validation of PTB messages. | |||
* A simple implementation MAY ignore received PTB messages and in | * A simple implementation MAY ignore received PTB messages, and in | |||
this case the PLPMTU is not updated when a PTB message is | this case, the PLPMTU is not updated when a PTB message is | |||
received. | received. | |||
* A PL that supports PTB messages MUST validate these messages | * A PL that supports PTB messages MUST validate these messages | |||
before they are further processed. | before they are further processed. | |||
A PL that receives a PTB message from a router or middlebox performs | A PL that receives a PTB message from a router or middlebox performs | |||
ICMP validation (see Section 4 of [RFC8201] and Section 5.2 of | ICMP validation (see Section 4 of [RFC8201] and Section 5.2 of | |||
[BCP145]). Because DPLPMTUD operates at the PL, the PL needs to | [BCP145]). Because DPLPMTUD operates at the PL, the PL needs to | |||
check that each received PTB message is received in response to a | check that each received PTB message is received in response to a | |||
packet transmitted by the endpoint PL performing DPLPMTUD. | packet transmitted by the endpoint PL performing DPLPMTUD. | |||
The PL MUST check the protocol information in the quoted packet | The PL MUST check the protocol information in the quoted packet | |||
carried in an ICMP PTB message payload to validate the message | carried in an ICMP PTB message payload to validate the message | |||
originated from the sending node. This validation includes | originated from the sending node. This validation includes | |||
determining that the combination of the IP addresses, the protocol, | determining that the combination of the IP addresses, the protocol, | |||
the source port and destination port match those returned in the | the source port, and destination port match those returned in the | |||
quoted packet - this is also necessary for the PTB message to be | quoted packet -- this is also necessary for the PTB message to be | |||
passed to the corresponding PL. | passed to the corresponding PL. | |||
The validation SHOULD utilize information that it is not simple for | The validation SHOULD utilize information that is not simple for an | |||
an off-path attacker to determine [BCP145]. For example, it could | off-path attacker to determine [BCP145]. For example, it could check | |||
check the value of a protocol header field known only to the two PL | the value of a protocol header field known only to the two PL | |||
endpoints. A datagram application that uses well-known source and | endpoints. A datagram application that uses well-known source and | |||
destination ports ought to also rely on other information to complete | destination ports ought to also rely on other information to complete | |||
this validation. | this validation. | |||
These checks are intended to provide protection from packets that | These checks are intended to provide protection from packets that | |||
originate from a node that is not on the network path. A PTB message | originate from a node that is not on the network path. A PTB message | |||
that does not complete the validation MUST NOT be further utilized by | that does not complete the validation MUST NOT be further utilized by | |||
the DPLPMTUD method, as discussed in the Security Considerations | the DPLPMTUD method, as discussed in the Security Considerations | |||
section. | section (Section 8). | |||
Section 4.6.2 describes this processing of PTB messages. | Section 4.6.2 describes this processing of PTB messages. | |||
4.6.2. Use of PTB Messages | 4.6.2. Use of PTB Messages | |||
PTB messages that have been validated MAY be utilized by the DPLPMTUD | PTB messages that have been validated MAY be utilized by the DPLPMTUD | |||
algorithm, but MUST NOT be used directly to set the PLPMTU. | algorithm but MUST NOT be used directly to set the PLPMTU. | |||
Before using the size reported in the PTB message it must first be | Before using the size reported in the PTB message, it must first be | |||
converted to a PL_PTB_SIZE. The PL_PTB_SIZE is smaller than the | converted to a PL_PTB_SIZE. The PL_PTB_SIZE is smaller than the | |||
PTB_SIZE because it is reduced by headers below the PL including any | PTB_SIZE because it is reduced by headers below the PL, including any | |||
IP options or extensions added to the PL packet. | IP options or extensions added to the PL packet. | |||
A method that utilizes these PTB messages can improve the speed at | A method that utilizes these PTB messages can improve the speed at | |||
which the algorithm detects an appropriate PLPMTU by triggering an | which the algorithm detects an appropriate PLPMTU by triggering an | |||
immediate probe for the PL_PTB_SIZE (resulting in a network-layer | immediate probe for the PL_PTB_SIZE (resulting in a network-layer | |||
packet of size PTB_SIZE), compared to one that relies solely on | packet of size PTB_SIZE), compared to one that relies solely on | |||
probing using a timer-based search algorithm. | probing using a timer-based search algorithm. | |||
A set of checks are intended to provide protection from a router that | A set of checks are intended to provide protection from a router that | |||
reports an unexpected PTB_SIZE. The PL also needs to check that the | reports an unexpected PTB_SIZE. The PL also needs to check that the | |||
indicated PL_PTB_SIZE is less than the size used by probe packets and | indicated PL_PTB_SIZE is less than the size used by probe packets and | |||
at least the minimum size accepted. | at least the minimum size accepted. | |||
This section provides a summary of how PTB messages can be utilized. | This section provides a summary of how PTB messages can be utilized, | |||
(This uses the set of constants defined in Section 5.1.2). This | using the set of constants defined in Section 5.1.2. This processing | |||
processing depends on the PL_PTB_SIZE and the current value of a set | depends on the PL_PTB_SIZE and the current value of a set of | |||
of variables: | variables: | |||
PL_PTB_SIZE < MIN_PLPMTU | PL_PTB_SIZE < MIN_PLPMTU | |||
* Invalid PL_PTB_SIZE see Section 4.6.1. | * Invalid PL_PTB_SIZE, see Section 4.6.1. | |||
* PTB message ought to be discarded without further processing | * PTB message ought to be discarded without further processing | |||
(i.e., PLPMTU is not modified). | (i.e., PLPMTU is not modified). | |||
* The information could be utilized as an input that triggers | * The information could be utilized as an input that triggers the | |||
enabling a resilience mode (see Section 5.3.3). | enabling of a resilience mode (see Section 5.3.3). | |||
MIN_PLPMTU < PL_PTB_SIZE < BASE_PLPMTU | MIN_PLPMTU < PL_PTB_SIZE < BASE_PLPMTU | |||
* A robust PL MAY enter an error state (see Section 5.2) for an | * A robust PL MAY enter an error state (see Section 5.2) for an | |||
IPv4 path when the PL_PTB_SIZE reported in the PTB message is | IPv4 path when the PL_PTB_SIZE reported in the PTB message is | |||
larger than or equal to 68 bytes [RFC0791] and when this is | larger than or equal to 68 bytes [RFC0791] and when this is | |||
less than the BASE_PLPMTU. | less than the BASE_PLPMTU. | |||
* A robust PL MAY enter an error state (see Section 5.2) for an | * A robust PL MAY enter an error state (see Section 5.2) for an | |||
IPv6 path when the PL_PTB_SIZE reported in the PTB message is | IPv6 path when the PL_PTB_SIZE reported in the PTB message is | |||
larger than or equal to 1280 bytes [RFC8200] and when this is | larger than or equal to 1280 bytes [RFC8200] and when this is | |||
skipping to change at page 20, line 41 ¶ | skipping to change at line 934 ¶ | |||
* The PL can use the reported PL_PTB_SIZE from the PTB message as | * The PL can use the reported PL_PTB_SIZE from the PTB message as | |||
the next search point when it resumes the search algorithm. | the next search point when it resumes the search algorithm. | |||
PL_PTB_SIZE >= PROBED_SIZE | PL_PTB_SIZE >= PROBED_SIZE | |||
* Inconsistent network signal. | * Inconsistent network signal. | |||
* PTB message ought to be discarded without further processing | * PTB message ought to be discarded without further processing | |||
(i.e., PLPMTU is not modified). | (i.e., PLPMTU is not modified). | |||
* The information could be utilized as an input to trigger | * The information could be utilized as an input to trigger the | |||
enabling a resilience mode. | enabling of a resilience mode. | |||
5. Datagram Packetization Layer PMTUD | 5. Datagram Packetization Layer PMTUD | |||
This section specifies Datagram PLPMTUD (DPLPMTUD). The method can | This section specifies Datagram PLPMTUD (DPLPMTUD). The method can | |||
be introduced at various points (as indicated with * in the figure | be introduced at various points (as indicated with * in Figure 2) in | |||
below) in the IP protocol stack to discover the PLPMTU so that an | the IP protocol stack to discover the PLPMTU so that an application | |||
application can utilize an appropriate MPS for the current network | can utilize an appropriate MPS for the current network path. | |||
path. | ||||
DPLPMTUD SHOULD only be performed at one layer between a pair of | DPLPMTUD SHOULD only be performed at one layer between a pair of | |||
endpoints. Therefore, an upper PL or application should avoid using | endpoints. Therefore, an upper PL or application should avoid using | |||
DPLPMTUD when this is already enabled in a lower layer. A PL MUST | DPLPMTUD when this is already enabled in a lower layer. A PL MUST | |||
adjust the MPS indicated by DPLPMTUD to account for any additional | adjust the MPS indicated by DPLPMTUD to account for any additional | |||
overhead introduced by the PL. | overhead introduced by the PL. | |||
+----------------------+ | +----------------------+ | |||
| Application* | | | Application* | | |||
+-----+------------+---+ | +-----+------------+---+ | |||
skipping to change at page 21, line 29 ¶ | skipping to change at line 968 ¶ | |||
+---+ +----+ | | +---+ +----+ | | |||
| | | | | | | | |||
+-+--+-+ | | +-+--+-+ | | |||
| UDP | | | | UDP | | | |||
+---+--+ | | +---+--+ | | |||
| | | | | | |||
+-----------+-------+--+ | +-----------+-------+--+ | |||
| Network Interface | | | Network Interface | | |||
+----------------------+ | +----------------------+ | |||
Figure 2: Examples where DPLPMTUD can be implemented | Figure 2: Examples Where DPLPMTUD Can Be Implemented | |||
The central idea of DPLPMTUD is probing by a sender. Probe packets | The central idea of DPLPMTUD is probing by a sender. Probe packets | |||
are sent to find the maximum size of user message that can be | are sent to find the maximum size of user message that can be | |||
completely transferred across the network path from the sender to the | completely transferred across the network path from the sender to the | |||
destination. | destination. | |||
The following sections identify the components needed for | The following sections identify the components needed for | |||
implementation, provides an overview of the phases of operation, and | implementation, provide an overview of the phases of operation, and | |||
specifies the state machine and search algorithm. | specify the state machine and search algorithm. | |||
5.1. DPLPMTUD Components | 5.1. DPLPMTUD Components | |||
This section describes the timers, constants, and variables of | This section describes the timers, constants, and variables of | |||
DPLPMTUD. | DPLPMTUD. | |||
5.1.1. Timers | 5.1.1. Timers | |||
The method utilizes up to three timers: | The method utilizes up to three timers: | |||
PROBE_TIMER: The PROBE_TIMER is configured to expire after a period | PROBE_TIMER: The PROBE_TIMER is configured to expire after a period | |||
longer than the maximum time to receive an acknowledgment to a | longer than the maximum time to receive an acknowledgment to a | |||
probe packet. This value MUST NOT be smaller than 1 second, and | probe packet. This value MUST NOT be smaller than 1 second and | |||
SHOULD be larger than 15 seconds. Guidance on selection of the | SHOULD be larger than 15 seconds. Guidance on the selection of | |||
timer value are provided in Section 3.1.1 of the UDP Usage | the timer value is provided in Section 3.1.1 of the UDP Usage | |||
Guidelines [BCP145]. | Guidelines [BCP145]. | |||
PMTU_RAISE_TIMER: The PMTU_RAISE_TIMER is configured to the period a | PMTU_RAISE_TIMER: The PMTU_RAISE_TIMER is configured to the period a | |||
sender will continue to use the current PLPMTU, after which it re- | sender will continue to use the current PLPMTU, after which it | |||
enters the Search phase. This timer has a period of 600 seconds, | reenters the Search Phase. This timer has a period of 600 | |||
as recommended by PLPMTUD [RFC4821]. | seconds, as recommended by PLPMTUD [RFC4821]. | |||
DPLPMTUD MAY inhibit sending probe packets when no application | DPLPMTUD MAY inhibit sending probe packets when no application | |||
data has been sent since the previous probe packet. A PL | data has been sent since the previous probe packet. A PL | |||
preferring to use an up-to-date PMTU once user data is sent again, | preferring to use an up-to-date PMTU once user data is sent again | |||
can choose to continue PMTU discovery for each path. However, | can choose to continue PMTU discovery for each path. However, | |||
this will result in sending additional packets. | this will result in sending additional packets. | |||
CONFIRMATION_TIMER: When an acknowledged PL is used, this timer MUST | CONFIRMATION_TIMER: When an acknowledged PL is used, this timer MUST | |||
NOT be used. For other PLs, the CONFIRMATION_TIMER is configured | NOT be used. For other PLs, the CONFIRMATION_TIMER is configured | |||
to the period a PL sender waits before confirming the current | to the period a PL sender waits before confirming the current | |||
PLPMTU is still supported. This is less than the PMTU_RAISE_TIMER | PLPMTU is still supported. This is less than the PMTU_RAISE_TIMER | |||
and used to decrease the PLPMTU (e.g., when a black hole is | and used to decrease the PLPMTU (e.g., when a black hole is | |||
encountered). Confirmation needs to be frequent enough when data | encountered). Confirmation needs to be frequent enough when data | |||
is flowing that the sending PL does not black hole extensive | is flowing that the sending PL does not black hole extensive | |||
amounts of traffic. Guidance on selection of the timer value are | amounts of traffic. Guidance on selection of the timer value are | |||
provided in Section 3.1.1 of the UDP Usage Guidelines [BCP145]. | provided in Section 3.1.1 of the UDP Usage Guidelines [BCP145]. | |||
DPLPMTUD MAY inhibit sending probe packets when no application | DPLPMTUD MAY inhibit sending probe packets when no application | |||
data has been sent since the previous probe packet. A PL | data has been sent since the previous probe packet. A PL | |||
preferring to use an up-to-date PMTU once user data is sent again, | preferring to use an up-to-date PMTU once user data is sent again, | |||
can choose to continue PMTU discovery for each path. However, | can choose to continue PMTU discovery for each path. However, | |||
this could result in sending additional packets. | this could result in sending additional packets. | |||
DPLPMTD specifies various timers, however an implementation could | DPLPMTUD specifies various timers; however, an implementation could | |||
choose to realise these timer functions using a single timer. | choose to realize these timer functions using a single timer. | |||
5.1.2. Constants | 5.1.2. Constants | |||
The following constants are defined: | The following constants are defined: | |||
MAX_PROBES: The MAX_PROBES is the maximum value of the PROBE_COUNT | MAX_PROBES: The MAX_PROBES is the maximum value of the PROBE_COUNT | |||
counter (see Section 5.1.3). MAX_PROBES represents the limit for | counter (see Section 5.1.3). MAX_PROBES represents the limit for | |||
the number of consecutive probe attempts of any size. Search | the number of consecutive probe attempts of any size. Search | |||
algorithms benefit from a MAX_PROBES value greater than 1 because | algorithms benefit from a MAX_PROBES value greater than 1 because | |||
this can provide robustness to isolated packet loss. The default | this can provide robustness to isolated packet loss. The default | |||
value of MAX_PROBES is 3. | value of MAX_PROBES is 3. | |||
MIN_PLPMTU: The MIN_PLPMTU is the smallest size of PLPMTU that | MIN_PLPMTU: The MIN_PLPMTU is the smallest size of PLPMTU that | |||
DPLPMTUD will attempt to use. An endpoint could need to be | DPLPMTUD will attempt to use. An endpoint could need to configure | |||
configure the MIN_PLPMTU to provide space for extension headers | the MIN_PLPMTU to provide space for extension headers and other | |||
and other encapsulations at layers below the PL. This value can | encapsulations at layers below the PL. This value can be | |||
be interface and path dependent. For IPv6, this size is greater | interface and path dependent. For IPv6, this size is greater than | |||
than or equal to the size at the PL that results in an 1280 byte | or equal to the size at the PL that results in an 1280-byte IPv6 | |||
IPv6 packet, as specified in [RFC8200]. For IPv4, this size is | packet, as specified in [RFC8200]. For IPv4, this size is greater | |||
greater than or equal to the size at the PL that results in an 68 | than or equal to the size at the PL that results in an 68-byte | |||
byte IPv4 packet. Note: An IPv4 router is required to be able to | IPv4 packet. Note: An IPv4 router is required to be able to | |||
forward a datagram of 68 bytes without further fragmentation. | forward a datagram of 68 bytes without further fragmentation. | |||
This is the combined size of an IPv4 header and the minimum | This is the combined size of an IPv4 header and the minimum | |||
fragment size of 8 bytes. In addition, receivers are required to | fragment size of 8 bytes. In addition, receivers are required to | |||
be able to reassemble fragmented datagrams at least up to 576 | be able to reassemble fragmented datagrams at least up to 576 | |||
bytes, as stated in section 3.3.3 of [RFC1122]. | bytes, as stated in Section 3.3.3 of [RFC1122]. | |||
MAX_PLPMTU: The MAX_PLPMTU is the largest size of PLPMTU. This has | MAX_PLPMTU: The MAX_PLPMTU is the largest size of PLPMTU. This has | |||
to be less than or equal to the maximum size of the PL packet that | to be less than or equal to the maximum size of the PL packet that | |||
can be sent on the outgoing interface (constrained by the local | can be sent on the outgoing interface (constrained by the local | |||
interface MTU). When known, this also ought to be less than the | interface MTU). When known, this also ought to be less than the | |||
maximum size of PL packet that can be received by the remote | maximum size of PL packet that can be received by the remote | |||
endpoint (constrained by EMTU_R). It can be limited by the design | endpoint (constrained by EMTU_R). It can be limited by the design | |||
or configuration of the PL being used. An application, or PL, MAY | or configuration of the PL being used. An application, or PL, MAY | |||
choose a smaller MAX_PLPMTU when there is no need to send packets | choose a smaller MAX_PLPMTU when there is no need to send packets | |||
larger than a specific size. | larger than a specific size. | |||
BASE_PLPMTU: The BASE_PLPMTU is a configured size expected to work | BASE_PLPMTU: The BASE_PLPMTU is a configured size expected to work | |||
for most paths. The size is equal to or larger than the | for most paths. The size is equal to or larger than the | |||
MIN_PLPMTU and smaller than the MAX_PLPMTU. For most PLs a | MIN_PLPMTU and smaller than the MAX_PLPMTU. For most PLs, a | |||
suitable BASE_PLPMTU will be larger than 1200 bytes. When using | suitable BASE_PLPMTU will be larger than 1200 bytes. When using | |||
IPv4, there is no currently equivalent size specified and a | IPv4, there is no currently equivalent size specified, and a | |||
default BASE_PLPMTU of 1200 bytes is RECOMMENDED. | default BASE_PLPMTU of 1200 bytes is RECOMMENDED. | |||
5.1.3. Variables | 5.1.3. Variables | |||
This method utilizes a set of variables: | This method utilizes a set of variables: | |||
PROBED_SIZE: The PROBED_SIZE is the size of the current probe packet | PROBED_SIZE: The PROBED_SIZE is the size of the current probe packet | |||
as determined at the PL. This is a tentative value for the | as determined at the PL. This is a tentative value for the | |||
PLPMTU, which is awaiting confirmation by an acknowledgment. | PLPMTU, which is awaiting confirmation by an acknowledgment. | |||
PROBE_COUNT: The PROBE_COUNT is a count of the number of successive | PROBE_COUNT: The PROBE_COUNT is a count of the number of successive | |||
unsuccessful probe packets that have been sent. Each time a probe | unsuccessful probe packets that have been sent. Each time a probe | |||
packet is acknowledged, the value is set to zero. (Some probe | packet is acknowledged, the value is set to zero. (Some probe | |||
loss is expected while searching, therefore loss of a single probe | loss is expected while searching, therefore loss of a single probe | |||
is not an indication of a PMTU problem.) | is not an indication of a PMTU problem.) | |||
The figure below illustrates the relationship between the packet size | Figure 3 illustrates the relationship between the packet size | |||
constants and variables at a point of time when the DPLPMTUD | constants and variables at a point of time when the DPLPMTUD | |||
algorithm performs path probing to increase the size of the PLPMTU. | algorithm performs path probing to increase the size of the PLPMTU. | |||
A probe packet has been sent of size PROBED_SIZE. Once this is | A probe packet has been sent of size PROBED_SIZE. Once this is | |||
acknowledged, the PLPMTU will raise to PROBED_SIZE allowing the | acknowledged, the PLPMTU will raise to PROBED_SIZE, allowing the | |||
DPLPMTUD algorithm to further increase PROBED_SIZE toward sending a | DPLPMTUD algorithm to further increase PROBED_SIZE toward sending a | |||
probe with the size of the actual PMTU. | probe with the size of the actual PMTU. | |||
MIN_PLPMTU MAX_PLPMTU | MIN_PLPMTU MAX_PLPMTU | |||
<-------------------------------------------> | <-------------------------------------------> | |||
| | | | | | | | |||
v | | | v | | | |||
BASE_PLPMTU | v | BASE_PLPMTU | v | |||
| PROBED_SIZE | | PROBED_SIZE | |||
v | v | |||
PLPMTU | PLPMTU | |||
Figure 3: Relationships between packet size constants and variables | Figure 3: Relationships between Packet Size Constants and Variables | |||
5.1.4. Overview of DPLPMTUD Phases | 5.1.4. Overview of DPLPMTUD Phases | |||
This section provides a high-level informative view of the DPLPMTUD | This section provides a high-level, informative view of the DPLPMTUD | |||
method, by describing the movement of the method through several | method, by describing the movement of the method through several | |||
phases of operation. More detail is available in the state machine | phases of operation. More detail is available in the state machine, | |||
Section 5.2. | Section 5.2. | |||
+------+ | +------+ | |||
+------->| Base |-----------------+ Connectivity | +------->| Base |-----------------+ Connectivity | |||
| +------+ | or BASE_PLPMTU | | +------+ | or BASE_PLPMTU | |||
| | | confirmation failed | | | | confirmation failed | |||
| | v | | | v | |||
| | Connectivity +-------+ | | | Connectivity +-------+ | |||
| | and BASE_PLPMTU | Error | | | | and BASE_PLPMTU | Error | | |||
| | confirmed +-------+ | | | confirmed +-------+ | |||
skipping to change at page 25, line 13 ¶ | skipping to change at line 1142 ¶ | |||
Figure 4: DPLPMTUD Phases | Figure 4: DPLPMTUD Phases | |||
Base: The Base Phase confirms connectivity to the remote peer using | Base: The Base Phase confirms connectivity to the remote peer using | |||
packets of the BASE_PLPMTU. The confirmation of connectivity is | packets of the BASE_PLPMTU. The confirmation of connectivity is | |||
implicit for a connection-oriented PL (where it can be performed | implicit for a connection-oriented PL (where it can be performed | |||
in a PL connection handshake). A connectionless PL sends a probe | in a PL connection handshake). A connectionless PL sends a probe | |||
packet and uses acknowledgment of this probe packet to confirm | packet and uses acknowledgment of this probe packet to confirm | |||
that the remote peer is reachable. | that the remote peer is reachable. | |||
The sender also confirms that BASE_PLPMTU is supported across the | The sender also confirms that BASE_PLPMTU is supported across the | |||
network path. This may be achieved using a PL mechanism (e.g., | network path. This may be achieved by using a PL mechanism (e.g., | |||
using a handshake packet of size BASE_PLPMTU), or by sending a | using a handshake packet of size BASE_PLPMTU) or by sending a | |||
probe packet of size BASE_PLPMTU and confirming that this is | probe packet of size BASE_PLPMTU and confirming that this is | |||
received. | received. | |||
A probe packet of size BASE_PLPMTU can be sent immediately on the | A probe packet of size BASE_PLPMTU can be sent immediately on the | |||
initial entry to the Base Phase (following a connectivity check). | initial entry to the Base Phase (following a connectivity check). | |||
A PL that does not wish to support a path with a PLPMTU less than | A PL that does not wish to support a path with a PLPMTU less than | |||
BASE_PLPMTU can simplify the phase into a single step by | BASE_PLPMTU can simplify the phase into a single step by | |||
performing the connectivity checks with a probe of the BASE_PLPMTU | performing the connectivity checks with a probe of the BASE_PLPMTU | |||
size. | size. | |||
Once confirmed, DPLPMTUD enters the Search Phase. If the Base | Once confirmed, DPLPMTUD enters the Search Phase. If the Base | |||
Phase fails to confirm the BASE_PLPMTU, DPLPMTUD enters the Error | Phase fails to confirm the BASE_PLPMTU, DPLPMTUD enters the Error | |||
Phase. | Phase. | |||
Search: The Search Phase utilizes a search algorithm to send probe | Search: The Search Phase utilizes a search algorithm to send probe | |||
packets to seek to increase the PLPMTU. The algorithm concludes | packets to seek to increase the PLPMTU. The algorithm concludes | |||
when it has found a suitable PLPMTU, by entering the Search | when it has found a suitable PLPMTU by entering the Search | |||
Complete Phase. | Complete Phase. | |||
A PL could respond to PTB messages using the PTB to advance or | A PL could respond to PTB messages using the PTB to advance or | |||
terminate the search, see Section 4.6. | terminate the search, see Section 4.6. | |||
Search Complete: The Search Complete Phase is entered when the | Search Complete: The Search Complete Phase is entered when the | |||
PLPMTU is supported across the network path. A PL can use a | PLPMTU is supported across the network path. A PL can use a | |||
CONFIRMATION_TIMER to periodically repeat a probe packet for the | CONFIRMATION_TIMER to periodically repeat a probe packet for the | |||
current PLPMTU size. If the sender is unable to confirm | current PLPMTU size. If the sender is unable to confirm | |||
reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL | reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL | |||
signals a lack of reachability, a black hole has been detected and | signals a lack of reachability, a black hole has been detected and | |||
DPLPMTUD enters the Base phase. | DPLPMTUD enters the Base Phase. | |||
The PMTU_RAISE_TIMER is used to periodically resume the search | The PMTU_RAISE_TIMER is used to periodically resume the Search | |||
phase to discover if the PLPMTU can be raised. Black Hole | Phase to discover if the PLPMTU can be raised. Black hole | |||
Detection causes the sender to enter the Base Phase. | detection causes the sender to enter the Base Phase. | |||
Error: The Error Phase is entered when there is conflicting or | Error: The Error Phase is entered when there is conflicting or | |||
invalid PLPMTU information for the path (e.g., a failure to | invalid PLPMTU information for the path (e.g., a failure to | |||
support the BASE_PLPMTU) that cause DPLPMTUD to be unable to | support the BASE_PLPMTU) that causes DPLPMTUD to be unable to | |||
progress and the PLPMTU is lowered. | progress, and the PLPMTU is lowered. | |||
DPLPMTUD remains in the Error Phase until a consistent view of the | DPLPMTUD remains in the Error Phase until a consistent view of the | |||
path can be discovered and it has also been confirmed that the | path can be discovered and it has also been confirmed that the | |||
path supports the BASE_PLPMTU (or DPLPMTUD is suspended). | path supports the BASE_PLPMTU (or DPLPMTUD is suspended). | |||
A method that only reduces the PLPMTU to a suitable size would be | A method that only reduces the PLPMTU to a suitable size would be | |||
sufficient to ensure reliable operation, but can be very inefficient | sufficient to ensure reliable operation but can be very inefficient | |||
when the actual PMTU changes or when the method (for whatever reason) | when the actual PMTU changes or when the method (for whatever reason) | |||
makes a suboptimal choice for the PLPMTU. | makes a suboptimal choice for the PLPMTU. | |||
A full implementation of DPLPMTUD provides an algorithm enabling the | A full implementation of DPLPMTUD provides an algorithm enabling the | |||
DPLPMTUD sender to increase the PLPMTU following a change in the | DPLPMTUD sender to increase the PLPMTU following a change in the | |||
characteristics of the path, such as when a link is reconfigured with | characteristics of the path, such as when a link is reconfigured with | |||
a larger MTU, or when there is a change in the set of links traversed | a larger MTU, or when there is a change in the set of links traversed | |||
by an end-to-end flow (e.g., after a routing or path fail-over | by an end-to-end flow (e.g., after a routing or path failover | |||
decision). | decision). | |||
5.2. State Machine | 5.2. State Machine | |||
A state machine for DPLPMTUD is depicted in Figure 5. If multipath | A state machine for DPLPMTUD is depicted in Figure 5. If multipath | |||
or multihoming is supported, a state machine is needed for each path. | or multihoming is supported, a state machine is needed for each path. | |||
Note: Not all changes are shown to simplify the diagram. | Note: Not all changes are shown to simplify the diagram. | |||
| | | | | | |||
| Start | PL indicates loss | | Start | PL indicates loss | |||
| | of connectivity | | | of connectivity | |||
v v | v v | |||
+---------------+ +---------------+ | +---------------+ +---------------+ | |||
| DISABLED | | ERROR | | | DISABLED | | ERROR | | |||
+---------------+ PROBE_TIMER expiry: +---------------+ | +---------------+ PROBE_TIMER expiry: +---------------+ | |||
| PL indicates PROBE_COUNT = MAX_PROBES or ^ | | | PL indicates PROBE_COUNT = MAX_PROBES or ^ | | |||
| connectivity PTB: PL_PTB_SIZE < BASE_PLPMTU | | | | connectivity PTB: PL_PTB_SIZE < BASE_PLPMTU | | | |||
+--------------------+ +---------------+ | | +--------------------+ +------------------+ | | |||
| | | | | | | | |||
v | BASE_PLPMTU Probe | | v | BASE_PLPMTU Probe | | |||
+---------------+ acked | | +---------------+ acked | | |||
| BASE |--------------------->+ | | BASE |--------------------->+ | |||
+---------------+ | | +---------------+ | | |||
^ | ^ ^ | | ^ | ^ ^ | | |||
Black hole detected | | | | Black hole detected | | Black hole detected | | | | Black hole detected | | |||
+--------------------+ | | +--------------------+ | | +--------------------+ | | +--------------------+ | | |||
| +----+ | | | | +----+ | | | |||
| PROBE_TIMER expiry: | | | | PROBE_TIMER expiry: | | | |||
skipping to change at page 27, line 45 ¶ | skipping to change at line 1246 ¶ | |||
| | | | | | | | | | | | | | |||
| | +-----------------------------------------+ | | | | | +-----------------------------------------+ | | | |||
| | MAX_PLPMTU Probe acked or | | | | | MAX_PLPMTU Probe acked or | | | |||
| | PROBE_TIMER expiry: PROBE_COUNT = MAX_PROBES or | | | | | PROBE_TIMER expiry: PROBE_COUNT = MAX_PROBES or | | | |||
+----+ PTB: PL_PTB_SIZE = PLPMTU +----+ | +----+ PTB: PL_PTB_SIZE = PLPMTU +----+ | |||
CONFIRMATION_TIMER expiry: PROBE_TIMER expiry: | CONFIRMATION_TIMER expiry: PROBE_TIMER expiry: | |||
PROBE_COUNT < MAX_PROBES or PROBE_COUNT < MAX_PROBES or | PROBE_COUNT < MAX_PROBES or PROBE_COUNT < MAX_PROBES or | |||
PLPMTU Probe acked Probe acked or PTB: | PLPMTU Probe acked Probe acked or PTB: | |||
PLPMTU < PL_PTB_SIZE < PROBED_SIZE | PLPMTU < PL_PTB_SIZE < PROBED_SIZE | |||
Figure 5: State machine for Datagram PLPMTUD | Figure 5: State Machine for Datagram PLPMTUD | |||
The following states are defined: | The following states are defined: | |||
DISABLED: The DISABLED state is the initial state before probing has | DISABLED: The DISABLED state is the initial state before probing has | |||
started. It is also entered from any other state, when the PL | started. It is also entered from any other state, when the PL | |||
indicates loss of connectivity. This state is left once the PL | indicates loss of connectivity. This state is left once the PL | |||
indicates connectivity to the remote PL. When transitioning to | indicates connectivity to the remote PL. When transitioning to | |||
the BASE state, a probe packet of size BASE_PLPMTU can be sent | the BASE state, a probe packet of size BASE_PLPMTU can be sent | |||
immediately. | immediately. | |||
BASE: The BASE state is used to confirm that the BASE_PLPMTU size is | BASE: The BASE state is used to confirm that the BASE_PLPMTU size is | |||
supported by the network path and is designed to allow an | supported by the network path and is designed to allow an | |||
application to continue working when there are transient | application to continue working when there are transient | |||
reductions in the actual PMTU. It also seeks to avoid long | reductions in the actual PMTU. It also seeks to avoid long | |||
periods when a sender searching for a larger PLPMTU is unaware | periods when a sender searching for a larger PLPMTU is unaware | |||
that packets are not being delivered due to a packet or ICMP Black | that packets are not being delivered due to a packet or ICMP black | |||
Hole. | hole. | |||
On entry, the PROBED_SIZE is set to the BASE_PLPMTU size and the | On entry, the PROBED_SIZE is set to the BASE_PLPMTU size, and the | |||
PROBE_COUNT is set to zero. | PROBE_COUNT is set to zero. | |||
Each time a probe packet is sent, the PROBE_TIMER is started. The | Each time a probe packet is sent, the PROBE_TIMER is started. The | |||
state is exited when the probe packet is acknowledged, and the PL | state is exited when the probe packet is acknowledged, and the PL | |||
sender enters the SEARCHING state. | sender enters the SEARCHING state. | |||
The state is also left when the PROBE_COUNT reaches MAX_PROBES or | The state is also left when the PROBE_COUNT reaches MAX_PROBES or | |||
a received PTB message is validated. This causes the PL sender to | a received PTB message is validated. This causes the PL sender to | |||
enter the ERROR state. | enter the ERROR state. | |||
SEARCHING: The SEARCHING state is the main probing state. This | SEARCHING: The SEARCHING state is the main probing state. This | |||
state is entered when probing for the BASE_PLPMTU completes. | state is entered when probing for the BASE_PLPMTU completes. | |||
Each time a probe packet is acknowledged, the PROBE_COUNT is set | Each time a probe packet is acknowledged, the PROBE_COUNT is set | |||
to zero, the PLPMTU is set to the PROBED_SIZE and then the | to zero, the PLPMTU is set to the PROBED_SIZE, and then the | |||
PROBED_SIZE is increased using the search algorithm (as described | PROBED_SIZE is increased using the search algorithm (as described | |||
in Section 5.3. | in Section 5.3). | |||
When a probe packet is sent and not acknowledged within the period | When a probe packet is sent and not acknowledged within the period | |||
of the PROBE_TIMER, the PROBE_COUNT is incremented and a new probe | of the PROBE_TIMER, the PROBE_COUNT is incremented, and a new | |||
packet is transmitted. | probe packet is transmitted. | |||
The state is exited to enter SEARCH_COMPLETE when the PROBE_COUNT | The state is exited to enter SEARCH_COMPLETE when the PROBE_COUNT | |||
reaches MAX_PROBES, a validated PTB is received that corresponds | reaches MAX_PROBES, a validated PTB is received that corresponds | |||
to the last successfully probed size (PL_PTB_SIZE = PLPMTU), or a | to the last successfully probed size (PL_PTB_SIZE = PLPMTU), or a | |||
probe of size MAX_PLPMTU is acknowledged (PLPMTU = MAX_PLPMTU). | probe of size MAX_PLPMTU is acknowledged (PLPMTU = MAX_PLPMTU). | |||
When a black hole is detected in the SEARCHING state, this causes | When a black hole is detected in the SEARCHING state, this causes | |||
the PL sender to enter the BASE state. | the PL sender to enter the BASE state. | |||
SEARCH_COMPLETE: The SEARCH_COMPLETE state indicates that a search | SEARCH_COMPLETE: The SEARCH_COMPLETE state indicates that a search | |||
has completed. This is the normal maintenance state, where the PL | has completed. This is the normal maintenance state, where the PL | |||
is not probing to update the PLPMTU. DPLPMTUD remains in this | is not probing to update the PLPMTU. DPLPMTUD remains in this | |||
state until either the PMTU_RAISE_TIMER expires or a black hole is | state until either the PMTU_RAISE_TIMER expires or a black hole is | |||
detected. | detected. | |||
When DPLPMTUD uses an unacknowledged PL and is in the | When DPLPMTUD uses an unacknowledged PL and is in the | |||
SEARCH_COMPLETE state, a CONFIRMATION_TIMER periodically resets | SEARCH_COMPLETE state, a CONFIRMATION_TIMER periodically resets | |||
the PROBE_COUNT and schedules a probe packet with the size of the | the PROBE_COUNT and schedules a probe packet with the size of the | |||
PLPMTU. If MAX_PROBES successive PLPMTUD sized probes fail to be | PLPMTU. If MAX_PROBES successive PLPMTUD-sized probes fail to be | |||
acknowledged the method enters the BASE state. When used with an | acknowledged, the method enters the BASE state. When used with an | |||
acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue to | acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue to | |||
generate PLPMTU probes in this state. | generate PLPMTU probes in this state. | |||
ERROR: The ERROR state represents the case where either the network | ERROR: The ERROR state represents the case where either the network | |||
path is not known to support a PLPMTU of at least the BASE_PLPMTU | path is not known to support a PLPMTU of at least the BASE_PLPMTU | |||
size or when there is contradictory information about the network | size or when there is contradictory information about the network | |||
path that would otherwise result in excessive variation in the MPS | path that would otherwise result in excessive variation in the MPS | |||
signaled to the higher layer. The state implements a method to | signaled to the higher layer. The state implements a method to | |||
mitigate oscillation in the state-event engine. It signals a | mitigate oscillation in the state-event engine. It signals a | |||
conservative value of the MPS to the higher layer by the PL. The | conservative value of the MPS to the higher layer by the PL. The | |||
state is exited when packet probes no longer detect the error. | state is exited when packet probes no longer detect the error. | |||
The PL sender then enters the SEARCHING state. | The PL sender then enters the SEARCHING state. | |||
Implementations are permitted to enable endpoint fragmentation if | Implementations are permitted to enable endpoint fragmentation if | |||
the DPLPMTUD is unable to validate MIN_PLPMTU within PROBE_COUNT | the DPLPMTUD is unable to validate MIN_PLPMTU within PROBE_COUNT | |||
probes. If DPLPMTUD is unable to validate MIN_PLPMTU the | probes. If DPLPMTUD is unable to validate MIN_PLPMTU, the | |||
implementation will transition to the DISABLED state. | implementation will transition to the DISABLED state. | |||
Note: MIN_PLPMTU could be identical to BASE_PLPMTU, simplifying | Note: MIN_PLPMTU could be identical to BASE_PLPMTU, simplifying | |||
the actions in this state. | the actions in this state. | |||
5.3. Search to Increase the PLPMTU | 5.3. Search to Increase the PLPMTU | |||
This section describes the algorithms used by DPLPMTUD to search for | This section describes the algorithms used by DPLPMTUD to search for | |||
a larger PLPMTU. | a larger PLPMTU. | |||
5.3.1. Probing for a larger PLPMTU | 5.3.1. Probing for a Larger PLPMTU | |||
Implementations use a search algorithm across the search range to | Implementations use a search algorithm across the search range to | |||
determine whether a larger PLPMTU can be supported across a network | determine whether a larger PLPMTU can be supported across a network | |||
path. | path. | |||
The method discovers the search range by confirming the minimum | The method discovers the search range by confirming the minimum | |||
PLPMTU and then using the probe method to select a PROBED_SIZE less | PLPMTU and then using the probe method to select a PROBED_SIZE less | |||
than or equal to MAX_PLPMTU. MAX_PLPMTU is the minimum of the local | than or equal to MAX_PLPMTU. MAX_PLPMTU is the minimum of the local | |||
MTU and EMTU_R (when this is learned from the remote endpoint). The | MTU and EMTU_R (when this is learned from the remote endpoint). The | |||
MAX_PLPMTU MAY be reduced by an application that sets a maximum to | MAX_PLPMTU MAY be reduced by an application that sets a maximum to | |||
the size of datagrams it will send. | the size of datagrams it will send. | |||
The PROBE_COUNT is initialized to zero when the first probe with a | The PROBE_COUNT is initialized to zero when the first probe with a | |||
size greater than or equal to PLPMTUD is sent. Each probe packet | size greater than or equal to PLPMTU is sent. Each probe packet | |||
successfully sent to the remote peer is confirmed by acknowledgment | successfully sent to the remote peer is confirmed by acknowledgment | |||
at the PL, see Section 4.1. | at the PL (see Section 4.1). | |||
Each time a probe packet is sent to the destination, the PROBE_TIMER | Each time a probe packet is sent to the destination, the PROBE_TIMER | |||
is started. The timer is canceled when the PL receives | is started. The timer is canceled when the PL receives | |||
acknowledgment that the probe packet has been successfully sent | acknowledgment that the probe packet has been successfully sent | |||
across the path Section 4.1. This confirms that the PROBED_SIZE is | across the path (Section 4.1). This confirms that the PROBED_SIZE is | |||
supported, and the PROBED_SIZE value is then assigned to the PLPMTU. | supported, and the PROBED_SIZE value is then assigned to the PLPMTU. | |||
The search algorithm can continue to send subsequent probe packets of | The search algorithm can continue to send subsequent probe packets of | |||
an increasing size. | an increasing size. | |||
If the timer expires before a probe packet is acknowledged, the probe | If the timer expires before a probe packet is acknowledged, the probe | |||
has failed to confirm the PROBED_SIZE. Each time the PROBE_TIMER | has failed to confirm the PROBED_SIZE. Each time the PROBE_TIMER | |||
expires, the PROBE_COUNT is incremented, the PROBE_TIMER is | expires, the PROBE_COUNT is incremented, the PROBE_TIMER is | |||
reinitialized, and a new probe of the same size or any other size | reinitialized, and a new probe of the same size or any other size | |||
(determined by the search algorithm) can be sent. The maximum number | (determined by the search algorithm) can be sent. The maximum number | |||
of consecutive failed probes is configured (MAX_PROBES). If the | of consecutive failed probes is configured (MAX_PROBES). If the | |||
skipping to change at page 30, line 34 ¶ | skipping to change at line 1380 ¶ | |||
The search algorithm determines a minimum useful gain in PLPMTU. It | The search algorithm determines a minimum useful gain in PLPMTU. It | |||
would not be constructive for a PL sender to attempt to probe for all | would not be constructive for a PL sender to attempt to probe for all | |||
sizes. This would incur unnecessary load on the path. | sizes. This would incur unnecessary load on the path. | |||
Implementations SHOULD select the set of probe packet sizes to | Implementations SHOULD select the set of probe packet sizes to | |||
maximize the gain in PLPMTU from each search step. | maximize the gain in PLPMTU from each search step. | |||
Implementations could optimize the search procedure by selecting step | Implementations could optimize the search procedure by selecting step | |||
sizes from a table of common PMTU sizes. When selecting the | sizes from a table of common PMTU sizes. When selecting the | |||
appropriate next size to search, an implementer ought to also | appropriate next size to search, an implementer ought to also | |||
consider that there can be common sizes of MPS that applications seek | consider that there can be common sizes of MPS that applications seek | |||
to use, and their could be common sizes of MTU used within the | to use, and there could be common sizes of MTU used within the | |||
network. | network. | |||
5.3.3. Resilience to Inconsistent Path Information | 5.3.3. Resilience to Inconsistent Path Information | |||
A decision to increase the PLPMTU needs to be resilient to the | A decision to increase the PLPMTU needs to be resilient to the | |||
possibility that information learned about the network path is | possibility that information learned about the network path is | |||
inconsistent. A path is inconsistent when, for example, probe | inconsistent. A path is inconsistent when, for example, probe | |||
packets are lost due to other reasons (i.e., not packet size) or due | packets are lost due to other reasons (i.e., not packet size) or due | |||
to frequent path changes. Frequent path changes could occur by | to frequent path changes. Frequent path changes could occur by | |||
unexpected "flapping" - where some packets from a flow pass along one | unexpected "flapping" -- where some packets from a flow pass along | |||
path, but other packets follow a different path with different | one path, but other packets follow a different path with different | |||
properties. | properties. | |||
A PL sender is able to detect inconsistency from the sequence of | A PL sender is able to detect inconsistency either from the sequence | |||
PLPMTU probes that are acknowledged or the sequence of PTB messages | of PLPMTU probes that are acknowledged or from the sequence of PTB | |||
that it receives. When inconsistent path information is detected, a | messages that it receives. When inconsistent path information is | |||
PL sender could use an alternate search mode that clamps the offered | detected, a PL sender could use an alternate search mode that clamps | |||
MPS to a smaller value for a period of time. This avoids unnecessary | the offered MPS to a smaller value for a period of time. This avoids | |||
loss of packets. | unnecessary loss of packets. | |||
5.4. Robustness to Inconsistent Paths | 5.4. Robustness to Inconsistent Paths | |||
Some paths could be unable to sustain packets of the BASE_PLPMTU | Some paths could be unable to sustain packets of the BASE_PLPMTU | |||
size. The Error State could be implemented to provide rubustness to | size. The Error State could be implemented to provide robustness to | |||
such paths. This allows fallback to a smaller than desired PLPMTU, | such paths. This allows fallback to a smaller than desired PLPMTU | |||
rather than suffer connectivity failure. This could utilize methods | rather than suffer connectivity failure. This could utilize methods | |||
such as endpoint IP fragmentation to enable the PL sender to | such as endpoint IP fragmentation to enable the PL sender to | |||
communicate using packets smaller than the BASE_PLPMTU. | communicate using packets smaller than the BASE_PLPMTU. | |||
6. Specification of Protocol-Specific Methods | 6. Specification of Protocol-Specific Methods | |||
DPLPMTUD requires protocol-specific details to be specified for each | DPLPMTUD requires protocol-specific details to be specified for each | |||
PL that is used. | PL that is used. | |||
The first subsection provides guidance on how to implement the | The first subsection provides guidance on how to implement the | |||
DPLPMTUD method as a part of an application using UDP or UDP-Lite. | DPLPMTUD method as a part of an application using UDP or UDP-Lite. | |||
The guidance also applies to other datagram services that do not | The guidance also applies to other datagram services that do not | |||
include a specific transport protocol (such as a tunnel | include a specific transport protocol (such as a tunnel | |||
encapsulation). The following subsections describe how DPLPMTUD can | encapsulation). The following subsections describe how DPLPMTUD can | |||
be implemented as a part of the transport service, allowing | be implemented as a part of the transport service, allowing | |||
applications using the service to benefit from discovery of the | applications using the service to benefit from discovery of the | |||
PLPMTU without themselves needing to implement this method when using | PLPMTU without themselves needing to implement this method when using | |||
SCTP and QUIC. | SCTP and QUIC. | |||
6.1. Application support for DPLPMTUD with UDP or UDP-Lite | 6.1. Application Support for DPLPMTUD with UDP or UDP-Lite | |||
The current specifications of UDP [RFC0768] and UDP-Lite [RFC3828] do | The current specifications of UDP [RFC0768] and UDP-Lite [RFC3828] do | |||
not define a method in the RFC-series that supports PLPMTUD. In | not define a method in the RFC series that supports PLPMTUD. In | |||
particular, the UDP transport does not provide the transport features | particular, the UDP transport does not provide the transport features | |||
needed to implement datagram PLPMTUD. | needed to implement datagram PLPMTUD. | |||
The DPLPMTUD method can be implemented as a part of an application | The DPLPMTUD method can be implemented as a part of an application | |||
built directly or indirectly on UDP or UDP-Lite, but relies on | built directly or indirectly on UDP or UDP-Lite but relies on higher- | |||
higher-layer protocol features to implement the method [BCP145]. | layer protocol features to implement the method [BCP145]. | |||
Some primitives used by DPLPMTUD might not be available via the | Some primitives used by DPLPMTUD might not be available via the | |||
Datagram API (e.g., the ability to access the PLPMTU from the IP | Datagram API (e.g., the ability to access the PLPMTU from the IP- | |||
layer cache, or interpret received PTB messages). | layer cache or to interpret received PTB messages). | |||
In addition, it is recommended that PMTU discovery is not performed | In addition, it is recommended that PMTU discovery is not performed | |||
by multiple protocol layers. An application SHOULD avoid using | by multiple protocol layers. An application SHOULD avoid using | |||
DPLPMTUD when the underlying transport system provides this | DPLPMTUD when the underlying transport system provides this | |||
capability. A common method for managing the PLPMTU has benefits, | capability. A common method for managing the PLPMTU has benefits, | |||
both in the ability to share state between different processes and | both in the ability to share state between different processes and in | |||
opportunities to coordinate probing for different PL instances. | opportunities to coordinate probing for different PL instances. | |||
6.1.1. Application Request | 6.1.1. Application Request | |||
An application needs an application-layer protocol mechanism (such as | An application needs an application-layer protocol mechanism (such as | |||
a message acknowledgment method) that solicits a response from a | a message acknowledgment method) that solicits a response from a | |||
destination endpoint. The method SHOULD allow the sender to check | destination endpoint. The method SHOULD allow the sender to check | |||
the value returned in the response to provide additional protection | the value returned in the response to provide additional protection | |||
from off-path insertion of data [BCP145]. Suitable methods include a | from off-path insertion of data [BCP145]. Suitable methods include a | |||
parameter known only to the two endpoints, such as a session ID or | parameter known only to the two endpoints, such as a session ID or | |||
initialized sequence number. | initialized sequence number. | |||
6.1.2. Application Response | 6.1.2. Application Response | |||
An application needs an application-layer protocol mechanism to | An application needs an application-layer protocol mechanism to | |||
communicate the response from the destination endpoint. This | communicate the response from the destination endpoint. This | |||
response could indicate successful reception of the probe across the | response could indicate successful reception of the probe across the | |||
path, but could also indicate that some (or all packets) have failed | path but could also indicate that some (or all packets) have failed | |||
to reach the destination. | to reach the destination. | |||
6.1.3. Sending Application Probe Packets | 6.1.3. Sending Application Probe Packets | |||
A probe packet can carry an application data block, but the | A probe packet can carry an application data block, but the | |||
successful transmission of this data is at risk when used for | successful transmission of this data is at risk when used for | |||
probing. Some applications might prefer to use a probe packet that | probing. Some applications might prefer to use a probe packet that | |||
does not carry an application data block to avoid disruption to data | does not carry an application data block to avoid disruption of data | |||
transfer. | transfer. | |||
6.1.4. Initial Connectivity | 6.1.4. Initial Connectivity | |||
An application that does not have other higher-layer information | An application that does not have other higher-layer information | |||
confirming connectivity with the remote peer SHOULD implement a | confirming connectivity with the remote peer SHOULD implement a | |||
connectivity mechanism using acknowledged probe packets before | connectivity mechanism using acknowledged probe packets before | |||
entering the BASE state. | entering the BASE state. | |||
6.1.5. Validating the Path | 6.1.5. Validating the Path | |||
skipping to change at page 33, line 4 ¶ | skipping to change at line 1495 ¶ | |||
SEARCH_COMPLETE state. | SEARCH_COMPLETE state. | |||
6.1.6. Handling of PTB Messages | 6.1.6. Handling of PTB Messages | |||
An application that is able and wishes to receive PTB messages MUST | An application that is able and wishes to receive PTB messages MUST | |||
perform ICMP validation as specified in Section 5.2 of [BCP145]. | perform ICMP validation as specified in Section 5.2 of [BCP145]. | |||
This requires that the application checks each received PTB message | This requires that the application checks each received PTB message | |||
to validate that it was is received in response to transmitted | to validate that it was is received in response to transmitted | |||
traffic and that the reported PL_PTB_SIZE is less than the current | traffic and that the reported PL_PTB_SIZE is less than the current | |||
probed size (see Section 4.6.2). A validated PTB message MAY be used | probed size (see Section 4.6.2). A validated PTB message MAY be used | |||
as input to the DPLPMTUD algorithm, but MUST NOT be used directly to | as input to the DPLPMTUD algorithm but MUST NOT be used directly to | |||
set the PLPMTU. | set the PLPMTU. | |||
6.2. DPLPMTUD for SCTP | 6.2. DPLPMTUD for SCTP | |||
Section 10.2 of [RFC4821] specified a recommended PLPMTUD probing | Section 10.2 of [RFC4821] specifies a recommended PLPMTUD probing | |||
method for SCTP and Section 7.3 of [RFC4960] recommended an endpoint | method for SCTP, and Section 7.3 of [RFC4960] recommends an endpoint | |||
apply the techniques in RFC4821 on a per-destination-address basis. | apply the techniques in RFC 4821 on a per-destination-address basis. | |||
The specification for DPLPMTUD continues the practice of using the PL | The specification for DPLPMTUD continues the practice of using the PL | |||
to discover the PMTU, but updates, RFC4960 with a recommendation to | to discover the PMTU but updates RFC4960 with a recommendation to use | |||
use the method specified in this document: The RECOMMENDED method for | the method specified in this document: The RECOMMENDED method for | |||
generating probes is to add a chunk consisting only of padding to an | generating probes is to add a chunk consisting only of padding to an | |||
SCTP message. The PAD chunk defined in [RFC4820] SHOULD be attached | SCTP message. The PAD chunk defined in [RFC4820] SHOULD be attached | |||
to a minimum length HEARTBEAT (HB) chunk to build a probe packet. | to a minimum-length HEARTBEAT (HB) chunk to build a probe packet. | |||
This enables probing without affecting the transfer of user messages | This enables probing without affecting the transfer of user messages | |||
and without being limited by congestion control or flow control. | and without being limited by congestion control or flow control. | |||
This is preferred to using DATA chunks (with padding as required) as | This is preferred to using DATA chunks (with padding as required) as | |||
path probes. | path probes. | |||
Section 6.9 of [RFC4960] describes dividing the user messages into | Section 6.9 of [RFC4960] describes dividing the user messages into | |||
data chunks sent by the PL when using SCTP. This notes that once an | DATA chunks sent by the PL when using SCTP. This notes that once an | |||
SCTP message has been sent, it cannot be re-segmented. [RFC4960] | SCTP message has been sent, it cannot be resegmented. [RFC4960] | |||
describes the method to retransmit data chunks when the MPS has | describes the method for retransmitting DATA chunks when the MPS has | |||
reduced, and the use of IP fragmentation for this case. This is | been reduced, and Section 6.9 of [RFC4960] describes use of IP | |||
unchanged by this document. | fragmentation for this case. This is unchanged by this document. | |||
6.2.1. SCTP/IPv4 and SCTP/IPv6 | 6.2.1. SCTP/IPv4 and SCTP/IPv6 | |||
6.2.1.1. Initial Connectivity | 6.2.1.1. Initial Connectivity | |||
The base protocol is specified in [RFC4960]. This provides an | The base protocol is specified in [RFC4960]. This provides an | |||
acknowledged PL. A sender can therefore enter the BASE state as soon | acknowledged PL. A sender can therefore enter the BASE state as soon | |||
as connectivity has been confirmed. | as connectivity has been confirmed. | |||
6.2.1.2. Sending SCTP Probe Packets | 6.2.1.2. Sending SCTP Probe Packets | |||
skipping to change at page 34, line 5 ¶ | skipping to change at line 1544 ¶ | |||
trigger the sending of a HEARTBEAT ACK chunk. The reception of the | trigger the sending of a HEARTBEAT ACK chunk. The reception of the | |||
HEARTBEAT ACK chunk acknowledges reception of a successful probe. A | HEARTBEAT ACK chunk acknowledges reception of a successful probe. A | |||
successful probe updates the association and path counters, but an | successful probe updates the association and path counters, but an | |||
unsuccessful probe is discounted (assumed to be a result of choosing | unsuccessful probe is discounted (assumed to be a result of choosing | |||
too large a PLPMTU). | too large a PLPMTU). | |||
The SCTP sender needs to be able to determine the total size of a | The SCTP sender needs to be able to determine the total size of a | |||
probe packet. The HEARTBEAT chunk could carry a Heartbeat | probe packet. The HEARTBEAT chunk could carry a Heartbeat | |||
Information parameter that includes, besides the information | Information parameter that includes, besides the information | |||
suggested in [RFC4960], the probe size to help an implementation | suggested in [RFC4960], the probe size to help an implementation | |||
associate a HEARTBEAT-ACK with the size of probe that was sent. The | associate a HEARTBEAT ACK with the size of probe that was sent. The | |||
sender could also use other methods, such as sending a nonce and | sender could also use other methods, such as sending a nonce and | |||
verifying the information returned also contains the corresponding | verifying the information returned also contains the corresponding | |||
nonce. The length of the PAD chunk is computed by reducing the | nonce. The length of the PAD chunk is computed by reducing the | |||
probing size by the size of the SCTP common header and the HEARTBEAT | probing size by the size of the SCTP common header and the HEARTBEAT | |||
chunk. The payload of the PAD chunk contains arbitrary data. When | chunk. The payload of the PAD chunk contains arbitrary data. When | |||
transmitted at the IP layer, the PMTU size also includes the IPv4 or | transmitted at the IP layer, the PMTU size also includes the IPv4 or | |||
IPv6 header(s). | IPv6 header(s). | |||
Probing can start directly after the PL handshake, this can be done | Probing can start directly after the PL handshake; this can be done | |||
before data is sent. Assuming this behavior (i.e., the PMTU is | before data is sent. Assuming this behavior (i.e., the PMTU is | |||
smaller than or equal to the interface MTU), this process will take | smaller than or equal to the interface MTU), this process will take | |||
several round trip time periods, dependent on the number of DPLPMTUD | several round-trip time periods, dependent on the number of DPLPMTUD | |||
probes sent. The Heartbeat timer can be used to implement the | probes sent. The Heartbeat timer can be used to implement the | |||
PROBE_TIMER. | PROBE_TIMER. | |||
6.2.1.3. Validating the Path with SCTP | 6.2.1.3. Validating the Path with SCTP | |||
Since SCTP provides an acknowledged PL, a sender MUST NOT implement | Since SCTP provides an acknowledged PL, a sender MUST NOT implement | |||
the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | |||
6.2.1.4. PTB Message Handling by SCTP | 6.2.1.4. PTB Message Handling by SCTP | |||
Normal ICMP validation MUST be performed as specified in Appendix C | Normal ICMP validation MUST be performed as specified in Appendix C | |||
of [RFC4960]. This requires that the first 8 bytes of the SCTP | of [RFC4960]. This requires that the first 8 bytes of the SCTP | |||
common header are quoted in the payload of the PTB message, which can | common header are quoted in the payload of the PTB message, which can | |||
be the case for ICMPv4 and is normally the case for ICMPv6. | be the case for ICMPv4 and is normally the case for ICMPv6. | |||
When a PTB message has been validated, the PL_PTB_SIZE calculated | When a PTB message has been validated, the PL_PTB_SIZE calculated | |||
from the PTB_SIZE reported in the PTB message SHOULD be used with the | from the PTB_SIZE reported in the PTB message SHOULD be used with the | |||
DPLPMTUD algorithm, providing that the reported PL_PTB_SIZE is less | DPLPMTUD algorithm, provided that the reported PL_PTB_SIZE is less | |||
than the current probe size (see Section 4.6). | than the current probe size (see Section 4.6). | |||
6.2.2. DPLPMTUD for SCTP/UDP | 6.2.2. DPLPMTUD for SCTP/UDP | |||
The UDP encapsulation of SCTP is specified in [RFC6951]. | The UDP encapsulation of SCTP is specified in [RFC6951]. | |||
This specification updates the reference to RFC 4821 in section 5.6 | This specification updates the reference to RFC 4821 in Section 5.6 | |||
of RFC 6951 to refer to XXXTHISRFCXXX. RFC 6951 is updated by | of RFC 6951 to refer to this document (RFC 8899). RFC 6951 is | |||
addition of the following sentence at the end of section 5.6: "The | updated by the addition of the following sentence at the end of | |||
RECOMMENDED method for determining the MTU of the path is specified | Section 5.6: | |||
in XXXTHISRFCXXX". | ||||
XXX RFC EDITOR - please replace XXXTHISRFCXXX when published XXX | | The RECOMMENDED method for determining the MTU of the path is | |||
| specified in RFC 8899. | ||||
6.2.2.1. Initial Connectivity | 6.2.2.1. Initial Connectivity | |||
A sender can enter the BASE state as soon as SCTP connectivity has | A sender can enter the BASE state as soon as SCTP connectivity has | |||
been confirmed. | been confirmed. | |||
6.2.2.2. Sending SCTP/UDP Probe Packets | 6.2.2.2. Sending SCTP/UDP Probe Packets | |||
Packet probing can be performed as specified in Section 6.2.1.2. The | Packet probing can be performed as specified in Section 6.2.1.2. The | |||
size of the probe packet includes the 8 bytes of UDP Header. This | size of the probe packet includes the 8 bytes of UDP header. This | |||
has to be considered when filling the probe packet with the PAD | has to be considered when filling the probe packet with the PAD | |||
chunk. | chunk. | |||
6.2.2.3. Validating the Path with SCTP/UDP | 6.2.2.3. Validating the Path with SCTP/UDP | |||
SCTP provides an acknowledged PL, therefore a sender does not | SCTP provides an acknowledged PL; therefore, a sender does not | |||
implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | |||
6.2.2.4. Handling of PTB Messages by SCTP/UDP | 6.2.2.4. Handling of PTB Messages by SCTP/UDP | |||
ICMP validation MUST be performed for PTB messages as specified in | ICMP validation MUST be performed for PTB messages as specified in | |||
Appendix C of [RFC4960]. This requires that the first 8 bytes of the | Appendix C of [RFC4960]. This requires that the first 8 bytes of the | |||
SCTP common header are contained in the PTB message, which can be the | SCTP common header are contained in the PTB message, which can be the | |||
case for ICMPv4 (but note the UDP header also consumes a part of the | case for ICMPv4 (but note the UDP header also consumes a part of the | |||
quoted packet header) and is normally the case for ICMPv6. When the | quoted packet header) and is normally the case for ICMPv6. When the | |||
validation is completed, the PL_PTB_SIZE calculated from the PTB_SIZE | validation is completed, the PL_PTB_SIZE calculated from the PTB_SIZE | |||
in the PTB message SHOULD be used with the DPLPMTUD providing that | in the PTB message SHOULD be used with the DPLPMTUD providing that | |||
the reported PL_PTB_SIZE is less than the current probe size. | the reported PL_PTB_SIZE is less than the current probe size. | |||
6.2.3. DPLPMTUD for SCTP/DTLS | 6.2.3. DPLPMTUD for SCTP/DTLS | |||
The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is | The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is | |||
specified in [RFC8261]. This is used for data channels in WebRTC | specified in [RFC8261]. This is used for data channels in WebRTC | |||
implementations. This specification updates the reference to RFC | implementations. This specification updates the reference to RFC | |||
4821 in section 5 of RFC 8261 to refer to XXXTHISRFCXXX. | 4821 in Section 5 of RFC 8261 to refer to this document (RFC 8899). | |||
XXX RFC EDITOR - please replace XXXTHISRFCXXX when published XXX | ||||
6.2.3.1. Initial Connectivity | 6.2.3.1. Initial Connectivity | |||
A sender can enter the BASE state as soon as SCTP connectivity has | A sender can enter the BASE state as soon as SCTP connectivity has | |||
been confirmed. | been confirmed. | |||
6.2.3.2. Sending SCTP/DTLS Probe Packets | 6.2.3.2. Sending SCTP/DTLS Probe Packets | |||
Packet probing can be done, as specified in Section 6.2.1.2. The | Packet probing can be done as specified in Section 6.2.1.2. The | |||
maximum payload is reduced by the size of the DTLS headers, which has | maximum payload is reduced by the size of the DTLS headers, which has | |||
to be considered when filling the PAD chunk. The size of the probe | to be considered when filling the PAD chunk. The size of the probe | |||
packet includes the DTLS PL headers. This has to be considered when | packet includes the DTLS PL headers. This has to be considered when | |||
filling the probe packet with the PAD chunk. | filling the probe packet with the PAD chunk. | |||
6.2.3.3. Validating the Path with SCTP/DTLS | 6.2.3.3. Validating the Path with SCTP/DTLS | |||
Since SCTP provides an acknowledged PL, a sender MUST NOT implement | Since SCTP provides an acknowledged PL, a sender MUST NOT implement | |||
the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | |||
6.2.3.4. Handling of PTB Messages by SCTP/DTLS | 6.2.3.4. Handling of PTB Messages by SCTP/DTLS | |||
[RFC4960] does not specify a way to validate SCTP/DTLS ICMP message | [RFC4960] does not specify a way to validate SCTP/DTLS ICMP message | |||
payload and neither does this document. This can prevent processing | payload and neither does this document. This can prevent processing | |||
of PTB messages at the PL. | of PTB messages at the PL. | |||
6.3. DPLPMTUD for QUIC | 6.3. DPLPMTUD for QUIC | |||
QUIC [I-D.ietf-quic-transport] is a UDP-based PL that provides | QUIC [QUIC] is a UDP-based PL that provides reception feedback. The | |||
reception feedback. The UDP payload includes a QUIC packet header, a | UDP payload includes a QUIC packet header, a protected payload, and | |||
protected payload, and any authentication fields. It supports | any authentication fields. It supports padding and packet | |||
padding and packet coalescence that can be used to construct probe | coalescence that can be used to construct probe packets. From the | |||
packets. From the perspective of DPLPMTUD, QUIC can function as an | perspective of DPLPMTUD, QUIC can function as an acknowledged PL. | |||
acknowledged PL. [I-D.ietf-quic-transport] describes the method for | [QUIC] describes the method for using DPLPMTUD with QUIC packets. | |||
using DPLPMTUD with QUIC packets. | ||||
7. Acknowledgments | ||||
This work was partially funded by the European Union's Horizon 2020 | ||||
research and innovation programme under grant agreement No. 644334 | ||||
(NEAT). The views expressed are solely those of the author(s). | ||||
Thanks to all that have commented or contributed, the TSVWG and QUIC | ||||
working groups, and Mathew Calder and Julius Flohr for providing | ||||
early implementations. | ||||
8. IANA Considerations | ||||
This memo includes no request to IANA. | 7. IANA Considerations | |||
If there are no requirements for IANA, the section will be removed | This document has no IANA actions. | |||
during conversion into an RFC by the RFC Editor. | ||||
9. Security Considerations | 8. Security Considerations | |||
The security considerations for the use of UDP and SCTP are provided | The security considerations for the use of UDP and SCTP are provided | |||
in the referenced RFCs. | in the referenced RFCs. | |||
To avoid excessive load, the interval between individual probe | To avoid excessive load, the interval between individual probe | |||
packets MUST be at least one RTT, and the interval between rounds of | packets MUST be at least one RTT, and the interval between rounds of | |||
probing is determined by the PMTU_RAISE_TIMER. | probing is determined by the PMTU_RAISE_TIMER. | |||
A PL sender needs to ensure that the method used to confirm reception | A PL sender needs to ensure that the method used to confirm reception | |||
of probe packets protects from off-path attackers injecting packets | of probe packets protects from off-path attackers injecting packets | |||
into the path. This protection is provided in IETF-defined protocols | into the path. This protection is provided in IETF-defined protocols | |||
(e.g., TCP, SCTP) using a randomly-initialized sequence number. A | (e.g., TCP, SCTP) using a randomly initialized sequence number. A | |||
description of one way to do this when using UDP is provided in | description of one way to do this when using UDP is provided in | |||
section 5.1 of [BCP145]). | Section 5.1 of [BCP145]). | |||
There are cases where ICMP Packet Too Big (PTB) messages are not | There are cases where ICMP Packet Too Big (PTB) messages are not | |||
delivered due to policy, configuration or equipment design (see | delivered due to policy, configuration, or equipment design (see | |||
Section 1.1). This method therefore does not rely upon PTB messages | Section 1.1). This method therefore does not rely upon PTB messages | |||
being received, but is able to utilize these when they are received | being received but is able to utilize these when they are received by | |||
by the sender. PTB messages could potentially be used to cause a | the sender. PTB messages could potentially be used to cause a node | |||
node to inappropriately reduce the PLPMTU. A node supporting | to inappropriately reduce the PLPMTU. A node supporting DPLPMTUD | |||
DPLPMTUD MUST therefore appropriately validate the payload of PTB | MUST therefore appropriately validate the payload of PTB messages to | |||
messages to ensure these are received in response to transmitted | ensure these are received in response to transmitted traffic (i.e., a | |||
traffic (i.e., a reported error condition that corresponds to a | reported error condition that corresponds to a datagram actually sent | |||
datagram actually sent by the path layer, see Section 4.6.1). | by the path layer, see Section 4.6.1). | |||
An on-path attacker able to create a PTB message could forge PTB | An on-path attacker able to create a PTB message could forge PTB | |||
messages that include a valid quoted IP packet. Such an attack could | messages that include a valid quoted IP packet. Such an attack could | |||
be used to drive down the PLPMTU. An on-path device could similarly | be used to drive down the PLPMTU. An on-path device could similarly | |||
force a reduction of the PLPMTU by implementing a policy that drops | force a reduction of the PLPMTU by implementing a policy that drops | |||
packets larger than a configured size. There are two ways this | packets larger than a configured size. There are two ways this | |||
method can be mitigated against such attacks: First, by ensuring that | method can be mitigated against such attacks: first, by ensuring that | |||
a PL sender never reduces the PLPMTU below the base size, solely in | a PL sender never reduces the PLPMTU below the base size solely in | |||
response to receiving a PTB message. This is achieved by first | response to receiving a PTB message. This is achieved by first | |||
entering the BASE state when such a message is received. Second, the | entering the BASE state when such a message is received. Second, the | |||
design does not require processing of PTB messages, a PL sender could | design does not require processing of PTB messages; a PL sender could | |||
therefore suspend processing of PTB messages (e.g., in a robustness | therefore suspend processing of PTB messages (e.g., in a robustness | |||
mode after detecting that subsequent probes actually confirm that a | mode after detecting that subsequent probes actually confirm that a | |||
size larger than the PTB_SIZE is supported by a path). | size larger than the PTB_SIZE is supported by a path). | |||
Parsing the quoted packet inside a PTB message can introduce addional | Parsing the quoted packet inside a PTB message can introduce | |||
per-packet processing at the PL sender. This processing SHOULD be | additional per-packet processing at the PL sender. This processing | |||
limited to avoid a denial of service attack when arbitrary headers | SHOULD be limited to avoid a denial-of-service attack when arbitrary | |||
are included. Rate-limiting the processing could result in PTB | headers are included. Rate-limiting the processing could result in | |||
messages not being received by a PL, however the DPLPMTUD method is | PTB messages not being received by a PL; however, the DPLPMTUD method | |||
robust to such loss. | is robust to such loss. | |||
The successful processing of an ICMP message can trigger a probe when | The successful processing of an ICMP message can trigger a probe when | |||
the reported PTB size is valid, but this does not directly update the | the reported PTB size is valid, but this does not directly update the | |||
PLPMTU for the path. This prevents a message attempting to black | PLPMTU for the path. This prevents a message attempting to black | |||
hole data by indicating a size larger than supported by the path. | hole data by indicating a size larger than supported by the path. | |||
It is possible that the information about a path is not stable. This | It is possible that the information about a path is not stable. This | |||
could be a result of forwarding across more than one path that has a | could be a result of forwarding across more than one path that has a | |||
different actual PMTU or a single path presents a varying PMTU. The | different actual PMTU or a single path presents a varying PMTU. The | |||
design of a PLPMTUD implementation SHOULD consider how to mitigate | design of a PLPMTUD implementation SHOULD consider how to mitigate | |||
skipping to change at page 38, line 31 ¶ | skipping to change at line 1735 ¶ | |||
to the payload data being sent (e.g., including security-related | to the payload data being sent (e.g., including security-related | |||
fields such as an AEAD tag and TLS record layer padding). The value | fields such as an AEAD tag and TLS record layer padding). The value | |||
of the padding data does not influence the DPLPMTUD search algorithm, | of the padding data does not influence the DPLPMTUD search algorithm, | |||
and therefore needs to be set consistent with the policy of the PL. | and therefore needs to be set consistent with the policy of the PL. | |||
If a PL can make use of cryptographic confidentiality or data- | If a PL can make use of cryptographic confidentiality or data- | |||
integrity mechanisms, then the design ought to avoid adding anything | integrity mechanisms, then the design ought to avoid adding anything | |||
(e.g., padding) to DPLPMTUD probe packets that is not also protected | (e.g., padding) to DPLPMTUD probe packets that is not also protected | |||
by those cryptographic mechanisms. | by those cryptographic mechanisms. | |||
10. References | 9. References | |||
10.1. Normative References | 9.1. Normative References | |||
[BCP145] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage | [BCP145] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage | |||
Guidelines", BCP 145, RFC 8085, March 2017. | Guidelines", BCP 145, RFC 8085, March 2017, | |||
<https://www.rfc-editor.org/info/bcp145>. | ||||
<https://www.rfc-editor.org/info/bcp145> | ||||
[RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, | [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, | |||
DOI 10.17487/RFC0768, August 1980, | DOI 10.17487/RFC0768, August 1980, | |||
<https://www.rfc-editor.org/info/rfc768>. | <https://www.rfc-editor.org/info/rfc768>. | |||
[RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, | [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, | |||
DOI 10.17487/RFC0791, September 1981, | DOI 10.17487/RFC0791, September 1981, | |||
<https://www.rfc-editor.org/info/rfc791>. | <https://www.rfc-editor.org/info/rfc791>. | |||
[RFC1191] Mogul, J.C. and S.E. Deering, "Path MTU discovery", | [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, | |||
RFC 1191, DOI 10.17487/RFC1191, November 1990, | DOI 10.17487/RFC1191, November 1990, | |||
<https://www.rfc-editor.org/info/rfc1191>. | <https://www.rfc-editor.org/info/rfc1191>. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
[RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., | [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., | |||
and G. Fairhurst, Ed., "The Lightweight User Datagram | and G. Fairhurst, Ed., "The Lightweight User Datagram | |||
Protocol (UDP-Lite)", RFC 3828, DOI 10.17487/RFC3828, July | Protocol (UDP-Lite)", RFC 3828, DOI 10.17487/RFC3828, July | |||
skipping to change at page 39, line 49 ¶ | skipping to change at line 1799 ¶ | |||
[RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., | [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., | |||
"Path MTU Discovery for IP version 6", STD 87, RFC 8201, | "Path MTU Discovery for IP version 6", STD 87, RFC 8201, | |||
DOI 10.17487/RFC8201, July 2017, | DOI 10.17487/RFC8201, July 2017, | |||
<https://www.rfc-editor.org/info/rfc8201>. | <https://www.rfc-editor.org/info/rfc8201>. | |||
[RFC8261] Tuexen, M., Stewart, R., Jesup, R., and S. Loreto, | [RFC8261] Tuexen, M., Stewart, R., Jesup, R., and S. Loreto, | |||
"Datagram Transport Layer Security (DTLS) Encapsulation of | "Datagram Transport Layer Security (DTLS) Encapsulation of | |||
SCTP Packets", RFC 8261, DOI 10.17487/RFC8261, November | SCTP Packets", RFC 8261, DOI 10.17487/RFC8261, November | |||
2017, <https://www.rfc-editor.org/info/rfc8261>. | 2017, <https://www.rfc-editor.org/info/rfc8261>. | |||
10.2. Informative References | 9.2. Informative References | |||
[I-D.ietf-intarea-frag-fragile] | ||||
Bonica, R., Baker, F., Huston, G., Hinden, R., Troan, O., | ||||
and F. Gont, "IP Fragmentation Considered Fragile", Work | ||||
in Progress, Internet-Draft, draft-ietf-intarea-frag- | ||||
fragile-17, 30 September 2019, <http://www.ietf.org/ | ||||
internet-drafts/draft-ietf-intarea-frag-fragile-17.txt>. | ||||
[I-D.ietf-intarea-tunnels] | ||||
Touch, J. and M. Townsley, "IP Tunnels in the Internet | ||||
Architecture", Work in Progress, Internet-Draft, draft- | ||||
ietf-intarea-tunnels-10, 12 September 2019, | ||||
<http://www.ietf.org/internet-drafts/draft-ietf-intarea- | ||||
tunnels-10.txt>. | ||||
[I-D.ietf-quic-transport] | [QUIC] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based | |||
Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed | Multiplexed and Secure Transport", Work in Progress, | |||
and Secure Transport", Work in Progress, Internet-Draft, | Internet-Draft, draft-ietf-quic-transport-29, 10 June | |||
draft-ietf-quic-transport-27, 21 February 2020, | 2020, <https://tools.ietf.org/html/draft-ietf-quic- | |||
<http://www.ietf.org/internet-drafts/draft-ietf-quic- | transport-29>. | |||
transport-27.txt>. | ||||
[RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, | [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, | |||
RFC 792, DOI 10.17487/RFC0792, September 1981, | RFC 792, DOI 10.17487/RFC0792, September 1981, | |||
<https://www.rfc-editor.org/info/rfc792>. | <https://www.rfc-editor.org/info/rfc792>. | |||
[RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - | [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - | |||
Communication Layers", STD 3, RFC 1122, | Communication Layers", STD 3, RFC 1122, | |||
DOI 10.17487/RFC1122, October 1989, | DOI 10.17487/RFC1122, October 1989, | |||
<https://www.rfc-editor.org/info/rfc1122>. | <https://www.rfc-editor.org/info/rfc1122>. | |||
skipping to change at page 41, line 19 ¶ | skipping to change at line 1849 ¶ | |||
[RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering | [RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering | |||
ICMPv6 Messages in Firewalls", RFC 4890, | ICMPv6 Messages in Firewalls", RFC 4890, | |||
DOI 10.17487/RFC4890, May 2007, | DOI 10.17487/RFC4890, May 2007, | |||
<https://www.rfc-editor.org/info/rfc4890>. | <https://www.rfc-editor.org/info/rfc4890>. | |||
[RFC5508] Srisuresh, P., Ford, B., Sivakumar, S., and S. Guha, "NAT | [RFC5508] Srisuresh, P., Ford, B., Sivakumar, S., and S. Guha, "NAT | |||
Behavioral Requirements for ICMP", BCP 148, RFC 5508, | Behavioral Requirements for ICMP", BCP 148, RFC 5508, | |||
DOI 10.17487/RFC5508, April 2009, | DOI 10.17487/RFC5508, April 2009, | |||
<https://www.rfc-editor.org/info/rfc5508>. | <https://www.rfc-editor.org/info/rfc5508>. | |||
Appendix A. Revision Notes | [RFC8900] Bonica, R., Baker, F., Huston, G., Hinden, R., Troan, O., | |||
and F. Gont, "IP Fragmentation Considered Fragile", | ||||
Note to RFC-Editor: please remove this entire section prior to | RFC 8900, BCP 230, September 2020, | |||
publication. | <https://www.rfc-editor.org/info/rfc8900>. | |||
Individual draft -00: | ||||
* Comments and corrections are welcome directly to the authors or | ||||
via the IETF TSVWG working group mailing list. | ||||
* This update is proposed for WG comments. | ||||
Individual draft -01: | ||||
* Contains the first representation of the algorithm, showing the | ||||
states and timers | ||||
* This update is proposed for WG comments. | ||||
Individual draft -02: | ||||
* Contains updated representation of the algorithm, and textual | ||||
corrections. | ||||
* The text describing when to set the effective PMTU has not yet | ||||
been validated by the authors | ||||
* To determine security to off-path-attacks: We need to decide | ||||
whether a received PTB message SHOULD/MUST be validated? The text | ||||
on how to handle a PTB message indicating a link MTU larger than | ||||
the probe has yet not been validated by the authors | ||||
* No text currently describes how to handle inconsistent results | ||||
from arbitrary re-routing along different parallel paths | ||||
* This update is proposed for WG comments. | ||||
Working Group draft -00: | ||||
* This draft follows a successful adoption call for TSVWG | ||||
* There is still work to complete, please comment on this draft. | ||||
Working Group draft -01: | ||||
* This draft includes improved introduction. | ||||
* The draft is updated to require ICMP validation prior to accepting | ||||
PTB messages - this to be confirmed by WG | ||||
* Section added to discuss Selection of Probe Size - methods to be | ||||
evaluated and recommendations to be considered | ||||
* Section added to align with work proposed in the QUIC WG. | ||||
Working Group draft -02: | ||||
* The draft was updated based on feedback from the WG, and a | ||||
detailed review by Magnus Westerlund. | ||||
* The document updates RFC 4821. | ||||
* Requirements list updated. | ||||
* Added more explicit discussion of a simpler black-hole detection | ||||
mode. | ||||
* This draft includes reorganisation of the section on IETF | ||||
protocols. | ||||
* Added more discussion of implementation within an application. | ||||
* Added text on flapping paths. | ||||
* Replaced 'effective MTU' with new term PLPMTU. | ||||
Working Group draft -03: | ||||
* Updated figures | ||||
* Added more discussion on blackhole detection | ||||
* Added figure describing just blackhole detection | ||||
* Added figure relating MPS sizes | ||||
Working Group draft -04: | ||||
* Described phases and named these consistently. | ||||
* Corrected transition from confirmation directly to the search | ||||
phase (Base has been checked). | ||||
* Redrawn state diagrams. | ||||
* Renamed BASE_MTU to BASE_PMTU (because it is a base for the PMTU). | ||||
* Clarified Error state. | ||||
* Clarified suspending DPLPMTUD. | ||||
* Verified normative text in requirements section. | ||||
* Removed duplicate text. | ||||
* Changed all text to refer to /packet probe/probe packet/ | ||||
/validation/verification/ added term /Probe Confirmation/ and | ||||
clarified BlackHole detection. | ||||
Working Group draft -05: | ||||
* Updated security considerations. | ||||
* Feedback after speaking with Joe Touch helped improve UDP-Options | ||||
description. | ||||
Working Group draft -06: | ||||
* Updated description of ICMP issues in section 1.1 | ||||
* Update to description of QUIC. | ||||
Working group draft -07: | ||||
* Moved description of the PTB processing method from the PTB | ||||
requirements section. | ||||
* Clarified what is performed in the PTB validation check. | ||||
* Updated security consideration to explain PTB security without | ||||
needing to read the rest of the document. | ||||
* Reformatted state machine diagram | ||||
Working group draft -08: | ||||
* Moved to rfcxml v3+ | ||||
* Rendered diagrams to svg in html version. | ||||
* Removed Appendix A. Event-driven state changes. | ||||
* Removed section on DPLPMTUD with UDP Options. | ||||
* Shortened the description of phases. | ||||
Working group draft -09: | ||||
* Remove final mention of UDP Options | ||||
* Add Initial Connectivity sections to each PL | ||||
* Add to disable outgoing pmtu enforcement of packets | ||||
Working group draft -10: | ||||
* Address comments from Lars Eggert | ||||
* Reinforce that PROBE_COUNT is successive attempts to probe for any | ||||
size | ||||
* Redefine MAX_PROBES to 3 | ||||
* Address PTB_SIZE of 0 or less that MIN_PLPMTU | ||||
Working group draft -11: | ||||
* Restore a sentence removed in previous rev | ||||
* De-acronymise QUIC | ||||
* Address some nits | ||||
Working group draft -12: | ||||
* Add TSVWG, QUIC and implementers to acknowledgments | ||||
* Shorten a diagram line. | ||||
* Address nits from Julius and Wes. | ||||
* Be clearer when talking about IP layer caches | ||||
Working group draft -13, -14: | ||||
* Updated after WGLC. | ||||
Working group draft -15: | ||||
* Updated after AD evaluation and prepared for IETF-LC. | ||||
Working group draft -16: | ||||
* Updated text after SECDIR review. | ||||
Working group draft -17: | ||||
* Updated text after GENART and IETF-LC. | ||||
* Renamed BASE_MTU to BASE_PLPMTU, and MIN and MAX PMTU to PLPMTU | ||||
(because these are about a base for the PLPMTU), and ensured | ||||
consistent separation of PMTU and PLPMTU. | ||||
* Adopted US-style English throughout. | ||||
Working group draft -18: | ||||
* Updated text and address nits from OPSDIR, ART and IESG reviews. | ||||
* Order PTB processing based on PL_PTB_SIZE | ||||
Working group draft -19: | ||||
* Updated text and address nits based on comments from Tim Chown and | ||||
Murray S. Kucherawy. | ||||
Working group draft -20: | ||||
* Address nits and comments from IESG | ||||
* Refer to BCP 145 rather than RFC 8085 in most places. | ||||
* Update probing method text for SCTP and QUIC. | ||||
Working group draft -21: | ||||
* Update QUIC text for skipping into BASE state. | ||||
Working group draft -22: | ||||
* Add a section reference to MPS | [TUNNELS] Touch, J. and M. Townsley, "IP Tunnels in the Internet | |||
Architecture", Work in Progress, Internet-Draft, draft- | ||||
ietf-intarea-tunnels-10, 12 September 2019, | ||||
<https://tools.ietf.org/html/draft-ietf-intarea-tunnels- | ||||
10>. | ||||
* Clarify MIN_PLPMTU text | Acknowledgments | |||
* Remove most QUIC text | This work was partially funded by the European Union Horizon 2020 | |||
Research and Innovation Programme under grant agreement No. 644334, | ||||
"A New, Evolutive API and Transport-Layer Architecture for the | ||||
Internet" (NEAT). The views expressed are solely those of the | ||||
author(s). | ||||
* Make QUIC reference informative. | Thanks to all who have commented or contributed, the TSVWG and QUIC | |||
working groups, and Mathew Calder and Julius Flohr for providing | ||||
early implementations. | ||||
Authors' Addresses | Authors' Addresses | |||
Godred Fairhurst | Godred Fairhurst | |||
University of Aberdeen | University of Aberdeen | |||
School of Engineering | School of Engineering | |||
Fraser Noble Building | Fraser Noble Building | |||
Aberdeen | Aberdeen | |||
AB24 3UE | AB24 3UE | |||
United Kingdom | United Kingdom | |||
skipping to change at page 46, line 35 ¶ | skipping to change at line 1894 ¶ | |||
Tom Jones | Tom Jones | |||
University of Aberdeen | University of Aberdeen | |||
School of Engineering | School of Engineering | |||
Fraser Noble Building | Fraser Noble Building | |||
Aberdeen | Aberdeen | |||
AB24 3UE | AB24 3UE | |||
United Kingdom | United Kingdom | |||
Email: tom@erg.abdn.ac.uk | Email: tom@erg.abdn.ac.uk | |||
Michael Tuexen | Michael Tüxen | |||
Muenster University of Applied Sciences | Münster University of Applied Sciences | |||
Stegerwaldstrasse 39 | Stegerwaldstrasse 39 | |||
48565 Steinfurt | 48565 Steinfurt | |||
Germany | Germany | |||
Email: tuexen@fh-muenster.de | Email: tuexen@fh-muenster.de | |||
Irene Ruengeler | Irene Rüngeler | |||
Muenster University of Applied Sciences | Münster University of Applied Sciences | |||
Stegerwaldstrasse 39 | Stegerwaldstrasse 39 | |||
48565 Steinfurt | 48565 Steinfurt | |||
Germany | Germany | |||
Email: i.ruengeler@fh-muenster.de | Email: i.ruengeler@fh-muenster.de | |||
Timo Voelker | ||||
Muenster University of Applied Sciences | Timo Völker | |||
Münster University of Applied Sciences | ||||
Stegerwaldstrasse 39 | Stegerwaldstrasse 39 | |||
48565 Steinfurt | 48565 Steinfurt | |||
Germany | Germany | |||
Email: timo.voelker@fh-muenster.de | Email: timo.voelker@fh-muenster.de | |||
End of changes. 197 change blocks. | ||||
721 lines changed or deleted | 474 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |