draft-ietf-tsvwg-datagram-plpmtud-12.txt | draft-ietf-tsvwg-datagram-plpmtud-13.txt | |||
---|---|---|---|---|
Internet Engineering Task Force G. Fairhurst | Internet Engineering Task Force G. Fairhurst | |||
Internet-Draft T. Jones | Internet-Draft T. Jones | |||
Updates4821 (if approved) University of Aberdeen | Updates4821, 4960, 8085 (if approved) University of Aberdeen | |||
Intended status: Standards Track M. Tuexen | Intended status: Standards Track M. Tuexen | |||
Expires: 7 June 2020 I. Ruengeler | Expires: 23 July 2020 I. Ruengeler | |||
T. Voelker | T. Voelker | |||
Muenster University of Applied Sciences | Muenster University of Applied Sciences | |||
5 December 2019 | 20 January 2020 | |||
Packetization Layer Path MTU Discovery for Datagram Transports | Packetization Layer Path MTU Discovery for Datagram Transports | |||
draft-ietf-tsvwg-datagram-plpmtud-12 | draft-ietf-tsvwg-datagram-plpmtud-13 | |||
Abstract | Abstract | |||
This document describes a robust method for Path MTU Discovery | This document describes a robust method for Path MTU Discovery | |||
(PMTUD) for datagram Packetization Layers (PLs). It describes an | (PMTUD) for datagram Packetization Layers (PLs). It describes an | |||
extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path | extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path | |||
MTU Discovery for IPv4 and IPv6. The method allows a PL, or a | MTU Discovery for IPv4 and IPv6. The method allows a PL, or a | |||
datagram application that uses a PL, to discover whether a network | datagram application that uses a PL, to discover whether a network | |||
path can support the current size of datagram. This can be used to | path can support the current size of datagram. This can be used to | |||
detect and reduce the message size when a sender encounters a network | detect and reduce the message size when a sender encounters a packet | |||
black hole (where packets are discarded). The method can probe a | black hole (where packets are discarded). The method can probe a | |||
network path with progressively larger packets to discover whether | network path with progressively larger packets to discover whether | |||
the maximum packet size can be increased. This allows a sender to | the maximum packet size can be increased. This allows a sender to | |||
determine an appropriate packet size, providing functionally for | determine an appropriate packet size, providing functionality for | |||
datagram transports that is equivalent to the Packetization Layer | datagram transports that is equivalent to the Packetization Layer | |||
PMTUD specification for TCP, specified in RFC 4821. | PMTUD specification for TCP, specified in RFC 4821. | |||
The document updates RFC 4821 to specify the method for datagram PLs, | ||||
and updates RFC 8085 as the method to use in place of RFC 4821 with | ||||
UDP datagrams. Section 7.3 of RFC4960 recommends an endpoint apply | ||||
the techniques in RFC4821 on a per-destination-address basis. | ||||
RFC4960 is updated to recommend that SCTP uses the method specified | ||||
in this document instead of the method in RFC4821. | ||||
The document also provides implementation notes for incorporating | The document also provides implementation notes for incorporating | |||
Datagram PMTUD into IETF datagram transports or applications that use | Datagram PMTUD into IETF datagram transports or applications that use | |||
datagram transports. | datagram transports. | |||
When published, this specification updates RFC 4821. | When published, this specification updates RFC 4821 and RFC 8085. | |||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on 7 June 2020. | This Internet-Draft will expire on 23 July 2020. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2019 IETF Trust and the persons identified as the | Copyright (c) 2020 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents (https://trustee.ietf.org/ | |||
license-info) in effect on the date of publication of this document. | license-info) in effect on the date of publication of this document. | |||
Please review these documents carefully, as they describe your rights | Please review these documents carefully, as they describe your rights | |||
and restrictions with respect to this document. Code Components | and restrictions with respect to this document. Code Components | |||
extracted from this document must include Simplified BSD License text | extracted from this document must include Simplified BSD License text | |||
as described in Section 4.e of the Trust Legal Provisions and are | as described in Section 4.e of the Trust Legal Provisions and are | |||
provided without warranty as described in the Simplified BSD License. | provided without warranty as described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 3 | 1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 4 | |||
1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 6 | 1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 6 | |||
1.3. Path MTU Discovery for Datagram Services . . . . . . . . 6 | 1.3. Path MTU Discovery for Datagram Services . . . . . . . . 7 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 7 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 9 | 3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 10 | |||
4. DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . . 12 | 4. DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . . 13 | |||
4.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 12 | 4.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 13 | |||
4.2. Confirmation of Probed Packet Size . . . . . . . . . . . 13 | 4.2. Confirmation of Probed Packet Size . . . . . . . . . . . 14 | |||
4.3. Detection of Unsupported PLPMTU Size, aka Black Hole | 4.3. Black Hole Detection . . . . . . . . . . . . . . . . . . 14 | |||
Detection . . . . . . . . . . . . . . . . . . . . . . . . 14 | 4.4. The Maximum Packet Size (MPS) . . . . . . . . . . . . . . 15 | |||
4.4. Disabling the Effect of PMTUD . . . . . . . . . . . . . . 15 | 4.5. Disabling the Effect of PMTUD . . . . . . . . . . . . . . 16 | |||
4.5. Response to PTB Messages . . . . . . . . . . . . . . . . 15 | 4.6. Response to PTB Messages . . . . . . . . . . . . . . . . 17 | |||
4.5.1. Validation of PTB Messages . . . . . . . . . . . . . 15 | 4.6.1. Validation of PTB Messages . . . . . . . . . . . . . 17 | |||
4.5.2. Use of PTB Messages . . . . . . . . . . . . . . . . . 16 | 4.6.2. Use of PTB Messages . . . . . . . . . . . . . . . . . 18 | |||
5. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 17 | 5. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 19 | |||
5.1. DPLPMTUD Components . . . . . . . . . . . . . . . . . . . 18 | 5.1. DPLPMTUD Components . . . . . . . . . . . . . . . . . . . 20 | |||
5.1.1. Timers . . . . . . . . . . . . . . . . . . . . . . . 18 | 5.1.1. Timers . . . . . . . . . . . . . . . . . . . . . . . 20 | |||
5.1.2. Constants . . . . . . . . . . . . . . . . . . . . . . 19 | 5.1.2. Constants . . . . . . . . . . . . . . . . . . . . . . 21 | |||
5.1.3. Variables . . . . . . . . . . . . . . . . . . . . . . 20 | 5.1.3. Variables . . . . . . . . . . . . . . . . . . . . . . 22 | |||
5.1.4. Overview of DPLPMTUD Phases . . . . . . . . . . . . . 21 | 5.1.4. Overview of DPLPMTUD Phases . . . . . . . . . . . . . 23 | |||
5.2. State Machine . . . . . . . . . . . . . . . . . . . . . . 23 | 5.2. State Machine . . . . . . . . . . . . . . . . . . . . . . 25 | |||
5.3. Search to Increase the PLPMTU . . . . . . . . . . . . . . 26 | 5.3. Search to Increase the PLPMTU . . . . . . . . . . . . . . 28 | |||
5.3.1. Probing for a larger PLPMTU . . . . . . . . . . . . . 26 | 5.3.1. Probing for a larger PLPMTU . . . . . . . . . . . . . 28 | |||
5.3.2. Selection of Probe Sizes . . . . . . . . . . . . . . 27 | 5.3.2. Selection of Probe Sizes . . . . . . . . . . . . . . 29 | |||
5.3.3. Resilience to Inconsistent Path Information . . . . . 27 | 5.3.3. Resilience to Inconsistent Path Information . . . . . 30 | |||
5.4. Robustness to Inconsistent Paths . . . . . . . . . . . . 28 | 5.4. Robustness to Inconsistent Paths . . . . . . . . . . . . 30 | |||
6. Specification of Protocol-Specific Methods . . . . . . . . . 30 | ||||
6. Specification of Protocol-Specific Methods . . . . . . . . . 28 | ||||
6.1. Application support for DPLPMTUD with UDP or | 6.1. Application support for DPLPMTUD with UDP or | |||
UDP-Lite . . . . . . . . . . . . . . . . . . . . . . . . 28 | UDP-Lite . . . . . . . . . . . . . . . . . . . . . . . . 30 | |||
6.1.1. Application Request . . . . . . . . . . . . . . . . . 29 | 6.1.1. Application Request . . . . . . . . . . . . . . . . . 31 | |||
6.1.2. Application Response . . . . . . . . . . . . . . . . 29 | 6.1.2. Application Response . . . . . . . . . . . . . . . . 31 | |||
6.1.3. Sending Application Probe Packets . . . . . . . . . . 29 | 6.1.3. Sending Application Probe Packets . . . . . . . . . . 31 | |||
6.1.4. Initial Connectivity . . . . . . . . . . . . . . . . 29 | 6.1.4. Initial Connectivity . . . . . . . . . . . . . . . . 31 | |||
6.1.5. Validating the Path . . . . . . . . . . . . . . . . . 29 | 6.1.5. Validating the Path . . . . . . . . . . . . . . . . . 32 | |||
6.1.6. Handling of PTB Messages . . . . . . . . . . . . . . 30 | 6.1.6. Handling of PTB Messages . . . . . . . . . . . . . . 32 | |||
6.2. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 30 | 6.2. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 32 | |||
6.2.1. SCTP/IPv4 and SCTP/IPv6 . . . . . . . . . . . . . . . 30 | 6.2.1. SCTP/IPv4 and SCTP/IPv6 . . . . . . . . . . . . . . . 32 | |||
6.2.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 31 | 6.2.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 33 | |||
6.2.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . 32 | 6.2.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . 34 | |||
6.3. DPLPMTUD for QUIC . . . . . . . . . . . . . . . . . . . . 32 | 6.3. DPLPMTUD for QUIC . . . . . . . . . . . . . . . . . . . . 35 | |||
6.3.1. Initial Connectivity . . . . . . . . . . . . . . . . 33 | 6.3.1. Initial Connectivity . . . . . . . . . . . . . . . . 35 | |||
6.3.2. Sending QUIC Probe Packets . . . . . . . . . . . . . 33 | 6.3.2. Sending QUIC Probe Packets . . . . . . . . . . . . . 35 | |||
6.3.3. Validating the Path with QUIC . . . . . . . . . . . . 33 | 6.3.3. Validating the Path with QUIC . . . . . . . . . . . . 36 | |||
6.3.4. Handling of PTB Messages by QUIC . . . . . . . . . . 33 | 6.3.4. Handling of PTB Messages by QUIC . . . . . . . . . . 36 | |||
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 34 | 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 36 | |||
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 36 | |||
9. Security Considerations . . . . . . . . . . . . . . . . . . . 34 | 9. Security Considerations . . . . . . . . . . . . . . . . . . . 36 | |||
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 35 | 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 37 | |||
10.1. Normative References . . . . . . . . . . . . . . . . . . 35 | 10.1. Normative References . . . . . . . . . . . . . . . . . . 37 | |||
10.2. Informative References . . . . . . . . . . . . . . . . . 36 | 10.2. Informative References . . . . . . . . . . . . . . . . . 39 | |||
Appendix A. Revision Notes . . . . . . . . . . . . . . . . . . . 37 | Appendix A. Revision Notes . . . . . . . . . . . . . . . . . . . 40 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 41 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 44 | |||
1. Introduction | 1. Introduction | |||
The IETF has specified datagram transport using UDP, SCTP, and DCCP, | The IETF has specified datagram transport using UDP, SCTP, and DCCP, | |||
as well as protocols layered on top of these transports (e.g., SCTP/ | as well as protocols layered on top of these transports (e.g., SCTP/ | |||
UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP | UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP | |||
network layer. This document describes a robust method for Path MTU | network layer. This document describes a robust method for Path MTU | |||
Discovery (PMTUD) that may be used with these transport protocols (or | Discovery (PMTUD) that can be used with these transport protocols (or | |||
the applications that use their transport service) to discover an | the applications that use their transport service) to discover an | |||
appropriate size of packet to use across an Internet path. | appropriate size of packet to use across an Internet path. | |||
1.1. Classical Path MTU Discovery | 1.1. Classical Path MTU Discovery | |||
Classical Path Maximum Transmission Unit Discovery (PMTUD) can be | Classical Path Maximum Transmission Unit Discovery (PMTUD) can be | |||
used with any transport that is able to process ICMP Packet Too Big | used with any transport that is able to process ICMP Packet Too Big | |||
(PTB) messages (e.g., [RFC1191] and [RFC8201]). In this document, | (PTB) messages (e.g., [RFC1191] and [RFC8201]). In this document, | |||
the term PTB message is applied to both IPv4 ICMP Unreachable | the term PTB message is applied to both IPv4 ICMP Unreachable | |||
messages (type 3) that carry the error Fragmentation Needed (Type 3, | messages (type 3) that carry the error Fragmentation Needed (Type 3, | |||
Code 4) [RFC0792] and ICMPv6 Packet Too Big messages (Type 2) | Code 4) [RFC0792] and ICMPv6 Packet Too Big messages (Type 2) | |||
[RFC4443]. When a sender receives a PTB message, it reduces the | [RFC4443]. When a sender receives a PTB message, it reduces the | |||
effective MTU to the value reported as the Link MTU in the PTB | effective MTU to the value reported as the Link MTU in the PTB | |||
message, and a method that from time-to-time increases the packet | message. A method from time-to-time increases the packet size in | |||
size in attempt to discover an increase in the supported PMTU. The | attempt to discover an increase in the supported PMTU. The packets | |||
packets sent with a size larger than the current effective PMTU are | sent with a size larger than the current effective PMTU are known as | |||
known as probe packets. | probe packets. | |||
Packets not intended as probe packets are either fragmented to the | Packets not intended as probe packets are either fragmented to the | |||
current effective PMTU, or the attempt to send fails with an error | current effective PMTU, or the attempt to send fails with an error | |||
code. Applications are sometimes provided with a primitive to let | code. Applications can be provided with a primitive to let them read | |||
them read the Maximum Packet Size (MPS), derived from the current | the Maximum Packet Size (MPS), derived from the current effective | |||
effective PMTU. | PMTU. | |||
Classical PMTUD is subject to protocol failures. One failure arises | Classical PMTUD is subject to protocol failures. One failure arises | |||
when traffic using a packet size larger than the actual PMTU is | when traffic using a packet size larger than the actual PMTU is | |||
black-holed (all datagrams sent with this size, or larger, are | black-holed (all datagrams sent with this size, or larger, are | |||
discarded). This could arise when the PTB messages are not delivered | discarded). This could arise when the PTB messages are not delivered | |||
back to the sender for some reason (see for example [RFC2923]). | back to the sender for some reason (see for example [RFC2923]). | |||
Examples where PTB messages are not delivered include: | Examples where PTB messages are not delivered include: | |||
* The generation of ICMP messages is usually rate limited. This | * The generation of ICMP messages is usually rate limited. This | |||
skipping to change at page 4, line 48 ¶ | skipping to change at page 5, line 13 ¶ | |||
sender. | sender. | |||
* Asymmetry in forwarding can result in there being no return route | * Asymmetry in forwarding can result in there being no return route | |||
to the original sender, which would prevent an ICMP message being | to the original sender, which would prevent an ICMP message being | |||
delivered to the sender. This issue can also arise when policy- | delivered to the sender. This issue can also arise when policy- | |||
based routing is used, Equal Cost Multipath (ECMP) routing is | based routing is used, Equal Cost Multipath (ECMP) routing is | |||
used, or a middlebox acts as an application load balancer. An | used, or a middlebox acts as an application load balancer. An | |||
example is where the path towards the server is chosen by ECMP | example is where the path towards the server is chosen by ECMP | |||
routing depending on bytes in the IP payload. In this case, when | routing depending on bytes in the IP payload. In this case, when | |||
a packet sent by the server encounters a problem after the ECMP | a packet sent by the server encounters a problem after the ECMP | |||
router, then any resulting ICMP message needs to also be directed | router, then any resulting ICMP message also needs to be directed | |||
by the ECMP router towards the original sender. | by the ECMP router towards the original sender. | |||
* There are additional cases where the next hop destination fails to | * There are additional cases where the next hop destination fails to | |||
receive a packet because of its size. This could be due to | receive a packet because of its size. This could be due to | |||
misconfiguration of the layer 2 path between nodes, for instance | misconfiguration of the layer 2 path between nodes, for instance | |||
the MTU configured in a layer 2 switch, or misconfiguration of the | the MTU configured in a layer 2 switch, or misconfiguration of the | |||
Maximum Receive Unit (MRU). If the packet is dropped by the link, | Maximum Receive Unit (MRU). If a packet is dropped by the link, | |||
this will not cause a PTB message to be sent to the original | this will not cause a PTB message to be sent to the original | |||
sender. | sender. | |||
Another failure could result if a node that is not on the network | Another failure could result if a node that is not on the network | |||
path sends a PTB message that attempts to force a sender to change | path sends a PTB message that attempts to force a sender to change | |||
the effective PMTU [RFC8201]. A sender can protect itself from | the effective PMTU [RFC8201]. A sender can protect itself from | |||
reacting to such messages by utilising the quoted packet within a PTB | reacting to such messages by utilising the quoted packet within a PTB | |||
message payload to validate that the received PTB message was | message payload to validate that the received PTB message was | |||
generated in response to a packet that had actually originated from | generated in response to a packet that had actually originated from | |||
the sender. However, there are situations where a sender would be | the sender. However, there are situations where a sender would be | |||
skipping to change at page 6, line 5 ¶ | skipping to change at page 6, line 14 ¶ | |||
validate the message, because validation depends on information | validate the message, because validation depends on information | |||
about the active transport flows at an endpoint node (e.g., the | about the active transport flows at an endpoint node (e.g., the | |||
socket/address pairs being used, and other protocol header | socket/address pairs being used, and other protocol header | |||
information). | information). | |||
* When a packet is encapsulated/tunneled over an encrypted | * When a packet is encapsulated/tunneled over an encrypted | |||
transport, the tunnel/encapsulation ingress might have | transport, the tunnel/encapsulation ingress might have | |||
insufficient context, or computational power, to reconstruct the | insufficient context, or computational power, to reconstruct the | |||
transport header that would be needed to perform validation. | transport header that would be needed to perform validation. | |||
* A Network Addres Translation (NAT) device that translates a packet | ||||
header, ought to also translate ICMP messages and update the ICMP | ||||
quoted packet [RFC5508] in that message. If this is not correctly | ||||
translated then the sender would not be able to associate the | ||||
message with the PL that originated the packet, and hence this | ||||
ICMP message cannot be validated. | ||||
1.2. Packetization Layer Path MTU Discovery | 1.2. Packetization Layer Path MTU Discovery | |||
The term Packetization Layer (PL) has been introduced to describe the | The term Packetization Layer (PL) has been introduced to describe the | |||
layer that is responsible for placing data blocks into the payload of | layer that is responsible for placing data blocks into the payload of | |||
IP packets and selecting an appropriate MPS. This function is often | IP packets and selecting an appropriate MPS. This function is often | |||
performed by a transport protocol, but can also be performed by other | performed by a transport protocol (e.g., DCCP, RTP, SCTP, QUIC), but | |||
encapsulation methods working above the transport layer. | can also be performed by other encapsulation methods working above | |||
the transport layer. | ||||
In contrast to PMTUD, Packetization Layer Path MTU Discovery | In contrast to PMTUD, Packetization Layer Path MTU Discovery | |||
(PLPMTUD) [RFC4821] does not rely upon reception and validation of | (PLPMTUD) [RFC4821] introduced a method that does not rely upon | |||
PTB messages. It is therefore more robust than Classical PMTUD. | reception and validation of PTB messages. It is therefore more | |||
This has become the recommended approach for implementing PMTU | robust than Classical PMTUD. This has become the recommended | |||
discovery. | approach for implementing discovery of the PMTU [RFC8085]. | |||
It uses a general strategy where the PL sends probe packets to search | It uses a general strategy where the PL sends probe packets to search | |||
for the largest size of unfragmented datagram that can be sent over a | for the largest size of unfragmented datagram that can be sent over a | |||
network path. Probe packets are sent with a progressively larger | network path. Probe packets are sent to explore using a larger | |||
packet size. If a probe packet is successfully delivered (as | packet size. If a probe packet is successfully delivered (as | |||
determined by the PL), then the PLPMTU is raised to the size of the | determined by the PL), then the PLPMTU is raised to the size of the | |||
successful probe. If no response is received to a probe packet, the | successful probe. If no response is received to a probe packet, the | |||
method reduces the probe size. The result of probing with the PLPMTU | method then reduces the PLPMTU. | |||
is used to set the application MPS. | ||||
PLPMTUD introduces flexibility in the implementation of PMTU | Datagram PLPMTUD introduces flexibility in implementation. At one | |||
discovery. At one extreme, it can be configured to only perform ICMP | extreme, it can be configured to only perform Black Hole Detection | |||
Black Hole Detection and recovery to increase the robustness of | and recovery with increased robustness compared to Classical PMTUD. | |||
Classical PMTUD, or at the other extreme, all PTB processing can be | At the other extreme, all PTB processing can be disabled, and PLPMTUD | |||
disabled and PLPMTUD can completely replace Classical PMTUD (see | replaces Classical PMTUD. | |||
Section 4.5). | ||||
PLPMTUD can also include additional consistency checks without | PLPMTUD can also include additional consistency checks without | |||
increasing the risk that data is lost when probing to discover the | increasing the risk that data is lost when probing to discover the | |||
path MTU. For example, information available at the PL, or higher | Path MTU. For example, information available at the PL, or higher | |||
layers, enables received PTB messages to be validated before being | layers, enables received PTB messages to be validated before being | |||
utilized. | utilized. | |||
1.3. Path MTU Discovery for Datagram Services | 1.3. Path MTU Discovery for Datagram Services | |||
Section 5 of this document presents a set of algorithms for datagram | Section 5 of this document presents a set of algorithms for datagram | |||
protocols to discover the largest size of unfragmented datagram that | protocols to discover the largest size of unfragmented datagram that | |||
can be sent over a network path. The method described relies on | can be sent over a network path. The method relies upon features of | |||
features of the PL described in Section 3 and applies to transport | the PL described in Section 3 and applies to transport protocols | |||
protocols operating over IPv4 and IPv6. It does not require | operating over IPv4 and IPv6. It does not require cooperation from | |||
cooperation from the lower layers, although it can utilize PTB | the lower layers, although it can utilize PTB messages when these | |||
messages when these received messages are made available to the PL. | received messages are made available to the PL. | |||
The UDP Usage Guidelines [RFC8085] state "an application SHOULD | The message size guidelines in section 3.2 of the UDP Usage | |||
either use the Path MTU information provided by the IP layer or | Guidelines [RFC8085] state "an application SHOULD either use the Path | |||
implement Path MTU Discovery (PMTUD)", but does not provide a | MTU information provided by the IP layer or implement Path MTU | |||
mechanism for discovering the largest size of unfragmented datagram | Discovery (PMTUD)", but does not provide a mechanism for discovering | |||
that can be used on a network path. Prior to this document, PLPMTUD | the largest size of unfragmented datagram that can be used on a | |||
had not been specified for UDP. | network path. The present document updates RFC 8085 to specify this | |||
method in place of PLPMTUD [RFC4821] and provides a mechanism for | ||||
sharing the discovered largest size as the Maximum Packet Size (MPS) | ||||
(see Section 4.4). | ||||
Section 10.2 of [RFC4821] recommends a PLPMTUD probing method for the | Section 10.2 of [RFC4821] recommended a PLPMTUD probing method for | |||
Stream Control Transport Protocol (SCTP). SCTP utilizes probe | the Stream Control Transport Protocol (SCTP). SCTP utilizes probe | |||
packets consisting of a minimal sized HEARTBEAT chunk bundled with a | packets consisting of a minimal sized HEARTBEAT chunk bundled with a | |||
PAD chunk as defined in [RFC4820], but RFC4821 does not provide a | PAD chunk as defined in [RFC4820]. However, RFC 4821 did not provide | |||
complete specification. The present document provides the details to | a complete specification. The present document replaces this by | |||
complete that specification. | providing a complete specification. | |||
The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires | The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires | |||
implementations to support Classical PMTUD and states that a DCCP | implementations to support Classical PMTUD and states that a DCCP | |||
sender "MUST maintain the MPS allowed for each active DCCP session". | sender "MUST maintain the MPS allowed for each active DCCP session". | |||
It also defines the current congestion control MPS (CCMPS) supported | It also defines the current congestion control MPS (CCMPS) supported | |||
by a network path. This recommends use of PMTUD, and suggests use of | by a network path. This recommends use of PMTUD, and suggests use of | |||
control packets (DCCP-Sync) as path probe packets, because they do | control packets (DCCP-Sync) as path probe packets, because they do | |||
not risk application data loss. The method defined in this | not risk application data loss. The method defined in this | |||
specification could be used with DCCP. | specification can be used with DCCP. | |||
Section 6 specifies the method for a set of transports, and provides | Section 6 specifies the method for datagram transports and provides | |||
information to enable the implementation of PLPMTUD with other | information to enable the implementation of PLPMTUD with other | |||
datagram transports and applications that use datagram transports. | datagram transports and applications that use datagram transports. | |||
2. Terminology | 2. Terminology | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
"OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in BCP | |||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
capitals, as shown here. | capitals, as shown here. | |||
skipping to change at page 7, line 48 ¶ | skipping to change at page 8, line 24 ¶ | |||
definitions in [RFC1122]. | definitions in [RFC1122]. | |||
Actual PMTU: The Actual PMTU is the PMTU of a network path between a | Actual PMTU: The Actual PMTU is the PMTU of a network path between a | |||
sender PL and a destination PL, which the DPLPMTUD algorithm seeks | sender PL and a destination PL, which the DPLPMTUD algorithm seeks | |||
to determine. | to determine. | |||
Black Hole: A Black Hole is encountered when a sender is unaware | Black Hole: A Black Hole is encountered when a sender is unaware | |||
that packets are not being delivered to the destination end point. | that packets are not being delivered to the destination end point. | |||
Two types of Black Hole are relevant to DPLPMTUD: | Two types of Black Hole are relevant to DPLPMTUD: | |||
Packet Black Hole: Packets encounter a Packet Black Hole when | * Packets encounter a packet Black Hole when packets are not | |||
packets are not delivered to the destination | delivered to the destination endpoint (e.g., when the sender | |||
endpoint (e.g., when the sender transmits | transmits packets of a particular size with a previously known | |||
packets of a particular size with a previously | effective PMTU and they are discarded by the network). | |||
known effective PMTU and they are discarded by | ||||
the network). | ||||
ICMP Black Hole An ICMP Black Hole is encountered when the | ||||
sender is unaware that packets are not | ||||
delivered to the destination endpoint because | ||||
PTB messages are not received by the | ||||
originating PL sender. | ||||
Black holed : Traffic is black-holed when the sender is unaware that | * An ICMP Black Hole is encountered when the sender is unaware | |||
packets are not being delivered. This could be due to a Packet | that packets are not delivered to the destination endpoint | |||
Black Hole or an ICMP Black Hole. | because PTB messages are not received by the originating PL | |||
sender. | ||||
Classical Path MTU Discovery: Classical PMTUD is a process described | Classical Path MTU Discovery: Classical PMTUD is a process described | |||
in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to | in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to | |||
learn the largest size of unfragmented datagram that can be used | learn the largest size of unfragmented packet that can be used | |||
across a network path. | across a network path. | |||
Datagram: A datagram is a transport-layer protocol data unit, | Datagram: A datagram is a transport-layer protocol data unit, | |||
transmitted in the payload of an IP packet. | transmitted in the payload of an IP packet. | |||
Effective PMTU: The Effective PMTU is the current estimated value | Effective PMTU: The Effective PMTU is the current estimated value | |||
for PMTU that is used by a PMTUD. This is equivalent to the | for PMTU that is used by a PMTUD. This is equivalent to the | |||
PLPMTU derived by PLPMTUD. | PLPMTU derived by PLPMTUD. | |||
EMTU_S: The Effective MTU for sending (EMTU_S) is defined in | EMTU_S: The Effective MTU for sending (EMTU_S) is defined in | |||
skipping to change at page 8, line 50 ¶ | skipping to change at page 9, line 18 ¶ | |||
and tunnels. | and tunnels. | |||
Link MTU: The Link Maximum Transmission Unit (MTU) is the size in | Link MTU: The Link Maximum Transmission Unit (MTU) is the size in | |||
bytes of the largest IP packet, including the IP header and | bytes of the largest IP packet, including the IP header and | |||
payload, that can be transmitted over a link. Note that this | payload, that can be transmitted over a link. Note that this | |||
could more properly be called the IP MTU, to be consistent with | could more properly be called the IP MTU, to be consistent with | |||
how other standards organizations use the acronym. This includes | how other standards organizations use the acronym. This includes | |||
the IP header, but excludes link layer headers and other framing | the IP header, but excludes link layer headers and other framing | |||
that is not part of IP or the IP payload. Other standards | that is not part of IP or the IP payload. Other standards | |||
organizations generally define the link MTU to include the link | organizations generally define the link MTU to include the link | |||
layer headers. | layer headers. This specification continues the requirement in | |||
[RFC4821], that states "All links MUST enforce their MTU: links | ||||
that might non- deterministically deliver packets that are larger | ||||
than their rated MTU MUST consistently discard such packets." | ||||
MAX_PMTU: The MAX_PMTU is the largest size of PLPMTU that DPLPMTUD | MAX_PMTU: The MAX_PMTU is the largest size of PLPMTU that DPLPMTUD | |||
will attempt to use. | will attempt to use. | |||
MPS: The Maximum Packet Size (MPS) is the largest size of | MPS: The Maximum Packet Size (MPS) is the largest size of | |||
application data block that can be sent across a network path by a | application data block that can be sent across a network path by a | |||
PL. In DPLPMTUD this quantity is derived from the PLPMTU by | PL. In DPLPMTUD this quantity is derived from the PLPMTU by | |||
taking into consideration the size of the lower protocol layer | taking into consideration the size of the lower protocol layer | |||
headers. Probe packets generated by DPLPMTUD can have a size | headers. Probe packets generated by DPLPMTUD can have a size | |||
larger than the MPS. | larger than the MPS. | |||
MIN_PMTU: The MIN_PMTU is the smallest size of PLPMTU that DPLPMTUD | MIN_PMTU: The MIN_PMTU is the smallest size of PLPMTU that DPLPMTUD | |||
will attempt to use. | will attempt to use. | |||
Packet: A Packet is the IP header plus the IP payload. | Packet: A Packet is the IP header plus the IP payload. | |||
Packetization Layer (PL): The Packetization Layer (PL) is the layer | Packetization Layer (PL): The Packetization Layer (PL) is a layer of | |||
of the network stack that places data into packets and performs | the network stack that places data into packets and performs | |||
transport protocol functions. | transport protocol functions. Examples of a PL include: TCP, | |||
SCTP, SCTP over DTLS or QUIC. | ||||
Path: The Path is the set of links and routers traversed by a packet | Path: The Path is the set of links and routers traversed by a packet | |||
between a source node and a destination node by a particular flow. | between a source node and a destination node by a particular flow. | |||
Path MTU (PMTU): The Path MTU (PMTU) is the minimum of the Link MTU | Path MTU (PMTU): The Path MTU (PMTU) is the minimum of the Link MTU | |||
of all the links forming a network path between a source node and | of all the links forming a network path between a source node and | |||
a destination node. | a destination node. | |||
PTB_SIZE: The PTB_SIZE is a value reported in a validated PTB | PTB_SIZE: The PTB_SIZE is a value reported in a validated PTB | |||
message that indicates next hop link MTU of a router along the | message that indicates next hop link MTU of a router along the | |||
skipping to change at page 9, line 49 ¶ | skipping to change at page 10, line 19 ¶ | |||
method described in this document for datagram PLs, which is an | method described in this document for datagram PLs, which is an | |||
extension to Classical PMTU Discovery. | extension to Classical PMTU Discovery. | |||
Probe packet: A probe packet is a datagram sent with a purposely | Probe packet: A probe packet is a datagram sent with a purposely | |||
chosen size (typically the current PLPMTU or larger) to detect if | chosen size (typically the current PLPMTU or larger) to detect if | |||
packets of this size can be successfully sent end-to-end across | packets of this size can be successfully sent end-to-end across | |||
the network path. | the network path. | |||
3. Features Required to Provide Datagram PLPMTUD | 3. Features Required to Provide Datagram PLPMTUD | |||
TCP PLPMTUD has been defined using standard TCP protocol mechanisms. | The principles expressed in [RFC4821] apply to the use of the | |||
All of the requirements in [RFC4821] also apply to the use of the | technique with any PL. TCP PLPMTUD has been defined using standard | |||
technique with a datagram PL. Unlike TCP, some datagram PLs require | TCP protocol mechanisms. Unlike TCP, datagram PLs require additional | |||
additional mechanisms to implement PLPMTUD. | mechanisms and considerations to implement PLPMTUD. | |||
There are eight requirements for performing the datagram PLPMTUD | The requirements for datagram PLPMTUD are: | |||
method described in this specification: | ||||
1. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to provide | 1. PLPMTU: The PLPMTU (specified as the effective PMTU in Section 1 | |||
information about the maximum size of packet that can be | of [RFC1191]) is equivalent to the EMTU_S (specified in | |||
transmitted by the sender on the local link (the local Link MTU). | [RFC1122]). For datagram PLs,] the PLPMTU is managed by | |||
It MAY utilize similar information about the receiver when this | DPLPMTUD. A PL MUST NOT send a packet (other than a probe | |||
is supplied (note this could be less than EMTU_R). This avoids | packet) with a size larger than the current PLPMTU at the | |||
implementations trying to send probe packets that can not be | network layer. | |||
transmitted by the local link. Too high of a value could reduce | ||||
the efficiency of the search algorithm. Some applications also | ||||
have a maximum transport protocol data unit (PDU) size, in which | ||||
case there is no benefit from probing for a size larger than this | ||||
(unless a transport allows multiplexing multiple applications | ||||
PDUs into the same datagram). | ||||
2. PLPMTU: A datagram application using a PL not supporting | 2. Probe packets: On request, a DPLPMTUD sender is REQUIRED to be | |||
fragmentation is REQUIRED to be able to choose the size of | able to transmit a packet larger than the PLMPMTU. This is used | |||
datagrams sent to the network, up to the PLPMTU, or a smaller | to send a probe packet. In IPv4, a probe packet MUST be sent | |||
value (such as the MPS) derived from this. This value is managed | with the Don't Fragment (DF) bit set in the IP header, and | |||
by the DPLPMTUD method. The PLPMTU (specified as the effective | without network layer endpoint fragmentation. In IPv6, a probe | |||
PMTU in Section 1 of [RFC1191]) is equivalent to the EMTU_S | packet is always sent without source fragmentation (as specified | |||
(specified in [RFC1122]). | in section 5.4 of [RFC8201]). | |||
3. Probe packets: On request, a DPLPMTUD sender is REQUIRED to be | 3. Reception feedback: The destination PL endpoint is REQUIRED to | |||
able to transmit a packet larger than the PLMPMTU. This is used | provide a feedback method that indicates to the DPLPMTUD sender | |||
to send a probe packet. In IPv4, a probe packet MUST be sent | when a probe packet has been received by the destination PL | |||
with the Don't Fragment (DF) bit set in the IP header, and | endpoint. | |||
without network layer endpoint fragmentation. In IPv6, a probe | ||||
packet is always sent without source fragmentation (as specified | ||||
in section 5.4 of [RFC8201]). | ||||
4. Processing PTB messages: A DPLPMTUD sender MAY optionally utilize | 4. Probe loss recovery: It is RECOMMENDED to use probe packets that | |||
PTB messages received from the network layer to help identify | do not carry any user data that would require retransmission if | |||
when a network path does not support the current size of probe | lost. Most datagram transports permit this. If a probe packet | |||
packet. Any received PTB message MUST be validated before it is | contains user data requiring retransmission in case of loss, the | |||
used to update the PLPMTU discovery information [RFC8201]. This | PL (or layers above) are REQUIRED to arrange any retransmission/ | |||
validation confirms that the PTB message was sent in response to | repair of any resulting loss. The PL is REQUIRED to be robust | |||
a packet originating by the sender, and needs to be performed | in the case where probe packets are lost due to other reasons | |||
before the PLPMTU discovery method reacts to the PTB message. A | (including link transmission error, congestion). | |||
PTB message MUST NOT be used to increase the PLPMTU [RFC8201]. | ||||
5. Reception feedback: The destination PL endpoint is REQUIRED to | 5. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to utilise | |||
provide a feedback method that indicates to the DPLPMTUD sender | information about the maximum size of packet that can be | |||
when a probe packet has been received by the destination PL | transmitted by the sender on the local link (e.g., the local | |||
endpoint. The mechanism needs to be robust to the possibility | Link MTU). It MAY utilize similar information about the | |||
that packets could be significantly delayed along a network path. | receiver when this is supplied (note this could be less than | |||
EMTU_R). This avoids implementations trying to send probe | ||||
packets that can not be transmitted by the local link. Too high | ||||
of a value could reduce the efficiency of the search algorithm. | ||||
Some applications also have a maximum transport protocol data | ||||
unit (PDU) size, in which case there is no benefit from probing | ||||
for a size larger than this (unless a transport allows | ||||
multiplexing multiple applications PDUs into the same datagram). | ||||
The local PL endpoint at the sending node is REQUIRED to pass | 6. Processing PTB messages: A DPLPMTUD sender MAY optionally | |||
this feedback to the sender DPLPMTUD method. | utilize PTB messages received from the network layer to help | |||
identify when a network path does not support the current size | ||||
of probe packet. Any received PTB message MUST be validated | ||||
before it is used to update the PLPMTU discovery information | ||||
[RFC8201]. This validation confirms that the PTB message was | ||||
sent in response to a packet originating by the sender, and | ||||
needs to be performed before the PLPMTU discovery method reacts | ||||
to the PTB message. A PTB message MUST NOT be used to increase | ||||
the PLPMTU [RFC8201], but could trigger a probe to test for a | ||||
larger PLPMTU. A PTB_SIZE greater than the currently probed | ||||
MUST be ignored. | ||||
6. Probe loss recovery: It is RECOMMENDED to use probe packets that | 7. Probing and congestion control: The decision about when to send | |||
do not carry any user data that would require retransmission if | a probe packet does not need to be limited by the congestion | |||
lost. Most datagram transports permit this. If a probe packet | controller. When not controlled by the congestion controller, | |||
contains user data requiring retransmission in case of loss, the | the interval between probe packets MUST be at least one RTT. If | |||
PL (or layers above) are REQUIRED to arrange any retransmission/ | transmission of probe packets is limited by the congestion | |||
repair of any resulting loss. DPLPMTUD is REQUIRED to be robust | controller, this could result in transmission of probe packets | |||
in the case where probe packets are lost due to other reasons | being delayed. | |||
(including link transmission error, congestion). | ||||
7. Probing and congestion control: The DPLPMTUD sender treats | 8. Loss of a probe packet SHOULD NOT be treated as an indication of | |||
isolated loss of a probe packet (with or without a corresponding | congestion and SHOULD NOT trigger a congestion control reaction | |||
PTB message) as a potential indication of a PMTU limit for the | [RFC4821], because this could result in unnecessary reduction of | |||
path. Loss of a probe packet SHOULD NOT be treated as an | the sending rate. | |||
indication of congestion. The loss of a probe packet SHOULD NOT | ||||
directly trigger a congestion control reaction [RFC4821] because | ||||
this could result in unecessary reduction of the sending rate. | ||||
The interval between probe packets MUST be at least one RTT. | ||||
8. Shared PLPMTU state: The PLPMTU value MAY also be stored with the | 9. An update to the PLPMTU (or MPS) MUST NOT modify the congestion | |||
corresponding entry associated with the destination in the IP | window measured in bytes [RFC4821]. Therefore, an increase in | |||
layer cache, and used by other PL instances. The specification | the packet size does not cause an increase the data rate in | |||
of PLPMTUD [RFC4821] states: "If PLPMTUD updates the MTU for a | bytes per second. | |||
particular path, all Packetization Layer sessions that share the | ||||
path representation (as described in Section 5.2 of [RFC4821]) | 10. Probing and flow control: Flow control at the PL concerns the | |||
SHOULD be notified to make use of the new MTU". Such methods | end-to-end flow of data using the PL service. This does not | |||
MUST be robust to the wide variety of underlying network | apply to DPLPMTU when probe packets use a design that does not | |||
forwarding behaviors. Section 5.2 of [RFC8201] provides guidance | carry user data to the remote application. | |||
on the caching of PMTU information and also the relation to IPv6 | ||||
flow labels. | 11. Shared PLPMTU state: The PLPMTU value MAY also be stored with | |||
the corresponding entry associated with the destination in the | ||||
IP layer cache, and used by other PL instances. The | ||||
specification of PLPMTUD [RFC4821] states: "If PLPMTUD updates | ||||
the MTU for a particular path, all Packetization Layer sessions | ||||
that share the path representation (as described in Section 5.2 | ||||
of [RFC4821]) SHOULD be notified to make use of the new MTU". | ||||
Such methods MUST be robust to the wide variety of underlying | ||||
network forwarding behaviors. Section 5.2 of [RFC8201] provides | ||||
guidance on the caching of PMTU information and also the | ||||
relation to IPv6 flow labels. | ||||
In addition, the following principles are stated for design of a | In addition, the following principles are stated for design of a | |||
DPLPMTUD method: | DPLPMTUD method: | |||
* MPS: A method is REQUIRED to signal an appropriate MPS to the | * Maximum Packet Size (MPS): A PL MAY be designed to segment data | |||
higher layer using the PL. The value of the MPS can change | blocks larger than the MPS into multiple datagrams. However, not | |||
following a change to the path. It is RECOMMENDED that methods | all datagram PLs support segmentation of data blocks. It is | |||
avoid forcing an application to use an arbitrary small MPS | RECOMMENDED that methods avoid forcing an application to use an | |||
(PLPMTU) for transmission while the method is searching for the | arbitrary small MPS for transmission while the method is searching | |||
currently supported PLPMTU. Datagram PLs do not necessarily | for the currently supported PLPMTU. A reduced MPS can adversely | |||
support fragmentation of PDUs larger than the PLPMTU. A reduced | impact the performance of an application. | |||
MPS can adversely impact the performance of a datagram | ||||
application. | * To assist applications in choosing a suitable data block size, the | |||
PL is RECOMMENDED to provide a primitive that returns the MPS | ||||
derived from the PLPMTU to the higher layer using the PL. The | ||||
value of the MPS can change following a change in the path, or | ||||
loss of probe packets. | ||||
* Path validation: It is RECOMMENDED that methods are robust to path | * Path validation: It is RECOMMENDED that methods are robust to path | |||
changes that could have occurred since the path characteristics | changes that could have occurred since the path characteristics | |||
were last confirmed, and to the possibility of inconsistent path | were last confirmed, and to the possibility of inconsistent path | |||
information being received. | information being received. | |||
* Datagram reordering: A method is REQUIRED to be robust to the | * Datagram reordering: A method is REQUIRED to be robust to the | |||
possibility that a flow encounters reordering, or the traffic | possibility that a flow encounters reordering, or the traffic | |||
(including probe packets) is divided over more than one network | (including probe packets) is divided over more than one network | |||
path. | path. | |||
* Datagram delay and duplication: The feedback mechanism is REQUIRED | ||||
to be robust to the possibility that packets could be | ||||
significantly delayed or duplicated along a network path. | ||||
* When to probe: It is RECOMMENDED that methods determine whether | * When to probe: It is RECOMMENDED that methods determine whether | |||
the path has changed since it last measured the path. This can | the path has changed since it last measured the path. This can | |||
help determine when to probe the path again. | help determine when to probe the path again. | |||
4. DPLPMTUD Mechanisms | 4. DPLPMTUD Mechanisms | |||
This section lists the protocol mechanisms used in this | This section lists the protocol mechanisms used in this | |||
specification. | specification. | |||
4.1. PLPMTU Probe Packets | 4.1. PLPMTU Probe Packets | |||
The DPLPMTUD method relies upon the PL sender being able to generate | The DPLPMTUD method relies upon the PL sender being able to generate | |||
probe packets with a specific size. TCP is able to generate these | probe packets with a specific size. TCP is able to generate these | |||
probe packets by choosing to appropriately segment data being sent | probe packets by choosing to appropriately segment data being sent | |||
[RFC4821]. In contrast, a datagram PL that needs to construct a | [RFC4821]. In contrast, a datagram PL that constructs a probe packet | |||
probe packet has to either request an application to send a data | has to either request an application to send a data block that is | |||
block that is larger than that generated by an application, or to | larger than that generated by an application, or to utilize padding | |||
utilize padding functions to extend a datagram beyond the size of the | functions to extend a datagram beyond the size of the application | |||
application data block. Protocols that permit exchange of control | data block. Protocols that permit exchange of control messages | |||
messages (without an application data block) MAY prefer to generate a | (without an application data block) can generate a probe packet by | |||
probe packet by extending a control message with padding data. | extending a control message with padding data. | |||
A receiver is REQUIRED to be able to distinguish an in-band data | A receiver is REQUIRED to be able to distinguish an in-band data | |||
block from any added padding. This is needed to ensure that any | block from any added padding. This is needed to ensure that any | |||
added padding is not passed on to an application at the receiver. | added padding is not passed on to an application at the receiver. | |||
This results in three possible ways that a sender can create a probe | This results in three possible ways that a sender can create a probe | |||
packet: | packet: | |||
Probing using padding data: A probe packet that contains only | Probing using padding data: A probe packet that contains only | |||
control information together with any padding, which is needed to | control information together with any padding, which is needed to | |||
be inflated to the size required for the probe packet. Since | be inflated to the size of the probe packet. Since these probe | |||
these probe packets do not carry an application-supplied data | packets do not carry an application-supplied data block, they do | |||
block, they do not typically require retransmission, although they | not typically require retransmission, although they do still | |||
do still consume network capacity and incur endpoint processing. | consume network capacity and incur endpoint processing. | |||
Probing using application data and padding | Probing using application data and padding | |||
data: A probe packet that | data: A probe packet that | |||
contains a data block supplied by an application that is combined | contains a data block supplied by an application that is combined | |||
with padding to inflate the length of the datagram to the size | with padding to inflate the length of the datagram to the size of | |||
required for the probe packet. If the application/transport needs | the probe packet. If the application/transport needs protection | |||
protection from the loss of this probe packet, the application/ | from the loss of this probe packet, the application/transport | |||
transport could perform transport-layer retransmission/repair of | could perform transport-layer retransmission/repair of the data | |||
the data block (e.g., by retransmission after loss is detected or | block (e.g., by retransmission after loss is detected or by | |||
by duplicating the data block in a datagram without the padding | duplicating the data block in a datagram without the padding | |||
data). | data). | |||
Probing using application data: A probe packet that contains a data | Probing using application data: A probe packet that contains a data | |||
block supplied by an application that matches the size required | block supplied by an application that matches the size of the | |||
for the probe packet. This method requests the application to | probe packet. This method requests the application to issue a | |||
issue a data block of the desired probe size. If the application/ | data block of the desired probe size. If the application/ | |||
transport needs protection from the loss of an unsuccessful probe | transport needs protection from the loss of an unsuccessful probe | |||
packet, the application/transport needs then to perform transport- | packet, the application/transport needs then to perform transport- | |||
layer retransmission/repair of the data block (e.g., by | layer retransmission/repair of the data block (e.g., by | |||
retransmission after loss is detected). | retransmission after loss is detected). | |||
A PL that uses a probe packet carrying an application data block, | A PL that uses a probe packet carrying an application data block, | |||
could need to retransmit this application data block if the probe | could need to retransmit this application data block if the probe | |||
fails. This could need the PL to re-fragment the data block to a | fails, possibly using a smaller PLPMTU. This could need the PL to to | |||
smaller packet size that is expected to traverse the end-to-end path | use a smaller packet size to traverse the end-to-end path (which | |||
(which could utilize endpoint network-layer or PL fragmentation when | could utilize endpoint network-layer or a PL that can re-segment the | |||
these are available). | data block into multiple datagrams). | |||
DPLPMTUD MAY choose to use only one of these methods to simplify the | DPLPMTUD MAY choose to use only one of these methods to simplify the | |||
implementation. | implementation. | |||
Probe messages sent by a PL MUST contain enough information to | Probe messages sent by a PL MUST contain enough information to | |||
uniquely identify the probe within Maximum Segment Lifetime, while | uniquely identify the probe within Maximum Segment Lifetime, while | |||
being robust to reordering and replay of probe response and PTB | being robust to reordering and replay of probe response and PTB | |||
messages. | messages. | |||
4.2. Confirmation of Probed Packet Size | 4.2. Confirmation of Probed Packet Size | |||
The PL needs a method to determine (confirm) when probe packets have | The PL needs a method to determine (confirm) when probe packets have | |||
been successfully received end-to-end across a network path. | been successfully received end-to-end across a network path. | |||
Transport protocols can include end-to-end methods that detect and | Transport protocols can include end-to-end methods that detect and | |||
report reception of specific datagrams that they send (e.g., DCCP and | report reception of specific datagrams that they send (e.g., DCCP and | |||
SCTP provide keep-alive/heartbeat features). When supported, this | SCTP provide keep-alive/heartbeat features). When supported, this | |||
mechanism SHOULD also be used by DPLPMTUD to acknowledge reception of | mechanism MAY also be used by DPLPMTUD to acknowledge reception of a | |||
a probe packet. | probe packet. | |||
A PL that does not acknowledge data reception (e.g., UDP and UDP- | A PL that does not acknowledge data reception (e.g., UDP and UDP- | |||
Lite) is unable itself to detect when the packets that it sends are | Lite) is unable itself to detect when the packets that it sends are | |||
discarded because their size is greater than the actual PMTU. These | discarded because their size is greater than the actual PMTU. These | |||
PLs need to either rely on an application protocol to detect this | PLs need to rely on an application protocol to detect this loss. | |||
loss. | ||||
Section 6 specifies this function for a set of IETF-specified | Section 6 specifies this function for a set of IETF-specified | |||
protocols. | protocols. | |||
4.3. Detection of Unsupported PLPMTU Size, aka Black Hole Detection | 4.3. Black Hole Detection | |||
A PL sender needs to reduce the PLPMTU when it discovers the actual | Black Hole Detection is triggered by an indication that the network | |||
PMTU supported by a network path is less than the PLPMTU. This can | path could be unable to support the current PLPMTU size. | |||
be triggered when a validated PTB message is received, or by another | ||||
event that indicates the network path no longer sustains the current | ||||
packet size, such as a loss report from the PL, or repeated lack of | ||||
response to probe packets sent to confirm the PLPMTU. Detection is | ||||
followed by a reduction of the PLPMTU. | ||||
This is performed by sending packet probes of size PLPMTU to verify | There are three ways to detect black holes: | |||
that a network path still supports the last acknowledged PLPMTU size. | ||||
There are two alternative mechanism: | ||||
* A PL can rely upon a mechanism implemented within the PL to detect | * A validated PTB message can be received that indicates a PTB_SIZE | |||
excessive loss of data sent with a specific packet size and then | less than the current PLPMTU. A DPLPMTUD method MUST NOT rely | |||
conclude that this excessive loss could be a result of an invalid | soley on this method. | |||
PMTU (as in PLPMTUD for TCP [RFC4821]). | ||||
* A PL can use the DPLPMTUD probing mechanism to periodically | * A PL can use the DPLPMTUD probing mechanism to periodically | |||
generate probe packets of the size of the current PLPMTU (e.g., | generate probe packets of the size of the current PLPMTU (e.g., | |||
using the confirmation timer Section 5.1.1). A timer tracks | using the confirmation timer Section 5.1.1). A timer tracks | |||
whether acknowledgments are received. Successive loss of probes | whether acknowledgments are received. Successive loss of probes | |||
is an indication that the current path no longer supports the | is an indication that the current path no longer supports the | |||
PLPMTU (e.g., when the number of probe packets sent without | PLPMTU (e.g., when the number of probe packets sent without | |||
receiving an acknowledgement, PROBE_COUNT, becomes greater than | receiving an acknowledgement, PROBE_COUNT, becomes greater than | |||
MAX_PROBES). | MAX_PROBES). | |||
* A PL can utilise an event that indicates the network path no | ||||
longer sustains the sender's PLPMTU size. This could use a | ||||
mechanism implemented within the PL to detect excessive loss of | ||||
data sent with a specific packet size and then conclude that this | ||||
excessive loss could be a result of an invalid PLPMTU (as in | ||||
PLPMTUD for TCP [RFC4821]). | ||||
A PL MAY inhibit sending probe packets when no application data has | A PL MAY inhibit sending probe packets when no application data has | |||
been sent since the previous probe packet. A PL preferring to use an | been sent since the previous probe packet. A PL preferring to use an | |||
up-to-data PLPMTU once user data is sent again, MAY choose to | up-to-data PLPMTU once user data is sent again, MAY choose to | |||
continue PLPMTU discovery for each path. However, this may result in | continue PLPMTU discovery for each path. However, this could result | |||
additional packets being sent. | in additional packets being sent. | |||
When the method detects the current PLPMTU is not supported, DPLPMTUD | When the method detects the current PLPMTU is not supported, DPLPMTUD | |||
sets a lower MPS. The PL then confirms that the updated PLPMTU can | sets a lower PLPMTU, and sets a lower MPS. The PL then confirms that | |||
be successfully used across the path. The PL could need to send a | the new PLPMTU can be successfully used across the path. A probe | |||
probe packet with a size less than the size of the data block | packet could need to have a size less than the size of the data block | |||
generated by an application. In this case, the PL could provide a | generated by the application. | |||
way to fragment a datagram at the PL, or use a control packet as the | ||||
packet probe. | ||||
4.4. Disabling the Effect of PMTUD | 4.4. The Maximum Packet Size (MPS) | |||
The result of probing determines a usable PLPMTU, which is used to | ||||
set the MPS used by the application. The MPS is smaller than the | ||||
PLPMTU because of the presence of PL headers and any IP options or | ||||
extensions added to the PL packet. The relationship between the MPS | ||||
and the PLPMTUD is illustrated in Figure 1. | ||||
any additional | ||||
headers .--- MPS -----. | ||||
| | | | ||||
v v v | ||||
+------------------------------+ | ||||
| IP | ** | PL | protocol data | | ||||
+------------------------------+ | ||||
<---------- PLPMTU ------------> | ||||
Figure 1: Relationship between MPS and PLPMTU | ||||
A PL is unable to send a packet (other than a probe packet) with a | ||||
size larger than the current PLPMTU at the network layer. To avoid | ||||
this, a PL MAY be designed to segment data blocks larger than the MPS | ||||
into multiple datagrams. | ||||
DPLPMTUD seeks to avoid IP fragmentation. An attempt to send a data | ||||
block larger than the MPS will therefore fail if a PL is unable to | ||||
segment data. To determine the largest data block that can be sent, | ||||
a PL SHOULD provide applications with a primitive that returns the | ||||
Maximum Packet Size (MPS), derived from the current PLPMTU. | ||||
If DPLPMTUD results in a change to the MPS, the application needs to | ||||
adapt to the new MPS. A particular case can arise when packets have | ||||
been sent with a size less than the MPS and the PLPMTU was | ||||
subsequently reduced. If these packets are lost, the PL MAY segment | ||||
the data using the new MPS. If a PL is unable to re-segment a | ||||
previously sent datagram (e.g., [RFC4960]), then the sender either | ||||
discards the datagram or could perform retransmission using network- | ||||
layer fragmentation to form multiple IP packets not larger than the | ||||
PLPMTU. For IPv4, the use of endpoint fragmentation by the sender is | ||||
preferred over clearing the DF-bit in the IPv4 header. Operational | ||||
experience reveals that IP fragmentation can reduce the reliability | ||||
of Internet communication [I-D.ietf-intarea-frag-fragile], which may | ||||
reduce the success of retransmission | ||||
4.5. Disabling the Effect of PMTUD | ||||
A PL implementing this specification MUST suspend network layer | A PL implementing this specification MUST suspend network layer | |||
processing of outgoing packets that enforces a PMTU | processing of outgoing packets that enforces a PMTU | |||
[RFC1191][RFC8201] for each flow utilising DPLPMTUD, and instead use | [RFC1191][RFC8201] for each flow utilising DPLPMTUD, and instead use | |||
DPLPMTUD to control the size of packets that are sent by a flow. | DPLPMTUD to control the size of packets that are sent by a flow. | |||
This removes the need for the network layer to drop or fragment sent | This removes the need for the network layer to drop or fragment sent | |||
packets that have a size greater than the PMTU. | packets that have a size greater than the PMTU. | |||
4.5. Response to PTB Messages | 4.6. Response to PTB Messages | |||
This method requires the DPLPMTUD sender to validate any received PTB | This method requires the DPLPMTUD sender to validate any received PTB | |||
message before using the PTB information. The response to a PTB | message before using the PTB information. The response to a PTB | |||
message depends on the PTB_SIZE indicated in the PTB message, the | message depends on the PTB_SIZE indicated in the PTB message, the | |||
state of the PLPMTUD state machine, and the IP protocol being used. | state of the PLPMTUD state machine, and the IP protocol being used. | |||
Section 4.5.1 first describes validation for both IPv4 ICMP | Section 4.6.1 first describes validation for both IPv4 ICMP | |||
Unreachable messages (type 3) and ICMPv6 Packet Too Big messages, | Unreachable messages (type 3) and ICMPv6 Packet Too Big messages, | |||
both of which are referred to as PTB messages in this document. | both of which are referred to as PTB messages in this document. | |||
4.5.1. Validation of PTB Messages | 4.6.1. Validation of PTB Messages | |||
This section specifies utilization of PTB messages. | This section specifies utilization of PTB messages. | |||
* A simple implementation MAY ignore received PTB messages and in | * A simple implementation MAY ignore received PTB messages and in | |||
this case the PLPMTU is not updated when a PTB message is | this case the PLPMTU is not updated when a PTB message is | |||
received. | received. | |||
* An implementation that supports PTB messages MUST validate | * An implementation that supports PTB messages MUST validate | |||
messages before they are further processed. | messages before they are further processed. | |||
skipping to change at page 16, line 16 ¶ | skipping to change at page 18, line 8 ¶ | |||
this validation. | this validation. | |||
These checks are intended to provide protection from packets that | These checks are intended to provide protection from packets that | |||
originate from a node that is not on the network path. A PTB message | originate from a node that is not on the network path. A PTB message | |||
that does not complete the validation MUST NOT be further utilized by | that does not complete the validation MUST NOT be further utilized by | |||
the DPLPMTUD method. | the DPLPMTUD method. | |||
PTB messages that have been validated MAY be utilized by the DPLPMTUD | PTB messages that have been validated MAY be utilized by the DPLPMTUD | |||
algorithm, but MUST NOT be used directly to set the PLPMTU. A method | algorithm, but MUST NOT be used directly to set the PLPMTU. A method | |||
that utilizes these PTB messages can improve the speed at the which | that utilizes these PTB messages can improve the speed at the which | |||
the algorithm detects an appropriate PLPMTU, compared to one that | the algorithm detects an appropriate PLPMTU by triggering an | |||
relies solely on probing. Section 4.5.2 describes this processing. | immediate probe for the PTB_SIZE, compared to one that relies solely | |||
on probing using a timer-based search algorithm. Section 4.6.2 | ||||
describes this processing. | ||||
4.5.2. Use of PTB Messages | 4.6.2. Use of PTB Messages | |||
A set of checks are intended to provide protection from a router that | A set of checks are intended to provide protection from a router that | |||
reports an unexpected PTB_SIZE. The PL also needs to check that the | reports an unexpected PTB_SIZE. The PL also needs to check that the | |||
indicated PTB_SIZE is less than the size used by probe packets and | indicated PTB_SIZE is less than the size used by probe packets and at | |||
larger than minimum size accepted. | least the minimum size accepted. | |||
This section provides a summary of how PTB messages can be utilized. | This section provides a summary of how PTB messages can be utilized. | |||
This processing depends on the PTB_SIZE and the current value of a | This processing depends on the PTB_SIZE and the current value of a | |||
set of variables: | set of variables: | |||
PTB_SIZE < MIN_PMTU | PTB_SIZE < MIN_PMTU | |||
* Invalid PTB_SIZE see Section 4.5.1. | * Invalid PTB_SIZE see Section 4.6.1. | |||
* PTB message ought to be discarded without further processing | * PTB message ought to be discarded without further processing | |||
(e. g. PLPMTU not modified). | (e. g. PLPMTU not modified). | |||
* The information could be utilized as an input to trigger | * The information could be utilized as an input to trigger | |||
enabling a resilience mode. | enabling a resilience mode. | |||
MIN_PMTU < PTB_SIZE < BASE_PMTU | MIN_PMTU < PTB_SIZE < BASE_PMTU | |||
* A robust PL MAY enter an error state (see Section 5.2) for an | * A robust PL MAY enter an error state (see Section 5.2) for an | |||
IPv4 path when the PTB_SIZE reported in the PTB message is | IPv4 path when the PTB_SIZE reported in the PTB message is | |||
skipping to change at page 17, line 15 ¶ | skipping to change at page 19, line 9 ¶ | |||
PTB_SIZE > PROBED_SIZE | PTB_SIZE > PROBED_SIZE | |||
* Inconsistent network signal. | * Inconsistent network signal. | |||
* PTB message ought to be discarded without further processing | * PTB message ought to be discarded without further processing | |||
(e. g. PLPMTU not modified). | (e. g. PLPMTU not modified). | |||
* The information could be utilized as an input to trigger | * The information could be utilized as an input to trigger | |||
enabling a resilience mode. | enabling a resilience mode. | |||
BASE_PMTU <= PTB_SIZE < PLPMTU | BASE_PMTU <= PTB_SIZE < PLPMTU | |||
* Black Hole Detection is triggered and the PLPMTU ought to be | * This could be an indication of a black hole. The PLPMTU SHOULD | |||
set to BASE_PMTU. | be set to BASE_PMTU (the PLPMTU is reduced to the BASE_PMTU to | |||
avoid unnecessary packet loss when a black hole is | ||||
encountered). | ||||
* The PL could use the PTB_SIZE reported in the PTB message to | * The PL ought to start a search to quickly discover the new | |||
initialize a search algorithm. | PLPMTU. The PTB_SIZE reported in the PTB message can be used | |||
to initialize a search algorithm. | ||||
PLPMTU < PTB_SIZE < PROBED_SIZE | PLPMTU < PTB_SIZE < PROBED_SIZE | |||
* The PLPMTU continues to be valid, but the last PROBED_SIZE | * The PLPMTU continues to be valid, but the last PROBED_SIZE | |||
searched was larger than the actual PMTU. | searched was larger than the actual PMTU. | |||
* The PLPMTU is not updated. | * The PLPMTU is not updated. | |||
* The PL can use the reported PTB_SIZE from the PTB message as | * The PL can use the reported PTB_SIZE from the PTB message as | |||
the next search point when it resumes the search algorithm. | the next search point when it resumes the search algorithm. | |||
5. Datagram Packetization Layer PMTUD | 5. Datagram Packetization Layer PMTUD | |||
This section specifies Datagram PLPMTUD (DPLPMTUD). The method can | This section specifies Datagram PLPMTUD (DPLPMTUD). The method can | |||
be introduced at various points (as indicated with * in the figure | be introduced at various points (as indicated with * in the figure | |||
below) in the IP protocol stack to discover the PLPMTU so that an | below) in the IP protocol stack to discover the PLPMTU so that an | |||
application can utilize an appropriate MPS for the current network | application can utilize an appropriate MPS for the current network | |||
path. DPLPMTUD SHOULD NOT be used by an application if it is already | path. | |||
used in a lower layer. | ||||
DPLPMTUD SHOULD NOT be used by an upper PL or application if it is | ||||
already used in a lower layer, DPLPMTUD SHOULD only be performed once | ||||
between a pair of endpoints. A PL MUST adjust the MPS indicated by | ||||
DPLPMTUD to account for any additional overhead introduced by the PL. | ||||
+----------------------+ | +----------------------+ | |||
| Application* | | | Application* | | |||
+-+-------+----+----+--+ | +-+-------+----+----+--+ | |||
| | | | | | | | | | |||
+---+--+ +--+--+ | +-+---+ | +---+--+ +--+--+ | +-+---+ | |||
| QUIC*| |UDPO*| | |SCTP*| | | QUIC*| |UDPO*| | |SCTP*| | |||
+---+--+ +--+--+ | +--+--+ | +---+--+ +--+--+ | +--+--+ | |||
| | | | | | | | | | | | |||
+-------+--+ | | | | +-------+--+ | | | | |||
| | | | | | | | | | |||
+-+-+--+ | | +-+-+--+ | | |||
| UDP | | | | UDP | | | |||
+---+--+ | | +---+--+ | | |||
| | | | | | |||
+--------------+-----+-+ | +--------------+-----+-+ | |||
| Network Interface | | | Network Interface | | |||
+----------------------+ | +----------------------+ | |||
Figure 1: Examples where DPLPMTUD can be implemented | Figure 2: Examples where DPLPMTUD can be implemented | |||
The central idea of DPLPMTUD is probing by a sender. Probe packets | The central idea of DPLPMTUD is probing by a sender. Probe packets | |||
are sent to find the maximum size of a user message that can be | are sent to find the maximum size of user message that can be | |||
completely transferred across the network path from the sender to the | completely transferred across the network path from the sender to the | |||
destination. | destination. | |||
The following sections identify the components needed for | The following sections identify the components needed for | |||
implementation, provides an overview of the phases of operation, and | implementation, provides an overview of the phases of operation, and | |||
specifies the state machine and search algorithm. | specifies the state machine and search algorithm. | |||
5.1. DPLPMTUD Components | 5.1. DPLPMTUD Components | |||
This section describes the timers, constants, and variables of | This section describes the timers, constants, and variables of | |||
skipping to change at page 18, line 51 ¶ | skipping to change at page 20, line 51 ¶ | |||
The method utilizes up to three timers: | The method utilizes up to three timers: | |||
PROBE_TIMER: The PROBE_TIMER is configured to expire after a | PROBE_TIMER: The PROBE_TIMER is configured to expire after a | |||
period longer than the maximum time to receive | period longer than the maximum time to receive | |||
an acknowledgment to a probe packet. This value | an acknowledgment to a probe packet. This value | |||
MUST NOT be smaller than 1 second, and SHOULD be | MUST NOT be smaller than 1 second, and SHOULD be | |||
larger than 15 seconds. Guidance on selection | larger than 15 seconds. Guidance on selection | |||
of the timer value are provided in section 3.1.1 | of the timer value are provided in section 3.1.1 | |||
of the UDP Usage Guidelines [RFC8085]. | of the UDP Usage Guidelines [RFC8085]. | |||
If the PL has a path Round Trip Time (RTT) | ||||
estimate and timely acknowledgements the | ||||
PROBE_TIMER can be derived from the PL RTT | ||||
estimate. | ||||
PMTU_RAISE_TIMER: The PMTU_RAISE_TIMER is configured to the period | PMTU_RAISE_TIMER: The PMTU_RAISE_TIMER is configured to the period | |||
a sender will continue to use the current | a sender will continue to use the current | |||
PLPMTU, after which it re-enters the Search | PLPMTU, after which it re-enters the Search | |||
phase. This timer has a period of 600 seconds, | phase. This timer has a period of 600 seconds, | |||
as recommended by PLPMTUD [RFC4821]. | as recommended by PLPMTUD [RFC4821]. | |||
DPLPMTUD MAY inhibit sending probe packets when | DPLPMTUD MAY inhibit sending probe packets when | |||
no application data has been sent since the | no application data has been sent since the | |||
previous probe packet. A PL preferring to use | previous probe packet. A PL preferring to use | |||
an up-to-data PMTU once user data is sent again, | an up-to-data PMTU once user data is sent again, | |||
can choose to continue PMTU discovery for each | can choose to continue PMTU discovery for each | |||
path. However, this may result in sending | path. However, this could result in sending | |||
additional packets. | additional packets. | |||
CONFIRMATION_TIMER: When an acknowledged PL is used, this timer MUST | CONFIRMATION_TIMER: When an acknowledged PL is used, this timer MUST | |||
NOT be used. For other PLs, the | NOT be used. For other PLs, the | |||
CONFIRMATION_TIMER is configured to the period a | CONFIRMATION_TIMER is configured to the period a | |||
PL sender waits before confirming the current | PL sender waits before confirming the current | |||
PLPMTU is still supported. This is less than | PLPMTU is still supported. This is less than | |||
the PMTU_RAISE_TIMER and used to decrease the | the PMTU_RAISE_TIMER and used to decrease the | |||
PLPMTU (e.g., when a black hole is encountered). | PLPMTU (e.g., when a black hole is encountered). | |||
Confirmation needs to be frequent enough when | Confirmation needs to be frequent enough when | |||
skipping to change at page 19, line 40 ¶ | skipping to change at page 21, line 35 ¶ | |||
black hole extensive amounts of traffic. | black hole extensive amounts of traffic. | |||
Guidance on selection of the timer value are | Guidance on selection of the timer value are | |||
provided in section 3.1.1 of the UDP Usage | provided in section 3.1.1 of the UDP Usage | |||
Guidelines [RFC8085]. | Guidelines [RFC8085]. | |||
DPLPMTUD MAY inhibit sending probe packets when | DPLPMTUD MAY inhibit sending probe packets when | |||
no application data has been sent since the | no application data has been sent since the | |||
previous probe packet. A PL preferring to use | previous probe packet. A PL preferring to use | |||
an up-to-data PMTU once user data is sent again, | an up-to-data PMTU once user data is sent again, | |||
can choose to continue PMTU discovery for each | can choose to continue PMTU discovery for each | |||
path. However, this may result in sending | path. However, this could result in sending | |||
additional packets. | additional packets. | |||
An implementation could implement the various timers using a single | An implementation could implement the various timers using a single | |||
timer. | timer. | |||
5.1.2. Constants | 5.1.2. Constants | |||
The following constants are defined: | The following constants are defined: | |||
MAX_PROBES: The MAX_PROBES is the maximum value of the PROBE_COUNT | MAX_PROBES: The MAX_PROBES is the maximum value of the PROBE_COUNT | |||
skipping to change at page 20, line 45 ¶ | skipping to change at page 22, line 41 ¶ | |||
This method utilizes a set of variables: | This method utilizes a set of variables: | |||
PROBED_SIZE: The PROBED_SIZE is the size of the current probe | PROBED_SIZE: The PROBED_SIZE is the size of the current probe | |||
packet. This is a tentative value for the PLPMTU, | packet. This is a tentative value for the PLPMTU, | |||
which is awaiting confirmation by an acknowledgment. | which is awaiting confirmation by an acknowledgment. | |||
PROBE_COUNT: The PROBE_COUNT is a count of the number of successive | PROBE_COUNT: The PROBE_COUNT is a count of the number of successive | |||
unsuccessful probe packets that have been sent. Each | unsuccessful probe packets that have been sent. Each | |||
time a probe packet is acknowledged, the value is set | time a probe packet is acknowledged, the value is set | |||
to zero. | to zero. (Some probe loss is expected while searching, | |||
therefore loss of a single probe is not an indication | ||||
of a PMTU problem.) | ||||
The figure below illustrates the relationship between the packet size | The figure below illustrates the relationship between the packet size | |||
constants and variables at a point of time when the DPLPMTUD | constants and variables at a point of time when the DPLPMTUD | |||
algorithm performs path probing to increase the size of the PLPMTU. | algorithm performs path probing to increase the size of the PLPMTU. | |||
A probe packet has been sent of size PROBED_SIZE. Once this is | A probe packet has been sent of size PROBED_SIZE. Once this is | |||
acknowledged, the PLPMTU will raise to PROBED_SIZE allowing the | acknowledged, the PLPMTU will raise to PROBED_SIZE allowing the | |||
DPLPMTUD algorithm to further increase PROBED_SIZE towards the actual | DPLPMTUD algorithm to further increase PROBED_SIZE towards the actual | |||
PMTU. | PMTU. | |||
MIN_PMTU MAX_PMTU | MIN_PMTU MAX_PMTU | |||
<--------------------------------------------------> | <--------------------------------------------------> | |||
| | | | | | | | | | |||
v | | v | v | | v | |||
BASE_PMTU | v Actual PMTU | BASE_PMTU | v Actual PMTU | |||
| PROBED_SIZE | | PROBED_SIZE | |||
v | v | |||
PLPMTU | PLPMTU | |||
Figure 2: Relationships between packet size constants and variables | Figure 3: Relationships between packet size constants and variables | |||
5.1.4. Overview of DPLPMTUD Phases | 5.1.4. Overview of DPLPMTUD Phases | |||
This section provides a high-level informative view of the DPLPMTUD | This section provides a high-level informative view of the DPLPMTUD | |||
method, by describing the movement of the method through several | method, by describing the movement of the method through several | |||
phases of operation. More detail is available in the state machine | phases of operation. More detail is available in the state machine | |||
Section 5.2. | Section 5.2. | |||
+------+ | +------+ | |||
+------->| Base |----------------+ Connectivity | +------->| Base |----------------+ Connectivity | |||
skipping to change at page 21, line 49 ¶ | skipping to change at page 23, line 47 ¶ | |||
| | | | | | | | |||
| Raise | | Search | | Raise | | Search | |||
| timer | | algorithm | | timer | | algorithm | |||
| expired | | completed | | expired | | completed | |||
| | | | | | | | |||
| | v | | | v | |||
| +-----------------+ | | +-----------------+ | |||
+---| Search Complete | | +---| Search Complete | | |||
+-----------------+ | +-----------------+ | |||
Figure 3: DPLPMTUD Phases | Figure 4: DPLPMTUD Phases | |||
Base: The Base Phase confirms connectivity to the remote | Base: The Base Phase confirms connectivity to the remote | |||
peer. This phase is implicit for a connection- | peer using packets of the BASE_PMTU. This phase is | |||
oriented PL (where it can be performed in a PL | implicit for a connection-oriented PL (where it can | |||
connection handshake). A connectionless PL needs | be performed in a PL connection handshake). A | |||
to send an acknowledged probe packet to confirm | connectionless PL sends an acknowledged probe | |||
that the remote peer is reachable. The sender also | packet to confirm that the remote peer is | |||
confirms that BASE_PMTU is supported across the | reachable. The sender also confirms that BASE_PMTU | |||
network path. | is supported across the network path. | |||
A PL that does not wish to support a path with a | A PL that does not wish to support a path with a | |||
PLPMTU less than BASE_PMTU can simplify the phase | PLPMTU less than BASE_PMTU can simplify the phase | |||
into a single step by performing the connectivity | into a single step by performing the connectivity | |||
checks with a probe of the BASE_PMTU size. | checks with a probe of the BASE_PMTU size. | |||
Once confirmed, DPLPMTUD enters the Search Phase. | Once confirmed, DPLPMTUD enters the Search Phase. | |||
If this phase fails to confirm, DPLPMTUD enters the | If this phase fails to confirm, DPLPMTUD enters the | |||
Error Phase. | Error Phase. | |||
Search: The Search Phase utilizes a search algorithm to | Search: The Search Phase utilizes a search algorithm to | |||
send probe packets to seek to increase the PLPMTU. | send probe packets to seek to increase the PLPMTU. | |||
The algorithm concludes when it has found a | The algorithm concludes when it has found a | |||
suitable PLPMTU, by entering the Search Complete | suitable PLPMTU, by entering the Search Complete | |||
Phase. | Phase. | |||
A PL could respond to PTB messages using the PTB to | A PL could respond to PTB messages using the PTB to | |||
advance or terminate the search, see Section 4.5. | advance or terminate the search, see Section 4.6. | |||
Search Complete: The Search Complete Phase is entered when the | Search Complete: The Search Complete Phase is entered when the | |||
PLPMTU is supported across the network path. A PL | PLPMTU is supported across the network path. A PL | |||
can use a CONFIRMATION_TIMER to periodically repeat | can use a CONFIRMATION_TIMER to periodically repeat | |||
a probe packet for the current PLPMTU size. If the | a probe packet for the current PLPMTU size. If the | |||
sender is unable to confirm reachability (e.g., if | sender is unable to confirm reachability (e.g., if | |||
the CONFIRMATION_TIMER expires) or the PL signals a | the CONFIRMATION_TIMER expires) or the PL signals a | |||
lack of reachability, DPLPMTUD enters the Base | lack of reachability, DPLPMTUD enters the Base | |||
phase. | phase. | |||
The PMTU_RAISE_TIMER is used to periodically resume | The PMTU_RAISE_TIMER is used to periodically resume | |||
the search phase to discover if the PLPMTU can be | the search phase to discover if the PLPMTU can be | |||
raised. Black Hole Detection or receipt of a | raised. Black Hole Detection causes the sender to | |||
validated PTB message (see Section 4.5.1) can cause | enter the Base Phase. | |||
the sender to enter the Base Phase. | ||||
Error: The Error Phase is entered when there is | Error: The Error Phase is entered when there is | |||
conflicting or invalid PLPMTU information for the | conflicting or invalid PLPMTU information for the | |||
path (e.g. a failure to support the BASE_PMTU) that | path (e.g. a failure to support the BASE_PMTU) that | |||
cause DPLPMTUD to be unable to progress and the | cause DPLPMTUD to be unable to progress and the | |||
PLPMTU is lowered. | PLPMTU is lowered. | |||
DPLPMTUD remains in the Error Phase until a | DPLPMTUD remains in the Error Phase until a | |||
consistent view of the path can be discovered and | consistent view of the path can be discovered and | |||
it has also been confirmed that the path supports | it has also been confirmed that the path supports | |||
skipping to change at page 23, line 24 ¶ | skipping to change at page 25, line 19 ¶ | |||
A full implementation of DPLPMTUD provides an algorithm enabling the | A full implementation of DPLPMTUD provides an algorithm enabling the | |||
DPLPMTUD sender to increase the PLPMTU following a change in the | DPLPMTUD sender to increase the PLPMTU following a change in the | |||
characteristics of the path, such as when a link is reconfigured with | characteristics of the path, such as when a link is reconfigured with | |||
a larger MTU, or when there is a change in the set of links traversed | a larger MTU, or when there is a change in the set of links traversed | |||
by an end-to-end flow (e.g., after a routing or path fail-over | by an end-to-end flow (e.g., after a routing or path fail-over | |||
decision). | decision). | |||
5.2. State Machine | 5.2. State Machine | |||
A state machine for DPLPMTUD is depicted in Figure 4. If multipath | A state machine for DPLPMTUD is depicted in Figure 5. If multipath | |||
or multihoming is supported, a state machine is needed for each path. | or multihoming is supported, a state machine is needed for each path. | |||
Note: Not all changes are not shown to simplify the diagram. | Note: Not all changes are shown to simplify the diagram. | |||
| | | | | | |||
| Start | PL indicates loss | | Start | PL indicates loss | |||
| | of connectivity | | | of connectivity | |||
v v | v v | |||
+---------------+ +---------------+ | +---------------+ +---------------+ | |||
| DISABLED | | ERROR | | | DISABLED | | ERROR | | |||
+---------------+ PROBE_TIMER expiry: +---------------+ | +---------------+ PROBE_TIMER expiry: +---------------+ | |||
| PL indicates PROBE_COUNT = MAX_PROBES or ^ | | | PL indicates PROBE_COUNT = MAX_PROBES or ^ | | |||
| connectivity PTB: PTB_SIZE < BASE_PMTU | | | | connectivity PTB: PTB_SIZE < BASE_PMTU | | | |||
+--------------------+ +---------------+ | | +--------------------+ +---------------+ | | |||
| | | | | | | | |||
v | BASE_PMTU Probe | | v | BASE_PMTU Probe | | |||
+---------------+ acked | | +---------------+ acked | | |||
| BASE |----------------------+ | | BASE |----------------------+ | |||
+---------------+ | | +---------------+ | | |||
Black hole detected or ^ | ^ ^ Black hole detected or | | ^ | ^ ^ | | |||
PTB: PTB_SIZE < PLPMTU | | | | PTB: PTB_SIZE < PLPMTU | | Black hole detected | | | | Black hole detected | | |||
+--------------------+ | | +--------------------+ | | +--------------------+ | | +--------------------+ | | |||
| +----+ | | | | +----+ | | | |||
| PROBE_TIMER expiry: | | | | PROBE_TIMER expiry: | | | |||
| PROBE_COUNT < MAX_PROBES | | | | PROBE_COUNT < MAX_PROBES | | | |||
| | | | | | | | |||
| PMTU_RAISE_TIMER expiry | | | | PMTU_RAISE_TIMER expiry | | | |||
| +-----------------------------------------+ | | | | +-----------------------------------------+ | | | |||
| | | | | | | | | | | | |||
| | v | v | | | v | v | |||
+---------------+ +---------------+ | +---------------+ +---------------+ | |||
|SEARCH_COMPLETE| | SEARCHING | | |SEARCH_COMPLETE| | SEARCHING | | |||
+---------------+ +---------------+ | +---------------+ +---------------+ | |||
| ^ ^ | | ^ | | ^ ^ | | ^ | |||
| | | | | | | | | | | | | | |||
| | +-----------------------------------------+ | | | | | +-----------------------------------------+ | | | |||
| | MAX_PMTU Probe acked or PROBE_TIMER | | | | | MAX_PMTU Probe acked or | | | |||
| | expiry: PROBE_COUNT = MAX_PROBES or | | | | | PROBE_TIMER expiry: PROBE_COUNT = MAX_PROBES or | | | |||
+----+ PTB: PTB_SIZE = PLPMTU +----+ | +----+ PTB: PTB_SIZE = PLPMTU +----+ | |||
CONFIRMATION_TIMER expiry: PROBE_TIMER expiry: | CONFIRMATION_TIMER expiry: PROBE_TIMER expiry: | |||
PROBE_COUNT < MAX_PROBES or PROBE_COUNT < MAX_PROBES or | PROBE_COUNT < MAX_PROBES or PROBE_COUNT < MAX_PROBES or | |||
PLPMTU Probe acked Probe acked or PTB: | PLPMTU Probe acked Probe acked or PTB: | |||
PLPMTU < PTB_SIZE < PROBED_SIZE | PLPMTU < PTB_SIZE < PROBED_SIZE | |||
Figure 4: State machine for Datagram PLPMTUD | Figure 5: State machine for Datagram PLPMTUD | |||
The following states are defined: | The following states are defined: | |||
DISABLED: The DISABLED state is the initial state before | DISABLED: The DISABLED state is the initial state before | |||
probing has started. It is also entered from any | probing has started. It is also entered from any | |||
other state, when the PL indicates loss of | other state, when the PL indicates loss of | |||
connectivity. This state is left, once the PL | connectivity. This state is left, once the PL | |||
indicates connectivity to the remote PL. | indicates connectivity to the remote PL. | |||
BASE: The BASE state is used to confirm that the | BASE: The BASE state is used to confirm that the | |||
BASE_PMTU size is supported by the network path and | BASE_PMTU size is supported by the network path and | |||
is designed to allow an application to continue | is designed to allow an application to continue | |||
working when there are transient reductions in the | working when there are transient reductions in the | |||
actual PMTU. It also seeks to avoid long periods | actual PMTU. It also seeks to avoid long periods | |||
where traffic is black holed while searching for a | when a sender searching for a larger PLPMTU is | |||
larger PLPMTU. | unaware that packets are not being delivered due to | |||
a packet or ICMP Black Hole. | ||||
On entry, the PROBED_SIZE is set to the BASE_PMTU | On entry, the PROBED_SIZE is set to the BASE_PMTU | |||
size and the PROBE_COUNT is set to zero. | size and the PROBE_COUNT is set to zero. | |||
Each time a probe packet is sent, the PROBE_TIMER | Each time a probe packet is sent, the PROBE_TIMER | |||
is started. The state is exited when the probe | is started. The state is exited when the probe | |||
packet is acknowledged, and the PL sender enters | packet is acknowledged, and the PL sender enters | |||
the SEARCHING state. | the SEARCHING state. | |||
The state is also left when the PROBE_COUNT reaches | The state is also left when the PROBE_COUNT reaches | |||
skipping to change at page 25, line 39 ¶ | skipping to change at page 27, line 40 ¶ | |||
BASE_PMTU was successful. | BASE_PMTU was successful. | |||
Each time a probe packet is acknowledged, the | Each time a probe packet is acknowledged, the | |||
PROBE_COUNT is set to zero, the PLPMTU is set to | PROBE_COUNT is set to zero, the PLPMTU is set to | |||
the PROBED_SIZE and then the PROBED_SIZE is | the PROBED_SIZE and then the PROBED_SIZE is | |||
increased using the search algorithm. | increased using the search algorithm. | |||
When a probe packet is sent and not acknowledged | When a probe packet is sent and not acknowledged | |||
within the period of the PROBE_TIMER, the | within the period of the PROBE_TIMER, the | |||
PROBE_COUNT is incremented and a new probe packet | PROBE_COUNT is incremented and a new probe packet | |||
is transmitted. The state is exited when the | is transmitted. | |||
PROBE_COUNT reaches MAX_PROBES, a received PTB | ||||
message is validated, a probe of size MAX_PMTU is | The state is exited to enter SEARCH_COMPLETE when | |||
acknowledged, or a black hole is detected. | the PROBE_COUNT reaches MAX_PROBES, a validated PTB | |||
is received that corresponds to the last | ||||
successfully probed size (PTB_SIZE = PLPMTU), or a | ||||
probe of size MAX_PMTU is acknowledged (PLPMTU = | ||||
MAX_PMTU). | ||||
When a black hole is detected in the SEARCHING | ||||
state, this causes the PL sender to enter the BASE | ||||
state. | ||||
SEARCH_COMPLETE: The SEARCH_COMPLETE state indicates a successful | SEARCH_COMPLETE: The SEARCH_COMPLETE state indicates a successful | |||
end to the SEARCHING state. DPLPMTUD remains in | end to the SEARCHING state. DPLPMTUD remains in | |||
this state until either the PMTU_RAISE_TIMER | this state until either the PMTU_RAISE_TIMER | |||
expires, a received PTB message is validated, or a | expires or a black hole is detected. | |||
black hole is detected. | ||||
When DPLPMTUD uses an unacknowledged PL and is in | When DPLPMTUD uses an unacknowledged PL and is in | |||
the SEARCH_COMPLETE state, a CONFIRMATION_TIMER | the SEARCH_COMPLETE state, a CONFIRMATION_TIMER | |||
periodically resets the PROBE_COUNT and schedules a | periodically resets the PROBE_COUNT and schedules a | |||
probe packet with the size of the PLPMTU. If | probe packet with the size of the PLPMTU. If | |||
MAX_PROBES successive PLPMTUD sized probes fail to | MAX_PROBES successive PLPMTUD sized probes fail to | |||
be acknowledged the method enters the BASE state. | be acknowledged the method enters the BASE state. | |||
When used with an acknowledged PL (e.g., SCTP), | When used with an acknowledged PL (e.g., SCTP), | |||
DPLPMTUD SHOULD NOT continue to generate PLPMTU | DPLPMTUD SHOULD NOT continue to generate PLPMTU | |||
probes in this state. | probes in this state. | |||
skipping to change at page 26, line 23 ¶ | skipping to change at page 28, line 31 ¶ | |||
the network path is not known to support a PLPMTU | the network path is not known to support a PLPMTU | |||
of at least the BASE_PMTU size or when there is | of at least the BASE_PMTU size or when there is | |||
contradictory information about the network path | contradictory information about the network path | |||
that would otherwise result in excessive variation | that would otherwise result in excessive variation | |||
in the MPS signalled to the higher layer. The | in the MPS signalled to the higher layer. The | |||
state implements a method to mitigate oscillation | state implements a method to mitigate oscillation | |||
in the state-event engine. It signals a | in the state-event engine. It signals a | |||
conservative value of the MPS to the higher layer | conservative value of the MPS to the higher layer | |||
by the PL. The state is exited when packet probes | by the PL. The state is exited when packet probes | |||
no longer detect the error or when the PL indicates | no longer detect the error or when the PL indicates | |||
that connectivity has been lost. | that connectivity has been lost. The PL sender | |||
then enters the SEARCHING state. | ||||
Implementations are permitted to enable endpoint | Implementations are permitted to enable endpoint | |||
fragmentation if the DPLPMTUD is unable to validate | fragmentation if the DPLPMTUD is unable to validate | |||
MIN_PMTU within PROBE_COUNT probes. If DPLPMTUD is | MIN_PMTU within PROBE_COUNT probes. If DPLPMTUD is | |||
unable to validate MIN_PMTU the implementation | unable to validate MIN_PMTU the implementation will | |||
should transition to the DISABLED state. | transition to the DISABLED state. | |||
Note: MIN_PMTU may be identical to BASE_PMTU, | Note: MIN_PMTU could be identical to BASE_PMTU, | |||
simplifying the actions in this state. | simplifying the actions in this state. | |||
5.3. Search to Increase the PLPMTU | 5.3. Search to Increase the PLPMTU | |||
This section describes the algorithms used by DPLPMTUD to search for | This section describes the algorithms used by DPLPMTUD to search for | |||
a larger PLPMTU. | a larger PLPMTU. | |||
5.3.1. Probing for a larger PLPMTU | 5.3.1. Probing for a larger PLPMTU | |||
Implementations use a search algorithm across the search range to | Implementations use a search algorithm across the search range to | |||
skipping to change at page 27, line 6 ¶ | skipping to change at page 29, line 13 ¶ | |||
path. | path. | |||
The method discovers the search range by confirming the minimum | The method discovers the search range by confirming the minimum | |||
PLPMTU and then using the probe method to select a PROBED_SIZE less | PLPMTU and then using the probe method to select a PROBED_SIZE less | |||
than or equal to MAX_PMTU. MAX_PMTU is the minimum of the local MTU | than or equal to MAX_PMTU. MAX_PMTU is the minimum of the local MTU | |||
and EMTU_R (learned from the remote endpoint). The MAX_PMTU MAY be | and EMTU_R (learned from the remote endpoint). The MAX_PMTU MAY be | |||
reduced by an application that sets a maximum to the size of | reduced by an application that sets a maximum to the size of | |||
datagrams it will send. | datagrams it will send. | |||
The PROBE_COUNT is initialized to zero when the first probe with a | The PROBE_COUNT is initialized to zero when the first probe with a | |||
size greater than or equal to PLPMTUD is sent. A timer is used by | size greater than or equal to PLPMTUD is sent. A timer is used to | |||
the search algorithm to trigger the sending of probe packets of size | trigger the sending of probe packets of size PROBED_SIZE, larger than | |||
PROBED_SIZE, larger than the PLPMTU. Each probe packet successfully | the PLPMTU. Each probe packet successfully sent to the remote peer | |||
sent to the remote peer is confirmed by acknowledgement at the PL, | is confirmed by acknowledgement at the PL, see Section 4.1. | |||
see Section 4.1. | ||||
Each time a probe packet is sent to the destination, the PROBE_TIMER | Each time a probe packet is sent to the destination, the PROBE_TIMER | |||
is started. The timer is canceled when the PL receives | is started. The timer is canceled when the PL receives | |||
acknowledgment that the probe packet has been successfully sent | acknowledgment that the probe packet has been successfully sent | |||
across the path Section 4.1. This confirms that the PROBED_SIZE is | across the path Section 4.1. This confirms that the PROBED_SIZE is | |||
supported, and the PROBED_SIZE value is then assigned to the PLPMTU. | supported, and the PROBED_SIZE value is then assigned to the PLPMTU. | |||
The search algorithm can continue to send subsequent probe packets of | The search algorithm can continue to send subsequent probe packets of | |||
an increasing size. | an increasing size. | |||
If the timer expires before a probe packet is acknowledged, the probe | If the timer expires before a probe packet is acknowledged, the probe | |||
has failed to confirm the PROBED_SIZE. Each time the PROBE_TIMER | has failed to confirm the PROBED_SIZE. Each time the PROBE_TIMER | |||
expires, the PROBE_COUNT is incremented, the PROBE_TIMER is | expires, the PROBE_COUNT is incremented, the PROBE_TIMER is | |||
reinitialized, and a new probe of the same size or any other size | reinitialized, and a new probe of the same size or any other size | |||
(determined by the search algorithm) can be sent. The maximum number | (determined by the search algorithm) can be sent. The maximum number | |||
of consecutive failed probes is configured (MAX_PROBES). If the | of consecutive failed probes is configured (MAX_PROBES). If the | |||
value of the PROBE_COUNT reaches MAX_PROBES, probing will stop, and | value of the PROBE_COUNT reaches MAX_PROBES, probing will stop, and | |||
the PL sender enters the SEARCH_COMPLETE state. | the PL sender enters the SEARCH_COMPLETE state. | |||
5.3.2. Selection of Probe Sizes | 5.3.2. Selection of Probe Sizes | |||
The search algorithm needs to determine a minimum useful gain in | The search algorithm determines a minimum useful gain in PLPMTU. It | |||
PLPMTU. It would not be constructive for a PL sender to attempt to | would not be constructive for a PL sender to attempt to probe for all | |||
probe for all sizes. This would incur unnecessary load on the path | sizes. This would incur unnecessary load on the path. | |||
and has the undesirable effect of slowing the time to reach a more | Implementations SHOULD select the set of probe packet sizes to | |||
optimal MPS. Implementations SHOULD select the set of probe packet | maximize the gain in PLPMTU from each search step. | |||
sizes to maximize the gain in PLPMTU from each search step. | ||||
Implementations could optimize the search procedure by selecting step | Implementations could optimize the search procedure by selecting step | |||
sizes from a table of common PMTU sizes. When selecting the | sizes from a table of common PMTU sizes. When selecting the | |||
appropriate next size to search, an implementer ought to also | appropriate next size to search, an implementer ought to also | |||
consider that there can be common sizes of MPS that applications seek | consider that there can be common sizes of MPS that applications seek | |||
to use, and their could be common sizes of MTU used within the | to use, and their could be common sizes of MTU used within the | |||
network. | network. | |||
5.3.3. Resilience to Inconsistent Path Information | 5.3.3. Resilience to Inconsistent Path Information | |||
A decision to increase the PLPMTU needs to be resilient to the | A decision to increase the PLPMTU needs to be resilient to the | |||
possibility that information learned about the network path is | possibility that information learned about the network path is | |||
inconsistent. A path is inconsistent, when, for example, probe | inconsistent. A path is inconsistent, when, for example, probe | |||
packets are lost due to other reasons (i.e. not packet size) or due | packets are lost due to other reasons (i.e., not packet size) or due | |||
to frequent path changes. Frequent path changes could occur by | to frequent path changes. Frequent path changes could occur by | |||
unexpected "flapping" - where some packets from a flow pass along one | unexpected "flapping" - where some packets from a flow pass along one | |||
path, but other packets follow a different path with different | path, but other packets follow a different path with different | |||
properties. | properties. | |||
A PL sender is able to detect inconsistency from the sequence of | A PL sender is able to detect inconsistency from the sequence of | |||
PLPMTU probes that it sends or the sequence of PTB messages that it | PLPMTU probes that are acknowledged or the sequence of PTB messages | |||
receives. When inconsistent path information is detected, a PL | that it receives. When inconsistent path information is detected, a | |||
sender could use an alternate search mode that clamps the offered MPS | PL sender could use an alternate search mode that clamps the offered | |||
to a smaller value for a period of time. This avoids unnecessary | MPS to a smaller value for a period of time. This avoids unnecessary | |||
loss of packets due to MTU limitation. | loss of packets. | |||
5.4. Robustness to Inconsistent Paths | 5.4. Robustness to Inconsistent Paths | |||
Some paths could be unable to sustain packets of the BASE_PMTU size. | Some paths could be unable to sustain packets of the BASE_PMTU size. | |||
To be robust to these paths an implementation could implement the | To be robust to these paths an implementation could implement the | |||
Error State. This allows fallback to a smaller than desired PLPMTU, | Error State. This allows fallback to a smaller than desired PLPMTU, | |||
rather than suffer connectivity failure. This could utilize methods | rather than suffer connectivity failure. This could utilize methods | |||
such as endpoint IP fragmentation to enable the PL sender to | such as endpoint IP fragmentation to enable the PL sender to | |||
communicate using packets smaller than the BASE_PMTU. | communicate using packets smaller than the BASE_PMTU. | |||
skipping to change at page 28, line 35 ¶ | skipping to change at page 30, line 44 ¶ | |||
DPLPMTUD requires protocol-specific details to be specified for each | DPLPMTUD requires protocol-specific details to be specified for each | |||
PL that is used. | PL that is used. | |||
The first subsection provides guidance on how to implement the | The first subsection provides guidance on how to implement the | |||
DPLPMTUD method as a part of an application using UDP or UDP-Lite. | DPLPMTUD method as a part of an application using UDP or UDP-Lite. | |||
The guidance also applies to other datagram services that do not | The guidance also applies to other datagram services that do not | |||
include a specific transport protocol (such as a tunnel | include a specific transport protocol (such as a tunnel | |||
encapsulation). The following subsections describe how DPLPMTUD can | encapsulation). The following subsections describe how DPLPMTUD can | |||
be implemented as a part of the transport service, allowing | be implemented as a part of the transport service, allowing | |||
applications using the service to benefit from discovery of the | applications using the service to benefit from discovery of the | |||
PLPMTU without themselves needing to implement this method. | PLPMTU without themselves needing to implement this method when using | |||
SCTP and QUIC. | ||||
6.1. Application support for DPLPMTUD with UDP or UDP-Lite | 6.1. Application support for DPLPMTUD with UDP or UDP-Lite | |||
The current specifications of UDP [RFC0768] and UDP-Lite [RFC3828] do | The current specifications of UDP [RFC0768] and UDP-Lite [RFC3828] do | |||
not define a method in the RFC-series that supports PLPMTUD. In | not define a method in the RFC-series that supports PLPMTUD. In | |||
particular, the UDP transport does not provide the transport layer | particular, the UDP transport does not provide the transport features | |||
features needed to implement datagram PLPMTUD. | needed to implement datagram PLPMTUD. | |||
The DPLPMTUD method can be implemented as a part of an application | The DPLPMTUD method can be implemented as a part of an application | |||
built directly or indirectly on UDP or UDP-Lite, but relies on | built directly or indirectly on UDP or UDP-Lite, but relies on | |||
higher-layer protocol features to implement the method [RFC8085]. | higher-layer protocol features to implement the method [RFC8085]. | |||
Some primitives used by DPLPMTUD might not be available via the | Some primitives used by DPLPMTUD might not be available via the | |||
Datagram API (e.g., the ability to access the PLPMTU from the IP | Datagram API (e.g., the ability to access the PLPMTU from the IP | |||
layer cache, or interpret received PTB messages). | layer cache, or interpret received PTB messages). | |||
In addition, it is desirable that PMTU discovery is not performed by | In addition, it is desirable that PMTU discovery is not performed by | |||
skipping to change at page 29, line 26 ¶ | skipping to change at page 31, line 34 ¶ | |||
destination endpoint. The method SHOULD allow the sender to check | destination endpoint. The method SHOULD allow the sender to check | |||
the value returned in the response to provide additional protection | the value returned in the response to provide additional protection | |||
from off-path insertion of data [RFC8085], suitable methods include a | from off-path insertion of data [RFC8085], suitable methods include a | |||
parameter known only to the two endpoints, such as a session ID or | parameter known only to the two endpoints, such as a session ID or | |||
initialized sequence number. | initialized sequence number. | |||
6.1.2. Application Response | 6.1.2. Application Response | |||
An application needs an application-layer protocol mechanism to | An application needs an application-layer protocol mechanism to | |||
communicate the response from the destination endpoint. This | communicate the response from the destination endpoint. This | |||
response may indicate successful reception of the probe across the | response could indicate successful reception of the probe across the | |||
path, but could also indicate that some (or all packets) have failed | path, but could also indicate that some (or all packets) have failed | |||
to reach the destination. | to reach the destination. | |||
6.1.3. Sending Application Probe Packets | 6.1.3. Sending Application Probe Packets | |||
A probe packet that may carry an application data block, but the | A probe packet that could carry an application data block, but the | |||
successful transmission of this data is at risk when used for | successful transmission of this data is at risk when used for | |||
probing. Some applications may prefer to use a probe packet that | probing. Some applications might prefer to use a probe packet that | |||
does not carry an application data block to avoid disruption to data | does not carry an application data block to avoid disruption to data | |||
transfer. | transfer. | |||
6.1.4. Initial Connectivity | 6.1.4. Initial Connectivity | |||
An application that does not have other higher-layer information | An application that does not have other higher-layer information | |||
confirming connectivity with the remote peer SHOULD implement a | confirming connectivity with the remote peer SHOULD implement a | |||
connectivity mechanism using acknowledged probe packets before | connectivity mechanism using acknowledged probe packets before | |||
entering the BASE state. | entering the BASE state. | |||
skipping to change at page 30, line 12 ¶ | skipping to change at page 32, line 19 ¶ | |||
CONFIRMATION_TIMER to periodically send probe packets while in the | CONFIRMATION_TIMER to periodically send probe packets while in the | |||
SEARCH_COMPLETE state. | SEARCH_COMPLETE state. | |||
6.1.6. Handling of PTB Messages | 6.1.6. Handling of PTB Messages | |||
An application that is able and wishes to receive PTB messages MUST | An application that is able and wishes to receive PTB messages MUST | |||
perform ICMP validation as specified in Section 5.2 of [RFC8085]. | perform ICMP validation as specified in Section 5.2 of [RFC8085]. | |||
This requires that the application to check each received PTB | This requires that the application to check each received PTB | |||
messages to validate it is received in response to transmitted | messages to validate it is received in response to transmitted | |||
traffic and that the reported PTB_SIZE is less than the current | traffic and that the reported PTB_SIZE is less than the current | |||
probed size (see Section 4.5.2). A validated PTB message MAY be used | probed size (see Section 4.6.2). A validated PTB message MAY be used | |||
as input to the DPLPMTUD algorithm, but MUST NOT be used directly to | as input to the DPLPMTUD algorithm, but MUST NOT be used directly to | |||
set the PLPMTU. | set the PLPMTU. | |||
6.2. DPLPMTUD for SCTP | 6.2. DPLPMTUD for SCTP | |||
Section 10.2 of [RFC4821] specifies a recommended PLPMTUD probing | Section 10.2 of [RFC4821] specified a recommended PLPMTUD probing | |||
method for SCTP. It recommends the use of the PAD chunk, defined in | method for SCTP and Section 7.3 of [RFC4960] and recommended an | |||
[RFC4820] to be attached to a minimum length HEARTBEAT chunk to build | endpoint apply the techniques in RFC4821 on a per-destination-address | |||
a probe packet. This enables probing without affecting the transfer | basis. The specification for DPLPMTUD continues the practice of | |||
of user messages and without interfering with congestion control. | using the PL to discover the PMTU, but updates, RFC4960 with a | |||
This is preferred to using DATA chunks (with padding as required) as | recommendation to use the method specified in this document: The | |||
path probes. | RECOMMENDED method for generating probes is to add a chunk consisting | |||
only of padding to an SCTP message. The PAD chunk defined in | ||||
[RFC4820] SHOULD be attached to a minimum length HEARTBEAT (HB) chunk | ||||
to build a probe packet. This enables probing without affecting the | ||||
transfer of user messages and without being limited by congestion | ||||
control or flow control. This is preferred to using DATA chunks | ||||
(with padding as required) as path probes. | ||||
Section 6.9 of [RFC4960] describes dividing the user messages into | ||||
data chunks sent by the PL when using SCTP. This notes that once an | ||||
SCTP message has been sent, it cannot be re-segmented. [RFC4960] | ||||
describes the method to retransmit data chunks when the MPS has | ||||
reduced, and the use of IP fragmentation for this case. | ||||
6.2.1. SCTP/IPv4 and SCTP/IPv6 | 6.2.1. SCTP/IPv4 and SCTP/IPv6 | |||
6.2.1.1. Initial Connectivity | 6.2.1.1. Initial Connectivity | |||
The base protocol is specified in [RFC4960]. This provides an | The base protocol is specified in [RFC4960]. This provides an | |||
acknowledged PL. A sender can therefore enter the BASE state as soon | acknowledged PL. A sender can therefore enter the BASE state as soon | |||
as connectivity has been confirmed. | as connectivity has been confirmed. | |||
6.2.1.2. Sending SCTP Probe Packets | 6.2.1.2. Sending SCTP Probe Packets | |||
Probe packets consist of an SCTP common header followed by a | Probe packets consist of an SCTP common header followed by a | |||
HEARTBEAT chunk and a PAD chunk. The PAD chunk is used to control | HEARTBEAT chunk and a PAD chunk. The PAD chunk is used to control | |||
the length of the probe packet. The HEARTBEAT chunk is used to | the length of the probe packet. The HEARTBEAT chunk is used to | |||
trigger the sending of a HEARTBEAT ACK chunk. The reception of the | trigger the sending of a HEARTBEAT ACK chunk. The reception of the | |||
HEARTBEAT ACK chunk acknowledges reception of a successful probe. | HEARTBEAT ACK chunk acknowledges reception of a successful probe. A | |||
successful probe updates the association and path counters, but an | ||||
unsuccessful probe is discounted (assumed to be a result of choosing | ||||
too large a PLPMTU). | ||||
The HEARTBEAT chunk carries a Heartbeat Information parameter which | The HEARTBEAT chunk carries a Heartbeat Information parameter which | |||
should include, besides the information suggested in [RFC4960], the | includes, besides the information suggested in [RFC4960], the probe | |||
probe size, which is the size of the complete datagram. The size of | size, which is the size of the complete datagram. The size of the | |||
the PAD chunk is therefore computed by reducing the probing size by | PAD chunk is therefore computed by reducing the probing size by the | |||
the IPv4 or IPv6 header size, the SCTP common header, the HEARTBEAT | IPv4 or IPv6 header size, the SCTP common header, the HEARTBEAT | |||
request and the PAD chunk header. The payload of the PAD chunk | request and the PAD chunk header. The payload of the PAD chunk | |||
contains arbitrary data. | contains arbitrary data. | |||
To avoid fragmentation of retransmitted data, probing starts right | Probing starts directly after the PL handshake, before data is sent. | |||
after the PL handshake, before data is sent. Assuming this behavior | Assuming this behavior (i.e., the PMTU is smaller than or equal to | |||
(i.e., the PMTU is smaller than or equal to the interface MTU), this | the interface MTU), this process will take a few round trip time | |||
process will take a few round trip time periods depending on the | periods, dependent on the number of PMTU probes sent. The Heartbeat | |||
number of PMTU sizes probed. The Heartbeat timer can be used to | timer can be used to implement the PROBE_TIMER. | |||
implement the PROBE_TIMER. | ||||
6.2.1.3. Validating the Path with SCTP | 6.2.1.3. Validating the Path with SCTP | |||
Since SCTP provides an acknowledged PL, a sender MUST NOT implement | Since SCTP provides an acknowledged PL, a sender MUST NOT implement | |||
the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | |||
6.2.1.4. PTB Message Handling by SCTP | 6.2.1.4. PTB Message Handling by SCTP | |||
Normal ICMP validation MUST be performed as specified in Appendix C | Normal ICMP validation MUST be performed as specified in Appendix C | |||
of [RFC4960]. This requires that the first 8 bytes of the SCTP | of [RFC4960]. This requires that the first 8 bytes of the SCTP | |||
common header are quoted in the payload of the PTB message, which can | common header are quoted in the payload of the PTB message, which can | |||
be the case for ICMPv4 and is normally the case for ICMPv6. | be the case for ICMPv4 and is normally the case for ICMPv6. | |||
When a PTB message has been validated, the PTB_SIZE reported in the | When a PTB message has been validated, the PTB_SIZE reported in the | |||
PTB message SHOULD be used with the DPLPMTUD algorithm, providing | PTB message SHOULD be used with the DPLPMTUD algorithm, providing | |||
that the reported PTB_SIZE is less than the current probe size (see | that the reported PTB_SIZE is less than the current probe size (see | |||
Section 4.5). | Section 4.6). | |||
6.2.2. DPLPMTUD for SCTP/UDP | 6.2.2. DPLPMTUD for SCTP/UDP | |||
The UDP encapsulation of SCTP is specified in [RFC6951]. | The UDP encapsulation of SCTP is specified in [RFC6951]. | |||
6.2.2.1. Initial Connectivity | 6.2.2.1. Initial Connectivity | |||
A sender can enter the BASE state as soon as SCTP connectivity has | A sender can enter the BASE state as soon as SCTP connectivity has | |||
been confirmed. | been confirmed. | |||
skipping to change at page 32, line 11 ¶ | skipping to change at page 34, line 35 ¶ | |||
SCTP common header are contained in the PTB message, which can be the | SCTP common header are contained in the PTB message, which can be the | |||
case for ICMPv4 (but note the UDP header also consumes a part of the | case for ICMPv4 (but note the UDP header also consumes a part of the | |||
quoted packet header) and is normally the case for ICMPv6. When the | quoted packet header) and is normally the case for ICMPv6. When the | |||
validation is completed, the PTB_SIZE indicated in the PTB message | validation is completed, the PTB_SIZE indicated in the PTB message | |||
SHOULD be used with the DPLPMTUD providing that the reported PTB_SIZE | SHOULD be used with the DPLPMTUD providing that the reported PTB_SIZE | |||
is less than the current probe size. | is less than the current probe size. | |||
6.2.3. DPLPMTUD for SCTP/DTLS | 6.2.3. DPLPMTUD for SCTP/DTLS | |||
The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is | The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is | |||
specified in [RFC8261]. It is used for data channels in WebRTC | specified in [RFC8261]. This is used for data channels in WebRTC | |||
implementations. | implementations. | |||
6.2.3.1. Initial Connectivity | 6.2.3.1. Initial Connectivity | |||
A sender can enter the BASE state as soon as SCTP connectivity has | A sender can enter the BASE state as soon as SCTP connectivity has | |||
been confirmed. | been confirmed. | |||
6.2.3.2. Sending SCTP/DTLS Probe Packets | 6.2.3.2. Sending SCTP/DTLS Probe Packets | |||
Packet probing can be done as specified in Section 6.2.1.2. | Packet probing can be done, as specified in Section 6.2.1.2. | |||
6.2.3.3. Validating the Path with SCTP/DTLS | 6.2.3.3. Validating the Path with SCTP/DTLS | |||
Since SCTP provides an acknowledged PL, a sender MUST NOT implement | Since SCTP provides an acknowledged PL, a sender MUST NOT implement | |||
the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | |||
6.2.3.4. Handling of PTB Messages by SCTP/DTLS | 6.2.3.4. Handling of PTB Messages by SCTP/DTLS | |||
It is not possible to perform ICMP validation as specified in | [RFC4960] does not specify a way to validate SCTP/DTLS ICMP message | |||
[RFC4960], since even if the ICMP message payload contains sufficient | payload. This can prevent processing of PTB messages at the PL. | |||
information, the reflected SCTP common header would be encrypted. | ||||
Therefore it is not possible to process PTB messages at the PL. | ||||
6.3. DPLPMTUD for QUIC | 6.3. DPLPMTUD for QUIC | |||
QUIC [I-D.ietf-quic-transport] is a UDP-based transport that provides | QUIC [I-D.ietf-quic-transport] is a UDP-based transport that provides | |||
reception feedback. The UDP payload includes the QUIC packet header, | reception feedback. The UDP payload includes the QUIC packet header, | |||
protected payload, and any authentication fields. QUIC depends on a | protected payload, and any authentication fields. QUIC depends on a | |||
PMTU of at least 1280 bytes. | PMTU of at least 1280 bytes. | |||
Section 14.1 of [I-D.ietf-quic-transport] describes the path | Section 14 of [I-D.ietf-quic-transport] describes the path | |||
considerations when sending QUIC packets. It recommends the use of | considerations when sending QUIC packets. It recommends the use of | |||
PADDING frames to build the probe packet. Pure probe-only packets | PADDING frames to build the probe packet. Pure probe-only packets | |||
are constructed with PADDING frames and PING frames to create a | are constructed with PADDING frames and PING frames to create a | |||
padding only packet that will elicit an acknowledgement. Such | padding only packet that will elicit an acknowledgement. Such | |||
padding only packets enable probing without affecting the transfer of | padding only packets enable probing without affecting the transfer of | |||
other QUIC frames. | other QUIC frames. | |||
The recommendation for QUIC endpoints implementing DPLPMTUD is that a | The recommendation for QUIC endpoints implementing DPLPMTUD is that a | |||
MPS is maintained for each combination of local and remote IP | MPS is maintained for each combination of local and remote IP | |||
addresses [I-D.ietf-quic-transport]. If a QUIC endpoint determines | addresses [I-D.ietf-quic-transport]. If a QUIC endpoint determines | |||
that the PMTU between any pair of local and remote IP addresses has | that the PMTU between any pair of local and remote IP addresses has | |||
fallen below an acceptable MPS, it needs to immediately cease sending | fallen below an acceptable MPS, it immediately ceases to send QUIC | |||
QUIC packets on the affected path. This could result in termination | packets on the affected path. This could result in termination of | |||
of the connection if an alternative path cannot be found | the connection if an alternative path cannot be found | |||
[I-D.ietf-quic-transport]. | [I-D.ietf-quic-transport]. | |||
6.3.1. Initial Connectivity | 6.3.1. Initial Connectivity | |||
The base protocol is specified in [I-D.ietf-quic-transport]. This | The base protocol is specified in [I-D.ietf-quic-transport]. This | |||
provides an acknowledged PL. A sender can therefore enter the BASE | provides an acknowledged PL. A sender can therefore enter the BASE | |||
state as soon as connectivity has been confirmed. | state as soon as connectivity has been confirmed. | |||
6.3.2. Sending QUIC Probe Packets | 6.3.2. Sending QUIC Probe Packets | |||
A probe packet consists of a QUIC Header and a payload containing | A probe packet consists of a QUIC Header and a payload containing | |||
PADDING Frames and a PING Frame. PADDING Frames are a single octet | PADDING Frames and a PING Frame. PADDING Frames are a single octet | |||
(0x00) and several of these can be used to create a probe packet of | (0x00) and several of these can be used to create a probe packet of | |||
size PROBED_SIZE. QUIC provides an acknowledged PL, a sender can | size PROBED_SIZE. QUIC provides an acknowledged PL, a sender can | |||
therefore enter the BASE state as soon as connectivity has been | therefore enter the BASE state as soon as connectivity has been | |||
confirmed. | confirmed. | |||
The current specification of QUIC sets the following: | The current specification of QUIC sets the following: | |||
* BASE_PMTU: 1200. A QUIC sender needs to pad initial packets to | * BASE_PMTU: 1280. A QUIC sender pads initial packets to confirm | |||
1200 bytes to confirm the path can support packets of a useful | the path can support packets of the required size. | |||
size. | ||||
* MIN_PMTU: 1200 bytes. A QUIC sender that determines the PMTU has | * MIN_PMTU: 1280 bytes. A QUIC sender that determines the PLPMTU | |||
fallen below 1200 bytes MUST immediately stop sending on the | has fallen below 1280 bytes MUST immediately stop sending on the | |||
affected path. | affected path. | |||
6.3.3. Validating the Path with QUIC | 6.3.3. Validating the Path with QUIC | |||
QUIC provides an acknowledged PL. A sender therefore MUST NOT | QUIC provides an acknowledged PL. A sender therefore MUST NOT | |||
implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | |||
6.3.4. Handling of PTB Messages by QUIC | 6.3.4. Handling of PTB Messages by QUIC | |||
QUIC operates over the UDP transport, and the guidelines on ICMP | QUIC validates ICMP PTB messages. In addition to UDP Port | |||
validation as specified in Section 5.2 of [RFC8085] therefore apply. | validation, QUIC can validate an ICMP message by using other PL | |||
In addition to UDP Port validation QUIC can validate an ICMP message | information (e.g., validation of connection IDs in the quoted packet | |||
by looking for valid Connection IDs in the quoted packet. | of any received ICMP message). | |||
7. Acknowledgements | 7. Acknowledgements | |||
This work was partially funded by the European Union's Horizon 2020 | This work was partially funded by the European Union's Horizon 2020 | |||
research and innovation programme under grant agreement No. 644334 | research and innovation programme under grant agreement No. 644334 | |||
(NEAT). The views expressed are solely those of the author(s). | (NEAT). The views expressed are solely those of the author(s). | |||
Thanks to all that have commented or contributed, the TSVWG and QUIC | Thanks to all that have commented or contributed, the TSVWG and QUIC | |||
working groups, and Mathew Calder and Julius Flohr for providing | working groups, and Mathew Calder and Julius Flohr for providing | |||
implementations. | early implementations. | |||
8. IANA Considerations | 8. IANA Considerations | |||
This memo includes no request to IANA. | This memo includes no request to IANA. | |||
If there are no requirements for IANA, the section will be removed | If there are no requirements for IANA, the section will be removed | |||
during conversion into an RFC by the RFC Editor. | during conversion into an RFC by the RFC Editor. | |||
9. Security Considerations | 9. Security Considerations | |||
The security considerations for the use of UDP and SCTP are provided | The security considerations for the use of UDP and SCTP are provided | |||
in the references RFCs. The interval between individual probe | in the referenced RFCs. | |||
To avoid excessive load, the interval between individual probe | ||||
packets MUST be at least one RTT, and the interval between rounds of | packets MUST be at least one RTT, and the interval between rounds of | |||
probing is determined by the PMTU_RAISE_TIMER. | probing is determined by the PMTU_RAISE_TIMER. | |||
A PL sender needs to ensure that the method used to confirm reception | A PL sender needs to ensure that the method used to confirm reception | |||
of probe packets offers protection from off-path attackers injecting | of probe packets protects from off-path attackers injecting packets | |||
packets into the path. This protection if provided in IETF-defined | into the path. This protection if provided in IETF-defined protocols | |||
protocols (e.g., TCP, SCTP) using a randomly-initialized sequence | (e.g., TCP, SCTP) using a randomly-initialized sequence number. A | |||
number. A description of one way to do this when using UDP is | description of one way to do this when using UDP is provided in | |||
provided in section 5.1 of [RFC8085]). | section 5.1 of [RFC8085]). | |||
There are cases where ICMP Packet Too Big (PTB) messages are not | There are cases where ICMP Packet Too Big (PTB) messages are not | |||
delivered due to policy, configuration or equipment design (see | delivered due to policy, configuration or equipment design (see | |||
Section 1.1), this method therefore does not rely upon PTB messages | Section 1.1), this method therefore does not rely upon PTB messages | |||
being received, but is able to utilize these when they are received | being received, but is able to utilize these when they are received | |||
by the sender. PTB messages could potentially be used to cause a | by the sender. PTB messages could potentially be used to cause a | |||
node to inappropriately reduce the PLPMTU. A node supporting | node to inappropriately reduce the PLPMTU. A node supporting | |||
DPLPMTUD MUST therefore appropriately validate the payload of PTB | DPLPMTUD MUST therefore appropriately validate the payload of PTB | |||
messages to ensure these are received in response to transmitted | messages to ensure these are received in response to transmitted | |||
traffic (i.e., a reported error condition that corresponds to a | traffic (i.e., a reported error condition that corresponds to a | |||
datagram actually sent by the path layer, see Section 4.5.1). | datagram actually sent by the path layer, see Section 4.6.1). | |||
An on-path attacker, able to create a PTB message could forge PTB | An on-path attacker, able to create a PTB message could forge PTB | |||
messages that include a valid quoted IP packet. Such an attack could | messages that include a valid quoted IP packet. Such an attack could | |||
be used to drive down the PLPMTU. There are two ways this method can | be used to drive down the PLPMTU. There are two ways this method can | |||
be mitigated against such attacks: First, by ensuring that a PL | be mitigated against such attacks: First, by ensuring that a PL | |||
sender never reduces the PLPMTU below the base size, solely in | sender never reduces the PLPMTU below the base size, solely in | |||
response to receiving a PTB message. This is achieved by first | response to receiving a PTB message. This is achieved by first | |||
entering the BASE state when such a message is received. Second, the | entering the BASE state when such a message is received. Second, the | |||
design does not require processing of PTB messages, a PL sender could | design does not require processing of PTB messages, a PL sender could | |||
therefore suspend processing of PTB messages (e.g., in a robustness | therefore suspend processing of PTB messages (e.g., in a robustness | |||
mode after detecting that subsequent probes actually confirm that a | mode after detecting that subsequent probes actually confirm that a | |||
size larger than the PTB_SIZE is supported by a path). | size larger than the PTB_SIZE is supported by a path). | |||
The successful processing of an ICMP message can trigger a probe when | ||||
the reported PTB size is valid, but this does not directly update the | ||||
PLPMTU for the path. This prevents a message attempting to black | ||||
hole data by indicating a size larger than supported by the path. | ||||
Parallel forwarding paths SHOULD be considered. Section 5.4 | Parallel forwarding paths SHOULD be considered. Section 5.4 | |||
identifies the need for robustness in the method when the path | identifies the need for robustness in the method because the path | |||
information may be inconsistent. | information might be inconsistent. | |||
A node performing DPLPMTUD could experience conflicting information | A node performing DPLPMTUD could experience conflicting information | |||
about the size of supported probe packets. This could occur when | about the size of supported probe packets. This could occur when | |||
there are multiple paths are concurrently in use and these exhibit a | there are multiple paths are concurrently in use and these exhibit a | |||
different PMTU. If not considered, this could result in data being | different PMTU. If not considered, this could result in packets not | |||
black holed when the PLPMTU is larger than the smallest PMTU across | being delivered (black holed) when the PLPMTU is larger than the | |||
the current paths. | smallest actual PMTU. | |||
10. References | 10. References | |||
10.1. Normative References | 10.1. Normative References | |||
[I-D.ietf-quic-transport] | [I-D.ietf-quic-transport] | |||
Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed | Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed | |||
and Secure Transport", draft-ietf-quic-transport-20 (work | and Secure Transport", draft-ietf-quic-transport-20 (work | |||
in progress), 23 April 2019, | in progress), 23 April 2019, | |||
<http://www.ietf.org/internet-drafts/draft-ietf-quic- | <http://www.ietf.org/internet-drafts/draft-ietf-quic- | |||
skipping to change at page 36, line 47 ¶ | skipping to change at page 39, line 22 ¶ | |||
DOI 10.17487/RFC8201, July 2017, | DOI 10.17487/RFC8201, July 2017, | |||
<https://www.rfc-editor.org/info/rfc8201>. | <https://www.rfc-editor.org/info/rfc8201>. | |||
[RFC8261] Tuexen, M., Stewart, R., Jesup, R., and S. Loreto, | [RFC8261] Tuexen, M., Stewart, R., Jesup, R., and S. Loreto, | |||
"Datagram Transport Layer Security (DTLS) Encapsulation of | "Datagram Transport Layer Security (DTLS) Encapsulation of | |||
SCTP Packets", RFC 8261, DOI 10.17487/RFC8261, November | SCTP Packets", RFC 8261, DOI 10.17487/RFC8261, November | |||
2017, <https://www.rfc-editor.org/info/rfc8261>. | 2017, <https://www.rfc-editor.org/info/rfc8261>. | |||
10.2. Informative References | 10.2. Informative References | |||
[I-D.ietf-intarea-frag-fragile] | ||||
Bonica, R., Baker, F., Huston, G., Hinden, R., Troan, O., | ||||
and F. Gont, "IP Fragmentation Considered Fragile", draft- | ||||
ietf-intarea-frag-fragile-17 (work in progress), 30 | ||||
September 2019, | ||||
<http://www.ietf.org/internet-drafts/draft-ietf-intarea- | ||||
frag-fragile-17.txt>. | ||||
[I-D.ietf-intarea-tunnels] | [I-D.ietf-intarea-tunnels] | |||
Touch, J. and M. Townsley, "IP Tunnels in the Internet | Touch, J. and M. Townsley, "IP Tunnels in the Internet | |||
Architecture", draft-ietf-intarea-tunnels-10 (work in | Architecture", draft-ietf-intarea-tunnels-10 (work in | |||
progress), 12 September 2019, | progress), 12 September 2019, | |||
<http://www.ietf.org/internet-drafts/draft-ietf-intarea- | <http://www.ietf.org/internet-drafts/draft-ietf-intarea- | |||
tunnels-10.txt>. | tunnels-10.txt>. | |||
[RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, | [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, | |||
RFC 792, DOI 10.17487/RFC0792, September 1981, | RFC 792, DOI 10.17487/RFC0792, September 1981, | |||
<https://www.rfc-editor.org/info/rfc792>. | <https://www.rfc-editor.org/info/rfc792>. | |||
skipping to change at page 37, line 44 ¶ | skipping to change at page 40, line 25 ¶ | |||
[RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU | [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU | |||
Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, | Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, | |||
<https://www.rfc-editor.org/info/rfc4821>. | <https://www.rfc-editor.org/info/rfc4821>. | |||
[RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering | [RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering | |||
ICMPv6 Messages in Firewalls", RFC 4890, | ICMPv6 Messages in Firewalls", RFC 4890, | |||
DOI 10.17487/RFC4890, May 2007, | DOI 10.17487/RFC4890, May 2007, | |||
<https://www.rfc-editor.org/info/rfc4890>. | <https://www.rfc-editor.org/info/rfc4890>. | |||
[RFC5508] Srisuresh, P., Ford, B., Sivakumar, S., and S. Guha, "NAT | ||||
Behavioral Requirements for ICMP", BCP 148, RFC 5508, | ||||
DOI 10.17487/RFC5508, April 2009, | ||||
<https://www.rfc-editor.org/info/rfc5508>. | ||||
Appendix A. Revision Notes | Appendix A. Revision Notes | |||
Note to RFC-Editor: please remove this entire section prior to | Note to RFC-Editor: please remove this entire section prior to | |||
publication. | publication. | |||
Individual draft -00: | Individual draft -00: | |||
* Comments and corrections are welcome directly to the authors or | * Comments and corrections are welcome directly to the authors or | |||
via the IETF TSVWG working group mailing list. | via the IETF TSVWG working group mailing list. | |||
skipping to change at page 41, line 20 ¶ | skipping to change at page 44, line 4 ¶ | |||
* Reinforce that PROBE_COUNT is successive attempts to probe for any | * Reinforce that PROBE_COUNT is successive attempts to probe for any | |||
size | size | |||
* Redefine MAx_PROBES to 3 | * Redefine MAx_PROBES to 3 | |||
* Address PTB_SIZE of 0 or less that MIN_PMTU | * Address PTB_SIZE of 0 or less that MIN_PMTU | |||
Working group draft -11: | Working group draft -11: | |||
* Restore a sentence removed in previous rev | * Restore a sentence removed in previous rev | |||
* De-acronymise QUIC | * De-acronymise QUIC | |||
* Address some nits | * Address some nits | |||
Working group draft -12: | Working group draft -12: | |||
* Add TSVWG, QUIC and implementers to acknowledgements | * Add TSVWG, QUIC and implementers to acknowledgements | |||
* Shorten a diagram line | * Shorten a diagram line. | |||
* Address nits from Julius and Wes | * Address nits from Julius and Wes. | |||
* Be clearer when talking about IP layer caches | * Be clearer when talking about IP layer caches | |||
Authors' Addresses | Authors' Addresses | |||
Godred Fairhurst | Godred Fairhurst | |||
University of Aberdeen | University of Aberdeen | |||
School of Engineering, Fraser Noble Building | School of Engineering, Fraser Noble Building | |||
Aberdeen | Aberdeen | |||
AB24 3UE | AB24 3UE | |||
End of changes. 131 change blocks. | ||||
395 lines changed or deleted | 510 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |