draft-ietf-tsvwg-datagram-plpmtud-19.txt | draft-ietf-tsvwg-datagram-plpmtud-20.txt | |||
---|---|---|---|---|
Internet Engineering Task Force G. Fairhurst | Internet Engineering Task Force G. Fairhurst | |||
Internet-Draft T. Jones | Internet-Draft T. Jones | |||
Updates: 4821, 4960, 6951, 8085, 8261 (if University of Aberdeen | Updates: 4821, 4960, 6951, 8085, 8261 (if University of Aberdeen | |||
approved) M. Tuexen | approved) M. Tuexen | |||
Intended status: Standards Track I. Ruengeler | Intended status: Standards Track I. Ruengeler | |||
Expires: 5 October 2020 T. Voelker | Expires: 8 November 2020 T. Voelker | |||
Muenster University of Applied Sciences | Muenster University of Applied Sciences | |||
3 April 2020 | 7 May 2020 | |||
Packetization Layer Path MTU Discovery for Datagram Transports | Packetization Layer Path MTU Discovery for Datagram Transports | |||
draft-ietf-tsvwg-datagram-plpmtud-19 | draft-ietf-tsvwg-datagram-plpmtud-20 | |||
Abstract | Abstract | |||
This document describes a robust method for Path MTU Discovery | This document describes a robust method for Path MTU Discovery | |||
(PMTUD) for datagram Packetization Layers (PLs). It describes an | (PMTUD) for datagram Packetization Layers (PLs). It describes an | |||
extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path | extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path | |||
MTU Discovery for IPv4 and IPv6. The method allows a PL, or a | MTU Discovery for IPv4 and IPv6. The method allows a PL, or a | |||
datagram application that uses a PL, to discover whether a network | datagram application that uses a PL, to discover whether a network | |||
path can support the current size of datagram. This can be used to | path can support the current size of datagram. This can be used to | |||
detect and reduce the message size when a sender encounters a packet | detect and reduce the message size when a sender encounters a packet | |||
black hole (where packets are discarded). The method can probe a | black hole (where packets are discarded). The method can probe a | |||
network path with progressively larger packets to discover whether | network path with progressively larger packets to discover whether | |||
the maximum packet size can be increased. This allows a sender to | the maximum packet size can be increased. This allows a sender to | |||
determine an appropriate packet size, providing functionality for | determine an appropriate packet size, providing functionality for | |||
datagram transports that is equivalent to the Packetization Layer | datagram transports that is equivalent to the Packetization Layer | |||
PMTUD specification for TCP, specified in RFC 4821. | PMTUD specification for TCP, specified in RFC 4821. | |||
This document updates RFC 4821 to specify the method for datagram | This document updates RFC 4821 to specify the PLPMTUD method for | |||
PLs, and updates RFC 8085 as the method to use in place of RFC 4821 | datagram PLs. It also updates RFC 8085 to refer to the method | |||
with UDP datagrams. Section 7.3 of RFC4960 recommends an endpoint | specified in this document instead of the method in RFC 4821 for use | |||
with UDP datagrams. Section 7.3 of RFC 4960 recommends an endpoint | ||||
apply the techniques in RFC 4821 on a per-destination-address basis. | apply the techniques in RFC 4821 on a per-destination-address basis. | |||
RFC 4960, RFC 6951 and RFC 8261 are updated to recommend that SCTP, | RFC 4960, RFC 6951, and RFC 8261 are updated to recommend that SCTP, | |||
SCTP encapsulated in UDP and SCTP encapsulated in DTLS use the method | SCTP encapsulated in UDP and SCTP encapsulated in DTLS use the method | |||
specified in this document instead of the method in RFC 4821. | specified in this document instead of the method in RFC 4821. | |||
The document also provides implementation notes for incorporating | The document also provides implementation notes for incorporating | |||
Datagram PMTUD into IETF datagram transports or applications that use | Datagram PMTUD into IETF datagram transports or applications that use | |||
datagram transports. | datagram transports. | |||
When published, this specification updates RFC 4960, RFC 4821, RFC | When published, this specification updates RFC 4960, RFC 4821, RFC | |||
8085 and RFC 8261. | 8085 and RFC 8261. | |||
skipping to change at page 2, line 15 ¶ | skipping to change at page 2, line 20 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on 5 October 2020. | This Internet-Draft will expire on 8 November 2020. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2020 IETF Trust and the persons identified as the | Copyright (c) 2020 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents (https://trustee.ietf.org/ | |||
license-info) in effect on the date of publication of this document. | license-info) in effect on the date of publication of this document. | |||
Please review these documents carefully, as they describe your rights | Please review these documents carefully, as they describe your rights | |||
skipping to change at page 2, line 42 ¶ | skipping to change at page 2, line 47 ¶ | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 4 | 1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 4 | |||
1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 6 | 1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 6 | |||
1.3. Path MTU Discovery for Datagram Services . . . . . . . . 7 | 1.3. Path MTU Discovery for Datagram Services . . . . . . . . 7 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 8 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 11 | 3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 11 | |||
4. DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . . 14 | 4. DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . . 14 | |||
4.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 14 | 4.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 14 | |||
4.2. Confirmation of Probed Packet Size . . . . . . . . . . . 15 | 4.2. Confirmation of Probed Packet Size . . . . . . . . . . . 15 | |||
4.3. Black Hole Detection and Reducing the PLPMTU . . . . . . 15 | 4.3. Black Hole Detection and Reducing the PLPMTU . . . . . . 16 | |||
4.4. The Maximum Packet Size (MPS) . . . . . . . . . . . . . . 16 | 4.4. The Maximum Packet Size (MPS) . . . . . . . . . . . . . . 17 | |||
4.5. Disabling the Effect of PMTUD . . . . . . . . . . . . . . 17 | 4.5. Disabling the Effect of PMTUD . . . . . . . . . . . . . . 18 | |||
4.6. Response to PTB Messages . . . . . . . . . . . . . . . . 18 | 4.6. Response to PTB Messages . . . . . . . . . . . . . . . . 18 | |||
4.6.1. Validation of PTB Messages . . . . . . . . . . . . . 18 | 4.6.1. Validation of PTB Messages . . . . . . . . . . . . . 18 | |||
4.6.2. Use of PTB Messages . . . . . . . . . . . . . . . . . 19 | 4.6.2. Use of PTB Messages . . . . . . . . . . . . . . . . . 19 | |||
5. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 20 | 5. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 20 | |||
5.1. DPLPMTUD Components . . . . . . . . . . . . . . . . . . . 21 | 5.1. DPLPMTUD Components . . . . . . . . . . . . . . . . . . . 21 | |||
5.1.1. Timers . . . . . . . . . . . . . . . . . . . . . . . 21 | 5.1.1. Timers . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
5.1.2. Constants . . . . . . . . . . . . . . . . . . . . . . 22 | 5.1.2. Constants . . . . . . . . . . . . . . . . . . . . . . 22 | |||
5.1.3. Variables . . . . . . . . . . . . . . . . . . . . . . 23 | 5.1.3. Variables . . . . . . . . . . . . . . . . . . . . . . 23 | |||
5.1.4. Overview of DPLPMTUD Phases . . . . . . . . . . . . . 24 | 5.1.4. Overview of DPLPMTUD Phases . . . . . . . . . . . . . 24 | |||
5.2. State Machine . . . . . . . . . . . . . . . . . . . . . . 26 | 5.2. State Machine . . . . . . . . . . . . . . . . . . . . . . 26 | |||
5.3. Search to Increase the PLPMTU . . . . . . . . . . . . . . 29 | 5.3. Search to Increase the PLPMTU . . . . . . . . . . . . . . 29 | |||
5.3.1. Probing for a larger PLPMTU . . . . . . . . . . . . . 29 | 5.3.1. Probing for a larger PLPMTU . . . . . . . . . . . . . 29 | |||
5.3.2. Selection of Probe Sizes . . . . . . . . . . . . . . 30 | 5.3.2. Selection of Probe Sizes . . . . . . . . . . . . . . 30 | |||
skipping to change at page 3, line 18 ¶ | skipping to change at page 3, line 24 ¶ | |||
5.3.2. Selection of Probe Sizes . . . . . . . . . . . . . . 30 | 5.3.2. Selection of Probe Sizes . . . . . . . . . . . . . . 30 | |||
5.3.3. Resilience to Inconsistent Path Information . . . . . 30 | 5.3.3. Resilience to Inconsistent Path Information . . . . . 30 | |||
5.4. Robustness to Inconsistent Paths . . . . . . . . . . . . 31 | 5.4. Robustness to Inconsistent Paths . . . . . . . . . . . . 31 | |||
6. Specification of Protocol-Specific Methods . . . . . . . . . 31 | 6. Specification of Protocol-Specific Methods . . . . . . . . . 31 | |||
6.1. Application support for DPLPMTUD with UDP or UDP-Lite . . 31 | 6.1. Application support for DPLPMTUD with UDP or UDP-Lite . . 31 | |||
6.1.1. Application Request . . . . . . . . . . . . . . . . . 32 | 6.1.1. Application Request . . . . . . . . . . . . . . . . . 32 | |||
6.1.2. Application Response . . . . . . . . . . . . . . . . 32 | 6.1.2. Application Response . . . . . . . . . . . . . . . . 32 | |||
6.1.3. Sending Application Probe Packets . . . . . . . . . . 32 | 6.1.3. Sending Application Probe Packets . . . . . . . . . . 32 | |||
6.1.4. Initial Connectivity . . . . . . . . . . . . . . . . 32 | 6.1.4. Initial Connectivity . . . . . . . . . . . . . . . . 32 | |||
6.1.5. Validating the Path . . . . . . . . . . . . . . . . . 32 | 6.1.5. Validating the Path . . . . . . . . . . . . . . . . . 32 | |||
6.1.6. Handling of PTB Messages . . . . . . . . . . . . . . 33 | 6.1.6. Handling of PTB Messages . . . . . . . . . . . . . . 32 | |||
6.2. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 33 | 6.2. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 33 | |||
6.2.1. SCTP/IPv4 and SCTP/IPv6 . . . . . . . . . . . . . . . 33 | 6.2.1. SCTP/IPv4 and SCTP/IPv6 . . . . . . . . . . . . . . . 33 | |||
6.2.1.1. Initial Connectivity . . . . . . . . . . . . . . 33 | 6.2.1.1. Initial Connectivity . . . . . . . . . . . . . . 33 | |||
6.2.1.2. Sending SCTP Probe Packets . . . . . . . . . . . 33 | 6.2.1.2. Sending SCTP Probe Packets . . . . . . . . . . . 33 | |||
6.2.1.3. Validating the Path with SCTP . . . . . . . . . . 34 | 6.2.1.3. Validating the Path with SCTP . . . . . . . . . . 34 | |||
6.2.1.4. PTB Message Handling by SCTP . . . . . . . . . . 34 | 6.2.1.4. PTB Message Handling by SCTP . . . . . . . . . . 34 | |||
6.2.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 34 | 6.2.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 34 | |||
6.2.2.1. Initial Connectivity . . . . . . . . . . . . . . 35 | 6.2.2.1. Initial Connectivity . . . . . . . . . . . . . . 35 | |||
6.2.2.2. Sending SCTP/UDP Probe Packets . . . . . . . . . 35 | 6.2.2.2. Sending SCTP/UDP Probe Packets . . . . . . . . . 35 | |||
6.2.2.3. Validating the Path with SCTP/UDP . . . . . . . . 35 | 6.2.2.3. Validating the Path with SCTP/UDP . . . . . . . . 35 | |||
6.2.2.4. Handling of PTB Messages by SCTP/UDP . . . . . . 35 | 6.2.2.4. Handling of PTB Messages by SCTP/UDP . . . . . . 35 | |||
6.2.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . 35 | 6.2.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . 35 | |||
6.2.3.1. Initial Connectivity . . . . . . . . . . . . . . 35 | 6.2.3.1. Initial Connectivity . . . . . . . . . . . . . . 35 | |||
6.2.3.2. Sending SCTP/DTLS Probe Packets . . . . . . . . . 35 | 6.2.3.2. Sending SCTP/DTLS Probe Packets . . . . . . . . . 36 | |||
6.2.3.3. Validating the Path with SCTP/DTLS . . . . . . . 36 | 6.2.3.3. Validating the Path with SCTP/DTLS . . . . . . . 36 | |||
6.2.3.4. Handling of PTB Messages by SCTP/DTLS . . . . . . 36 | 6.2.3.4. Handling of PTB Messages by SCTP/DTLS . . . . . . 36 | |||
6.3. DPLPMTUD for QUIC . . . . . . . . . . . . . . . . . . . . 36 | 6.3. DPLPMTUD for QUIC . . . . . . . . . . . . . . . . . . . . 36 | |||
6.3.1. Initial Connectivity . . . . . . . . . . . . . . . . 36 | 6.3.1. Initial Connectivity . . . . . . . . . . . . . . . . 36 | |||
6.3.2. Sending QUIC Probe Packets . . . . . . . . . . . . . 36 | 6.3.2. Sending QUIC Probe Packets . . . . . . . . . . . . . 37 | |||
6.3.3. Validating the Path with QUIC . . . . . . . . . . . . 37 | 6.3.3. Validating the Path with QUIC . . . . . . . . . . . . 37 | |||
6.3.4. Handling of PTB Messages by QUIC . . . . . . . . . . 37 | 6.3.4. Handling of PTB Messages by QUIC . . . . . . . . . . 37 | |||
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 37 | 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 37 | |||
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 37 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 37 | |||
9. Security Considerations . . . . . . . . . . . . . . . . . . . 37 | 9. Security Considerations . . . . . . . . . . . . . . . . . . . 38 | |||
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 39 | 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 39 | |||
10.1. Normative References . . . . . . . . . . . . . . . . . . 39 | 10.1. Normative References . . . . . . . . . . . . . . . . . . 39 | |||
10.2. Informative References . . . . . . . . . . . . . . . . . 40 | 10.2. Informative References . . . . . . . . . . . . . . . . . 41 | |||
Appendix A. Revision Notes . . . . . . . . . . . . . . . . . . . 42 | Appendix A. Revision Notes . . . . . . . . . . . . . . . . . . . 42 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 46 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 46 | |||
1. Introduction | 1. Introduction | |||
The IETF has specified datagram transport using UDP, SCTP, and DCCP, | The IETF has specified datagram transport using UDP, SCTP, and DCCP, | |||
as well as protocols layered on top of these transports (e.g., SCTP/ | as well as protocols layered on top of these transports (e.g., SCTP/ | |||
UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP | UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP | |||
network layer. This document describes a robust method for Path MTU | network layer. This document describes a robust method for Path MTU | |||
Discovery (PMTUD) that can be used with these transport protocols (or | Discovery (PMTUD) that can be used with these transport protocols (or | |||
skipping to change at page 4, line 38 ¶ | skipping to change at page 4, line 39 ¶ | |||
probe packets. | probe packets. | |||
Packets not intended as probe packets are either fragmented to the | Packets not intended as probe packets are either fragmented to the | |||
current effective PMTU, or the attempt to send fails with an error | current effective PMTU, or the attempt to send fails with an error | |||
code. Applications can be provided with a primitive to let them read | code. Applications can be provided with a primitive to let them read | |||
the Maximum Packet Size (MPS), derived from the current effective | the Maximum Packet Size (MPS), derived from the current effective | |||
PMTU. | PMTU. | |||
Classical PMTUD is subject to protocol failures. One failure arises | Classical PMTUD is subject to protocol failures. One failure arises | |||
when traffic using a packet size larger than the actual PMTU is | when traffic using a packet size larger than the actual PMTU is | |||
black-holed (all datagrams sent with this size, or larger, are | black-holed (all datagrams larger than the actual PMTU, are | |||
discarded). This could arise when the PTB messages are not delivered | discarded). This could arise when the PTB messages are not delivered | |||
back to the sender for some reason (see for example [RFC2923]). | back to the sender for some reason (see for example [RFC2923]). | |||
Examples where PTB messages are not delivered include: | Examples where PTB messages are not delivered include: | |||
* The generation of ICMP messages is usually rate limited. This | * The generation of ICMP messages is usually rate limited. This | |||
could result in no PTB messages being generated to the sender (see | could result in no PTB messages being generated to the sender (see | |||
section 2.4 of [RFC4443]) | section 2.4 of [RFC4443]) | |||
* ICMP messages can be filtered by middleboxes (including firewalls) | * ICMP messages can be filtered by middleboxes (including firewalls) | |||
[RFC4890]. A stateful firewall could be configured with a policy | [RFC4890]. A firewall could be configured with a policy to block | |||
to block incoming ICMP messages, which would prevent reception of | incoming ICMP messages, which would prevent reception of PTB | |||
PTB messages to a sending endpoint behind this firewall. | messages to a sending endpoint behind this firewall. | |||
* When the router issuing the ICMP message drops a tunneled packet, | * When the router issuing the ICMP message drops a tunneled packet, | |||
the resulting ICMP message will be directed to the tunnel ingress. | the resulting ICMP message will be directed to the tunnel ingress. | |||
This tunnel endpoint is responsible for forwarding the ICMP | This tunnel endpoint is responsible for forwarding the ICMP | |||
message and also processing the quoted packet within the payload | message and also processing the quoted packet within the payload | |||
field to remove the effect of the tunnel, and return a correctly | field to remove the effect of the tunnel, and return a correctly | |||
formatted ICMP message to the sender [I-D.ietf-intarea-tunnels]. | formatted ICMP message to the sender [I-D.ietf-intarea-tunnels]. | |||
Failure to do this prevents the PTB message reaching the original | Failure to do this prevents the PTB message reaching the original | |||
sender. | sender. | |||
skipping to change at page 6, line 51 ¶ | skipping to change at page 7, line 6 ¶ | |||
layer that is responsible for placing data blocks into the payload of | layer that is responsible for placing data blocks into the payload of | |||
IP packets and selecting an appropriate MPS. This function is often | IP packets and selecting an appropriate MPS. This function is often | |||
performed by a transport protocol (e.g., DCCP, RTP, SCTP, QUIC), but | performed by a transport protocol (e.g., DCCP, RTP, SCTP, QUIC), but | |||
can also be performed by other encapsulation methods working above | can also be performed by other encapsulation methods working above | |||
the transport layer. | the transport layer. | |||
In contrast to PMTUD, Packetization Layer Path MTU Discovery | In contrast to PMTUD, Packetization Layer Path MTU Discovery | |||
(PLPMTUD) [RFC4821] introduced a method that does not rely upon | (PLPMTUD) [RFC4821] introduced a method that does not rely upon | |||
reception and validation of PTB messages. It is therefore more | reception and validation of PTB messages. It is therefore more | |||
robust than Classical PMTUD. This has become the recommended | robust than Classical PMTUD. This has become the recommended | |||
approach for implementing discovery of the PMTU [RFC8085]. | approach for implementing discovery of the PMTU [BCP145]. | |||
It uses a general strategy where the PL sends probe packets to search | It uses a general strategy where the PL sends probe packets to search | |||
for the largest size of unfragmented datagram that can be sent over a | for the largest size of unfragmented datagram that can be sent over a | |||
network path. Probe packets are sent to explore using a larger | network path. Probe packets are sent to explore using a larger | |||
packet size. If a probe packet is successfully delivered (as | packet size. If a probe packet is successfully delivered (as | |||
determined by the PL), then the PLPMTU is raised to the size of the | determined by the PL), then the PLPMTU is raised to the size of the | |||
successful probe. If a black hole is detected (e.g., where packets | successful probe. If a black hole is detected (e.g., where packets | |||
of size PLPMTU are consistently not received), the method reduces the | of size PLPMTU are consistently not received), the method reduces the | |||
PLPMTU. | PLPMTU. | |||
skipping to change at page 7, line 37 ¶ | skipping to change at page 7, line 40 ¶ | |||
Section 5 of this document presents a set of algorithms for datagram | Section 5 of this document presents a set of algorithms for datagram | |||
protocols to discover the largest size of unfragmented datagram that | protocols to discover the largest size of unfragmented datagram that | |||
can be sent over a network path. The method relies upon features of | can be sent over a network path. The method relies upon features of | |||
the PL described in Section 3 and applies to transport protocols | the PL described in Section 3 and applies to transport protocols | |||
operating over IPv4 and IPv6. It does not require cooperation from | operating over IPv4 and IPv6. It does not require cooperation from | |||
the lower layers, although it can utilize PTB messages when these | the lower layers, although it can utilize PTB messages when these | |||
received messages are made available to the PL. | received messages are made available to the PL. | |||
The message size guidelines in section 3.2 of the UDP Usage | The message size guidelines in section 3.2 of the UDP Usage | |||
Guidelines [RFC8085] state "an application SHOULD either use the Path | Guidelines [BCP145] state "an application SHOULD either use the Path | |||
MTU information provided by the IP layer or implement Path MTU | MTU information provided by the IP layer or implement Path MTU | |||
Discovery (PMTUD)", but does not provide a mechanism for discovering | Discovery (PMTUD)", but does not provide a mechanism for discovering | |||
the largest size of unfragmented datagram that can be used on a | the largest size of unfragmented datagram that can be used on a | |||
network path. The present document updates RFC 8085 to specify this | network path. The present document updates RFC 8085 to specify this | |||
method in place of PLPMTUD [RFC4821] and provides a mechanism for | method in place of PLPMTUD [RFC4821] and provides a mechanism for | |||
sharing the discovered largest size as the MPS (see Section 4.4). | sharing the discovered largest size as the MPS (see Section 4.4). | |||
Section 10.2 of [RFC4821] recommended a PLPMTUD probing method for | Section 10.2 of [RFC4821] recommended a PLPMTUD probing method for | |||
the Stream Control Transport Protocol (SCTP). SCTP utilizes probe | the Stream Control Transport Protocol (SCTP). SCTP utilizes probe | |||
packets consisting of a minimal sized HEARTBEAT chunk bundled with a | packets consisting of a minimal sized HEARTBEAT chunk bundled with a | |||
PAD chunk as defined in [RFC4820]. However, RFC 4821 did not provide | PAD chunk as defined in [RFC4820]. However, RFC 4821 did not provide | |||
a complete specification. The present document replaces this by | a complete specification. The present document replaces that | |||
providing a complete specification. | description by providing a complete specification. | |||
The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires | The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires | |||
implementations to support Classical PMTUD and states that a DCCP | implementations to support Classical PMTUD and states that a DCCP | |||
sender "MUST maintain the MPS allowed for each active DCCP session". | sender "MUST maintain the MPS allowed for each active DCCP session". | |||
It also defines the current congestion control MPS (CCMPS) supported | It also defines the current congestion control MPS (CCMPS) supported | |||
by a network path. This recommends use of PMTUD, and suggests use of | by a network path. This recommends use of PMTUD, and suggests use of | |||
control packets (DCCP-Sync) as path probe packets, because they do | control packets (DCCP-Sync) as path probe packets, because they do | |||
not risk application data loss. The method defined in this | not risk application data loss. The method defined in this | |||
specification can be used with DCCP. | specification can be used with DCCP. | |||
skipping to change at page 9, line 50 ¶ | skipping to change at page 10, line 6 ¶ | |||
how other standards organizations use the acronym. This includes | how other standards organizations use the acronym. This includes | |||
the IP header, but excludes link layer headers and other framing | the IP header, but excludes link layer headers and other framing | |||
that is not part of IP or the IP payload. Other standards | that is not part of IP or the IP payload. Other standards | |||
organizations generally define the link MTU to include the link | organizations generally define the link MTU to include the link | |||
layer headers. This specification continues the requirement in | layer headers. This specification continues the requirement in | |||
[RFC4821], that states "All links MUST enforce their MTU: links | [RFC4821], that states "All links MUST enforce their MTU: links | |||
that might non- deterministically deliver packets that are larger | that might non- deterministically deliver packets that are larger | |||
than their rated MTU MUST consistently discard such packets." | than their rated MTU MUST consistently discard such packets." | |||
MAX_PLPMTU: The MAX_PLPMTU is the largest size of PLPMTU that | MAX_PLPMTU: The MAX_PLPMTU is the largest size of PLPMTU that | |||
DPLPMTUD will attempt to use. | DPLPMTUD will attempt to use (see the constants defined in | |||
Section 5.1.2). | ||||
MIN_PLPMTU: The MIN_PLPMTU is the smallest size of PLPMTU that | MIN_PLPMTU: The MIN_PLPMTU is the smallest size of PLPMTU that | |||
DPLPMTUD will attempt to use. | DPLPMTUD will attempt to use (see the constants defined in | |||
Section 5.1.2). | ||||
MPS: The Maximum Packet Size (MPS) is the largest size of | MPS: The Maximum Packet Size (MPS) is the largest size of | |||
application data block that can be sent across a network path by a | application data block that can be sent across a network path by a | |||
PL using a single Datagram. | PL using a single Datagram. | |||
MSL: Maximum Segment Lifetime (MSL) The maximum delay a packet is | MSL: Maximum Segment Lifetime (MSL) The maximum delay a packet is | |||
expected to experience across a path, taken as 2 minutes | expected to experience across a path, taken as 2 minutes [BCP145]. | |||
[RFC8085]. | ||||
Packet: A Packet is the IP header plus the IP payload. | Packet: A Packet is the IP header(s) and any extension headers/ | |||
options plus the IP payload. | ||||
Packetization Layer (PL): The PL is a layer of the network stack | Packetization Layer (PL): The PL is a layer of the network stack | |||
that places data into packets and performs transport protocol | that places data into packets and performs transport protocol | |||
functions. Examples of a PL include: TCP, SCTP, SCTP over DTLS or | functions. Examples of a PL include: TCP, SCTP, SCTP over UDP, | |||
QUIC. | SCTP over DTLS, or QUIC. | |||
Path: The Path is the set of links and routers traversed by a packet | Path: The Path is the set of links and routers traversed by a packet | |||
between a source node and a destination node by a particular flow. | between a source node and a destination node by a particular flow. | |||
Path MTU (PMTU): The Path MTU (PMTU) is the minimum of the Link MTU | Path MTU (PMTU): The Path MTU (PMTU) is the minimum of the Link MTU | |||
of all the links forming a network path between a source node and | of all the links forming a network path between a source node and | |||
a destination node, as used by PMTUD. | a destination node, as used by PMTUD. | |||
PTB: In this document, the term PTB message is applied to both IPv4 | ||||
ICMP Unreachable messages (type 3) that carry the error | ||||
Fragmentation Needed (Type 3, Code 4) [RFC0792] and ICMPv6 Packet | ||||
Too Big messages (Type 2) [RFC4443]. | ||||
PTB_SIZE: The PTB_SIZE is a value reported in a validated PTB | PTB_SIZE: The PTB_SIZE is a value reported in a validated PTB | |||
message that indicates next hop link MTU of a router along the | message that indicates next hop link MTU of a router along the | |||
path. | path. | |||
PL_PTB_SIZE: The size reported in a validated PTB message, reduced | PL_PTB_SIZE: The size reported in a validated PTB message, reduced | |||
by the size of all headers added by layers below the PL. | by the size of all headers added by layers below the PL. | |||
PLPMTU: The Packetization Layer PMTU is an estimate of the largest | PLPMTU: The Packetization Layer PMTU is an estimate of the largest | |||
size of PL datagram that can be sent by a path, controled by | size of PL datagram that can be sent by a path, controled by | |||
PLPMTUD. | PLPMTUD. | |||
skipping to change at page 11, line 16 ¶ | skipping to change at page 11, line 26 ¶ | |||
3. Features Required to Provide Datagram PLPMTUD | 3. Features Required to Provide Datagram PLPMTUD | |||
The principles expressed in [RFC4821] apply to the use of the | The principles expressed in [RFC4821] apply to the use of the | |||
technique with any PL. TCP PLPMTUD has been defined using standard | technique with any PL. TCP PLPMTUD has been defined using standard | |||
TCP protocol mechanisms. Unlike TCP, a datagram PL requires | TCP protocol mechanisms. Unlike TCP, a datagram PL requires | |||
additional mechanisms and considerations to implement PLPMTUD. | additional mechanisms and considerations to implement PLPMTUD. | |||
The requirements for datagram PLPMTUD are: | The requirements for datagram PLPMTUD are: | |||
1. Managing the PLPMTU: For datagram PLs, the PLPMTU is managed by | 1. Managing the PLPMTU: For datagram PLs, the PLPMTU is managed by | |||
DPLPMTUD. A PL MUST NOT send a datagram (other than a probe | DPLPMTUD. A PL MUST NOT send a datagram (other than a probe | |||
packet) with a size at the PL that is larger than the current | packet) with a size at the PL that is larger than the current | |||
PLPMTU. | PLPMTU. | |||
2. Probe packets: The network interface below PL is REQUIRED to | ||||
provide a way to transmit a probe packet that is larger than the | ||||
PLMPMTU. In IPv4, a probe packet MUST be sent with the Don't | ||||
Fragment (DF) bit set in the IP header, and without network | ||||
layer endpoint fragmentation. In IPv6, a probe packet is always | ||||
sent without source fragmentation (as specified in section 5.4 | ||||
of [RFC8201]). | ||||
3. Reception feedback: The destination PL endpoint is REQUIRED to | ||||
provide a feedback method that indicates to the DPLPMTUD sender | ||||
when a probe packet has been received by the destination PL | ||||
endpoint. Section 6 provides examples of how a PL can provide | ||||
this acknowledgment of received probe packets. | ||||
4. Probe loss recovery: It is RECOMMENDED to use probe packets that | ||||
do not carry any user data that would require retransmission if | ||||
lost. Most datagram transports permit this. If a probe packet | ||||
contains user data requiring retransmission in case of loss, the | ||||
PL (or layers above) are REQUIRED to arrange any retransmission/ | ||||
repair of any resulting loss. The PL is REQUIRED to be robust | ||||
in the case where probe packets are lost due to other reasons | ||||
(including link transmission error, congestion). | ||||
5. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to utilize | 2. Probe packets: The network interface below PL is REQUIRED to | |||
information about the maximum size of packet that can be | provide a way to transmit a probe packet that is larger than the | |||
transmitted by the sender on the local link (e.g., the local | PLPMTU. In IPv4, a probe packet MUST be sent with the Don't | |||
Link MTU). It MAY utilize similar information about the maximum | Fragment (DF) bit set in the IP header, and without network layer | |||
size a receiver can accept when this is supplied (note this | endpoint fragmentation. In IPv6, a probe packet is always sent | |||
could be less than EMTU_R). This avoids implementations trying | without source fragmentation (as specified in section 5.4 of | |||
to send probe packets that can not be transferred by the local | [RFC8201]). | |||
link. Too high of a value could reduce the efficiency of the | ||||
search algorithm. Some applications also have a maximum | ||||
transport protocol data unit (PDU) size, in which case there is | ||||
no benefit from probing for a size larger than this (unless a | ||||
transport allows multiplexing multiple applications PDUs into | ||||
the same datagram). | ||||
6. Processing PTB messages: A DPLPMTUD sender MAY optionally | 3. Reception feedback: The destination PL endpoint is REQUIRED to | |||
utilize PTB messages received from the network layer to help | provide a feedback method that indicates to the DPLPMTUD sender | |||
identify when a network path does not support the current size | when a probe packet has been received by the destination PL | |||
of probe packet. Any received PTB message MUST be validated | endpoint. Section 6 provides examples of how a PL can provide | |||
before it is used to update the PLPMTU discovery information | this acknowledgment of received probe packets. | |||
[RFC8201]. This validation confirms that the PTB message was | ||||
sent in response to a packet originating by the sender, and | ||||
needs to be performed before the PLPMTU discovery method reacts | ||||
to the PTB message. A PTB message MUST NOT be used to increase | ||||
the PLPMTU [RFC8201], but could trigger a probe to test for a | ||||
larger PLPMTU. A PL_PTB_SIZE that is greater than that | ||||
currently probed MUST be ignored. A valid PTB_SIZE is converted | ||||
to a PL_PTB_SIZE before it is to be used in the DPLPMTUD state | ||||
machine. | ||||
7. Probing and congestion control: The decision about when to send | 4. Probe loss recovery: It is RECOMMENDED to use probe packets that | |||
a probe packet does not need to be limited by the congestion | do not carry any user data that would require retransmission if | |||
controller. When not controlled by the congestion controller, | lost. Most datagram transports permit this. If a probe packet | |||
the interval between probe packets MUST be at least one RTT. If | contains user data requiring retransmission in case of loss, the | |||
transmission of probe packets is limited by the congestion | PL (or layers above) are REQUIRED to arrange any retransmission/ | |||
controller, this could result in transmission of probe packets | repair of any resulting loss. The PL is REQUIRED to be robust in | |||
being delayed or suspended during congestion. | the case where probe packets are lost due to other reasons | |||
(including link transmission error, congestion). | ||||
8. Loss of a probe packet SHOULD NOT be treated as an indication of | 5. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to utilize | |||
congestion and SHOULD NOT trigger a congestion control reaction | information about the maximum size of packet that can be | |||
[RFC4821], because this could result in unnecessary reduction of | transmitted by the sender on the local link (e.g., the local Link | |||
the sending rate. | MTU). A PL sender MAY utilize similar information about the | |||
maximum size of network layer packet that a receiver can accept | ||||
when this is supplied (note this could be less than EMTU_R). | ||||
This avoids implementations trying to send probe packets that can | ||||
not be transferred by the local link. Too high of a value could | ||||
reduce the efficiency of the search algorithm. Some applications | ||||
also have a maximum transport protocol data unit (PDU) size, in | ||||
which case there is no benefit from probing for a size larger | ||||
than this (unless a transport allows multiplexing multiple | ||||
applications PDUs into the same datagram). | ||||
9. An update to the PLPMTU (or MPS) MUST NOT increase the | 6. Processing PTB messages: A DPLPMTUD sender MAY optionally utilize | |||
congestion window measured in bytes [RFC4821]. Therefore, an | PTB messages received from the network layer to help identify | |||
increase in the packet size does not cause an increase in the | when a network path does not support the current size of probe | |||
data rate in bytes per second. | packet. Any received PTB message MUST be validated before it is | |||
used to update the PLPMTU discovery information [RFC8201]. This | ||||
validation confirms that the PTB message was sent in response to | ||||
a packet originating by the sender, and needs to be performed | ||||
before the PLPMTU discovery method reacts to the PTB message. A | ||||
PTB message MUST NOT be used to increase the PLPMTU [RFC8201], | ||||
but could trigger a probe to test for a larger PLPMTU. A valid | ||||
PTB_SIZE is converted to a PL_PTB_SIZE before it is to be used in | ||||
the DPLPMTUD state machine. A PL_PTB_SIZE that is greater than | ||||
that currently probed SHOULD be ignored. (This PTB message ought | ||||
to be discarded without further processing, but could be utilized | ||||
as an input that enables a resilience mode). | ||||
10. A PL that maintains the congestion window in terms of a limit to | 7. Probing and congestion control: A PL MAY use a congestion | |||
the number of outstanding fixed size packets SHOULD adapt this | controller to decide when to send a probe packet. If | |||
limit to compensate for the size of the actual packets. | transmission of probe packets is limited by the congestion | |||
controller, this could result in transmission of probe packets | ||||
being delayed or suspended during congestion. When the | ||||
transmission of probe packets is not controlled by the congestion | ||||
controller, the interval between probe packets MUST be at least | ||||
one RTT. Loss of a probe packet SHOULD NOT be treated as an | ||||
indication of congestion and SHOULD NOT trigger a congestion | ||||
control reaction [RFC4821], because this could result in | ||||
unnecessary reduction of the sending rate. An update to the | ||||
PLPMTU (or MPS) MUST NOT increase the congestion window measured | ||||
in bytes [RFC4821]. Therefore, an increase in the packet size | ||||
does not cause an increase in the data rate in bytes per second. | ||||
A PL that maintains the congestion window in terms of a limit to | ||||
the number of outstanding fixed size packets SHOULD adapt this | ||||
limit to compensate for the size of the actual packets. The | ||||
transmission of probe packets can interact with the operation of | ||||
a PL that performs burst mitigation or pacing and could need | ||||
transmission of probe packets to be regulated by these methods. | ||||
11. Probing and flow control: Flow control at the PL concerns the | 8. Probing and flow control: Flow control at the PL concerns the | |||
end-to-end flow of data using the PL service. This does not | end-to-end flow of data using the PL service. Flow control | |||
apply to DPLPMTU when probe packets use a design that does not | SHOULD NOT apply to DPLPMTU when probe packets use a design that | |||
carry user data to the remote application. | does not carry user data to the remote application. | |||
12. Shared PLPMTU state: The PMTU value calculated from the PLPMTU | 9. Shared PLPMTU state: The PMTU value calculated from the PLPMTU | |||
MAY also be stored with the corresponding entry associated with | MAY also be stored with the corresponding entry associated with | |||
the destination in the IP layer cache, and used by other PL | the destination in the IP layer cache, and used by other PL | |||
instances. The specification of PLPMTUD [RFC4821] states: "If | instances. The specification of PLPMTUD [RFC4821] states: "If | |||
PLPMTUD updates the MTU for a particular path, all Packetization | PLPMTUD updates the MTU for a particular path, all Packetization | |||
Layer sessions that share the path representation (as described | Layer sessions that share the path representation (as described | |||
in Section 5.2 of [RFC4821]) SHOULD be notified to make use of | in Section 5.2 of [RFC4821]) SHOULD be notified to make use of | |||
the new MTU". Such methods MUST be robust to the wide variety | the new MTU". Such methods MUST be robust to the wide variety of | |||
of underlying network forwarding behaviors. Section 5.2 of | underlying network forwarding behaviors. Section 5.2 of | |||
[RFC8201] provides guidance on the caching of PMTU information | [RFC8201] provides guidance on the caching of PMTU information | |||
and also the relation to IPv6 flow labels. | and also the relation to IPv6 flow labels. | |||
In addition, the following principles are stated for design of a | In addition, the following principles are stated for design of a | |||
DPLPMTUD method: | DPLPMTUD method: | |||
* A PL MAY be designed to segment data blocks larger than the MPS | * A PL MAY be designed to segment data blocks larger than the MPS | |||
into multiple datagrams. However, not all datagram PLs support | into multiple datagrams. However, not all datagram PLs support | |||
segmentation of data blocks. It is RECOMMENDED that methods avoid | segmentation of data blocks. It is RECOMMENDED that methods avoid | |||
forcing an application to use an arbitrary small MPS for | forcing an application to use an arbitrary small MPS for | |||
transmission while the method is searching for the currently | transmission while the method is searching for the currently | |||
supported PLPMTU. A reduced MPS can adversely impact the | supported PLPMTU. A reduced MPS can adversely impact the | |||
skipping to change at page 15, line 8 ¶ | skipping to change at page 15, line 21 ¶ | |||
block supplied by an application that matches the size of the | block supplied by an application that matches the size of the | |||
probe packet. This method requests the application to issue a | probe packet. This method requests the application to issue a | |||
data block of the desired probe size. | data block of the desired probe size. | |||
A PL that uses a probe packet carrying application data and needs | A PL that uses a probe packet carrying application data and needs | |||
protection from the loss of this probe packet could perform | protection from the loss of this probe packet could perform | |||
transport-layer retransmission/repair of the data block (e.g., by | transport-layer retransmission/repair of the data block (e.g., by | |||
retransmission after loss is detected or by duplicating the data | retransmission after loss is detected or by duplicating the data | |||
block in a datagram without the padding data). This retransmitted | block in a datagram without the padding data). This retransmitted | |||
data block might possibly need to be sent using a smaller PLPMTU, | data block might possibly need to be sent using a smaller PLPMTU, | |||
which could need the PL to to use a smaller packet size to traverse | which could force the PL to to use a smaller packet size to traverse | |||
the end-to-end path. (This could utilize endpoint network-layer or a | the end-to-end path. (This could utilize endpoint network-layer | |||
PL that can re-segment the data block into multiple datagrams). | fragmentation or a PL that can re-segment the data block into | |||
multiple datagrams). | ||||
DPLPMTUD MAY choose to use only one of these methods to simplify the | DPLPMTUD MAY choose to use only one of these methods to simplify the | |||
implementation. | implementation. | |||
Probe messages sent by a PL MUST contain enough information to | Probe messages sent by a PL MUST contain enough information to | |||
uniquely identify the probe within Maximum Segment Lifetime (e.g., | uniquely identify the probe within Maximum Segment Lifetime (e.g., | |||
including a unique identifier from the PL or the DPLPMTUD | including a unique identifier from the PL or the DPLPMTUD | |||
implementation), while being robust to reordering and replay of probe | implementation), while being robust to reordering and replay of probe | |||
response and PTB messages. | response and PTB messages. | |||
4.2. Confirmation of Probed Packet Size | 4.2. Confirmation of Probed Packet Size | |||
The PL needs a method to determine (confirm) when probe packets have | The PL needs a method to determine (confirm) when probe packets have | |||
been successfully received end-to-end across a network path. | been successfully received end-to-end across a network path. | |||
Transport protocols can include end-to-end methods that detect and | Transport protocols can include end-to-end methods that detect and | |||
report reception of specific datagrams that they send (e.g., DCCP and | report reception of specific datagrams that they send (e.g., DCCP, | |||
SCTP provide keep-alive/heartbeat features). When supported, this | SCTP, and QUIC provide keep-alive/heartbeat features). When | |||
mechanism MAY also be used by DPLPMTUD to acknowledge reception of a | supported, this mechanism MAY also be used by DPLPMTUD to acknowledge | |||
probe packet. | reception of a probe packet. | |||
A PL that does not acknowledge data reception (e.g., UDP and UDP- | A PL that does not acknowledge data reception (e.g., UDP and UDP- | |||
Lite) is unable itself to detect when the packets that it sends are | Lite) is unable itself to detect when the packets that it sends are | |||
discarded because their size is greater than the actual PMTU. These | discarded because their size is greater than the actual PMTU. These | |||
PLs need to rely on an application protocol to detect this loss. | PLs need to rely on an application protocol to detect this loss. | |||
Section 6 specifies this function for a set of IETF-specified | Section 6 specifies this function for a set of IETF-specified | |||
protocols. | protocols. | |||
4.3. Black Hole Detection and Reducing the PLPMTU | 4.3. Black Hole Detection and Reducing the PLPMTU | |||
skipping to change at page 17, line 41 ¶ | skipping to change at page 17, line 50 ¶ | |||
been sent with a size less than the MPS and the PLPMTU was | been sent with a size less than the MPS and the PLPMTU was | |||
subsequently reduced. If these packets are lost, the PL MAY segment | subsequently reduced. If these packets are lost, the PL MAY segment | |||
the data using the new MPS. If a PL is unable to re-segment a | the data using the new MPS. If a PL is unable to re-segment a | |||
previously sent datagram (e.g., [RFC4960]), then the sender either | previously sent datagram (e.g., [RFC4960]), then the sender either | |||
discards the datagram or could perform retransmission using network- | discards the datagram or could perform retransmission using network- | |||
layer fragmentation to form multiple IP packets not larger than the | layer fragmentation to form multiple IP packets not larger than the | |||
PLPMTU. For IPv4, the use of endpoint fragmentation by the sender is | PLPMTU. For IPv4, the use of endpoint fragmentation by the sender is | |||
preferred over clearing the DF bit in the IPv4 header. Operational | preferred over clearing the DF bit in the IPv4 header. Operational | |||
experience reveals that IP fragmentation can reduce the reliability | experience reveals that IP fragmentation can reduce the reliability | |||
of Internet communication [I-D.ietf-intarea-frag-fragile], which may | of Internet communication [I-D.ietf-intarea-frag-fragile], which may | |||
reduce the success of retransmission. | reduce the probability of successful retransmission. | |||
4.5. Disabling the Effect of PMTUD | 4.5. Disabling the Effect of PMTUD | |||
A PL implementing this specification MUST suspend network layer | A PL implementing this specification MUST suspend network layer | |||
processing of outgoing packets that enforces a PMTU | processing of outgoing packets that enforces a PMTU | |||
[RFC1191][RFC8201] for each flow utilizing DPLPMTUD, and instead use | [RFC1191][RFC8201] for each flow utilizing DPLPMTUD, and instead use | |||
DPLPMTUD to control the size of packets that are sent by a flow. | DPLPMTUD to control the size of packets that are sent by a flow. | |||
This removes the need for the network layer to drop or fragment sent | This removes the need for the network layer to drop or fragment sent | |||
packets that have a size greater than the PMTU. | packets that have a size greater than the PMTU. | |||
skipping to change at page 18, line 29 ¶ | skipping to change at page 18, line 38 ¶ | |||
This section specifies utilization and validation of PTB messages. | This section specifies utilization and validation of PTB messages. | |||
* A simple implementation MAY ignore received PTB messages and in | * A simple implementation MAY ignore received PTB messages and in | |||
this case the PLPMTU is not updated when a PTB message is | this case the PLPMTU is not updated when a PTB message is | |||
received. | received. | |||
* A PL that supports PTB messages MUST validate these messages | * A PL that supports PTB messages MUST validate these messages | |||
before they are further processed. | before they are further processed. | |||
A PL that receives a PTB message from a router or middlebox performs | A PL that receives a PTB message from a router or middlebox performs | |||
ICMP validation as specified in Section 5.2 of [RFC8085][RFC8201]. | ICMP validation (see Section 4 of [RFC8201] and Section 5.2 of | |||
Because DPLPMTUD operates at the PL, the PL needs to check that each | [BCP145]). Because DPLPMTUD operates at the PL, the PL needs to | |||
received PTB message is received in response to a packet transmitted | check that each received PTB message is received in response to a | |||
by the endpoint PL performing DPLPMTUD. | packet transmitted by the endpoint PL performing DPLPMTUD. | |||
The PL MUST check the protocol information in the quoted packet | The PL MUST check the protocol information in the quoted packet | |||
carried in an ICMP PTB message payload to validate the message | carried in an ICMP PTB message payload to validate the message | |||
originated from the sending node. This validation includes | originated from the sending node. This validation includes | |||
determining that the combination of the IP addresses, the protocol, | determining that the combination of the IP addresses, the protocol, | |||
the source port and destination port match those returned in the | the source port and destination port match those returned in the | |||
quoted packet - this is also necessary for the PTB message to be | quoted packet - this is also necessary for the PTB message to be | |||
passed to the corresponding PL. | passed to the corresponding PL. | |||
The validation SHOULD utilize information that it is not simple for | The validation SHOULD utilize information that it is not simple for | |||
an off-path attacker to determine [RFC8085]. For example, it could | an off-path attacker to determine [BCP145]. For example, it could | |||
check the value of a protocol header field known only to the two PL | check the value of a protocol header field known only to the two PL | |||
endpoints. A datagram application that uses well-known source and | endpoints. A datagram application that uses well-known source and | |||
destination ports ought to also rely on other information to complete | destination ports ought to also rely on other information to complete | |||
this validation. | this validation. | |||
These checks are intended to provide protection from packets that | These checks are intended to provide protection from packets that | |||
originate from a node that is not on the network path. A PTB message | originate from a node that is not on the network path. A PTB message | |||
that does not complete the validation MUST NOT be further utilized by | that does not complete the validation MUST NOT be further utilized by | |||
the DPLPMTUD method, as discussed in the Security Considerations | the DPLPMTUD method, as discussed in the Security Considerations | |||
section. | section. | |||
PTB messages that have been validated MAY be utilized by the DPLPMTUD | Section 4.6.2 describes this processing of PTB messages. | |||
algorithm, but MUST NOT be used directly to set the PLPMTU. The | ||||
PL_PTB_SIZE is smaller than the PTB_SIZE because it is reduced by | ||||
headers below the PL including any IP options or extensions added to | ||||
the PL packet. A method that utilizes these PTB messages can improve | ||||
the speed at which the algorithm detects an appropriate PLPMTU by | ||||
triggering an immediate probe for the PL_PTB_SIZE (resulting in a | ||||
network-layer packet of size PTB_SIZE), compared to one that relies | ||||
solely on probing using a timer-based search algorithm. | ||||
Section 4.6.2 describes this processing. | ||||
4.6.2. Use of PTB Messages | 4.6.2. Use of PTB Messages | |||
PTB messages that have been validated MAY be utilized by the DPLPMTUD | ||||
algorithm, but MUST NOT be used directly to set the PLPMTU. | ||||
Before using the size reported in the PTB message it must first be | Before using the size reported in the PTB message it must first be | |||
converted to a PL_PTB_SIZE. A set of checks are intended to provide | converted to a PL_PTB_SIZE. The PL_PTB_SIZE is smaller than the | |||
protection from a router that reports an unexpected PTB_SIZE. The PL | PTB_SIZE because it is reduced by headers below the PL including any | |||
also needs to check that the indicated PL_PTB_SIZE is less than the | IP options or extensions added to the PL packet. | |||
size used by probe packets and at least the minimum size accepted. | ||||
A method that utilizes these PTB messages can improve the speed at | ||||
which the algorithm detects an appropriate PLPMTU by triggering an | ||||
immediate probe for the PL_PTB_SIZE (resulting in a network-layer | ||||
packet of size PTB_SIZE), compared to one that relies solely on | ||||
probing using a timer-based search algorithm. | ||||
A set of checks are intended to provide protection from a router that | ||||
reports an unexpected PTB_SIZE. The PL also needs to check that the | ||||
indicated PL_PTB_SIZE is less than the size used by probe packets and | ||||
at least the minimum size accepted. | ||||
This section provides a summary of how PTB messages can be utilized. | This section provides a summary of how PTB messages can be utilized. | |||
(This uses the set of constants defined in section 5.1.2). This | (This uses the set of constants defined in Section 5.1.2). This | |||
processing depends on the PL_PTB_SIZE and the current value of a set | processing depends on the PL_PTB_SIZE and the current value of a set | |||
of variables: | of variables: | |||
PL_PTB_SIZE < MIN_PLPMTU | PL_PTB_SIZE < MIN_PLPMTU | |||
* Invalid PL_PTB_SIZE see Section 4.6.1. | * Invalid PL_PTB_SIZE see Section 4.6.1. | |||
* PTB message ought to be discarded without further processing | * PTB message ought to be discarded without further processing | |||
(i.e., PLPMTU is not modified). | (i.e., PLPMTU is not modified). | |||
* The information could be utilized as an input that triggers | * The information could be utilized as an input that triggers | |||
skipping to change at page 20, line 15 ¶ | skipping to change at page 20, line 26 ¶ | |||
BASE_PLPMTU <= PL_PTB_SIZE < PLPMTU | BASE_PLPMTU <= PL_PTB_SIZE < PLPMTU | |||
* This could be an indication of a black hole. The PLPMTU SHOULD | * This could be an indication of a black hole. The PLPMTU SHOULD | |||
be set to BASE_PLPMTU (the PLPMTU is reduced to the BASE_PLPMTU | be set to BASE_PLPMTU (the PLPMTU is reduced to the BASE_PLPMTU | |||
to avoid unnecessary packet loss when a black hole is | to avoid unnecessary packet loss when a black hole is | |||
encountered). | encountered). | |||
* The PL ought to start a search to quickly discover the new | * The PL ought to start a search to quickly discover the new | |||
PLPMTU. The PL_PTB_SIZE reported in the PTB message can be | PLPMTU. The PL_PTB_SIZE reported in the PTB message can be | |||
used to initialize a search algorithm. | used to initialize a search algorithm. | |||
PL_PTB_SIZE = PLPMTU | ||||
* Completes the search for a larger PLPMTU. | ||||
PLPMTU < PL_PTB_SIZE < PROBED_SIZE | PLPMTU < PL_PTB_SIZE < PROBED_SIZE | |||
* The PLPMTU continues to be valid, but the size of a packet used | * The PLPMTU continues to be valid, but the size of a packet used | |||
to search (PROBED_SIZE) was larger than the actual PMTU. | to search (PROBED_SIZE) was larger than the actual PMTU. | |||
* The PLPMTU is not updated. | * The PLPMTU is not updated. | |||
* The PL can use the reported PL_PTB_SIZE from the PTB message as | * The PL can use the reported PL_PTB_SIZE from the PTB message as | |||
the next search point when it resumes the search algorithm. | the next search point when it resumes the search algorithm. | |||
PL_PTB_SIZE > PROBED_SIZE | PL_PTB_SIZE >= PROBED_SIZE | |||
* Inconsistent network signal. | * Inconsistent network signal. | |||
* PTB message ought to be discarded without further processing | * PTB message ought to be discarded without further processing | |||
(i.e., PLPMTU is not modified). | (i.e., PLPMTU is not modified). | |||
* The information could be utilized as an input to trigger | * The information could be utilized as an input to trigger | |||
enabling a resilience mode. | enabling a resilience mode. | |||
5. Datagram Packetization Layer PMTUD | 5. Datagram Packetization Layer PMTUD | |||
This section specifies Datagram PLPMTUD (DPLPMTUD). The method can | This section specifies Datagram PLPMTUD (DPLPMTUD). The method can | |||
be introduced at various points (as indicated with * in the figure | be introduced at various points (as indicated with * in the figure | |||
below) in the IP protocol stack to discover the PLPMTU so that an | below) in the IP protocol stack to discover the PLPMTU so that an | |||
application can utilize an appropriate MPS for the current network | application can utilize an appropriate MPS for the current network | |||
path. | path. | |||
DPLPMTUD SHOULD NOT be used by an upper PL or application if it is | DPLPMTUD SHOULD only be performed at one layer between a pair of | |||
already used in a lower layer DPLPMTUD SHOULD only be performed once | endpoints. Therefore, an upper PL or application should avoid using | |||
between a pair of endpoints. A PL MUST adjust the MPS indicated by | DPLPMTUD when this is already enabled in a lower layer. A PL MUST | |||
DPLPMTUD to account for any additional overhead introduced by the PL. | adjust the MPS indicated by DPLPMTUD to account for any additional | |||
overhead introduced by the PL. | ||||
+----------------------+ | +----------------------+ | |||
| Application* | | | Application* | | |||
+-----+------------+---+ | +-----+------------+---+ | |||
| | | | | | |||
+---+--+ +--+--+ | +---+--+ +--+--+ | |||
| QUIC*| |SCTP*| | | QUIC*| |SCTP*| | |||
+---+--+ +-+-+-+ | +---+--+ +-+-+-+ | |||
| | | | | | | | |||
+---+ +----+ | | +---+ +----+ | | |||
skipping to change at page 21, line 47 ¶ | skipping to change at page 22, line 4 ¶ | |||
DPLPMTUD. | DPLPMTUD. | |||
5.1.1. Timers | 5.1.1. Timers | |||
The method utilizes up to three timers: | The method utilizes up to three timers: | |||
PROBE_TIMER: The PROBE_TIMER is configured to expire after a period | PROBE_TIMER: The PROBE_TIMER is configured to expire after a period | |||
longer than the maximum time to receive an acknowledgment to a | longer than the maximum time to receive an acknowledgment to a | |||
probe packet. This value MUST NOT be smaller than 1 second, and | probe packet. This value MUST NOT be smaller than 1 second, and | |||
SHOULD be larger than 15 seconds. Guidance on selection of the | SHOULD be larger than 15 seconds. Guidance on selection of the | |||
timer value are provided in section 3.1.1 of the UDP Usage | timer value are provided in Section 3.1.1 of the UDP Usage | |||
Guidelines [RFC8085]. | Guidelines [BCP145]. | |||
PMTU_RAISE_TIMER: The PMTU_RAISE_TIMER is configured to the period a | PMTU_RAISE_TIMER: The PMTU_RAISE_TIMER is configured to the period a | |||
sender will continue to use the current PLPMTU, after which it re- | sender will continue to use the current PLPMTU, after which it re- | |||
enters the Search phase. This timer has a period of 600 seconds, | enters the Search phase. This timer has a period of 600 seconds, | |||
as recommended by PLPMTUD [RFC4821]. | as recommended by PLPMTUD [RFC4821]. | |||
DPLPMTUD MAY inhibit sending probe packets when no application | DPLPMTUD MAY inhibit sending probe packets when no application | |||
data has been sent since the previous probe packet. A PL | data has been sent since the previous probe packet. A PL | |||
preferring to use an up-to-date PMTU once user data is sent again, | preferring to use an up-to-date PMTU once user data is sent again, | |||
can choose to continue PMTU discovery for each path. However, | can choose to continue PMTU discovery for each path. However, | |||
this could result in sending additional packets. | this will result in sending additional packets. | |||
CONFIRMATION_TIMER: When an acknowledged PL is used, this timer MUST | CONFIRMATION_TIMER: When an acknowledged PL is used, this timer MUST | |||
NOT be used. For other PLs, the CONFIRMATION_TIMER is configured | NOT be used. For other PLs, the CONFIRMATION_TIMER is configured | |||
to the period a PL sender waits before confirming the current | to the period a PL sender waits before confirming the current | |||
PLPMTU is still supported. This is less than the PMTU_RAISE_TIMER | PLPMTU is still supported. This is less than the PMTU_RAISE_TIMER | |||
and used to decrease the PLPMTU (e.g., when a black hole is | and used to decrease the PLPMTU (e.g., when a black hole is | |||
encountered). Confirmation needs to be frequent enough when data | encountered). Confirmation needs to be frequent enough when data | |||
is flowing that the sending PL does not black hole extensive | is flowing that the sending PL does not black hole extensive | |||
amounts of traffic. Guidance on selection of the timer value are | amounts of traffic. Guidance on selection of the timer value are | |||
provided in section 3.1.1 of the UDP Usage Guidelines [RFC8085]. | provided in Section 3.1.1 of the UDP Usage Guidelines [BCP145]. | |||
DPLPMTUD MAY inhibit sending probe packets when no application | DPLPMTUD MAY inhibit sending probe packets when no application | |||
data has been sent since the previous probe packet. A PL | data has been sent since the previous probe packet. A PL | |||
preferring to use an up-to-date PMTU once user data is sent again, | preferring to use an up-to-date PMTU once user data is sent again, | |||
can choose to continue PMTU discovery for each path. However, | can choose to continue PMTU discovery for each path. However, | |||
this could result in sending additional packets. | this could result in sending additional packets. | |||
The various timers could be implemented using a single timer | DPLPMTD specifies various timers, however an implementation could | |||
choose to realise these timer functions using a single timer. | ||||
5.1.2. Constants | 5.1.2. Constants | |||
The following constants are defined: | The following constants are defined: | |||
MAX_PROBES: The MAX_PROBES is the maximum value of the PROBE_COUNT | MAX_PROBES: The MAX_PROBES is the maximum value of the PROBE_COUNT | |||
counter (see Section 5.1.3). MAX_PROBES represents the limit for | counter (see Section 5.1.3). MAX_PROBES represents the limit for | |||
the number of consecutive probe attempts of any size. Search | the number of consecutive probe attempts of any size. Search | |||
algorithms benefit from a MAX_PROBES value greater than 1 because | algorithms benefit from a MAX_PROBES value greater than 1 because | |||
this can provide robustness to isolated packet loss. The default | this can provide robustness to isolated packet loss. The default | |||
value of MAX_PROBES is 3. | value of MAX_PROBES is 3. | |||
MIN_PLPMTU: The MIN_PLPMTU is the smallest allowed probe packet | MIN_PLPMTU: The MIN_PLPMTU is the smallest size of PLPMTU that | |||
size. For IPv6, this value is 1280 bytes, as specified in | DPLPMTUD will attempt to use. For IPv6, this size is greater than | |||
[RFC8200]. For IPv4, the minimum value is 68 bytes. | or equal to the size at the PL that results in an 1280 byte IPv6 | |||
packet, as specified in [RFC8200]. For IPv4, this size is greater | ||||
Note: An IPv4 router is required to be able to forward a datagram | than or equal to the size at the PL that results in an 68 byte | |||
of 68 bytes without further fragmentation. This is the combined | IPv4 packet. Note: An IPv4 router is required to be able to | |||
size of an IPv4 header and the minimum fragment size of 8 bytes. | forward a datagram of 68 bytes without further fragmentation. | |||
In addition, receivers are required to be able to reassemble | This is the combined size of an IPv4 header and the minimum | |||
fragmented datagrams at least up to 576 bytes, as stated in | fragment size of 8 bytes. In addition, receivers are required to | |||
section 3.3.3 of [RFC1122]. | be able to reassemble fragmented datagrams at least up to 576 | |||
bytes, as stated in section 3.3.3 of [RFC1122]. | ||||
MAX_PLPMTU: The MAX_PLPMTU is the largest size of PLPMTU. This has | MAX_PLPMTU: The MAX_PLPMTU is the largest size of PLPMTU. This has | |||
to be less than or equal to the maximum size of the PL packet that | to be less than or equal to the maximum size of the PL packet that | |||
can be sent on the outgoing interface (constrained by the local | can be sent on the outgoing interface (constrained by the local | |||
interface MTU). When known, this also ought to be less than the | interface MTU). When known, this also ought to be less than the | |||
maximum size of PL packet that can be received by the remote | maximum size of PL packet that can be received by the remote | |||
endpoint (constrained by EMTU_R). It can be limited by the design | endpoint (constrained by EMTU_R). It can be limited by the design | |||
or configuration of the PL being used. An application, or PL, MAY | or configuration of the PL being used. An application, or PL, MAY | |||
choose a smaller MAX_PLPMTU when there is no need to send packets | choose a smaller MAX_PLPMTU when there is no need to send packets | |||
larger than a specific size. | larger than a specific size. | |||
BASE_PLPMTU: The BASE_PLPMTU is a configured size expected to work | BASE_PLPMTU: The BASE_PLPMTU is a configured size expected to work | |||
for most paths. The size is equal to or larger than the | for most paths. The size is equal to or larger than the | |||
MIN_PLPMTU and smaller than the MAX_PLPMTU. In the case of IPv6, | MIN_PLPMTU and smaller than the MAX_PLPMTU. For most PLs a | |||
this value is derived from the IPv6 minimum link MTU of 1280 bytes | suitable BASE_PLPMTU will be larger than 1200 bytes. When using | |||
[RFC8200]. When using IPv4, there is no currently equivalent size | IPv4, there is no currently equivalent size specified and a | |||
specified and a default BASE_PLPMTU of 1200 bytes is RECOMMENDED. | default BASE_PLPMTU of 1200 bytes is RECOMMENDED. | |||
5.1.3. Variables | 5.1.3. Variables | |||
This method utilizes a set of variables: | This method utilizes a set of variables: | |||
PROBED_SIZE: The PROBED_SIZE is the size of the current probe | PROBED_SIZE: The PROBED_SIZE is the size of the current probe packet | |||
packet. This is a tentative value for the PLPMTU, which is | as determined at the PL. This is a tentative value for the | |||
awaiting confirmation by an acknowledgment. | PLPMTU, which is awaiting confirmation by an acknowledgment. | |||
PROBE_COUNT: The PROBE_COUNT is a count of the number of successive | PROBE_COUNT: The PROBE_COUNT is a count of the number of successive | |||
unsuccessful probe packets that have been sent. Each time a probe | unsuccessful probe packets that have been sent. Each time a probe | |||
packet is acknowledged, the value is set to zero. (Some probe | packet is acknowledged, the value is set to zero. (Some probe | |||
loss is expected while searching, therefore loss of a single probe | loss is expected while searching, therefore loss of a single probe | |||
is not an indication of a PMTU problem.) | is not an indication of a PMTU problem.) | |||
The figure below illustrates the relationship between the packet size | The figure below illustrates the relationship between the packet size | |||
constants and variables at a point of time when the DPLPMTUD | constants and variables at a point of time when the DPLPMTUD | |||
algorithm performs path probing to increase the size of the PLPMTU. | algorithm performs path probing to increase the size of the PLPMTU. | |||
skipping to change at page 24, line 41 ¶ | skipping to change at page 24, line 50 ¶ | |||
| expired | | completed | | expired | | completed | |||
| | | | | | | | |||
| | v | | | v | |||
| +-----------------+ | | +-----------------+ | |||
+---| Search Complete | | +---| Search Complete | | |||
+-----------------+ | +-----------------+ | |||
Figure 4: DPLPMTUD Phases | Figure 4: DPLPMTUD Phases | |||
Base: The Base Phase confirms connectivity to the remote peer using | Base: The Base Phase confirms connectivity to the remote peer using | |||
packets of the BASE_PLPMTU. This phase is implicit for a | packets of the BASE_PLPMTU. The confirmation of connectivity is | |||
connection-oriented PL (where it can be performed in a PL | implicit for a connection-oriented PL (where it can be performed | |||
connection handshake). A connectionless PL sends a probe packet | in a PL connection handshake). A connectionless PL sends a probe | |||
and uses acknowledgment of this probe packet to confirm that the | packet and uses acknowledgment of this probe packet to confirm | |||
remote peer is reachable. | that the remote peer is reachable. | |||
The sender also confirms that BASE_PLPMTU is supported across the | The sender also confirms that BASE_PLPMTU is supported across the | |||
network path. This may be achieved using a PL mechanism (e.g., | network path. This may be achieved using a PL mechanism (e.g., | |||
using a handshake packet of size BASE_PLPMTU), or by sending a | using a handshake packet of size BASE_PLPMTU), or by sending a | |||
probe packet of size BASE_PLPMTU and confirming that this is | probe packet of size BASE_PLPMTU and confirming that this is | |||
received. | received. | |||
A probe packet of size BASE_PLPMTU can be sent immediately on the | A probe packet of size BASE_PLPMTU can be sent immediately on the | |||
initial entry to the Base Phase (following a connectivity check). | initial entry to the Base Phase (following a connectivity check). | |||
A PL that does not wish to support a path with a PLPMTU less than | A PL that does not wish to support a path with a PLPMTU less than | |||
skipping to change at page 27, line 13 ¶ | skipping to change at page 27, line 13 ¶ | |||
Note: Not all changes are shown to simplify the diagram. | Note: Not all changes are shown to simplify the diagram. | |||
| | | | | | |||
| Start | PL indicates loss | | Start | PL indicates loss | |||
| | of connectivity | | | of connectivity | |||
v v | v v | |||
+---------------+ +---------------+ | +---------------+ +---------------+ | |||
| DISABLED | | ERROR | | | DISABLED | | ERROR | | |||
+---------------+ PROBE_TIMER expiry: +---------------+ | +---------------+ PROBE_TIMER expiry: +---------------+ | |||
| PL indicates PROBE_COUNT = MAX_PROBES or ^ | | | PL indicates PROBE_COUNT = MAX_PROBES or ^ | | |||
| connectivity PTB: PLPTB_SIZE < BASE_PLPMTU | | | | connectivity PTB: PL_PTB_SIZE < BASE_PLPMTU | | | |||
+--------------------+ +---------------+ | | +--------------------+ +---------------+ | | |||
| | | | | | | | |||
v | BASE_PLPMTU Probe | | v | BASE_PLPMTU Probe | | |||
+---------------+ acked | | +---------------+ acked | | |||
| BASE |----------------------+ | | BASE |--------------------->+ | |||
+---------------+ | | +---------------+ | | |||
^ | ^ ^ | | ^ | ^ ^ | | |||
Black hole detected | | | | Black hole detected | | Black hole detected | | | | Black hole detected | | |||
+--------------------+ | | +--------------------+ | | +--------------------+ | | +--------------------+ | | |||
| +----+ | | | | +----+ | | | |||
| PROBE_TIMER expiry: | | | | PROBE_TIMER expiry: | | | |||
| PROBE_COUNT < MAX_PROBES | | | | PROBE_COUNT < MAX_PROBES | | | |||
| | | | | | | | |||
| PMTU_RAISE_TIMER expiry | | | | PMTU_RAISE_TIMER expiry | | | |||
| +-----------------------------------------+ | | | | +-----------------------------------------+ | | | |||
| | | | | | | | | | | | |||
| | v | v | | | v | v | |||
+---------------+ +---------------+ | +---------------+ +---------------+ | |||
|SEARCH_COMPLETE| | SEARCHING | | |SEARCH_COMPLETE| | SEARCHING | | |||
+---------------+ +---------------+ | +---------------+ +---------------+ | |||
| ^ ^ | | ^ | | ^ ^ | | ^ | |||
| | | | | | | | | | | | | | |||
| | +-----------------------------------------+ | | | | | +-----------------------------------------+ | | | |||
| | MAX_PLPMTU Probe acked or | | | | | MAX_PLPMTU Probe acked or | | | |||
| | PROBE_TIMER expiry: PROBE_COUNT = MAX_PROBES or | | | | | PROBE_TIMER expiry: PROBE_COUNT = MAX_PROBES or | | | |||
+----+ PTB: PLPTB_SIZE = PLPMTU +----+ | +----+ PTB: PL_PTB_SIZE = PLPMTU +----+ | |||
CONFIRMATION_TIMER expiry: PROBE_TIMER expiry: | CONFIRMATION_TIMER expiry: PROBE_TIMER expiry: | |||
PROBE_COUNT < MAX_PROBES or PROBE_COUNT < MAX_PROBES or | PROBE_COUNT < MAX_PROBES or PROBE_COUNT < MAX_PROBES or | |||
PLPMTU Probe acked Probe acked or PTB: | PLPMTU Probe acked Probe acked or PTB: | |||
PLPMTU < PLPTB_SIZE < PROBED_SIZE | PLPMTU < PL_PTB_SIZE < PROBED_SIZE | |||
Figure 5: State machine for Datagram PLPMTUD | Figure 5: State machine for Datagram PLPMTUD | |||
The following states are defined: | The following states are defined: | |||
DISABLED: The DISABLED state is the initial state before probing has | DISABLED: The DISABLED state is the initial state before probing has | |||
started. It is also entered from any other state, when the PL | started. It is also entered from any other state, when the PL | |||
indicates loss of connectivity. This state is left once the PL | indicates loss of connectivity. This state is left once the PL | |||
indicates connectivity to the remote PL. When transitioning to | indicates connectivity to the remote PL. When transitioning to | |||
the BASE state, a probe packet of size BASE_PLPMTU can be sent | the BASE state, a probe packet of size BASE_PLPMTU can be sent | |||
skipping to change at page 29, line 50 ¶ | skipping to change at page 29, line 50 ¶ | |||
path. | path. | |||
The method discovers the search range by confirming the minimum | The method discovers the search range by confirming the minimum | |||
PLPMTU and then using the probe method to select a PROBED_SIZE less | PLPMTU and then using the probe method to select a PROBED_SIZE less | |||
than or equal to MAX_PLPMTU. MAX_PLPMTU is the minimum of the local | than or equal to MAX_PLPMTU. MAX_PLPMTU is the minimum of the local | |||
MTU and EMTU_R (when this is learned from the remote endpoint). The | MTU and EMTU_R (when this is learned from the remote endpoint). The | |||
MAX_PLPMTU MAY be reduced by an application that sets a maximum to | MAX_PLPMTU MAY be reduced by an application that sets a maximum to | |||
the size of datagrams it will send. | the size of datagrams it will send. | |||
The PROBE_COUNT is initialized to zero when the first probe with a | The PROBE_COUNT is initialized to zero when the first probe with a | |||
size greater than or equal to PLPMTUD is sent. A timer is used to | size greater than or equal to PLPMTUD is sent. Each probe packet | |||
trigger the sending of probe packets of size PROBED_SIZE, larger than | successfully sent to the remote peer is confirmed by acknowledgment | |||
the PLPMTU. Each probe packet successfully sent to the remote peer | at the PL, see Section 4.1. | |||
is confirmed by acknowledgment at the PL, see Section 4.1. | ||||
Each time a probe packet is sent to the destination, the PROBE_TIMER | Each time a probe packet is sent to the destination, the PROBE_TIMER | |||
is started. The timer is canceled when the PL receives | is started. The timer is canceled when the PL receives | |||
acknowledgment that the probe packet has been successfully sent | acknowledgment that the probe packet has been successfully sent | |||
across the path Section 4.1. This confirms that the PROBED_SIZE is | across the path Section 4.1. This confirms that the PROBED_SIZE is | |||
supported, and the PROBED_SIZE value is then assigned to the PLPMTU. | supported, and the PROBED_SIZE value is then assigned to the PLPMTU. | |||
The search algorithm can continue to send subsequent probe packets of | The search algorithm can continue to send subsequent probe packets of | |||
an increasing size. | an increasing size. | |||
If the timer expires before a probe packet is acknowledged, the probe | If the timer expires before a probe packet is acknowledged, the probe | |||
skipping to change at page 31, line 42 ¶ | skipping to change at page 31, line 40 ¶ | |||
6.1. Application support for DPLPMTUD with UDP or UDP-Lite | 6.1. Application support for DPLPMTUD with UDP or UDP-Lite | |||
The current specifications of UDP [RFC0768] and UDP-Lite [RFC3828] do | The current specifications of UDP [RFC0768] and UDP-Lite [RFC3828] do | |||
not define a method in the RFC-series that supports PLPMTUD. In | not define a method in the RFC-series that supports PLPMTUD. In | |||
particular, the UDP transport does not provide the transport features | particular, the UDP transport does not provide the transport features | |||
needed to implement datagram PLPMTUD. | needed to implement datagram PLPMTUD. | |||
The DPLPMTUD method can be implemented as a part of an application | The DPLPMTUD method can be implemented as a part of an application | |||
built directly or indirectly on UDP or UDP-Lite, but relies on | built directly or indirectly on UDP or UDP-Lite, but relies on | |||
higher-layer protocol features to implement the method [RFC8085]. | higher-layer protocol features to implement the method [BCP145]. | |||
Some primitives used by DPLPMTUD might not be available via the | Some primitives used by DPLPMTUD might not be available via the | |||
Datagram API (e.g., the ability to access the PLPMTU from the IP | Datagram API (e.g., the ability to access the PLPMTU from the IP | |||
layer cache, or interpret received PTB messages). | layer cache, or interpret received PTB messages). | |||
In addition, it is recommended that PMTU discovery is not performed | In addition, it is recommended that PMTU discovery is not performed | |||
by multiple protocol layers. An application SHOULD avoid using | by multiple protocol layers. An application SHOULD avoid using | |||
DPLPMTUD when the underlying transport system provides this | DPLPMTUD when the underlying transport system provides this | |||
capability. To use common method for managing the PLPMTU has | capability. A common method for managing the PLPMTU has benefits, | |||
benefits, both in the ability to share state between different | both in the ability to share state between different processes and | |||
processes and opportunities to coordinate probing. | opportunities to coordinate probing for different PL instances. | |||
6.1.1. Application Request | 6.1.1. Application Request | |||
An application needs an application-layer protocol mechanism (such as | An application needs an application-layer protocol mechanism (such as | |||
a message acknowledgment method) that solicits a response from a | a message acknowledgment method) that solicits a response from a | |||
destination endpoint. The method SHOULD allow the sender to check | destination endpoint. The method SHOULD allow the sender to check | |||
the value returned in the response to provide additional protection | the value returned in the response to provide additional protection | |||
from off-path insertion of data [RFC8085]. Suitable methods include | from off-path insertion of data [BCP145]. Suitable methods include a | |||
a parameter known only to the two endpoints, such as a session ID or | parameter known only to the two endpoints, such as a session ID or | |||
initialized sequence number. | initialized sequence number. | |||
6.1.2. Application Response | 6.1.2. Application Response | |||
An application needs an application-layer protocol mechanism to | An application needs an application-layer protocol mechanism to | |||
communicate the response from the destination endpoint. This | communicate the response from the destination endpoint. This | |||
response could indicate successful reception of the probe across the | response could indicate successful reception of the probe across the | |||
path, but could also indicate that some (or all packets) have failed | path, but could also indicate that some (or all packets) have failed | |||
to reach the destination. | to reach the destination. | |||
skipping to change at page 33, line 8 ¶ | skipping to change at page 32, line 48 ¶ | |||
6.1.5. Validating the Path | 6.1.5. Validating the Path | |||
An application that does not have other higher-layer information | An application that does not have other higher-layer information | |||
confirming correct delivery of datagrams SHOULD implement the | confirming correct delivery of datagrams SHOULD implement the | |||
CONFIRMATION_TIMER to periodically send probe packets while in the | CONFIRMATION_TIMER to periodically send probe packets while in the | |||
SEARCH_COMPLETE state. | SEARCH_COMPLETE state. | |||
6.1.6. Handling of PTB Messages | 6.1.6. Handling of PTB Messages | |||
An application that is able and wishes to receive PTB messages MUST | An application that is able and wishes to receive PTB messages MUST | |||
perform ICMP validation as specified in Section 5.2 of [RFC8085]. | perform ICMP validation as specified in Section 5.2 of [BCP145]. | |||
This requires that the application checks each received PTB message | This requires that the application checks each received PTB message | |||
to validate that it was is received in response to transmitted | to validate that it was is received in response to transmitted | |||
traffic and that the reported PL_PTB_SIZE is less than the current | traffic and that the reported PL_PTB_SIZE is less than the current | |||
probed size (see Section 4.6.2). A validated PTB message MAY be used | probed size (see Section 4.6.2). A validated PTB message MAY be used | |||
as input to the DPLPMTUD algorithm, but MUST NOT be used directly to | as input to the DPLPMTUD algorithm, but MUST NOT be used directly to | |||
set the PLPMTU. | set the PLPMTU. | |||
6.2. DPLPMTUD for SCTP | 6.2. DPLPMTUD for SCTP | |||
Section 10.2 of [RFC4821] specified a recommended PLPMTUD probing | Section 10.2 of [RFC4821] specified a recommended PLPMTUD probing | |||
method for SCTP and Section 7.3 of [RFC4960] and recommended an | method for SCTP and Section 7.3 of [RFC4960] recommended an endpoint | |||
endpoint apply the techniques in RFC4821 on a per-destination-address | apply the techniques in RFC4821 on a per-destination-address basis. | |||
basis. The specification for DPLPMTUD continues the practice of | The specification for DPLPMTUD continues the practice of using the PL | |||
using the PL to discover the PMTU, but updates, RFC4960 with a | to discover the PMTU, but updates, RFC4960 with a recommendation to | |||
recommendation to use the method specified in this document: The | use the method specified in this document: The RECOMMENDED method for | |||
RECOMMENDED method for generating probes is to add a chunk consisting | generating probes is to add a chunk consisting only of padding to an | |||
only of padding to an SCTP message. The PAD chunk defined in | SCTP message. The PAD chunk defined in [RFC4820] SHOULD be attached | |||
[RFC4820] SHOULD be attached to a minimum length HEARTBEAT (HB) chunk | to a minimum length HEARTBEAT (HB) chunk to build a probe packet. | |||
to build a probe packet. This enables probing without affecting the | This enables probing without affecting the transfer of user messages | |||
transfer of user messages and without being limited by congestion | and without being limited by congestion control or flow control. | |||
control or flow control. This is preferred to using DATA chunks | This is preferred to using DATA chunks (with padding as required) as | |||
(with padding as required) as path probes. | path probes. | |||
Section 6.9 of [RFC4960] describes dividing the user messages into | Section 6.9 of [RFC4960] describes dividing the user messages into | |||
data chunks sent by the PL when using SCTP. This notes that once an | data chunks sent by the PL when using SCTP. This notes that once an | |||
SCTP message has been sent, it cannot be re-segmented. [RFC4960] | SCTP message has been sent, it cannot be re-segmented. [RFC4960] | |||
describes the method to retransmit data chunks when the MPS has | describes the method to retransmit data chunks when the MPS has | |||
reduced, and the use of IP fragmentation for this case. | reduced, and the use of IP fragmentation for this case. This is | |||
unchanged by this document. | ||||
6.2.1. SCTP/IPv4 and SCTP/IPv6 | 6.2.1. SCTP/IPv4 and SCTP/IPv6 | |||
6.2.1.1. Initial Connectivity | 6.2.1.1. Initial Connectivity | |||
The base protocol is specified in [RFC4960]. This provides an | The base protocol is specified in [RFC4960]. This provides an | |||
acknowledged PL. A sender can therefore enter the BASE state as soon | acknowledged PL. A sender can therefore enter the BASE state as soon | |||
as connectivity has been confirmed. | as connectivity has been confirmed. | |||
6.2.1.2. Sending SCTP Probe Packets | 6.2.1.2. Sending SCTP Probe Packets | |||
Probe packets consist of an SCTP common header followed by a | Probe packets consist of an SCTP common header followed by a | |||
HEARTBEAT chunk and a PAD chunk. The PAD chunk is used to control | HEARTBEAT chunk and a PAD chunk. The PAD chunk is used to control | |||
the length of the probe packet. The HEARTBEAT chunk is used to | the length of the probe packet. The HEARTBEAT chunk is used to | |||
trigger the sending of a HEARTBEAT ACK chunk. The reception of the | trigger the sending of a HEARTBEAT ACK chunk. The reception of the | |||
HEARTBEAT ACK chunk acknowledges reception of a successful probe. A | HEARTBEAT ACK chunk acknowledges reception of a successful probe. A | |||
successful probe updates the association and path counters, but an | successful probe updates the association and path counters, but an | |||
unsuccessful probe is discounted (assumed to be a result of choosing | unsuccessful probe is discounted (assumed to be a result of choosing | |||
too large a PLPMTU). | too large a PLPMTU). | |||
The HEARTBEAT chunk carries a Heartbeat Information parameter which | The SCTP sender needs to be able to determine the total size of a | |||
includes, besides the information suggested in [RFC4960], the probe | probe packet. The HEARTBEAT chunk could carry a Heartbeat | |||
size, which is the size of the complete datagram. The size of the | Information parameter that includes, besides the information | |||
PAD chunk is therefore computed by reducing the probing size by the | suggested in [RFC4960], the probe size to help an implementation | |||
IPv4 or IPv6 header size, the SCTP common header, the HEARTBEAT | associate a HEARTBEAT-ACK with the size of probe that was sent. The | |||
request and the PAD chunk header. The payload of the PAD chunk | sender could also use other methods, such as sending a nonce and | |||
contains arbitrary data. | verifying the information returned also contains the corresponding | |||
nonce. The length of the PAD chunk is computed by reducing the | ||||
probing size by the size of the SCTP common header and the HEARTBEAT | ||||
chunk. The payload of the PAD chunk contains arbitrary data. When | ||||
transmitted at the IP layer, the PMTU size also includes the IPv4 or | ||||
IPv6 header(s). | ||||
Probing starts directly after the PL handshake, before data is sent. | Probing can start directly after the PL handshake, this can be done | |||
Assuming this behavior (i.e., the PMTU is smaller than or equal to | before data is sent. Assuming this behavior (i.e., the PMTU is | |||
the interface MTU), this process will take several round trip time | smaller than or equal to the interface MTU), this process will take | |||
periods, dependent on the number of DPLPMTUD probes sent. The | several round trip time periods, dependent on the number of DPLPMTUD | |||
Heartbeat timer can be used to implement the PROBE_TIMER. | probes sent. The Heartbeat timer can be used to implement the | |||
PROBE_TIMER. | ||||
6.2.1.3. Validating the Path with SCTP | 6.2.1.3. Validating the Path with SCTP | |||
Since SCTP provides an acknowledged PL, a sender MUST NOT implement | Since SCTP provides an acknowledged PL, a sender MUST NOT implement | |||
the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | |||
6.2.1.4. PTB Message Handling by SCTP | 6.2.1.4. PTB Message Handling by SCTP | |||
Normal ICMP validation MUST be performed as specified in Appendix C | Normal ICMP validation MUST be performed as specified in Appendix C | |||
of [RFC4960]. This requires that the first 8 bytes of the SCTP | of [RFC4960]. This requires that the first 8 bytes of the SCTP | |||
skipping to change at page 34, line 45 ¶ | skipping to change at page 34, line 44 ¶ | |||
from the PTB_SIZE reported in the PTB message SHOULD be used with the | from the PTB_SIZE reported in the PTB message SHOULD be used with the | |||
DPLPMTUD algorithm, providing that the reported PL_PTB_SIZE is less | DPLPMTUD algorithm, providing that the reported PL_PTB_SIZE is less | |||
than the current probe size (see Section 4.6). | than the current probe size (see Section 4.6). | |||
6.2.2. DPLPMTUD for SCTP/UDP | 6.2.2. DPLPMTUD for SCTP/UDP | |||
The UDP encapsulation of SCTP is specified in [RFC6951]. | The UDP encapsulation of SCTP is specified in [RFC6951]. | |||
This specification updates the reference to RFC 4821 in section 5.6 | This specification updates the reference to RFC 4821 in section 5.6 | |||
of RFC 6951 to refer to XXXTHISRFCXXX. RFC 6951 is updated by | of RFC 6951 to refer to XXXTHISRFCXXX. RFC 6951 is updated by | |||
addition of the following sentence is to be added at the end of | addition of the following sentence at the end of section 5.6: "The | |||
section 5.6: "The RECOMMENDED method for determining the MTU of the | RECOMMENDED method for determining the MTU of the path is specified | |||
path is specified in XXXTHISRFCXXX". | in XXXTHISRFCXXX". | |||
XXX RFC EDITOR - please replace XXXTHISRFCXXX when published XXX | XXX RFC EDITOR - please replace XXXTHISRFCXXX when published XXX | |||
6.2.2.1. Initial Connectivity | 6.2.2.1. Initial Connectivity | |||
A sender can enter the BASE state as soon as SCTP connectivity has | A sender can enter the BASE state as soon as SCTP connectivity has | |||
been confirmed. | been confirmed. | |||
6.2.2.2. Sending SCTP/UDP Probe Packets | 6.2.2.2. Sending SCTP/UDP Probe Packets | |||
Packet probing can be performed as specified in Section 6.2.1.2. The | Packet probing can be performed as specified in Section 6.2.1.2. The | |||
maximum payload is reduced by 8 bytes, which has to be considered | size of the probe packet includes the 8 bytes of UDP Header. This | |||
when filling the PAD chunk. | has to be considered when filling the probe packet with the PAD | |||
chunk. | ||||
6.2.2.3. Validating the Path with SCTP/UDP | 6.2.2.3. Validating the Path with SCTP/UDP | |||
Since SCTP provides an acknowledged PL, a sender MUST NOT implement | SCTP provides an acknowledged PL, therefore a sender does not | |||
the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | |||
6.2.2.4. Handling of PTB Messages by SCTP/UDP | 6.2.2.4. Handling of PTB Messages by SCTP/UDP | |||
ICMP validation MUST be performed for PTB messages as specified in | ICMP validation MUST be performed for PTB messages as specified in | |||
Appendix C of [RFC4960]. This requires that the first 8 bytes of the | Appendix C of [RFC4960]. This requires that the first 8 bytes of the | |||
SCTP common header are contained in the PTB message, which can be the | SCTP common header are contained in the PTB message, which can be the | |||
case for ICMPv4 (but note the UDP header also consumes a part of the | case for ICMPv4 (but note the UDP header also consumes a part of the | |||
quoted packet header) and is normally the case for ICMPv6. When the | quoted packet header) and is normally the case for ICMPv6. When the | |||
validation is completed, the PL_PTB_SIZE calculated from the PTB_SIZE | validation is completed, the PL_PTB_SIZE calculated from the PTB_SIZE | |||
in the PTB message SHOULD be used with the DPLPMTUD providing that | in the PTB message SHOULD be used with the DPLPMTUD providing that | |||
skipping to change at page 35, line 48 ¶ | skipping to change at page 36, line 7 ¶ | |||
XXX RFC EDITOR - please replace XXXTHISRFCXXX when published XXX | XXX RFC EDITOR - please replace XXXTHISRFCXXX when published XXX | |||
6.2.3.1. Initial Connectivity | 6.2.3.1. Initial Connectivity | |||
A sender can enter the BASE state as soon as SCTP connectivity has | A sender can enter the BASE state as soon as SCTP connectivity has | |||
been confirmed. | been confirmed. | |||
6.2.3.2. Sending SCTP/DTLS Probe Packets | 6.2.3.2. Sending SCTP/DTLS Probe Packets | |||
Packet probing can be done, as specified in Section 6.2.1.2. | Packet probing can be done, as specified in Section 6.2.1.2. The | |||
maximum payload is reduced by the size of the DTLS headers, which has | ||||
to be considered when filling the PAD chunk. The size of the probe | ||||
packet includes the DTLS PL headers. This has to be considered when | ||||
filling the probe packet with the PAD chunk. | ||||
6.2.3.3. Validating the Path with SCTP/DTLS | 6.2.3.3. Validating the Path with SCTP/DTLS | |||
Since SCTP provides an acknowledged PL, a sender MUST NOT implement | Since SCTP provides an acknowledged PL, a sender MUST NOT implement | |||
the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | |||
6.2.3.4. Handling of PTB Messages by SCTP/DTLS | 6.2.3.4. Handling of PTB Messages by SCTP/DTLS | |||
[RFC4960] does not specify a way to validate SCTP/DTLS ICMP message | [RFC4960] does not specify a way to validate SCTP/DTLS ICMP message | |||
payload. This can prevent processing of PTB messages at the PL. | payload and neither does this document. This can prevent processing | |||
of PTB messages at the PL. | ||||
6.3. DPLPMTUD for QUIC | 6.3. DPLPMTUD for QUIC | |||
QUIC [I-D.ietf-quic-transport] is a UDP-based transport that provides | QUIC [I-D.ietf-quic-transport] is a UDP-based transport that provides | |||
reception feedback. The UDP payload includes the QUIC packet header, | reception feedback. The UDP payload includes the QUIC packet header, | |||
protected payload, and any authentication fields. QUIC depends on a | protected payload, and any authentication fields. QUIC depends on a | |||
PMTU of at least 1280 bytes. | PMTU of at least 1280 bytes. | |||
Section 14 of [I-D.ietf-quic-transport] describes the path | Section 14 of [I-D.ietf-quic-transport] describes the path | |||
considerations when sending QUIC packets. It recommends the use of | considerations when sending QUIC packets. It recommends the use of | |||
skipping to change at page 36, line 45 ¶ | skipping to change at page 37, line 5 ¶ | |||
ceases to send QUIC packets on the affected path. This could result | ceases to send QUIC packets on the affected path. This could result | |||
in termination of the connection if an alternative path cannot be | in termination of the connection if an alternative path cannot be | |||
found [I-D.ietf-quic-transport]. | found [I-D.ietf-quic-transport]. | |||
6.3.1. Initial Connectivity | 6.3.1. Initial Connectivity | |||
The base protocol is specified in [I-D.ietf-quic-transport]. This | The base protocol is specified in [I-D.ietf-quic-transport]. This | |||
provides an acknowledged PL. A sender can therefore enter the BASE | provides an acknowledged PL. A sender can therefore enter the BASE | |||
state as soon as connectivity has been confirmed. | state as soon as connectivity has been confirmed. | |||
QUIC provides an acknowledged PL, a sender can therefore enter the | ||||
BASE state as soon as connectivity has been confirmed. | ||||
6.3.2. Sending QUIC Probe Packets | 6.3.2. Sending QUIC Probe Packets | |||
A probe packet consists of a QUIC Header and a payload containing | Probe packets consist of a QUIC Header and a payload containing a | |||
PADDING Frames and a PING Frame. PADDING Frames are a single octet | PING Frame and multiple PADDING Frames. A PADDING Frame is | |||
(0x00) and several of these can be used to create a probe packet of | represented by a single octet (0x00). Several PADDING Frames are | |||
size PROBED_SIZE. QUIC provides an acknowledged PL, a sender can | used together to control the length of the probe packet. The PING | |||
therefore enter the BASE state as soon as connectivity has been | Frame is used to trigger generation of an acknowledgement. | |||
confirmed. | ||||
The current specification of QUIC sets the following: | The current specification of QUIC sets the following: | |||
* BASE_PLPMTU: A QUIC sender pads initial packets to confirm the | * BASE_PLPMTU: A QUIC sender pads initial packets to confirm the | |||
path can support packets of the required size, which sets the | path can support packets of the required size, which sets the | |||
BASE_PLPMTU and MIN_PLPMTU. | BASE_PLPMTU and MIN_PLPMTU. | |||
* MIN_PLPMTU: A QUIC sender that determines the MIN_PLPMTU has | * MIN_PLPMTU: A QUIC sender that determines the MIN_PLPMTU has | |||
fallen MUST immediately stop sending on the affected path. | fallen MUST immediately stop sending on the affected path. | |||
6.3.3. Validating the Path with QUIC | 6.3.3. Validating the Path with QUIC | |||
QUIC provides an acknowledged PL. A sender therefore MUST NOT | QUIC provides an acknowledged PL, therefore a sender does not | |||
implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | |||
6.3.4. Handling of PTB Messages by QUIC | 6.3.4. Handling of PTB Messages by QUIC | |||
QUIC validates ICMP PTB messages. In addition to UDP Port | QUIC validates ICMP PTB messages. In addition to UDP Port | |||
validation, QUIC can validate an ICMP message by using other PL | validation, QUIC can validate an ICMP message by using other PL | |||
information (e.g., validation of connection identifiers (CIDs) in the | information (e.g., validation of connection identifiers (CIDs) in the | |||
quoted packet of any received ICMP message). | quoted packet of any received ICMP message). | |||
7. Acknowledgments | 7. Acknowledgments | |||
skipping to change at page 38, line 10 ¶ | skipping to change at page 38, line 19 ¶ | |||
To avoid excessive load, the interval between individual probe | To avoid excessive load, the interval between individual probe | |||
packets MUST be at least one RTT, and the interval between rounds of | packets MUST be at least one RTT, and the interval between rounds of | |||
probing is determined by the PMTU_RAISE_TIMER. | probing is determined by the PMTU_RAISE_TIMER. | |||
A PL sender needs to ensure that the method used to confirm reception | A PL sender needs to ensure that the method used to confirm reception | |||
of probe packets protects from off-path attackers injecting packets | of probe packets protects from off-path attackers injecting packets | |||
into the path. This protection is provided in IETF-defined protocols | into the path. This protection is provided in IETF-defined protocols | |||
(e.g., TCP, SCTP) using a randomly-initialized sequence number. A | (e.g., TCP, SCTP) using a randomly-initialized sequence number. A | |||
description of one way to do this when using UDP is provided in | description of one way to do this when using UDP is provided in | |||
section 5.1 of [RFC8085]). | section 5.1 of [BCP145]). | |||
There are cases where ICMP Packet Too Big (PTB) messages are not | There are cases where ICMP Packet Too Big (PTB) messages are not | |||
delivered due to policy, configuration or equipment design (see | delivered due to policy, configuration or equipment design (see | |||
Section 1.1). This method therefore does not rely upon PTB messages | Section 1.1). This method therefore does not rely upon PTB messages | |||
being received, but is able to utilize these when they are received | being received, but is able to utilize these when they are received | |||
by the sender. PTB messages could potentially be used to cause a | by the sender. PTB messages could potentially be used to cause a | |||
node to inappropriately reduce the PLPMTU. A node supporting | node to inappropriately reduce the PLPMTU. A node supporting | |||
DPLPMTUD MUST therefore appropriately validate the payload of PTB | DPLPMTUD MUST therefore appropriately validate the payload of PTB | |||
messages to ensure these are received in response to transmitted | messages to ensure these are received in response to transmitted | |||
traffic (i.e., a reported error condition that corresponds to a | traffic (i.e., a reported error condition that corresponds to a | |||
datagram actually sent by the path layer, see Section 4.6.1). | datagram actually sent by the path layer, see Section 4.6.1). | |||
An on-path attacker able to create a PTB message could forge PTB | An on-path attacker able to create a PTB message could forge PTB | |||
messages that include a valid quoted IP packet. Such an attack could | messages that include a valid quoted IP packet. Such an attack could | |||
be used to drive down the PLPMTU. There are two ways this method can | be used to drive down the PLPMTU. An on-path device could similarly | |||
be mitigated against such attacks: First, by ensuring that a PL | force a reduction of the PLPMTU by implementing a policy that drops | |||
sender never reduces the PLPMTU below the base size, solely in | packets larger than a configured size. There are two ways this | |||
method can be mitigated against such attacks: First, by ensuring that | ||||
a PL sender never reduces the PLPMTU below the base size, solely in | ||||
response to receiving a PTB message. This is achieved by first | response to receiving a PTB message. This is achieved by first | |||
entering the BASE state when such a message is received. Second, the | entering the BASE state when such a message is received. Second, the | |||
design does not require processing of PTB messages, a PL sender could | design does not require processing of PTB messages, a PL sender could | |||
therefore suspend processing of PTB messages (e.g., in a robustness | therefore suspend processing of PTB messages (e.g., in a robustness | |||
mode after detecting that subsequent probes actually confirm that a | mode after detecting that subsequent probes actually confirm that a | |||
size larger than the PTB_SIZE is supported by a path). | size larger than the PTB_SIZE is supported by a path). | |||
Parsing the quoted packet inside a PTB message can introduce addional | Parsing the quoted packet inside a PTB message can introduce addional | |||
per-packet processing at the PL sender. This processing SHOULD be | per-packet processing at the PL sender. This processing SHOULD be | |||
limited to avoid a denial of service attack when arbitrary headers | limited to avoid a denial of service attack when arbitrary headers | |||
skipping to change at page 39, line 7 ¶ | skipping to change at page 39, line 18 ¶ | |||
hole data by indicating a size larger than supported by the path. | hole data by indicating a size larger than supported by the path. | |||
It is possible that the information about a path is not stable. This | It is possible that the information about a path is not stable. This | |||
could be a result of forwarding across more than one path that has a | could be a result of forwarding across more than one path that has a | |||
different actual PMTU or a single path presents a varying PMTU. The | different actual PMTU or a single path presents a varying PMTU. The | |||
design of a PLPMTUD implementation SHOULD consider how to mitigate | design of a PLPMTUD implementation SHOULD consider how to mitigate | |||
the effects of varying path information. One possible mitigation is | the effects of varying path information. One possible mitigation is | |||
to provide robustness (see Section 5.4) in the method that avoids | to provide robustness (see Section 5.4) in the method that avoids | |||
oscillation in the MPS. | oscillation in the MPS. | |||
A node performing DPLPMTUD could experience conflicting information | ||||
about the size of supported probe packets. This could occur when | ||||
multiple paths are concurrently in use and these exhibit a different | ||||
PMTU. If not considered, this could result in packets not being | ||||
delivered (black holed) when the PLPMTU results in a packet larger | ||||
than the smallest actual PMTU. | ||||
DPLPMTUD methods can introduce padding data to inflate the length of | DPLPMTUD methods can introduce padding data to inflate the length of | |||
the datagram to the total size required for a probe packet. The | the datagram to the total size required for a probe packet. The | |||
total size of a probe packet includes all headers and padding added | total size of a probe packet includes all headers and padding added | |||
to the payload data being sent (e.g., including security-related | to the payload data being sent (e.g., including security-related | |||
fields such as an AEAD tag and TLS record layer padding). The value | fields such as an AEAD tag and TLS record layer padding). The value | |||
of the padding data does not influence the DPLPMTUD search algorithm, | of the padding data does not influence the DPLPMTUD search algorithm, | |||
and therefore needs to be set consistent with the policy of the PL. | and therefore needs to be set consistent with the policy of the PL. | |||
If a PL can make use of cryptographic confidentiality or data- | If a PL can make use of cryptographic confidentiality or data- | |||
integrity mechanisms, then the design ought to avoid adding anything | integrity mechanisms, then the design ought to avoid adding anything | |||
(e.g., padding) to DPLPMTUD probe packets that is not also protected | (e.g., padding) to DPLPMTUD probe packets that is not also protected | |||
by those cryptographic mechanisms. | by those cryptographic mechanisms. | |||
10. References | 10. References | |||
10.1. Normative References | 10.1. Normative References | |||
[BCP145] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage | ||||
Guidelines", BCP 145, RFC 8085, March 2017. | ||||
<https://www.rfc-editor.org/info/bcp145> | ||||
[I-D.ietf-quic-transport] | [I-D.ietf-quic-transport] | |||
Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed | Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed | |||
and Secure Transport", Work in Progress, Internet-Draft, | and Secure Transport", Work in Progress, Internet-Draft, | |||
draft-ietf-quic-transport-27, 21 February 2020, | draft-ietf-quic-transport-27, 21 February 2020, | |||
<http://www.ietf.org/internet-drafts/draft-ietf-quic- | <http://www.ietf.org/internet-drafts/draft-ietf-quic- | |||
transport-27.txt>. | transport-27.txt>. | |||
[RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, | [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, | |||
DOI 10.17487/RFC0768, August 1980, | DOI 10.17487/RFC0768, August 1980, | |||
<https://www.rfc-editor.org/info/rfc768>. | <https://www.rfc-editor.org/info/rfc768>. | |||
skipping to change at page 40, line 27 ¶ | skipping to change at page 40, line 36 ¶ | |||
[RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", | [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", | |||
RFC 4960, DOI 10.17487/RFC4960, September 2007, | RFC 4960, DOI 10.17487/RFC4960, September 2007, | |||
<https://www.rfc-editor.org/info/rfc4960>. | <https://www.rfc-editor.org/info/rfc4960>. | |||
[RFC6951] Tuexen, M. and R. Stewart, "UDP Encapsulation of Stream | [RFC6951] Tuexen, M. and R. Stewart, "UDP Encapsulation of Stream | |||
Control Transmission Protocol (SCTP) Packets for End-Host | Control Transmission Protocol (SCTP) Packets for End-Host | |||
to End-Host Communication", RFC 6951, | to End-Host Communication", RFC 6951, | |||
DOI 10.17487/RFC6951, May 2013, | DOI 10.17487/RFC6951, May 2013, | |||
<https://www.rfc-editor.org/info/rfc6951>. | <https://www.rfc-editor.org/info/rfc6951>. | |||
[RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage | ||||
Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, | ||||
March 2017, <https://www.rfc-editor.org/info/rfc8085>. | ||||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
[RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | |||
(IPv6) Specification", STD 86, RFC 8200, | (IPv6) Specification", STD 86, RFC 8200, | |||
DOI 10.17487/RFC8200, July 2017, | DOI 10.17487/RFC8200, July 2017, | |||
<https://www.rfc-editor.org/info/rfc8200>. | <https://www.rfc-editor.org/info/rfc8200>. | |||
[RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., | [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., | |||
skipping to change at page 46, line 34 ¶ | skipping to change at page 46, line 40 ¶ | |||
* Updated text and address nits from OPSDIR, ART and IESG reviews. | * Updated text and address nits from OPSDIR, ART and IESG reviews. | |||
* Order PTB processing based on PL_PTB_SIZE | * Order PTB processing based on PL_PTB_SIZE | |||
Working group draft -19: | Working group draft -19: | |||
* Updated text and address nits based on comments from Tim Chown and | * Updated text and address nits based on comments from Tim Chown and | |||
Murray S. Kucherawy. | Murray S. Kucherawy. | |||
Working group draft -20: | ||||
* Address nits and comments from IESG | ||||
* Refer to BCP 145 rather than RFC 8085 in most places. | ||||
* Update probing method text for SCTP and QUIC. | ||||
Authors' Addresses | Authors' Addresses | |||
Godred Fairhurst | Godred Fairhurst | |||
University of Aberdeen | University of Aberdeen | |||
School of Engineering | School of Engineering | |||
Fraser Noble Building | Fraser Noble Building | |||
Aberdeen | Aberdeen | |||
AB24 3UE | AB24 3UE | |||
United Kingdom | United Kingdom | |||
End of changes. 80 change blocks. | ||||
252 lines changed or deleted | 284 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |