draft-ietf-tsvwg-datagram-plpmtud-16.txt | draft-ietf-tsvwg-datagram-plpmtud-17.txt | |||
---|---|---|---|---|
Internet Engineering Task Force G. Fairhurst | Internet Engineering Task Force G. Fairhurst | |||
Internet-Draft T. Jones | Internet-Draft T. Jones | |||
Updates: 4821, 4960, 6951, 8085, 8261 (if University of Aberdeen | Updates: 4821, 4960, 6951, 8085, 8261 (if University of Aberdeen | |||
approved) M. Tuexen | approved) M. Tuexen | |||
Intended status: Standards Track I. Ruengeler | Intended status: Standards Track I. Ruengeler | |||
Expires: 10 September 2020 T. Voelker | Expires: 24 September 2020 T. Voelker | |||
Muenster University of Applied Sciences | Muenster University of Applied Sciences | |||
9 March 2020 | 23 March 2020 | |||
Packetization Layer Path MTU Discovery for Datagram Transports | Packetization Layer Path MTU Discovery for Datagram Transports | |||
draft-ietf-tsvwg-datagram-plpmtud-16 | draft-ietf-tsvwg-datagram-plpmtud-17 | |||
Abstract | Abstract | |||
This document describes a robust method for Path MTU Discovery | This document describes a robust method for Path MTU Discovery | |||
(PMTUD) for datagram Packetization Layers (PLs). It describes an | (PMTUD) for datagram Packetization Layers (PLs). It describes an | |||
extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path | extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path | |||
MTU Discovery for IPv4 and IPv6. The method allows a PL, or a | MTU Discovery for IPv4 and IPv6. The method allows a PL, or a | |||
datagram application that uses a PL, to discover whether a network | datagram application that uses a PL, to discover whether a network | |||
path can support the current size of datagram. This can be used to | path can support the current size of datagram. This can be used to | |||
detect and reduce the message size when a sender encounters a packet | detect and reduce the message size when a sender encounters a packet | |||
skipping to change at page 2, line 15 ¶ | skipping to change at page 2, line 15 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on 10 September 2020. | This Internet-Draft will expire on 24 September 2020. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2020 IETF Trust and the persons identified as the | Copyright (c) 2020 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents (https://trustee.ietf.org/ | |||
license-info) in effect on the date of publication of this document. | license-info) in effect on the date of publication of this document. | |||
Please review these documents carefully, as they describe your rights | Please review these documents carefully, as they describe your rights | |||
skipping to change at page 2, line 48 ¶ | skipping to change at page 2, line 48 ¶ | |||
3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 10 | 3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 10 | |||
4. DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . . 13 | 4. DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . . 13 | |||
4.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 13 | 4.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 13 | |||
4.2. Confirmation of Probed Packet Size . . . . . . . . . . . 14 | 4.2. Confirmation of Probed Packet Size . . . . . . . . . . . 14 | |||
4.3. Black Hole Detection . . . . . . . . . . . . . . . . . . 15 | 4.3. Black Hole Detection . . . . . . . . . . . . . . . . . . 15 | |||
4.4. The Maximum Packet Size (MPS) . . . . . . . . . . . . . . 16 | 4.4. The Maximum Packet Size (MPS) . . . . . . . . . . . . . . 16 | |||
4.5. Disabling the Effect of PMTUD . . . . . . . . . . . . . . 17 | 4.5. Disabling the Effect of PMTUD . . . . . . . . . . . . . . 17 | |||
4.6. Response to PTB Messages . . . . . . . . . . . . . . . . 17 | 4.6. Response to PTB Messages . . . . . . . . . . . . . . . . 17 | |||
4.6.1. Validation of PTB Messages . . . . . . . . . . . . . 17 | 4.6.1. Validation of PTB Messages . . . . . . . . . . . . . 17 | |||
4.6.2. Use of PTB Messages . . . . . . . . . . . . . . . . . 18 | 4.6.2. Use of PTB Messages . . . . . . . . . . . . . . . . . 18 | |||
5. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 19 | 5. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 20 | |||
5.1. DPLPMTUD Components . . . . . . . . . . . . . . . . . . . 20 | 5.1. DPLPMTUD Components . . . . . . . . . . . . . . . . . . . 20 | |||
5.1.1. Timers . . . . . . . . . . . . . . . . . . . . . . . 20 | 5.1.1. Timers . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
5.1.2. Constants . . . . . . . . . . . . . . . . . . . . . . 21 | 5.1.2. Constants . . . . . . . . . . . . . . . . . . . . . . 21 | |||
5.1.3. Variables . . . . . . . . . . . . . . . . . . . . . . 22 | 5.1.3. Variables . . . . . . . . . . . . . . . . . . . . . . 22 | |||
5.1.4. Overview of DPLPMTUD Phases . . . . . . . . . . . . . 23 | 5.1.4. Overview of DPLPMTUD Phases . . . . . . . . . . . . . 23 | |||
5.2. State Machine . . . . . . . . . . . . . . . . . . . . . . 24 | 5.2. State Machine . . . . . . . . . . . . . . . . . . . . . . 25 | |||
5.3. Search to Increase the PLPMTU . . . . . . . . . . . . . . 27 | 5.3. Search to Increase the PLPMTU . . . . . . . . . . . . . . 28 | |||
5.3.1. Probing for a larger PLPMTU . . . . . . . . . . . . . 27 | 5.3.1. Probing for a larger PLPMTU . . . . . . . . . . . . . 28 | |||
5.3.2. Selection of Probe Sizes . . . . . . . . . . . . . . 28 | 5.3.2. Selection of Probe Sizes . . . . . . . . . . . . . . 29 | |||
5.3.3. Resilience to Inconsistent Path Information . . . . . 28 | 5.3.3. Resilience to Inconsistent Path Information . . . . . 29 | |||
5.4. Robustness to Inconsistent Paths . . . . . . . . . . . . 29 | 5.4. Robustness to Inconsistent Paths . . . . . . . . . . . . 30 | |||
6. Specification of Protocol-Specific Methods . . . . . . . . . 29 | 6. Specification of Protocol-Specific Methods . . . . . . . . . 30 | |||
6.1. Application support for DPLPMTUD with UDP or UDP-Lite . . 29 | 6.1. Application support for DPLPMTUD with UDP or UDP-Lite . . 30 | |||
6.1.1. Application Request . . . . . . . . . . . . . . . . . 30 | 6.1.1. Application Request . . . . . . . . . . . . . . . . . 31 | |||
6.1.2. Application Response . . . . . . . . . . . . . . . . 30 | 6.1.2. Application Response . . . . . . . . . . . . . . . . 31 | |||
6.1.3. Sending Application Probe Packets . . . . . . . . . . 30 | 6.1.3. Sending Application Probe Packets . . . . . . . . . . 31 | |||
6.1.4. Initial Connectivity . . . . . . . . . . . . . . . . 30 | 6.1.4. Initial Connectivity . . . . . . . . . . . . . . . . 31 | |||
6.1.5. Validating the Path . . . . . . . . . . . . . . . . . 30 | 6.1.5. Validating the Path . . . . . . . . . . . . . . . . . 31 | |||
6.1.6. Handling of PTB Messages . . . . . . . . . . . . . . 30 | 6.1.6. Handling of PTB Messages . . . . . . . . . . . . . . 31 | |||
6.2. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 31 | 6.2. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 32 | |||
6.2.1. SCTP/IPv4 and SCTP/IPv6 . . . . . . . . . . . . . . . 31 | 6.2.1. SCTP/IPv4 and SCTP/IPv6 . . . . . . . . . . . . . . . 32 | |||
6.2.1.1. Initial Connectivity . . . . . . . . . . . . . . 31 | 6.2.1.1. Initial Connectivity . . . . . . . . . . . . . . 32 | |||
6.2.1.2. Sending SCTP Probe Packets . . . . . . . . . . . 31 | 6.2.1.2. Sending SCTP Probe Packets . . . . . . . . . . . 32 | |||
6.2.1.3. Validating the Path with SCTP . . . . . . . . . . 32 | 6.2.1.3. Validating the Path with SCTP . . . . . . . . . . 33 | |||
6.2.1.4. PTB Message Handling by SCTP . . . . . . . . . . 32 | 6.2.1.4. PTB Message Handling by SCTP . . . . . . . . . . 33 | |||
6.2.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 32 | 6.2.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 33 | |||
6.2.2.1. Initial Connectivity . . . . . . . . . . . . . . 32 | 6.2.2.1. Initial Connectivity . . . . . . . . . . . . . . 33 | |||
6.2.2.2. Sending SCTP/UDP Probe Packets . . . . . . . . . 33 | 6.2.2.2. Sending SCTP/UDP Probe Packets . . . . . . . . . 34 | |||
6.2.2.3. Validating the Path with SCTP/UDP . . . . . . . . 33 | 6.2.2.3. Validating the Path with SCTP/UDP . . . . . . . . 34 | |||
6.2.2.4. Handling of PTB Messages by SCTP/UDP . . . . . . 33 | 6.2.2.4. Handling of PTB Messages by SCTP/UDP . . . . . . 34 | |||
6.2.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . 33 | 6.2.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . 34 | |||
6.2.3.1. Initial Connectivity . . . . . . . . . . . . . . 33 | 6.2.3.1. Initial Connectivity . . . . . . . . . . . . . . 34 | |||
6.2.3.2. Sending SCTP/DTLS Probe Packets . . . . . . . . . 33 | 6.2.3.2. Sending SCTP/DTLS Probe Packets . . . . . . . . . 34 | |||
6.2.3.3. Validating the Path with SCTP/DTLS . . . . . . . 33 | 6.2.3.3. Validating the Path with SCTP/DTLS . . . . . . . 34 | |||
6.2.3.4. Handling of PTB Messages by SCTP/DTLS . . . . . . 34 | 6.2.3.4. Handling of PTB Messages by SCTP/DTLS . . . . . . 35 | |||
6.3. DPLPMTUD for QUIC . . . . . . . . . . . . . . . . . . . . 34 | 6.3. DPLPMTUD for QUIC . . . . . . . . . . . . . . . . . . . . 35 | |||
6.3.1. Initial Connectivity . . . . . . . . . . . . . . . . 34 | 6.3.1. Initial Connectivity . . . . . . . . . . . . . . . . 35 | |||
6.3.2. Sending QUIC Probe Packets . . . . . . . . . . . . . 34 | 6.3.2. Sending QUIC Probe Packets . . . . . . . . . . . . . 35 | |||
6.3.3. Validating the Path with QUIC . . . . . . . . . . . . 35 | 6.3.3. Validating the Path with QUIC . . . . . . . . . . . . 36 | |||
6.3.4. Handling of PTB Messages by QUIC . . . . . . . . . . 35 | 6.3.4. Handling of PTB Messages by QUIC . . . . . . . . . . 36 | |||
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 35 | 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 36 | |||
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 35 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 36 | |||
9. Security Considerations . . . . . . . . . . . . . . . . . . . 35 | 9. Security Considerations . . . . . . . . . . . . . . . . . . . 36 | |||
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 37 | 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 38 | |||
10.1. Normative References . . . . . . . . . . . . . . . . . . 37 | 10.1. Normative References . . . . . . . . . . . . . . . . . . 38 | |||
10.2. Informative References . . . . . . . . . . . . . . . . . 38 | 10.2. Informative References . . . . . . . . . . . . . . . . . 39 | |||
Appendix A. Revision Notes . . . . . . . . . . . . . . . . . . . 39 | Appendix A. Revision Notes . . . . . . . . . . . . . . . . . . . 40 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 43 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 45 | |||
1. Introduction | 1. Introduction | |||
The IETF has specified datagram transport using UDP, SCTP, and DCCP, | The IETF has specified datagram transport using UDP, SCTP, and DCCP, | |||
as well as protocols layered on top of these transports (e.g., SCTP/ | as well as protocols layered on top of these transports (e.g., SCTP/ | |||
UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP | UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP | |||
network layer. This document describes a robust method for Path MTU | network layer. This document describes a robust method for Path MTU | |||
Discovery (PMTUD) that can be used with these transport protocols (or | Discovery (PMTUD) that can be used with these transport protocols (or | |||
the applications that use their transport service) to discover an | the applications that use their transport service) to discover an | |||
appropriate size of packet to use across an Internet path. | appropriate size of packet to use across an Internet path. | |||
skipping to change at page 5, line 36 ¶ | skipping to change at page 5, line 36 ¶ | |||
receive a packet because of its size. This could be due to | receive a packet because of its size. This could be due to | |||
misconfiguration of the layer 2 path between nodes, for instance | misconfiguration of the layer 2 path between nodes, for instance | |||
the MTU configured in a layer 2 switch, or misconfiguration of the | the MTU configured in a layer 2 switch, or misconfiguration of the | |||
Maximum Receive Unit (MRU). If a packet is dropped by the link, | Maximum Receive Unit (MRU). If a packet is dropped by the link, | |||
this will not cause a PTB message to be sent to the original | this will not cause a PTB message to be sent to the original | |||
sender. | sender. | |||
Another failure could result if a node that is not on the network | Another failure could result if a node that is not on the network | |||
path sends a PTB message that attempts to force a sender to change | path sends a PTB message that attempts to force a sender to change | |||
the effective PMTU [RFC8201]. A sender can protect itself from | the effective PMTU [RFC8201]. A sender can protect itself from | |||
reacting to such messages by utilising the quoted packet within a PTB | reacting to such messages by utilizing the quoted packet within a PTB | |||
message payload to validate that the received PTB message was | message payload to validate that the received PTB message was | |||
generated in response to a packet that had actually originated from | generated in response to a packet that had actually originated from | |||
the sender. However, there are situations where a sender would be | the sender. However, there are situations where a sender would be | |||
unable to provide this validation. Examples where validation of the | unable to provide this validation. Examples where validation of the | |||
PTB message is not possible include: | PTB message is not possible include: | |||
* When a router issuing the ICMP message implements RFC792 | * When a router issuing the ICMP message implements RFC792 | |||
[RFC0792], it is only required to include the first 64 bits of the | [RFC0792], it is only required to include the first 64 bits of the | |||
IP payload of the packet within the quoted payload. There could | IP payload of the packet within the quoted payload. There could | |||
be insufficient bytes remaining for the sender to interpret the | be insufficient bytes remaining for the sender to interpret the | |||
skipping to change at page 6, line 24 ¶ | skipping to change at page 6, line 24 ¶ | |||
validate the message, because validation depends on information | validate the message, because validation depends on information | |||
about the active transport flows at an endpoint node (e.g., the | about the active transport flows at an endpoint node (e.g., the | |||
socket/address pairs being used, and other protocol header | socket/address pairs being used, and other protocol header | |||
information). | information). | |||
* When a packet is encapsulated/tunneled over an encrypted | * When a packet is encapsulated/tunneled over an encrypted | |||
transport, the tunnel/encapsulation ingress might have | transport, the tunnel/encapsulation ingress might have | |||
insufficient context, or computational power, to reconstruct the | insufficient context, or computational power, to reconstruct the | |||
transport header that would be needed to perform validation. | transport header that would be needed to perform validation. | |||
* A Network Addres Translation (NAT) device that translates a packet | * A Network Address Translation (NAT) device that translates a | |||
header, ought to also translate ICMP messages and update the ICMP | packet header, ought to also translate ICMP messages and update | |||
quoted packet [RFC5508] in that message. If this is not correctly | the ICMP quoted packet [RFC5508] in that message. If this is not | |||
translated then the sender would not be able to associate the | correctly translated then the sender would not be able to | |||
message with the PL that originated the packet, and hence this | associate the message with the PL that originated the packet, and | |||
ICMP message cannot be validated. | hence this ICMP message cannot be validated. | |||
1.2. Packetization Layer Path MTU Discovery | 1.2. Packetization Layer Path MTU Discovery | |||
The term Packetization Layer (PL) has been introduced to describe the | The term Packetization Layer (PL) has been introduced to describe the | |||
layer that is responsible for placing data blocks into the payload of | layer that is responsible for placing data blocks into the payload of | |||
IP packets and selecting an appropriate MPS. This function is often | IP packets and selecting an appropriate MPS. This function is often | |||
performed by a transport protocol (e.g., DCCP, RTP, SCTP, QUIC), but | performed by a transport protocol (e.g., DCCP, RTP, SCTP, QUIC), but | |||
can also be performed by other encapsulation methods working above | can also be performed by other encapsulation methods working above | |||
the transport layer. | the transport layer. | |||
skipping to change at page 7, line 36 ¶ | skipping to change at page 7, line 36 ¶ | |||
the lower layers, although it can utilize PTB messages when these | the lower layers, although it can utilize PTB messages when these | |||
received messages are made available to the PL. | received messages are made available to the PL. | |||
The message size guidelines in section 3.2 of the UDP Usage | The message size guidelines in section 3.2 of the UDP Usage | |||
Guidelines [RFC8085] state "an application SHOULD either use the Path | Guidelines [RFC8085] state "an application SHOULD either use the Path | |||
MTU information provided by the IP layer or implement Path MTU | MTU information provided by the IP layer or implement Path MTU | |||
Discovery (PMTUD)", but does not provide a mechanism for discovering | Discovery (PMTUD)", but does not provide a mechanism for discovering | |||
the largest size of unfragmented datagram that can be used on a | the largest size of unfragmented datagram that can be used on a | |||
network path. The present document updates RFC 8085 to specify this | network path. The present document updates RFC 8085 to specify this | |||
method in place of PLPMTUD [RFC4821] and provides a mechanism for | method in place of PLPMTUD [RFC4821] and provides a mechanism for | |||
sharing the discovered largest size as the Maximum Packet Size (MPS) | sharing the discovered largest size as the MPS (see Section 4.4). | |||
(see Section 4.4). | ||||
Section 10.2 of [RFC4821] recommended a PLPMTUD probing method for | Section 10.2 of [RFC4821] recommended a PLPMTUD probing method for | |||
the Stream Control Transport Protocol (SCTP). SCTP utilizes probe | the Stream Control Transport Protocol (SCTP). SCTP utilizes probe | |||
packets consisting of a minimal sized HEARTBEAT chunk bundled with a | packets consisting of a minimal sized HEARTBEAT chunk bundled with a | |||
PAD chunk as defined in [RFC4820]. However, RFC 4821 did not provide | PAD chunk as defined in [RFC4820]. However, RFC 4821 did not provide | |||
a complete specification. The present document replaces this by | a complete specification. The present document replaces this by | |||
providing a complete specification. | providing a complete specification. | |||
The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires | The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires | |||
implementations to support Classical PMTUD and states that a DCCP | implementations to support Classical PMTUD and states that a DCCP | |||
skipping to change at page 8, line 22 ¶ | skipping to change at page 8, line 22 ¶ | |||
[RFC8261]. | [RFC8261]. | |||
2. Terminology | 2. Terminology | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
"OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in BCP | |||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
capitals, as shown here. | capitals, as shown here. | |||
Other terminology is directly copied from [RFC4821], and the | The following terminology is defined. Relevant terms are directly | |||
definitions in [RFC1122]. | copied from [RFC4821], and the definitions in [RFC1122]. | |||
Acknowledged PL: A PL that includes a mechanism that can confirm | ||||
successful delivery of datagrams to the remote PL endpoint (e.g., | ||||
SCTP). Typically, the PL receiver returns acknowledgments | ||||
corresponding to the received datagrams, which can be utilised to | ||||
detect black-holing of packets (c.f., Unacknowledged PL). | ||||
Actual PMTU: The Actual PMTU is the PMTU of a network path between a | Actual PMTU: The Actual PMTU is the PMTU of a network path between a | |||
sender PL and a destination PL, which the DPLPMTUD algorithm seeks | sender PL and a destination PL, which the DPLPMTUD algorithm seeks | |||
to determine. | to determine. | |||
Black Hole: A Black Hole is encountered when a sender is unaware | Black Hole: A Black Hole is encountered when a sender is unaware | |||
that packets are not being delivered to the destination end point. | that packets are not being delivered to the destination end point. | |||
Two types of Black Hole are relevant to DPLPMTUD: | Two types of Black Hole are relevant to DPLPMTUD: | |||
* Packets encounter a packet Black Hole when packets are not | * Packets encounter a packet Black Hole when packets are not | |||
skipping to change at page 9, line 7 ¶ | skipping to change at page 9, line 12 ¶ | |||
Classical Path MTU Discovery: Classical PMTUD is a process described | Classical Path MTU Discovery: Classical PMTUD is a process described | |||
in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to | in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to | |||
learn the largest size of unfragmented packet that can be used | learn the largest size of unfragmented packet that can be used | |||
across a network path. | across a network path. | |||
Datagram: A datagram is a transport-layer protocol data unit, | Datagram: A datagram is a transport-layer protocol data unit, | |||
transmitted in the payload of an IP packet. | transmitted in the payload of an IP packet. | |||
Effective PMTU: The Effective PMTU is the current estimated value | Effective PMTU: The Effective PMTU is the current estimated value | |||
for PMTU that is used by a PMTUD. This is equivalent to the | for PMTU that is used by a PMTUD. This is equivalent to the | |||
PLPMTU derived by PLPMTUD. | PLPMTU derived by PLPMTUD plus the size of any headers added below | |||
the PL, including the IP layer headers. | ||||
EMTU_S: The Effective MTU for sending (EMTU_S) is defined in | EMTU_S: The Effective MTU for sending (EMTU_S) is defined in | |||
[RFC1122] as "the maximum IP datagram size that may be sent, for a | [RFC1122] as "the maximum IP datagram size that may be sent, for a | |||
particular combination of IP source and destination addresses...". | particular combination of IP source and destination addresses...". | |||
EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in | EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in | |||
[RFC1122] as the largest datagram size that can be reassembled by | [RFC1122] as the largest datagram size that can be reassembled by | |||
EMTU_R (Effective MTU to receive). | EMTU_R (Effective MTU to receive). | |||
Link: A Link is a communication facility or medium over which nodes | Link: A Link is a communication facility or medium over which nodes | |||
skipping to change at page 9, line 35 ¶ | skipping to change at page 9, line 41 ¶ | |||
could more properly be called the IP MTU, to be consistent with | could more properly be called the IP MTU, to be consistent with | |||
how other standards organizations use the acronym. This includes | how other standards organizations use the acronym. This includes | |||
the IP header, but excludes link layer headers and other framing | the IP header, but excludes link layer headers and other framing | |||
that is not part of IP or the IP payload. Other standards | that is not part of IP or the IP payload. Other standards | |||
organizations generally define the link MTU to include the link | organizations generally define the link MTU to include the link | |||
layer headers. This specification continues the requirement in | layer headers. This specification continues the requirement in | |||
[RFC4821], that states "All links MUST enforce their MTU: links | [RFC4821], that states "All links MUST enforce their MTU: links | |||
that might non- deterministically deliver packets that are larger | that might non- deterministically deliver packets that are larger | |||
than their rated MTU MUST consistently discard such packets." | than their rated MTU MUST consistently discard such packets." | |||
MAX_PMTU: The MAX_PMTU is the largest size of PLPMTU that DPLPMTUD | MAX_PLPMTU: The MAX_PLPMTU is the largest size of PLPMTU that | |||
will attempt to use. | DPLPMTUD will attempt to use. | |||
MPS: The Maximum Packet Size (MPS) is the largest size of | MIN_PLPMTU: The MIN_PLPMTU is the smallest size of PLPMTU that | |||
application data block that can be sent across a network path by a | DPLPMTUD will attempt to use. | |||
PL. In DPLPMTUD this quantity is derived from the PLPMTU by | ||||
taking into consideration the size of the lower protocol layer | ||||
headers. Probe packets generated by DPLPMTUD can have a size | ||||
larger than the MPS. | ||||
MIN_PMTU: The MIN_PMTU is the smallest size of PLPMTU that DPLPMTUD | MPS: MPS: The Maximum Packet Size (MPS) is the largest size of | |||
will attempt to use. | application data block that can be sent across a network path by a | |||
PL using a single Datagram. | ||||
Packet: A Packet is the IP header plus the IP payload. | Packet: A Packet is the IP header plus the IP payload. | |||
Packetization Layer (PL): The Packetization Layer (PL) is a layer of | Packetization Layer (PL): The PL is a layer of the network stack | |||
the network stack that places data into packets and performs | that places data into packets and performs transport protocol | |||
transport protocol functions. Examples of a PL include: TCP, | functions. Examples of a PL include: TCP, SCTP, SCTP over DTLS or | |||
SCTP, SCTP over DTLS or QUIC. | QUIC. | |||
Path: The Path is the set of links and routers traversed by a packet | Path: The Path is the set of links and routers traversed by a packet | |||
between a source node and a destination node by a particular flow. | between a source node and a destination node by a particular flow. | |||
Path MTU (PMTU): The Path MTU (PMTU) is the minimum of the Link MTU | Path MTU (PMTU): The Path MTU (PMTU) is the minimum of the Link MTU | |||
of all the links forming a network path between a source node and | of all the links forming a network path between a source node and | |||
a destination node. | a destination node, as used by PMTUD. | |||
PTB_SIZE: The PTB_SIZE is a value reported in a validated PTB | PTB_SIZE: The PTB_SIZE is a value reported in a validated PTB | |||
message that indicates next hop link MTU of a router along the | message that indicates next hop link MTU of a router along the | |||
path. | path. | |||
PLPMTU: The Packetization Layer PMTU is an estimate of the actual | PL_PTB_SIZE: The size reported in a validated PTB message, reduced | |||
PMTU provided by the DPLPMTUD algorithm. | by the size of all headers added by layers below the PL. | |||
PLPMTU: The Packetization Layer PMTU is an estimate of the largest | ||||
size of PL datagram that can be sent by a path, controled by | ||||
PLPMTUD. | ||||
PLPMTUD: Packetization Layer Path MTU Discovery (PLPMTUD), the | PLPMTUD: Packetization Layer Path MTU Discovery (PLPMTUD), the | |||
method described in this document for datagram PLs, which is an | method described in this document for datagram PLs, which is an | |||
extension to Classical PMTU Discovery. | extension to Classical PMTU Discovery. | |||
Probe packet: A probe packet is a datagram sent with a purposely | Probe packet: A probe packet is a datagram sent with a purposely | |||
chosen size (typically the current PLPMTU or larger) to detect if | chosen size (typically the current PLPMTU or larger) to detect if | |||
packets of this size can be successfully sent end-to-end across | packets of this size can be successfully sent end-to-end across | |||
the network path. | the network path. | |||
Unacknowledged PL: A PL that does not itself provide a mechanism to | ||||
confirm delivery of datagrams to the remote PL endpoint (e.g., | ||||
UDP), and therefore requires DPLPMTUD to provide a mechanism to | ||||
detect black-holing of packets (c.f., Acknowledged PL). | ||||
3. Features Required to Provide Datagram PLPMTUD | 3. Features Required to Provide Datagram PLPMTUD | |||
The principles expressed in [RFC4821] apply to the use of the | The principles expressed in [RFC4821] apply to the use of the | |||
technique with any PL. TCP PLPMTUD has been defined using standard | technique with any PL. TCP PLPMTUD has been defined using standard | |||
TCP protocol mechanisms. Unlike TCP, datagram PLs require additional | TCP protocol mechanisms. Unlike TCP, a datagram PL requires | |||
mechanisms and considerations to implement PLPMTUD. | additional mechanisms and considerations to implement PLPMTUD. | |||
The requirements for datagram PLPMTUD are: | The requirements for datagram PLPMTUD are: | |||
1. PLPMTU: The PLPMTU (specified as the effective PMTU in Section 1 | 1. Managing the PLPMTU: For datagram PLs, the PLPMTU is managed by | |||
of [RFC1191]) is equivalent to the EMTU_S (specified in | DPLPMTUD. A PL MUST NOT send a datagram (other than a probe | |||
[RFC1122]). For datagram PLs,] the PLPMTU is managed by | packet) with a size at the PL layer that is larger than the | |||
DPLPMTUD. A PL MUST NOT send a packet (other than a probe | current PLPMTU. | |||
packet) with a size larger than the current PLPMTU at the | ||||
network layer. | ||||
2. Probe packets: On request, a DPLPMTUD sender is REQUIRED to be | 2. Probe packets: On request, a DPLPMTUD sender is REQUIRED to be | |||
able to transmit a packet larger than the PLMPMTU. This is used | able to transmit a packet larger than the PLMPMTU. This is used | |||
to send a probe packet. In IPv4, a probe packet MUST be sent | to send a probe packet. In IPv4, a probe packet MUST be sent | |||
with the Don't Fragment (DF) bit set in the IP header, and | with the Don't Fragment (DF) bit set in the IP header, and | |||
without network layer endpoint fragmentation. In IPv6, a probe | without network layer endpoint fragmentation. In IPv6, a probe | |||
packet is always sent without source fragmentation (as specified | packet is always sent without source fragmentation (as specified | |||
in section 5.4 of [RFC8201]). | in section 5.4 of [RFC8201]). | |||
3. Reception feedback: The destination PL endpoint is REQUIRED to | 3. Reception feedback: The destination PL endpoint is REQUIRED to | |||
skipping to change at page 11, line 21 ¶ | skipping to change at page 11, line 29 ¶ | |||
4. Probe loss recovery: It is RECOMMENDED to use probe packets that | 4. Probe loss recovery: It is RECOMMENDED to use probe packets that | |||
do not carry any user data that would require retransmission if | do not carry any user data that would require retransmission if | |||
lost. Most datagram transports permit this. If a probe packet | lost. Most datagram transports permit this. If a probe packet | |||
contains user data requiring retransmission in case of loss, the | contains user data requiring retransmission in case of loss, the | |||
PL (or layers above) are REQUIRED to arrange any retransmission/ | PL (or layers above) are REQUIRED to arrange any retransmission/ | |||
repair of any resulting loss. The PL is REQUIRED to be robust | repair of any resulting loss. The PL is REQUIRED to be robust | |||
in the case where probe packets are lost due to other reasons | in the case where probe packets are lost due to other reasons | |||
(including link transmission error, congestion). | (including link transmission error, congestion). | |||
5. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to utilise | 5. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to utilize | |||
information about the maximum size of packet that can be | information about the maximum size of packet that can be | |||
transmitted by the sender on the local link (e.g., the local | transmitted by the sender on the local link (e.g., the local | |||
Link MTU). It MAY utilize similar information about the | Link MTU). It MAY utilize similar information about the maximum | |||
receiver when this is supplied (note this could be less than | size a receiver can accept when this is supplied (note this | |||
EMTU_R). This avoids implementations trying to send probe | could be less than EMTU_R). This avoids implementations trying | |||
packets that can not be transmitted by the local link. Too high | to send probe packets that can not be transferred by the local | |||
of a value could reduce the efficiency of the search algorithm. | link. Too high of a value could reduce the efficiency of the | |||
Some applications also have a maximum transport protocol data | search algorithm. Some applications also have a maximum | |||
unit (PDU) size, in which case there is no benefit from probing | transport protocol data unit (PDU) size, in which case there is | |||
for a size larger than this (unless a transport allows | no benefit from probing for a size larger than this (unless a | |||
multiplexing multiple applications PDUs into the same datagram). | transport allows multiplexing multiple applications PDUs into | |||
the same datagram). | ||||
6. Processing PTB messages: A DPLPMTUD sender MAY optionally | 6. Processing PTB messages: A DPLPMTUD sender MAY optionally | |||
utilize PTB messages received from the network layer to help | utilize PTB messages received from the network layer to help | |||
identify when a network path does not support the current size | identify when a network path does not support the current size | |||
of probe packet. Any received PTB message MUST be validated | of probe packet. Any received PTB message MUST be validated | |||
before it is used to update the PLPMTU discovery information | before it is used to update the PLPMTU discovery information | |||
[RFC8201]. This validation confirms that the PTB message was | [RFC8201]. This validation confirms that the PTB message was | |||
sent in response to a packet originating by the sender, and | sent in response to a packet originating by the sender, and | |||
needs to be performed before the PLPMTU discovery method reacts | needs to be performed before the PLPMTU discovery method reacts | |||
to the PTB message. A PTB message MUST NOT be used to increase | to the PTB message. A PTB message MUST NOT be used to increase | |||
the PLPMTU [RFC8201], but could trigger a probe to test for a | the PLPMTU [RFC8201], but could trigger a probe to test for a | |||
larger PLPMTU. A PTB_SIZE greater than the currently probed | larger PLPMTU. A PL_PTB_SIZE that is greater than that | |||
MUST be ignored. | currently probed MUST be ignored. A valid PTB_SIZE is converted | |||
to a PL_PTB_SIZE before it is to be used in the DPLPMTUD state | ||||
machine. | ||||
7. Probing and congestion control: The decision about when to send | 7. Probing and congestion control: The decision about when to send | |||
a probe packet does not need to be limited by the congestion | a probe packet does not need to be limited by the congestion | |||
controller. When not controlled by the congestion controller, | controller. When not controlled by the congestion controller, | |||
the interval between probe packets MUST be at least one RTT. If | the interval between probe packets MUST be at least one RTT. If | |||
transmission of probe packets is limited by the congestion | transmission of probe packets is limited by the congestion | |||
controller, this could result in transmission of probe packets | controller, this could result in transmission of probe packets | |||
being delayed. | being delayed or suspended during congestion. | |||
8. Loss of a probe packet SHOULD NOT be treated as an indication of | 8. Loss of a probe packet SHOULD NOT be treated as an indication of | |||
congestion and SHOULD NOT trigger a congestion control reaction | congestion and SHOULD NOT trigger a congestion control reaction | |||
[RFC4821], because this could result in unnecessary reduction of | [RFC4821], because this could result in unnecessary reduction of | |||
the sending rate. | the sending rate. | |||
9. An update to the PLPMTU (or MPS) MUST NOT modify the congestion | 9. An update to the PLPMTU (or MPS) MUST NOT modify the congestion | |||
window measured in bytes [RFC4821]. Therefore, an increase in | window measured in bytes [RFC4821]. Therefore, an increase in | |||
the packet size does not cause an increase the data rate in | the packet size does not cause an increase the data rate in | |||
bytes per second. | bytes per second. | |||
10. Probing and flow control: Flow control at the PL concerns the | 10. Probing and flow control: Flow control at the PL concerns the | |||
end-to-end flow of data using the PL service. This does not | end-to-end flow of data using the PL service. This does not | |||
apply to DPLPMTU when probe packets use a design that does not | apply to DPLPMTU when probe packets use a design that does not | |||
carry user data to the remote application. | carry user data to the remote application. | |||
11. Shared PLPMTU state: The PLPMTU value MAY also be stored with | 11. Shared PLPMTU state: The PMTU value calculated from the PLPMTU | |||
the corresponding entry associated with the destination in the | MAY also be stored with the corresponding entry associated with | |||
IP layer cache, and used by other PL instances. The | the destination in the IP layer cache, and used by other PL | |||
specification of PLPMTUD [RFC4821] states: "If PLPMTUD updates | instances. The specification of PLPMTUD [RFC4821] states: "If | |||
the MTU for a particular path, all Packetization Layer sessions | PLPMTUD updates the MTU for a particular path, all Packetization | |||
that share the path representation (as described in Section 5.2 | Layer sessions that share the path representation (as described | |||
of [RFC4821]) SHOULD be notified to make use of the new MTU". | in Section 5.2 of [RFC4821]) SHOULD be notified to make use of | |||
Such methods MUST be robust to the wide variety of underlying | the new MTU". Such methods MUST be robust to the wide variety | |||
network forwarding behaviors. Section 5.2 of [RFC8201] provides | of underlying network forwarding behaviors. Section 5.2 of | |||
guidance on the caching of PMTU information and also the | [RFC8201] provides guidance on the caching of PMTU information | |||
relation to IPv6 flow labels. | and also the relation to IPv6 flow labels. | |||
In addition, the following principles are stated for design of a | In addition, the following principles are stated for design of a | |||
DPLPMTUD method: | DPLPMTUD method: | |||
* Maximum Packet Size (MPS): A PL MAY be designed to segment data | * A PL MAY be designed to segment data blocks larger than the MPS | |||
blocks larger than the MPS into multiple datagrams. However, not | into multiple datagrams. However, not all datagram PLs support | |||
all datagram PLs support segmentation of data blocks. It is | segmentation of data blocks. It is RECOMMENDED that methods avoid | |||
RECOMMENDED that methods avoid forcing an application to use an | forcing an application to use an arbitrary small MPS for | |||
arbitrary small MPS for transmission while the method is searching | transmission while the method is searching for the currently | |||
for the currently supported PLPMTU. A reduced MPS can adversely | supported PLPMTU. A reduced MPS can adversely impact the | |||
impact the performance of an application. | performance of an application. | |||
* To assist applications in choosing a suitable data block size, the | * To assist applications in choosing a suitable data block size, the | |||
PL is RECOMMENDED to provide a primitive that returns the MPS | PL is RECOMMENDED to provide a primitive that returns the MPS | |||
derived from the PLPMTU to the higher layer using the PL. The | derived from the PLPMTU to the higher layer using the PL. The | |||
value of the MPS can change following a change in the path, or | value of the MPS can change following a change in the path, or | |||
loss of probe packets. | loss of probe packets. | |||
* Path validation: It is RECOMMENDED that methods are robust to path | * Path validation: It is RECOMMENDED that methods are robust to path | |||
changes that could have occurred since the path characteristics | changes that could have occurred since the path characteristics | |||
were last confirmed, and to the possibility of inconsistent path | were last confirmed, and to the possibility of inconsistent path | |||
skipping to change at page 14, line 33 ¶ | skipping to change at page 14, line 43 ¶ | |||
block in a datagram without the padding data). This retransmited | block in a datagram without the padding data). This retransmited | |||
data block might possibly need to be sent using a smaller PLPMTU, | data block might possibly need to be sent using a smaller PLPMTU, | |||
which could need the PL to to use a smaller packet size to traverse | which could need the PL to to use a smaller packet size to traverse | |||
the end-to-end path. (This could utilize endpoint network-layer or a | the end-to-end path. (This could utilize endpoint network-layer or a | |||
PL that can re-segment the data block into multiple datagrams). | PL that can re-segment the data block into multiple datagrams). | |||
DPLPMTUD MAY choose to use only one of these methods to simplify the | DPLPMTUD MAY choose to use only one of these methods to simplify the | |||
implementation. | implementation. | |||
Probe messages sent by a PL MUST contain enough information to | Probe messages sent by a PL MUST contain enough information to | |||
uniquely identify the probe within Maximum Segment Lifetime, while | uniquely identify the probe within Maximum Segment Lifetime (e.g., | |||
being robust to reordering and replay of probe response and PTB | including a unique identifier from the PL or the DPLPMTUD | |||
messages. | implementation), while being robust to reordering and replay of probe | |||
response and PTB messages. | ||||
4.2. Confirmation of Probed Packet Size | 4.2. Confirmation of Probed Packet Size | |||
The PL needs a method to determine (confirm) when probe packets have | The PL needs a method to determine (confirm) when probe packets have | |||
been successfully received end-to-end across a network path. | been successfully received end-to-end across a network path. | |||
Transport protocols can include end-to-end methods that detect and | Transport protocols can include end-to-end methods that detect and | |||
report reception of specific datagrams that they send (e.g., DCCP and | report reception of specific datagrams that they send (e.g., DCCP and | |||
SCTP provide keep-alive/heartbeat features). When supported, this | SCTP provide keep-alive/heartbeat features). When supported, this | |||
mechanism MAY also be used by DPLPMTUD to acknowledge reception of a | mechanism MAY also be used by DPLPMTUD to acknowledge reception of a | |||
skipping to change at page 15, line 15 ¶ | skipping to change at page 15, line 26 ¶ | |||
Section 6 specifies this function for a set of IETF-specified | Section 6 specifies this function for a set of IETF-specified | |||
protocols. | protocols. | |||
4.3. Black Hole Detection | 4.3. Black Hole Detection | |||
Black Hole Detection is triggered by an indication that the network | Black Hole Detection is triggered by an indication that the network | |||
path could be unable to support the current PLPMTU size. | path could be unable to support the current PLPMTU size. | |||
There are three ways to detect black holes: | There are three ways to detect black holes: | |||
* A validated PTB message can be received that indicates a PTB_SIZE | * A validated PTB message can be received that indicates a | |||
less than the current PLPMTU. A DPLPMTUD method MUST NOT rely | PL_PTB_SIZE less than the current PLPMTU. A DPLPMTUD method MUST | |||
soley on this method. | NOT rely solely on this method. | |||
* A PL can use the DPLPMTUD probing mechanism to periodically | * A PL can use the DPLPMTUD probing mechanism to periodically | |||
generate probe packets of the size of the current PLPMTU (e.g., | generate probe packets of the size of the current PLPMTU (e.g., | |||
using the confirmation timer Section 5.1.1). A timer tracks | using the confirmation timer Section 5.1.1). A timer tracks | |||
whether acknowledgments are received. Successive loss of probes | whether acknowledgments are received. Successive loss of probes | |||
is an indication that the current path no longer supports the | is an indication that the current path no longer supports the | |||
PLPMTU (e.g., when the number of probe packets sent without | PLPMTU (e.g., when the number of probe packets sent without | |||
receiving an acknowledgement, PROBE_COUNT, becomes greater than | receiving an acknowledgment, PROBE_COUNT, becomes greater than | |||
MAX_PROBES). | MAX_PROBES). | |||
* A PL can utilise an event that indicates the network path no | * A PL can utilize an event that indicates the network path no | |||
longer sustains the sender's PLPMTU size. This could use a | longer sustains the sender's PLPMTU size. This could use a | |||
mechanism implemented within the PL to detect excessive loss of | mechanism implemented within the PL to detect excessive loss of | |||
data sent with a specific packet size and then conclude that this | data sent with a specific packet size and then conclude that this | |||
excessive loss could be a result of an invalid PLPMTU (as in | excessive loss could be a result of an invalid PLPMTU (as in | |||
PLPMTUD for TCP [RFC4821]). | PLPMTUD for TCP [RFC4821]). | |||
A PL MAY inhibit sending probe packets when no application data has | A PL MAY inhibit sending probe packets when no application data has | |||
been sent since the previous probe packet. A PL preferring to use an | been sent since the previous probe packet. A PL preferring to use an | |||
up-to-data PLPMTU once user data is sent again, MAY choose to | up-to-data PLPMTU once user data is sent again, MAY choose to | |||
continue PLPMTU discovery for each path. However, this could result | continue PLPMTU discovery for each path. However, this could result | |||
skipping to change at page 16, line 11 ¶ | skipping to change at page 16, line 17 ¶ | |||
the new PLPMTU can be successfully used across the path. A probe | the new PLPMTU can be successfully used across the path. A probe | |||
packet could need to have a size less than the size of the data block | packet could need to have a size less than the size of the data block | |||
generated by the application. | generated by the application. | |||
4.4. The Maximum Packet Size (MPS) | 4.4. The Maximum Packet Size (MPS) | |||
The result of probing determines a usable PLPMTU, which is used to | The result of probing determines a usable PLPMTU, which is used to | |||
set the MPS used by the application. The MPS is smaller than the | set the MPS used by the application. The MPS is smaller than the | |||
PLPMTU because it is reduced by the size of PL headers (including the | PLPMTU because it is reduced by the size of PL headers (including the | |||
overhead of security-related fields such as an AEAD tag and TLS | overhead of security-related fields such as an AEAD tag and TLS | |||
record layer padding) and any IP options or extensions added to the | record layer padding). The relationship between the MPS and the | |||
PL packet. The relationship between the MPS and the PLPMTUD is | PLPMTUD is illustrated in Figure 1. | |||
illustrated in Figure 1. | ||||
any additional | any additional | |||
headers .--- MPS -----. | headers .--- MPS -----. | |||
| | | | | | | | |||
v v v | v v v | |||
+------------------------------+ | +------------------------------+ | |||
| IP | ** | PL | protocol data | | | IP | ** | PL | protocol data | | |||
+------------------------------+ | +------------------------------+ | |||
<---------- PLPMTU ------------> | <----- PLPMTU -----> | |||
<---------- PMTU --------------> | ||||
Figure 1: Relationship between MPS and PLPMTU | Figure 1: Relationship between MPS and PLPMTU | |||
A PL is unable to send a packet (other than a probe packet) with a | A PL is unable to send a packet (other than a probe packet) with a | |||
size larger than the current PLPMTU at the network layer. To avoid | size larger than the current PLPMTU at the network layer. To avoid | |||
this, a PL MAY be designed to segment data blocks larger than the MPS | this, a PL MAY be designed to segment data blocks larger than the MPS | |||
into multiple datagrams. | into multiple datagrams. | |||
DPLPMTUD seeks to avoid IP fragmentation. An attempt to send a data | DPLPMTUD seeks to avoid IP fragmentation. An attempt to send a data | |||
block larger than the MPS will therefore fail if a PL is unable to | block larger than the MPS will therefore fail if a PL is unable to | |||
segment data. To determine the largest data block that can be sent, | segment data. To determine the largest data block that can be sent, | |||
a PL SHOULD provide applications with a primitive that returns the | a PL SHOULD provide applications with a primitive that returns the | |||
Maximum Packet Size (MPS), derived from the current PLPMTU. | MPS, derived from the current PLPMTU. | |||
If DPLPMTUD results in a change to the MPS, the application needs to | If DPLPMTUD results in a change to the MPS, the application needs to | |||
adapt to the new MPS. A particular case can arise when packets have | adapt to the new MPS. A particular case can arise when packets have | |||
been sent with a size less than the MPS and the PLPMTU was | been sent with a size less than the MPS and the PLPMTU was | |||
subsequently reduced. If these packets are lost, the PL MAY segment | subsequently reduced. If these packets are lost, the PL MAY segment | |||
the data using the new MPS. If a PL is unable to re-segment a | the data using the new MPS. If a PL is unable to re-segment a | |||
previously sent datagram (e.g., [RFC4960]), then the sender either | previously sent datagram (e.g., [RFC4960]), then the sender either | |||
discards the datagram or could perform retransmission using network- | discards the datagram or could perform retransmission using network- | |||
layer fragmentation to form multiple IP packets not larger than the | layer fragmentation to form multiple IP packets not larger than the | |||
PLPMTU. For IPv4, the use of endpoint fragmentation by the sender is | PLPMTU. For IPv4, the use of endpoint fragmentation by the sender is | |||
preferred over clearing the DF-bit in the IPv4 header. Operational | preferred over clearing the DF-bit in the IPv4 header. Operational | |||
experience reveals that IP fragmentation can reduce the reliability | experience reveals that IP fragmentation can reduce the reliability | |||
of Internet communication [I-D.ietf-intarea-frag-fragile], which may | of Internet communication [I-D.ietf-intarea-frag-fragile], which may | |||
reduce the success of retransmission. | reduce the success of retransmission. | |||
4.5. Disabling the Effect of PMTUD | 4.5. Disabling the Effect of PMTUD | |||
A PL implementing this specification MUST suspend network layer | A PL implementing this specification MUST suspend network layer | |||
processing of outgoing packets that enforces a PMTU | processing of outgoing packets that enforces a PMTU | |||
[RFC1191][RFC8201] for each flow utilising DPLPMTUD, and instead use | [RFC1191][RFC8201] for each flow utilizing DPLPMTUD, and instead use | |||
DPLPMTUD to control the size of packets that are sent by a flow. | DPLPMTUD to control the size of packets that are sent by a flow. | |||
This removes the need for the network layer to drop or fragment sent | This removes the need for the network layer to drop or fragment sent | |||
packets that have a size greater than the PMTU. | packets that have a size greater than the PMTU. | |||
4.6. Response to PTB Messages | 4.6. Response to PTB Messages | |||
This method requires the DPLPMTUD sender to validate any received PTB | This method requires the DPLPMTUD sender to validate any received PTB | |||
message before using the PTB information. The response to a PTB | message before using the PTB information. The response to a PTB | |||
message depends on the PTB_SIZE indicated in the PTB message, the | message depends on the PL_PTB_SIZE calculated from the PTB_SIZE in | |||
state of the PLPMTUD state machine, and the IP protocol being used. | the PTB message, the state of the PLPMTUD state machine, and the IP | |||
protocol being used. | ||||
Section 4.6.1 first describes validation for both IPv4 ICMP | Section 4.6.1 first describes validation for both IPv4 ICMP | |||
Unreachable messages (type 3) and ICMPv6 Packet Too Big messages, | Unreachable messages (type 3) and ICMPv6 Packet Too Big messages, | |||
both of which are referred to as PTB messages in this document. | both of which are referred to as PTB messages in this document. | |||
4.6.1. Validation of PTB Messages | 4.6.1. Validation of PTB Messages | |||
This section specifies utilization of PTB messages. | This section specifies utilization of PTB messages. | |||
* A simple implementation MAY ignore received PTB messages and in | * A simple implementation MAY ignore received PTB messages and in | |||
skipping to change at page 18, line 14 ¶ | skipping to change at page 18, line 20 ¶ | |||
PL endpoints. A datagram application that uses well-known source and | PL endpoints. A datagram application that uses well-known source and | |||
destination ports ought to also rely on other information to complete | destination ports ought to also rely on other information to complete | |||
this validation. | this validation. | |||
These checks are intended to provide protection from packets that | These checks are intended to provide protection from packets that | |||
originate from a node that is not on the network path. A PTB message | originate from a node that is not on the network path. A PTB message | |||
that does not complete the validation MUST NOT be further utilized by | that does not complete the validation MUST NOT be further utilized by | |||
the DPLPMTUD method. | the DPLPMTUD method. | |||
PTB messages that have been validated MAY be utilized by the DPLPMTUD | PTB messages that have been validated MAY be utilized by the DPLPMTUD | |||
algorithm, but MUST NOT be used directly to set the PLPMTU. A method | algorithm, but MUST NOT be used directly to set the PLPMTU. The | |||
that utilizes these PTB messages can improve the speed at the which | PL_PTB_SIZE is smaller than the PTB_SIZE because it is reduced by | |||
the algorithm detects an appropriate PLPMTU by triggering an | headers below the PL including any IP options or extensions added to | |||
immediate probe for the PTB_SIZE, compared to one that relies solely | the PL packet. A method that utilizes these PTB messages can improve | |||
on probing using a timer-based search algorithm. Section 4.6.2 | the speed at which the algorithm detects an appropriate PLPMTU by | |||
describes this processing. | triggering an immediate probe for the PL_PTB_SIZE (resulting in a | |||
network-layer packet of size PTB_SIZE), compared to one that relies | ||||
solely on probing using a timer-based search algorithm. | ||||
Section 4.6.2 describes this processing. | ||||
4.6.2. Use of PTB Messages | 4.6.2. Use of PTB Messages | |||
A set of checks are intended to provide protection from a router that | Before using the size reported in the PTB message it must first be | |||
reports an unexpected PTB_SIZE. The PL also needs to check that the | converted to a PL_PTB_SIZE. A set of checks are intended to provide | |||
indicated PTB_SIZE is less than the size used by probe packets and at | protection from a router that reports an unexpected PTB_SIZE. The PL | |||
least the minimum size accepted. | also needs to check that the indicated PL_PTB_SIZE is less than the | |||
size used by probe packets and at least the minimum size accepted. | ||||
This section provides a summary of how PTB messages can be utilized. | This section provides a summary of how PTB messages can be utilized. | |||
This processing depends on the PTB_SIZE and the current value of a | This processing depends on the PL_PTB_SIZE and the current value of a | |||
set of variables: | set of variables: | |||
PTB_SIZE < MIN_PMTU | PL_PTB_SIZE < MIN_PLPMTU | |||
* Invalid PTB_SIZE see Section 4.6.1. | * Invalid PL_PTB_SIZE see Section 4.6.1. | |||
* PTB message ought to be discarded without further processing | * PTB message ought to be discarded without further processing | |||
(e. g. PLPMTU not modified). | (i.e., PLPMTU is not modified). | |||
* The information could be utilized as an input to trigger | * The information could be utilized as an input to a trigger that | |||
enabling a resilience mode. | would enable a resilience mode. | |||
MIN_PMTU < PTB_SIZE < BASE_PMTU | MIN_PLPMTU < PL_PTB_SIZE < BASE_PLPMTU | |||
* A robust PL MAY enter an error state (see Section 5.2) for an | * A robust PL MAY enter an error state (see Section 5.2) for an | |||
IPv4 path when the PTB_SIZE reported in the PTB message is | IPv4 path when the PL_PTB_SIZE reported in the PTB message is | |||
larger than or equal to 68 bytes [RFC0791] and when this is | larger than or equal to 68 bytes [RFC0791] and when this is | |||
less than the BASE_PMTU. | less than the BASE_PLPMTU. | |||
* A robust PL MAY enter an error state (see Section 5.2) for an | * A robust PL MAY enter an error state (see Section 5.2) for an | |||
IPv6 path when the PTB_SIZE reported in the PTB message is | IPv6 path when the PL_PTB_SIZE reported in the PTB message is | |||
larger than or equal to 1280 bytes [RFC8200] and when this is | larger than or equal to 1280 bytes [RFC8200] and when this is | |||
less than the BASE_PMTU. | less than the BASE_PLPMTU. | |||
PTB_SIZE = PLPMTU | PL_PTB_SIZE = PLPMTU | |||
* Completes the search for a larger PLPMTU. | * Completes the search for a larger PLPMTU. | |||
PTB_SIZE > PROBED_SIZE | PL_PTB_SIZE > PROBED_SIZE | |||
* Inconsistent network signal. | * Inconsistent network signal. | |||
* PTB message ought to be discarded without further processing | * PTB message ought to be discarded without further processing | |||
(e. g. PLPMTU not modified). | (i.e., PLPMTU is not modified). | |||
* The information could be utilized as an input to trigger | * The information could be utilized as an input to trigger | |||
enabling a resilience mode. | enabling a resilience mode. | |||
BASE_PMTU <= PTB_SIZE < PLPMTU | BASE_PLPMTU <= PL_PTB_SIZE < PLPMTU | |||
* This could be an indication of a black hole. The PLPMTU SHOULD | * This could be an indication of a black hole. The PLPMTU SHOULD | |||
be set to BASE_PMTU (the PLPMTU is reduced to the BASE_PMTU to | be set to BASE_PLPMTU (the PLPMTU is reduced to the BASE_PLPMTU | |||
avoid unnecessary packet loss when a black hole is | to avoid unnecessary packet loss when a black hole is | |||
encountered). | encountered). | |||
* The PL ought to start a search to quickly discover the new | * The PL ought to start a search to quickly discover the new | |||
PLPMTU. The PTB_SIZE reported in the PTB message can be used | PLPMTU. The PL_PTB_SIZE reported in the PTB message can be | |||
to initialize a search algorithm. | used to initialize a search algorithm. | |||
PLPMTU < PTB_SIZE < PROBED_SIZE | PLPMTU < PL_PTB_SIZE < PROBED_SIZE | |||
* The PLPMTU continues to be valid, but the last PROBED_SIZE | * The PLPMTU continues to be valid, but the size of a packet used | |||
searched was larger than the actual PMTU. | to search (PROBED_SIZE) was larger than the actual PMTU. | |||
* The PLPMTU is not updated. | * The PLPMTU is not updated. | |||
* The PL can use the reported PTB_SIZE from the PTB message as | * The PL can use the reported PL_PTB_SIZE from the PTB message as | |||
the next search point when it resumes the search algorithm. | the next search point when it resumes the search algorithm. | |||
5. Datagram Packetization Layer PMTUD | 5. Datagram Packetization Layer PMTUD | |||
This section specifies Datagram PLPMTUD (DPLPMTUD). The method can | This section specifies Datagram PLPMTUD (DPLPMTUD). The method can | |||
be introduced at various points (as indicated with * in the figure | be introduced at various points (as indicated with * in the figure | |||
below) in the IP protocol stack to discover the PLPMTU so that an | below) in the IP protocol stack to discover the PLPMTU so that an | |||
application can utilize an appropriate MPS for the current network | application can utilize an appropriate MPS for the current network | |||
path. | path. | |||
DPLPMTUD SHOULD NOT be used by an upper PL or application if it is | DPLPMTUD SHOULD NOT be used by an upper PL or application if it is | |||
already used in a lower layer, DPLPMTUD SHOULD only be performed once | already used in a lower layer, DPLPMTUD SHOULD only be performed once | |||
between a pair of endpoints. A PL MUST adjust the MPS indicated by | between a pair of endpoints. A PL MUST adjust the MPS indicated by | |||
DPLPMTUD to account for any additional overhead introduced by the PL. | DPLPMTUD to account for any additional overhead introduced by the PL. | |||
+----------------------+ | +----------------------+ | |||
| Application* | | | Application* | | |||
+-+-------+----+----+--+ | +-----+------------+---+ | |||
| | | | | | | | |||
+---+--+ +--+--+ | +-+---+ | +---+--+ +--+--+ | |||
| QUIC*| |UDPO*| | |SCTP*| | | QUIC*| |SCTP*| | |||
+---+--+ +--+--+ | +--+--+ | +---+--+ +-+-+-+ | |||
| | | | | | | | | | |||
+-------+--+ | | | | +---+ +----+ | | |||
| | | | | | | | | |||
+-+-+--+ | | +-+--+-+ | | |||
| UDP | | | | UDP | | | |||
+---+--+ | | +---+--+ | | |||
| | | | | | |||
+--------------+-----+-+ | +-----------+-------+--+ | |||
| Network Interface | | | Network Interface | | |||
+----------------------+ | +----------------------+ | |||
Figure 2: Examples where DPLPMTUD can be implemented | Figure 2: Examples where DPLPMTUD can be implemented | |||
The central idea of DPLPMTUD is probing by a sender. Probe packets | The central idea of DPLPMTUD is probing by a sender. Probe packets | |||
are sent to find the maximum size of user message that can be | are sent to find the maximum size of user message that can be | |||
completely transferred across the network path from the sender to the | completely transferred across the network path from the sender to the | |||
destination. | destination. | |||
The following sections identify the components needed for | The following sections identify the components needed for | |||
implementation, provides an overview of the phases of operation, and | implementation, provides an overview of the phases of operation, and | |||
skipping to change at page 21, line 38 ¶ | skipping to change at page 22, line 4 ¶ | |||
An implementation could implement the various timers using a single | An implementation could implement the various timers using a single | |||
timer. | timer. | |||
5.1.2. Constants | 5.1.2. Constants | |||
The following constants are defined: | The following constants are defined: | |||
MAX_PROBES: The MAX_PROBES is the maximum value of the PROBE_COUNT | MAX_PROBES: The MAX_PROBES is the maximum value of the PROBE_COUNT | |||
counter (see Section 5.1.3). MAX_PROBES represents the limit for | counter (see Section 5.1.3). MAX_PROBES represents the limit for | |||
the number of consecutive probe attempts of any size. Search | the number of consecutive probe attempts of any size. Search | |||
algorithms benefit from a MAX_PROBES value greater than 1 because | algorithms benefit from a MAX_PROBES valugreater than 1 because | |||
this can provide robustness to isolated packet loss. The default | this can provide robustness to isolated packet loss. The default | |||
value of MAX_PROBES is 3. | value of MAX_PROBES is 3. | |||
MIN_PMTU: The MIN_PMTU is the smallest allowed probe packet size. | MIN_PLPMTU: The MIN_PLPMTU is the smallest allowed probe packet | |||
For IPv6, this value is 1280 bytes, as specified in [RFC8200]. | size. For IPv6, this value is 1280 bytes, as specified in | |||
For IPv4, the minimum value is 68 bytes. | [RFC8200]. For IPv4, the minimum value is 68 bytes. | |||
Note: An IPv4 router is required to be able to forward a datagram | Note: An IPv4 router is required to be able to forward a datagram | |||
of 68 bytes without further fragmentation. This is the combined | of 68 bytes without further fragmentation. This is the combined | |||
size of an IPv4 header and the minimum fragment size of 8 bytes. | size of an IPv4 header and the minimum fragment size of 8 bytes. | |||
In addition, receivers are required to be able to reassemble | In addition, receivers are required to be able to reassemble | |||
fragmented datagrams at least up to 576 bytes, as stated in | fragmented datagrams at least up to 576 bytes, as stated in | |||
section 3.3.3 of [RFC1122]. | section 3.3.3 of [RFC1122]. | |||
MAX_PMTU: The MAX_PMTU is the largest size of PLPMTU. This has to | MAX_PLPMTU: The MAX_PLPMTU is the largest size of PLPMTU. This has | |||
be less than or equal to the minimum of the local MTU of the | to be less than or equal to the maximum size of the PL packet that | |||
outgoing interface and the destination PMTU for receiving. An | can be sent on the outgoing interface (constrained by the local | |||
application, or PL, MAY choose a smaller MAX_PMTU when there is no | interface MTU). When known, this also ought to be less than the | |||
need to send packets larger than a specific size. | maximum size of PL packet that can be received by the remote | |||
endpoint (constrained by EMTU_R). It can be limited by the design | ||||
or configuration of the PL being used. An application, or PL, MAY | ||||
choose a smaller MAX_PLPMTU when there is no need to send packets | ||||
larger than a specific size. | ||||
BASE_PMTU: The BASE_PMTU is a configured size expected to work for | BASE_PLPMTU: The BASE_PLPMTU is a configured size expected to work | |||
most paths. The size is equal to or larger than the MIN_PMTU and | for most paths. The size is equal to or larger than the | |||
smaller than the MAX_PMTU. In the case of IPv6, this value is | MIN_PLPMTU and smaller than the MAX_PLPMTU. In the case of IPv6, | |||
1280 bytes [RFC8200]. When using IPv4, a size of 1200 bytes is | this value is 1280 bytes [RFC8200]. When using IPv4, a size of | |||
RECOMMENDED. | 1200 bytes is RECOMMENDED. | |||
5.1.3. Variables | 5.1.3. Variables | |||
This method utilizes a set of variables: | This method utilizes a set of variables: | |||
PROBED_SIZE: The PROBED_SIZE is the size of the current probe | PROBED_SIZE: The PROBED_SIZE is the size of the current probe | |||
packet. This is a tentative value for the PLPMTU, which is | packet. This is a tentative value for the PLPMTU, which is | |||
awaiting confirmation by an acknowledgment. | awaiting confirmation by an acknowledgment. | |||
PROBE_COUNT: The PROBE_COUNT is a count of the number of successive | PROBE_COUNT: The PROBE_COUNT is a count of the number of successive | |||
skipping to change at page 22, line 34 ¶ | skipping to change at page 23, line 4 ¶ | |||
PROBE_COUNT: The PROBE_COUNT is a count of the number of successive | PROBE_COUNT: The PROBE_COUNT is a count of the number of successive | |||
unsuccessful probe packets that have been sent. Each time a probe | unsuccessful probe packets that have been sent. Each time a probe | |||
packet is acknowledged, the value is set to zero. (Some probe | packet is acknowledged, the value is set to zero. (Some probe | |||
loss is expected while searching, therefore loss of a single probe | loss is expected while searching, therefore loss of a single probe | |||
is not an indication of a PMTU problem.) | is not an indication of a PMTU problem.) | |||
The figure below illustrates the relationship between the packet size | The figure below illustrates the relationship between the packet size | |||
constants and variables at a point of time when the DPLPMTUD | constants and variables at a point of time when the DPLPMTUD | |||
algorithm performs path probing to increase the size of the PLPMTU. | algorithm performs path probing to increase the size of the PLPMTU. | |||
A probe packet has been sent of size PROBED_SIZE. Once this is | A probe packet has been sent of size PROBED_SIZE. Once this is | |||
acknowledged, the PLPMTU will raise to PROBED_SIZE allowing the | acknowledged, the PLPMTU will raise to PROBED_SIZE allowing the | |||
DPLPMTUD algorithm to further increase PROBED_SIZE towards the actual | DPLPMTUD algorithm to further increase PROBED_SIZE toward sending a | |||
PMTU. | probe with the size of the actual PMTU. | |||
MIN_PMTU MAX_PMTU | MIN_PLPMTU MAX_PLPMTU | |||
<--------------------------------------------------> | <-------------------------------------------> | |||
| | | | | | | | | |||
v | | v | v | | | |||
BASE_PMTU | v Actual PMTU | BASE_PLPMTU | v | |||
| PROBED_SIZE | | PROBED_SIZE | |||
v | v | |||
PLPMTU | PLPMTU | |||
Figure 3: Relationships between packet size constants and variables | Figure 3: Relationships between packet size constants and variables | |||
5.1.4. Overview of DPLPMTUD Phases | 5.1.4. Overview of DPLPMTUD Phases | |||
This section provides a high-level informative view of the DPLPMTUD | This section provides a high-level informative view of the DPLPMTUD | |||
method, by describing the movement of the method through several | method, by describing the movement of the method through several | |||
phases of operation. More detail is available in the state machine | phases of operation. More detail is available in the state machine | |||
Section 5.2. | Section 5.2. | |||
+------+ | +------+ | |||
+------->| Base |----------------+ Connectivity | +------->| Base |-----------------+ Connectivity | |||
| +------+ | or BASE_PMTU | | +------+ | or BASE_PLPMTU | |||
| | | confirmation failed | | | | confirmation failed | |||
| | v | | | v | |||
| | Connectivity +-------+ | | | Connectivity +-------+ | |||
| | and BASE_PMTU | Error | | | | and BASE_PLPMTU | Error | | |||
| | confirmed +-------+ | | | confirmed +-------+ | |||
| | | Consistent | | | | Consistent | |||
| v | connectivity | | v | connectivity | |||
PLPMTU | +--------+ | and BASE_PMTU | PLPMTU | +--------+ | and BASE_PLPMTU | |||
confirmation | | Search |<--------------+ confirmed | confirmation | | Search |<---------------+ confirmed | |||
failed | +--------+ | failed | +--------+ | |||
| ^ | | | ^ | | |||
| | | | | | | | |||
| Raise | | Search | | Raise | | Search | |||
| timer | | algorithm | | timer | | algorithm | |||
| expired | | completed | | expired | | completed | |||
| | | | | | | | |||
| | v | | | v | |||
| +-----------------+ | | +-----------------+ | |||
+---| Search Complete | | +---| Search Complete | | |||
+-----------------+ | +-----------------+ | |||
Figure 4: DPLPMTUD Phases | Figure 4: DPLPMTUD Phases | |||
Base: The Base Phase confirms connectivity to the remote peer using | Base: The Base Phase confirms connectivity to the remote peer using | |||
packets of the BASE_PMTU. This phase is implicit for a | packets of the BASE_PLPMTU. This phase is implicit for a | |||
connection-oriented PL (where it can be performed in a PL | connection-oriented PL (where it can be performed in a PL | |||
connection handshake). A connectionless PL sends an acknowledged | connection handshake). A connectionless PL sends a probe packet | |||
probe packet to confirm that the remote peer is reachable. The | and uses acknowledgment of this probe packet to confirm that the | |||
sender also confirms that BASE_PMTU is supported across the | remote peer is reachable. | |||
network path. | ||||
The sender also confirms that BASE_PLPMTU is supported across the | ||||
network path. This may be achieved using a PL mechanism (e.g., | ||||
using a handshake packet of size BASE_PLPMTU), or by sending a | ||||
probe packet of size BASE_PLPMTU and confirming that this is | ||||
received. | ||||
A probe packet of size BASE_PLPMTU can be sent immediately on the | ||||
initial entry to the Base Phase (following a connectivity check). | ||||
A PL that does not wish to support a path with a PLPMTU less than | A PL that does not wish to support a path with a PLPMTU less than | |||
BASE_PMTU can simplify the phase into a single step by performing | BASE_PLPMTU can simplify the phase into a single step by | |||
the connectivity checks with a probe of the BASE_PMTU size. | performing the connectivity checks with a probe of the BASE_PLPMTU | |||
size. | ||||
Once confirmed, DPLPMTUD enters the Search Phase. If this phase | Once confirmed, DPLPMTUD enters the Search Phase. If this phase | |||
fails to confirm, DPLPMTUD enters the Error Phase. | fails to confirm, DPLPMTUD enters the Error Phase. | |||
Search: The Search Phase utilizes a search algorithm to send probe | Search: The Search Phase utilizes a search algorithm to send probe | |||
packets to seek to increase the PLPMTU. The algorithm concludes | packets to seek to increase the PLPMTU. The algorithm concludes | |||
when it has found a suitable PLPMTU, by entering the Search | when it has found a suitable PLPMTU, by entering the Search | |||
Complete Phase. | Complete Phase. | |||
A PL could respond to PTB messages using the PTB to advance or | A PL could respond to PTB messages using the PTB to advance or | |||
skipping to change at page 24, line 25 ¶ | skipping to change at page 24, line 48 ¶ | |||
CONFIRMATION_TIMER to periodically repeat a probe packet for the | CONFIRMATION_TIMER to periodically repeat a probe packet for the | |||
current PLPMTU size. If the sender is unable to confirm | current PLPMTU size. If the sender is unable to confirm | |||
reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL | reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL | |||
signals a lack of reachability, DPLPMTUD enters the Base phase. | signals a lack of reachability, DPLPMTUD enters the Base phase. | |||
The PMTU_RAISE_TIMER is used to periodically resume the search | The PMTU_RAISE_TIMER is used to periodically resume the search | |||
phase to discover if the PLPMTU can be raised. Black Hole | phase to discover if the PLPMTU can be raised. Black Hole | |||
Detection causes the sender to enter the Base Phase. | Detection causes the sender to enter the Base Phase. | |||
Error: The Error Phase is entered when there is conflicting or | Error: The Error Phase is entered when there is conflicting or | |||
invalid PLPMTU information for the path (e.g. a failure to support | invalid PLPMTU information for the path (e.g., a failure to | |||
the BASE_PMTU) that cause DPLPMTUD to be unable to progress and | support the BASE_PLPMTU) that cause DPLPMTUD to be unable to | |||
the PLPMTU is lowered. | progress and the PLPMTU is lowered. | |||
DPLPMTUD remains in the Error Phase until a consistent view of the | DPLPMTUD remains in the Error Phase until a consistent view of the | |||
path can be discovered and it has also been confirmed that the | path can be discovered and it has also been confirmed that the | |||
path supports the BASE_PMTU (or DPLPMTUD is suspended). | path supports the BASE_PLPMTU (or DPLPMTUD is suspended). | |||
An implementation that only reduces the PLPMTU to a suitable size | An implementation that only reduces the PLPMTU to a suitable size | |||
would be sufficient to ensure reliable operation, but can be very | would be sufficient to ensure reliable operation, but can be very | |||
inefficient when the actual PMTU changes or when the method (for | inefficient when the actual PMTU changes or when the method (for | |||
whatever reason) makes a suboptimal choice for the PLPMTU. | whatever reason) makes a suboptimal choice for the PLPMTU. | |||
A full implementation of DPLPMTUD provides an algorithm enabling the | A full implementation of DPLPMTUD provides an algorithm enabling the | |||
DPLPMTUD sender to increase the PLPMTU following a change in the | DPLPMTUD sender to increase the PLPMTU following a change in the | |||
characteristics of the path, such as when a link is reconfigured with | characteristics of the path, such as when a link is reconfigured with | |||
a larger MTU, or when there is a change in the set of links traversed | a larger MTU, or when there is a change in the set of links traversed | |||
skipping to change at page 25, line 13 ¶ | skipping to change at page 26, line 13 ¶ | |||
Note: Not all changes are shown to simplify the diagram. | Note: Not all changes are shown to simplify the diagram. | |||
| | | | | | |||
| Start | PL indicates loss | | Start | PL indicates loss | |||
| | of connectivity | | | of connectivity | |||
v v | v v | |||
+---------------+ +---------------+ | +---------------+ +---------------+ | |||
| DISABLED | | ERROR | | | DISABLED | | ERROR | | |||
+---------------+ PROBE_TIMER expiry: +---------------+ | +---------------+ PROBE_TIMER expiry: +---------------+ | |||
| PL indicates PROBE_COUNT = MAX_PROBES or ^ | | | PL indicates PROBE_COUNT = MAX_PROBES or ^ | | |||
| connectivity PTB: PTB_SIZE < BASE_PMTU | | | | connectivity PTB: PLPTB_SIZE < BASE_PLPMTU | | | |||
+--------------------+ +---------------+ | | +--------------------+ +---------------+ | | |||
| | | | | | | | |||
v | BASE_PMTU Probe | | v | BASE_PLPMTU Probe | | |||
+---------------+ acked | | +---------------+ acked | | |||
| BASE |----------------------+ | | BASE |----------------------+ | |||
+---------------+ | | +---------------+ | | |||
^ | ^ ^ | | ^ | ^ ^ | | |||
Black hole detected | | | | Black hole detected | | Black hole detected | | | | Black hole detected | | |||
+--------------------+ | | +--------------------+ | | +--------------------+ | | +--------------------+ | | |||
| +----+ | | | | +----+ | | | |||
| PROBE_TIMER expiry: | | | | PROBE_TIMER expiry: | | | |||
| PROBE_COUNT < MAX_PROBES | | | | PROBE_COUNT < MAX_PROBES | | | |||
| | | | | | | | |||
| PMTU_RAISE_TIMER expiry | | | | PMTU_RAISE_TIMER expiry | | | |||
| +-----------------------------------------+ | | | | +-----------------------------------------+ | | | |||
| | | | | | | | | | | | |||
| | v | v | | | v | v | |||
+---------------+ +---------------+ | +---------------+ +---------------+ | |||
|SEARCH_COMPLETE| | SEARCHING | | |SEARCH_COMPLETE| | SEARCHING | | |||
+---------------+ +---------------+ | +---------------+ +---------------+ | |||
| ^ ^ | | ^ | | ^ ^ | | ^ | |||
| | | | | | | | | | | | | | |||
| | +-----------------------------------------+ | | | | | +-----------------------------------------+ | | | |||
| | MAX_PMTU Probe acked or | | | | | MAX_PLPMTU Probe acked or | | | |||
| | PROBE_TIMER expiry: PROBE_COUNT = MAX_PROBES or | | | | | PROBE_TIMER expiry: PROBE_COUNT = MAX_PROBES or | | | |||
+----+ PTB: PTB_SIZE = PLPMTU +----+ | +----+ PTB: PLPTB_SIZE = PLPMTU +----+ | |||
CONFIRMATION_TIMER expiry: PROBE_TIMER expiry: | CONFIRMATION_TIMER expiry: PROBE_TIMER expiry: | |||
PROBE_COUNT < MAX_PROBES or PROBE_COUNT < MAX_PROBES or | PROBE_COUNT < MAX_PROBES or PROBE_COUNT < MAX_PROBES or | |||
PLPMTU Probe acked Probe acked or PTB: | PLPMTU Probe acked Probe acked or PTB: | |||
PLPMTU < PTB_SIZE < PROBED_SIZE | PLPMTU < PLPTB_SIZE < PROBED_SIZE | |||
Figure 5: State machine for Datagram PLPMTUD | Figure 5: State machine for Datagram PLPMTUD | |||
The following states are defined: | The following states are defined: | |||
DISABLED: The DISABLED state is the initial state before probing has | DISABLED: The DISABLED state is the initial state before probing has | |||
started. It is also entered from any other state, when the PL | started. It is also entered from any other state, when the PL | |||
indicates loss of connectivity. This state is left, once the PL | indicates loss of connectivity. This state is left once the PL | |||
indicates connectivity to the remote PL. | indicates connectivity to the remote PL. When transitioning to | |||
the BASE state, a probe packet of size BASE_PLPMTU can be sent | ||||
immediately. | ||||
BASE: The BASE state is used to confirm that the BASE_PMTU size is | BASE: The BASE state is used to confirm that the BASE_PLPMTU size is | |||
supported by the network path and is designed to allow an | supported by the network path and is designed to allow an | |||
application to continue working when there are transient | application to continue working when there are transient | |||
reductions in the actual PMTU. It also seeks to avoid long | reductions in the actual PMTU. It also seeks to avoid long | |||
periods when a sender searching for a larger PLPMTU is unaware | periods when a sender searching for a larger PLPMTU is unaware | |||
that packets are not being delivered due to a packet or ICMP Black | that packets are not being delivered due to a packet or ICMP Black | |||
Hole. | Hole. | |||
On entry, the PROBED_SIZE is set to the BASE_PMTU size and the | On entry, the PROBED_SIZE is set to the BASE_PLPMTU size and the | |||
PROBE_COUNT is set to zero. | PROBE_COUNT is set to zero. | |||
Each time a probe packet is sent, the PROBE_TIMER is started. The | Each time a probe packet is sent, the PROBE_TIMER is started. The | |||
state is exited when the probe packet is acknowledged, and the PL | state is exited when the probe packet is acknowledged, and the PL | |||
sender enters the SEARCHING state. | sender enters the SEARCHING state. | |||
The state is also left when the PROBE_COUNT reaches MAX_PROBES or | The state is also left when the PROBE_COUNT reaches MAX_PROBES or | |||
a received PTB message is validated. This causes the PL sender to | a received PTB message is validated. This causes the PL sender to | |||
enter the ERROR state. | enter the ERROR state. | |||
SEARCHING: The SEARCHING state is the main probing state. This | SEARCHING: The SEARCHING state is the main probing state. This | |||
state is entered when probing for the BASE_PMTU was successful. | state is entered when probing for the BASE_PLPMTU was successful. | |||
Each time a probe packet is acknowledged, the PROBE_COUNT is set | Each time a probe packet is acknowledged, the PROBE_COUNT is set | |||
to zero, the PLPMTU is set to the PROBED_SIZE and then the | to zero, the PLPMTU is set to the PROBED_SIZE and then the | |||
PROBED_SIZE is increased using the search algorithm. | PROBED_SIZE is increased using the search algorithm. | |||
When a probe packet is sent and not acknowledged within the period | When a probe packet is sent and not acknowledged within the period | |||
of the PROBE_TIMER, the PROBE_COUNT is incremented and a new probe | of the PROBE_TIMER, the PROBE_COUNT is incremented and a new probe | |||
packet is transmitted. | packet is transmitted. | |||
The state is exited to enter SEARCH_COMPLETE when the PROBE_COUNT | The state is exited to enter SEARCH_COMPLETE when the PROBE_COUNT | |||
reaches MAX_PROBES, a validated PTB is received that corresponds | reaches MAX_PROBES, a validated PTB is received that corresponds | |||
to the last successfully probed size (PTB_SIZE = PLPMTU), or a | to the last successfully probed size (PL_PTB_SIZE = PLPMTU), or a | |||
probe of size MAX_PMTU is acknowledged (PLPMTU = MAX_PMTU). | probe of size MAX_PLPMTU is acknowledged (PLPMTU = MAX_PLPMTU). | |||
When a black hole is detected in the SEARCHING state, this causes | When a black hole is detected in the SEARCHING state, this causes | |||
the PL sender to enter the BASE state. | the PL sender to enter the BASE state. | |||
SEARCH_COMPLETE: The SEARCH_COMPLETE state indicates a successful | SEARCH_COMPLETE: The SEARCH_COMPLETE state indicates a successful | |||
end to the SEARCHING state. DPLPMTUD remains in this state until | end to the SEARCHING state. DPLPMTUD remains in this state until | |||
either the PMTU_RAISE_TIMER expires or a black hole is detected. | either the PMTU_RAISE_TIMER expires or a black hole is detected. | |||
When DPLPMTUD uses an unacknowledged PL and is in the | When DPLPMTUD uses an unacknowledged PL and is in the | |||
SEARCH_COMPLETE state, a CONFIRMATION_TIMER periodically resets | SEARCH_COMPLETE state, a CONFIRMATION_TIMER periodically resets | |||
the PROBE_COUNT and schedules a probe packet with the size of the | the PROBE_COUNT and schedules a probe packet with the size of the | |||
PLPMTU. If MAX_PROBES successive PLPMTUD sized probes fail to be | PLPMTU. If MAX_PROBES successive PLPMTUD sized probes fail to be | |||
acknowledged the method enters the BASE state. When used with an | acknowledged the method enters the BASE state. When used with an | |||
acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue to | acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue to | |||
generate PLPMTU probes in this state. | generate PLPMTU probes in this state. | |||
ERROR: The ERROR state represents the case where either the network | ERROR: The ERROR state represents the case where either the network | |||
path is not known to support a PLPMTU of at least the BASE_PMTU | path is not known to support a PLPMTU of at least the BASE_PLPMTU | |||
size or when there is contradictory information about the network | size or when there is contradictory information about the network | |||
path that would otherwise result in excessive variation in the MPS | path that would otherwise result in excessive variation in the MPS | |||
signalled to the higher layer. The state implements a method to | signaled to the higher layer. The state implements a method to | |||
mitigate oscillation in the state-event engine. It signals a | mitigate oscillation in the state-event engine. It signals a | |||
conservative value of the MPS to the higher layer by the PL. The | conservative value of the MPS to the higher layer by the PL. The | |||
state is exited when packet probes no longer detect the error. | state is exited when packet probes no longer detect the error. | |||
The PL sender then enters the SEARCHING state. | The PL sender then enters the SEARCHING state. | |||
Implementations are permitted to enable endpoint fragmentation if | Implementations are permitted to enable endpoint fragmentation if | |||
the DPLPMTUD is unable to validate MIN_PMTU within PROBE_COUNT | the DPLPMTUD is unable to validate MIN_PLPMTU within PROBE_COUNT | |||
probes. If DPLPMTUD is unable to validate MIN_PMTU the | probes. If DPLPMTUD is unable to validate MIN_PLPMTU the | |||
implementation will transition to the DISABLED state. | implementation will transition to the DISABLED state. | |||
Note: MIN_PMTU could be identical to BASE_PMTU, simplifying the | Note: MIN_PLPMTU could be identical to BASE_PLPMTU, simplifying | |||
actions in this state. | the actions in this state. | |||
5.3. Search to Increase the PLPMTU | 5.3. Search to Increase the PLPMTU | |||
This section describes the algorithms used by DPLPMTUD to search for | This section describes the algorithms used by DPLPMTUD to search for | |||
a larger PLPMTU. | a larger PLPMTU. | |||
5.3.1. Probing for a larger PLPMTU | 5.3.1. Probing for a larger PLPMTU | |||
Implementations use a search algorithm across the search range to | Implementations use a search algorithm across the search range to | |||
determine whether a larger PLPMTU can be supported across a network | determine whether a larger PLPMTU can be supported across a network | |||
path. | path. | |||
The method discovers the search range by confirming the minimum | The method discovers the search range by confirming the minimum | |||
PLPMTU and then using the probe method to select a PROBED_SIZE less | PLPMTU and then using the probe method to select a PROBED_SIZE less | |||
than or equal to MAX_PMTU. MAX_PMTU is the minimum of the local MTU | than or equal to MAX_PLPMTU. MAX_PLPMTU is the minimum of the local | |||
and EMTU_R (learned from the remote endpoint). The MAX_PMTU MAY be | MTU and EMTU_R (when this is learned from the remote endpoint). The | |||
reduced by an application that sets a maximum to the size of | MAX_PLPMTU MAY be reduced by an application that sets a maximum to | |||
datagrams it will send. | the size of datagrams it will send. | |||
The PROBE_COUNT is initialized to zero when the first probe with a | The PROBE_COUNT is initialized to zero when the first probe with a | |||
size greater than or equal to PLPMTUD is sent. A timer is used to | size greater than or equal to PLPMTUD is sent. A timer is used to | |||
trigger the sending of probe packets of size PROBED_SIZE, larger than | trigger the sending of probe packets of size PROBED_SIZE, larger than | |||
the PLPMTU. Each probe packet successfully sent to the remote peer | the PLPMTU. Each probe packet successfully sent to the remote peer | |||
is confirmed by acknowledgement at the PL, see Section 4.1. | is confirmed by acknowledgment at the PL, see Section 4.1. | |||
Each time a probe packet is sent to the destination, the PROBE_TIMER | Each time a probe packet is sent to the destination, the PROBE_TIMER | |||
is started. The timer is canceled when the PL receives | is started. The timer is canceled when the PL receives | |||
acknowledgment that the probe packet has been successfully sent | acknowledgment that the probe packet has been successfully sent | |||
across the path Section 4.1. This confirms that the PROBED_SIZE is | across the path Section 4.1. This confirms that the PROBED_SIZE is | |||
supported, and the PROBED_SIZE value is then assigned to the PLPMTU. | supported, and the PROBED_SIZE value is then assigned to the PLPMTU. | |||
The search algorithm can continue to send subsequent probe packets of | The search algorithm can continue to send subsequent probe packets of | |||
an increasing size. | an increasing size. | |||
If the timer expires before a probe packet is acknowledged, the probe | If the timer expires before a probe packet is acknowledged, the probe | |||
skipping to change at page 29, line 9 ¶ | skipping to change at page 30, line 9 ¶ | |||
A PL sender is able to detect inconsistency from the sequence of | A PL sender is able to detect inconsistency from the sequence of | |||
PLPMTU probes that are acknowledged or the sequence of PTB messages | PLPMTU probes that are acknowledged or the sequence of PTB messages | |||
that it receives. When inconsistent path information is detected, a | that it receives. When inconsistent path information is detected, a | |||
PL sender could use an alternate search mode that clamps the offered | PL sender could use an alternate search mode that clamps the offered | |||
MPS to a smaller value for a period of time. This avoids unnecessary | MPS to a smaller value for a period of time. This avoids unnecessary | |||
loss of packets. | loss of packets. | |||
5.4. Robustness to Inconsistent Paths | 5.4. Robustness to Inconsistent Paths | |||
Some paths could be unable to sustain packets of the BASE_PMTU size. | Some paths could be unable to sustain packets of the BASE_PLPMTU | |||
To be robust to these paths an implementation could implement the | size. To be robust to these paths an implementation could implement | |||
Error State. This allows fallback to a smaller than desired PLPMTU, | the Error State. This allows fallback to a smaller than desired | |||
rather than suffer connectivity failure. This could utilize methods | PLPMTU, rather than suffer connectivity failure. This could utilize | |||
such as endpoint IP fragmentation to enable the PL sender to | methods such as endpoint IP fragmentation to enable the PL sender to | |||
communicate using packets smaller than the BASE_PMTU. | communicate using packets smaller than the BASE_PLPMTU. | |||
6. Specification of Protocol-Specific Methods | 6. Specification of Protocol-Specific Methods | |||
DPLPMTUD requires protocol-specific details to be specified for each | DPLPMTUD requires protocol-specific details to be specified for each | |||
PL that is used. | PL that is used. | |||
The first subsection provides guidance on how to implement the | The first subsection provides guidance on how to implement the | |||
DPLPMTUD method as a part of an application using UDP or UDP-Lite. | DPLPMTUD method as a part of an application using UDP or UDP-Lite. | |||
The guidance also applies to other datagram services that do not | The guidance also applies to other datagram services that do not | |||
include a specific transport protocol (such as a tunnel | include a specific transport protocol (such as a tunnel | |||
skipping to change at page 30, line 8 ¶ | skipping to change at page 31, line 8 ¶ | |||
In addition, it is desirable that PMTU discovery is not performed by | In addition, it is desirable that PMTU discovery is not performed by | |||
multiple protocol layers. An application SHOULD avoid using DPLPMTUD | multiple protocol layers. An application SHOULD avoid using DPLPMTUD | |||
when the underlying transport system provides this capability. To | when the underlying transport system provides this capability. To | |||
use common method for managing the PLPMTU has benefits, both in the | use common method for managing the PLPMTU has benefits, both in the | |||
ability to share state between different processes and opportunities | ability to share state between different processes and opportunities | |||
to coordinate probing. | to coordinate probing. | |||
6.1.1. Application Request | 6.1.1. Application Request | |||
An application needs an application-layer protocol mechanism (such as | An application needs an application-layer protocol mechanism (such as | |||
a message acknowledgement method) that solicits a response from a | a message acknowledgment method) that solicits a response from a | |||
destination endpoint. The method SHOULD allow the sender to check | destination endpoint. The method SHOULD allow the sender to check | |||
the value returned in the response to provide additional protection | the value returned in the response to provide additional protection | |||
from off-path insertion of data [RFC8085], suitable methods include a | from off-path insertion of data [RFC8085], suitable methods include a | |||
parameter known only to the two endpoints, such as a session ID or | parameter known only to the two endpoints, such as a session ID or | |||
initialized sequence number. | initialized sequence number. | |||
6.1.2. Application Response | 6.1.2. Application Response | |||
An application needs an application-layer protocol mechanism to | An application needs an application-layer protocol mechanism to | |||
communicate the response from the destination endpoint. This | communicate the response from the destination endpoint. This | |||
response could indicate successful reception of the probe across the | response could indicate successful reception of the probe across the | |||
path, but could also indicate that some (or all packets) have failed | path, but could also indicate that some (or all packets) have failed | |||
to reach the destination. | to reach the destination. | |||
6.1.3. Sending Application Probe Packets | 6.1.3. Sending Application Probe Packets | |||
A probe packet that could carry an application data block, but the | A probe packet can carry an application data block, but the | |||
successful transmission of this data is at risk when used for | successful transmission of this data is at risk when used for | |||
probing. Some applications might prefer to use a probe packet that | probing. Some applications might prefer to use a probe packet that | |||
does not carry an application data block to avoid disruption to data | does not carry an application data block to avoid disruption to data | |||
transfer. | transfer. | |||
6.1.4. Initial Connectivity | 6.1.4. Initial Connectivity | |||
An application that does not have other higher-layer information | An application that does not have other higher-layer information | |||
confirming connectivity with the remote peer SHOULD implement a | confirming connectivity with the remote peer SHOULD implement a | |||
connectivity mechanism using acknowledged probe packets before | connectivity mechanism using acknowledged probe packets before | |||
skipping to change at page 30, line 49 ¶ | skipping to change at page 31, line 49 ¶ | |||
An application that does not have other higher-layer information | An application that does not have other higher-layer information | |||
confirming correct delivery of datagrams SHOULD implement the | confirming correct delivery of datagrams SHOULD implement the | |||
CONFIRMATION_TIMER to periodically send probe packets while in the | CONFIRMATION_TIMER to periodically send probe packets while in the | |||
SEARCH_COMPLETE state. | SEARCH_COMPLETE state. | |||
6.1.6. Handling of PTB Messages | 6.1.6. Handling of PTB Messages | |||
An application that is able and wishes to receive PTB messages MUST | An application that is able and wishes to receive PTB messages MUST | |||
perform ICMP validation as specified in Section 5.2 of [RFC8085]. | perform ICMP validation as specified in Section 5.2 of [RFC8085]. | |||
This requires that the application to check each received PTB | This requires that the application checks each received PTB message | |||
messages to validate it is received in response to transmitted | to validate that it was is received in response to transmitted | |||
traffic and that the reported PTB_SIZE is less than the current | traffic and that the reported PL_PTB_SIZE is less than the current | |||
probed size (see Section 4.6.2). A validated PTB message MAY be used | probed size (see Section 4.6.2). A validated PTB message MAY be used | |||
as input to the DPLPMTUD algorithm, but MUST NOT be used directly to | as input to the DPLPMTUD algorithm, but MUST NOT be used directly to | |||
set the PLPMTU. | set the PLPMTU. | |||
6.2. DPLPMTUD for SCTP | 6.2. DPLPMTUD for SCTP | |||
Section 10.2 of [RFC4821] specified a recommended PLPMTUD probing | Section 10.2 of [RFC4821] specified a recommended PLPMTUD probing | |||
method for SCTP and Section 7.3 of [RFC4960] and recommended an | method for SCTP and Section 7.3 of [RFC4960] and recommended an | |||
endpoint apply the techniques in RFC4821 on a per-destination-address | endpoint apply the techniques in RFC4821 on a per-destination-address | |||
basis. The specification for DPLPMTUD continues the practice of | basis. The specification for DPLPMTUD continues the practice of | |||
skipping to change at page 32, line 10 ¶ | skipping to change at page 33, line 10 ¶ | |||
The HEARTBEAT chunk carries a Heartbeat Information parameter which | The HEARTBEAT chunk carries a Heartbeat Information parameter which | |||
includes, besides the information suggested in [RFC4960], the probe | includes, besides the information suggested in [RFC4960], the probe | |||
size, which is the size of the complete datagram. The size of the | size, which is the size of the complete datagram. The size of the | |||
PAD chunk is therefore computed by reducing the probing size by the | PAD chunk is therefore computed by reducing the probing size by the | |||
IPv4 or IPv6 header size, the SCTP common header, the HEARTBEAT | IPv4 or IPv6 header size, the SCTP common header, the HEARTBEAT | |||
request and the PAD chunk header. The payload of the PAD chunk | request and the PAD chunk header. The payload of the PAD chunk | |||
contains arbitrary data. | contains arbitrary data. | |||
Probing starts directly after the PL handshake, before data is sent. | Probing starts directly after the PL handshake, before data is sent. | |||
Assuming this behavior (i.e., the PMTU is smaller than or equal to | Assuming this behavior (i.e., the PMTU is smaller than or equal to | |||
the interface MTU), this process will take a few round trip time | the interface MTU), this process will take several round trip time | |||
periods, dependent on the number of PMTU probes sent. The Heartbeat | periods, dependent on the number of DPLPMTUD probes sent. The | |||
timer can be used to implement the PROBE_TIMER. | Heartbeat timer can be used to implement the PROBE_TIMER. | |||
6.2.1.3. Validating the Path with SCTP | 6.2.1.3. Validating the Path with SCTP | |||
Since SCTP provides an acknowledged PL, a sender MUST NOT implement | Since SCTP provides an acknowledged PL, a sender MUST NOT implement | |||
the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | |||
6.2.1.4. PTB Message Handling by SCTP | 6.2.1.4. PTB Message Handling by SCTP | |||
Normal ICMP validation MUST be performed as specified in Appendix C | Normal ICMP validation MUST be performed as specified in Appendix C | |||
of [RFC4960]. This requires that the first 8 bytes of the SCTP | of [RFC4960]. This requires that the first 8 bytes of the SCTP | |||
common header are quoted in the payload of the PTB message, which can | common header are quoted in the payload of the PTB message, which can | |||
be the case for ICMPv4 and is normally the case for ICMPv6. | be the case for ICMPv4 and is normally the case for ICMPv6. | |||
When a PTB message has been validated, the PTB_SIZE reported in the | When a PTB message has been validated, the PL_PTB_SIZE calculated | |||
PTB message SHOULD be used with the DPLPMTUD algorithm, providing | from the PTB_SIZE reported in the PTB message SHOULD be used with the | |||
that the reported PTB_SIZE is less than the current probe size (see | DPLPMTUD algorithm, providing that the reported PL_PTB_SIZE is less | |||
Section 4.6). | than the current probe size (see Section 4.6). | |||
6.2.2. DPLPMTUD for SCTP/UDP | 6.2.2. DPLPMTUD for SCTP/UDP | |||
The UDP encapsulation of SCTP is specified in [RFC6951]. | The UDP encapsulation of SCTP is specified in [RFC6951]. | |||
This specification updates the reference to RFC 4821 in section 5.6 | This specification updates the reference to RFC 4821 in section 5.6 | |||
of RFC 6951 to refer to XXXTHISRFCXXX. RFC 6951 is updated by | of RFC 6951 to refer to XXXTHISRFCXXX. RFC 6951 is updated by | |||
addition of the following sentence is to be added at the end of | addition of the following sentence is to be added at the end of | |||
section 5.6: "The RECOMMENDED method for determining the MTU of the | section 5.6: "The RECOMMENDED method for determining the MTU of the | |||
path is specified in XXXTHISRFCXXX". | path is specified in XXXTHISRFCXXX". | |||
skipping to change at page 33, line 23 ¶ | skipping to change at page 34, line 23 ¶ | |||
Since SCTP provides an acknowledged PL, a sender MUST NOT implement | Since SCTP provides an acknowledged PL, a sender MUST NOT implement | |||
the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | |||
6.2.2.4. Handling of PTB Messages by SCTP/UDP | 6.2.2.4. Handling of PTB Messages by SCTP/UDP | |||
ICMP validation MUST be performed for PTB messages as specified in | ICMP validation MUST be performed for PTB messages as specified in | |||
Appendix C of [RFC4960]. This requires that the first 8 bytes of the | Appendix C of [RFC4960]. This requires that the first 8 bytes of the | |||
SCTP common header are contained in the PTB message, which can be the | SCTP common header are contained in the PTB message, which can be the | |||
case for ICMPv4 (but note the UDP header also consumes a part of the | case for ICMPv4 (but note the UDP header also consumes a part of the | |||
quoted packet header) and is normally the case for ICMPv6. When the | quoted packet header) and is normally the case for ICMPv6. When the | |||
validation is completed, the PTB_SIZE indicated in the PTB message | validation is completed, the PL_PTB_SIZE calculated from the PTB_SIZE | |||
SHOULD be used with the DPLPMTUD providing that the reported PTB_SIZE | in the PTB message SHOULD be used with the DPLPMTUD providing that | |||
is less than the current probe size. | the reported PL_PTB_SIZE is less than the current probe size. | |||
6.2.3. DPLPMTUD for SCTP/DTLS | 6.2.3. DPLPMTUD for SCTP/DTLS | |||
The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is | The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is | |||
specified in [RFC8261] . This is used for data channels in WebRTC | specified in [RFC8261]. This is used for data channels in WebRTC | |||
implementations. This specification updates the reference to RFC | implementations. This specification updates the reference to RFC | |||
4821 in section 5 of RFC 8261 to refer to XXXTHISRFCXXX. | 4821 in section 5 of RFC 8261 to refer to XXXTHISRFCXXX. | |||
XXX RFC EDITOR - please replace XXXTHISRFCXXX when published XXX | XXX RFC EDITOR - please replace XXXTHISRFCXXX when published XXX | |||
6.2.3.1. Initial Connectivity | 6.2.3.1. Initial Connectivity | |||
A sender can enter the BASE state as soon as SCTP connectivity has | A sender can enter the BASE state as soon as SCTP connectivity has | |||
been confirmed. | been confirmed. | |||
skipping to change at page 34, line 21 ¶ | skipping to change at page 35, line 21 ¶ | |||
QUIC [I-D.ietf-quic-transport] is a UDP-based transport that provides | QUIC [I-D.ietf-quic-transport] is a UDP-based transport that provides | |||
reception feedback. The UDP payload includes the QUIC packet header, | reception feedback. The UDP payload includes the QUIC packet header, | |||
protected payload, and any authentication fields. QUIC depends on a | protected payload, and any authentication fields. QUIC depends on a | |||
PMTU of at least 1280 bytes. | PMTU of at least 1280 bytes. | |||
Section 14 of [I-D.ietf-quic-transport] describes the path | Section 14 of [I-D.ietf-quic-transport] describes the path | |||
considerations when sending QUIC packets. It recommends the use of | considerations when sending QUIC packets. It recommends the use of | |||
PADDING frames to build the probe packet. Pure probe-only packets | PADDING frames to build the probe packet. Pure probe-only packets | |||
are constructed with PADDING frames and PING frames to create a | are constructed with PADDING frames and PING frames to create a | |||
padding only packet that will elicit an acknowledgement. Such | padding only packet that will elicit an acknowledgment. Such padding | |||
padding only packets enable probing without affecting the transfer of | only packets enable probing without affecting the transfer of other | |||
other QUIC frames. | QUIC frames. | |||
The recommendation for QUIC endpoints implementing DPLPMTUD is that a | The recommendation for QUIC endpoints implementing DPLPMTUD is that a | |||
MPS is maintained for each combination of local and remote IP | MPS is maintained for each combination of local and remote IP | |||
addresses [I-D.ietf-quic-transport]. If a QUIC endpoint determines | addresses [I-D.ietf-quic-transport]. If a QUIC endpoint determines | |||
that the PMTU between any pair of local and remote IP addresses has | that the PMTU between any pair of local and remote IP addresses has | |||
fallen below an acceptable MPS, it immediately ceases to send QUIC | fallen below the size required for an acceptable MPS, it immediately | |||
packets on the affected path. This could result in termination of | ceases to send QUIC packets on the affected path. This could result | |||
the connection if an alternative path cannot be found | in termination of the connection if an alternative path cannot be | |||
[I-D.ietf-quic-transport]. | found [I-D.ietf-quic-transport]. | |||
6.3.1. Initial Connectivity | 6.3.1. Initial Connectivity | |||
The base protocol is specified in [I-D.ietf-quic-transport]. This | The base protocol is specified in [I-D.ietf-quic-transport]. This | |||
provides an acknowledged PL. A sender can therefore enter the BASE | provides an acknowledged PL. A sender can therefore enter the BASE | |||
state as soon as connectivity has been confirmed. | state as soon as connectivity has been confirmed. | |||
6.3.2. Sending QUIC Probe Packets | 6.3.2. Sending QUIC Probe Packets | |||
A probe packet consists of a QUIC Header and a payload containing | A probe packet consists of a QUIC Header and a payload containing | |||
PADDING Frames and a PING Frame. PADDING Frames are a single octet | PADDING Frames and a PING Frame. PADDING Frames are a single octet | |||
(0x00) and several of these can be used to create a probe packet of | (0x00) and several of these can be used to create a probe packet of | |||
size PROBED_SIZE. QUIC provides an acknowledged PL, a sender can | size PROBED_SIZE. QUIC provides an acknowledged PL, a sender can | |||
therefore enter the BASE state as soon as connectivity has been | therefore enter the BASE state as soon as connectivity has been | |||
confirmed. | confirmed. | |||
The current specification of QUIC sets the following: | The current specification of QUIC sets the following: | |||
* BASE_PMTU: 1280. A QUIC sender pads initial packets to confirm | * BASE_PLPMTU: A QUIC sender pads initial packets to confirm the | |||
the path can support packets of the required size. | path can support packets of the required size, this sets the | |||
BASE_PLPMTU and MIN_PLPMTU. | ||||
* MIN_PMTU: 1280 bytes. A QUIC sender that determines the PLPMTU | * MIN_PLPMTU: A QUIC sender that determines the MIN_PLPMTU has | |||
has fallen below 1280 bytes MUST immediately stop sending on the | fallen MUST immediately stop sending on the affected path. | |||
affected path. | ||||
6.3.3. Validating the Path with QUIC | 6.3.3. Validating the Path with QUIC | |||
QUIC provides an acknowledged PL. A sender therefore MUST NOT | QUIC provides an acknowledged PL. A sender therefore MUST NOT | |||
implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | |||
6.3.4. Handling of PTB Messages by QUIC | 6.3.4. Handling of PTB Messages by QUIC | |||
QUIC validates ICMP PTB messages. In addition to UDP Port | QUIC validates ICMP PTB messages. In addition to UDP Port | |||
validation, QUIC can validate an ICMP message by using other PL | validation, QUIC can validate an ICMP message by using other PL | |||
information (e.g., validation of connection IDs in the quoted packet | information (e.g., validation of connection identifiers (CIDs) in the | |||
of any received ICMP message). | quoted packet of any received ICMP message). | |||
7. Acknowledgements | 7. Acknowledgments | |||
This work was partially funded by the European Union's Horizon 2020 | This work was partially funded by the European Union's Horizon 2020 | |||
research and innovation programme under grant agreement No. 644334 | research and innovation programme under grant agreement No. 644334 | |||
(NEAT). The views expressed are solely those of the author(s). | (NEAT). The views expressed are solely those of the author(s). | |||
Thanks to all that have commented or contributed, the TSVWG and QUIC | Thanks to all that have commented or contributed, the TSVWG and QUIC | |||
working groups, and Mathew Calder and Julius Flohr for providing | working groups, and Mathew Calder and Julius Flohr for providing | |||
early implementations. | early implementations. | |||
8. IANA Considerations | 8. IANA Considerations | |||
skipping to change at page 36, line 41 ¶ | skipping to change at page 37, line 43 ¶ | |||
hole data by indicating a size larger than supported by the path. | hole data by indicating a size larger than supported by the path. | |||
Parallel forwarding paths SHOULD be considered. Section 5.4 | Parallel forwarding paths SHOULD be considered. Section 5.4 | |||
identifies the need for robustness in the method because the path | identifies the need for robustness in the method because the path | |||
information might be inconsistent. | information might be inconsistent. | |||
A node performing DPLPMTUD could experience conflicting information | A node performing DPLPMTUD could experience conflicting information | |||
about the size of supported probe packets. This could occur when | about the size of supported probe packets. This could occur when | |||
there are multiple paths are concurrently in use and these exhibit a | there are multiple paths are concurrently in use and these exhibit a | |||
different PMTU. If not considered, this could result in packets not | different PMTU. If not considered, this could result in packets not | |||
being delivered (black holed) when the PLPMTU is larger than the | being delivered (black holed) when the PLPMTU results in a packet | |||
smallest actual PMTU. | larger than the smallest actual PMTU. | |||
DPLPMTUD methods can introduce padding data to inflate the length of | DPLPMTUD methods can introduce padding data to inflate the length of | |||
the datagram to the total size required for a probe packet. The | the datagram to the total size required for a probe packet. The | |||
total size of a probe packet includes all headers and padding added | total size of a probe packet includes all headers and padding added | |||
to the payload data being sent (e.g., including security-related | to the payload data being sent (e.g., including security-related | |||
fields such as an AEAD tag and TLS record layer padding). The value | fields such as an AEAD tag and TLS record layer padding). The value | |||
of the padding data does not influence the DPLPMTUD search algorithm, | of the padding data does not influence the DPLPMTUD search algorithm, | |||
and therefore needs to be set consistent with the policy of the PL. | and therefore needs to be set consistent with the policy of the PL. | |||
If a PL can make use of cryptographic confidentiality or data- | If a PL can make use of cryptographic confidentiality or data- | |||
integrity mechanisms, then adding anything (e.g., padding) for | integrity mechanisms, then the design ought to avoid adding anything | |||
DPLPMTUD that is not protected by those cryptographic mechanisms is | (e.g., padding) to DPLPMTUD probe packets that is not also protected | |||
an anti-pattern to be avoided. | by those cryptographic mechanisms. | |||
10. References | 10. References | |||
10.1. Normative References | 10.1. Normative References | |||
[I-D.ietf-quic-transport] | [I-D.ietf-quic-transport] | |||
Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed | Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed | |||
and Secure Transport", Work in Progress, Internet-Draft, | and Secure Transport", Work in Progress, Internet-Draft, | |||
draft-ietf-quic-transport-27, 21 February 2020, | draft-ietf-quic-transport-27, 21 February 2020, | |||
<http://www.ietf.org/internet-drafts/draft-ietf-quic- | <http://www.ietf.org/internet-drafts/draft-ietf-quic- | |||
skipping to change at page 42, line 51 ¶ | skipping to change at page 44, line 4 ¶ | |||
* Removed section on DPLPMTUD with UDP Options. | * Removed section on DPLPMTUD with UDP Options. | |||
* Shortened the description of phases. | * Shortened the description of phases. | |||
Working group draft -09: | Working group draft -09: | |||
* Remove final mention of UDP Options | * Remove final mention of UDP Options | |||
* Add Initial Connectivity sections to each PL | * Add Initial Connectivity sections to each PL | |||
* Add to disable outgoing pmtu enforcement of packets | * Add to disable outgoing pmtu enforcement of packets | |||
Working group draft -10: | Working group draft -10: | |||
* Address comments from Lars Eggert | * Address comments from Lars Eggert | |||
* Reinforce that PROBE_COUNT is successive attempts to probe for any | * Reinforce that PROBE_COUNT is successive attempts to probe for any | |||
size | size | |||
* Redefine MAx_PROBES to 3 | * Redefine MAX_PROBES to 3 | |||
* Address PTB_SIZE of 0 or less that MIN_PMTU | * Address PTB_SIZE of 0 or less that MIN_PLPMTU | |||
Working group draft -11: | Working group draft -11: | |||
* Restore a sentence removed in previous rev | * Restore a sentence removed in previous rev | |||
* De-acronymise QUIC | * De-acronymise QUIC | |||
* Address some nits | * Address some nits | |||
Working group draft -12: | Working group draft -12: | |||
* Add TSVWG, QUIC and implementers to acknowledgements | * Add TSVWG, QUIC and implementers to acknowledgments | |||
* Shorten a diagram line. | * Shorten a diagram line. | |||
* Address nits from Julius and Wes. | * Address nits from Julius and Wes. | |||
* Be clearer when talking about IP layer caches | * Be clearer when talking about IP layer caches | |||
Working group draft -13, -14: | ||||
* Updated after WGLC. | ||||
Working group draft -15: | ||||
* Updated after AD evaluation and prepared for IETF-LC. | ||||
Working group draft -16: | ||||
* Updated text after SECDIR review. | ||||
Working group draft -17: | ||||
* Updated text after GENART and IETF-LC. | ||||
* Renamed BASE_MTU to BASE_PLPMTU, and MIN and MAX PMTU to PLPMTU | ||||
(because these are about a base for the PLPMTU), and ensured | ||||
consistent separation of PMTU and PLPMTU. | ||||
* Adopted US-style English throughout. | ||||
Authors' Addresses | Authors' Addresses | |||
Godred Fairhurst | Godred Fairhurst | |||
University of Aberdeen | University of Aberdeen | |||
School of Engineering | School of Engineering | |||
Fraser Noble Building | Fraser Noble Building | |||
Aberdeen | Aberdeen | |||
AB24 3UE | AB24 3UE | |||
United Kingdom | United Kingdom | |||
End of changes. 108 change blocks. | ||||
296 lines changed or deleted | 352 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |