draft-ietf-tsvwg-datagram-plpmtud-17.txt | draft-ietf-tsvwg-datagram-plpmtud-18.txt | |||
---|---|---|---|---|
Internet Engineering Task Force G. Fairhurst | Internet Engineering Task Force G. Fairhurst | |||
Internet-Draft T. Jones | Internet-Draft T. Jones | |||
Updates: 4821, 4960, 6951, 8085, 8261 (if University of Aberdeen | Updates: 4821, 4960, 6951, 8085, 8261 (if University of Aberdeen | |||
approved) M. Tuexen | approved) M. Tuexen | |||
Intended status: Standards Track I. Ruengeler | Intended status: Standards Track I. Ruengeler | |||
Expires: 24 September 2020 T. Voelker | Expires: 4 October 2020 T. Voelker | |||
Muenster University of Applied Sciences | Muenster University of Applied Sciences | |||
23 March 2020 | 2 April 2020 | |||
Packetization Layer Path MTU Discovery for Datagram Transports | Packetization Layer Path MTU Discovery for Datagram Transports | |||
draft-ietf-tsvwg-datagram-plpmtud-17 | draft-ietf-tsvwg-datagram-plpmtud-18 | |||
Abstract | Abstract | |||
This document describes a robust method for Path MTU Discovery | This document describes a robust method for Path MTU Discovery | |||
(PMTUD) for datagram Packetization Layers (PLs). It describes an | (PMTUD) for datagram Packetization Layers (PLs). It describes an | |||
extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path | extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path | |||
MTU Discovery for IPv4 and IPv6. The method allows a PL, or a | MTU Discovery for IPv4 and IPv6. The method allows a PL, or a | |||
datagram application that uses a PL, to discover whether a network | datagram application that uses a PL, to discover whether a network | |||
path can support the current size of datagram. This can be used to | path can support the current size of datagram. This can be used to | |||
detect and reduce the message size when a sender encounters a packet | detect and reduce the message size when a sender encounters a packet | |||
skipping to change at page 2, line 15 ¶ | skipping to change at page 2, line 15 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on 24 September 2020. | This Internet-Draft will expire on 4 October 2020. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2020 IETF Trust and the persons identified as the | Copyright (c) 2020 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents (https://trustee.ietf.org/ | |||
license-info) in effect on the date of publication of this document. | license-info) in effect on the date of publication of this document. | |||
Please review these documents carefully, as they describe your rights | Please review these documents carefully, as they describe your rights | |||
skipping to change at page 2, line 41 ¶ | skipping to change at page 2, line 41 ¶ | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 4 | 1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 4 | |||
1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 6 | 1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 6 | |||
1.3. Path MTU Discovery for Datagram Services . . . . . . . . 7 | 1.3. Path MTU Discovery for Datagram Services . . . . . . . . 7 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 8 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 10 | 3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 10 | |||
4. DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . . 13 | 4. DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . . 13 | |||
4.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 13 | 4.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 13 | |||
4.2. Confirmation of Probed Packet Size . . . . . . . . . . . 14 | 4.2. Confirmation of Probed Packet Size . . . . . . . . . . . 15 | |||
4.3. Black Hole Detection . . . . . . . . . . . . . . . . . . 15 | 4.3. Black Hole Detection . . . . . . . . . . . . . . . . . . 15 | |||
4.4. The Maximum Packet Size (MPS) . . . . . . . . . . . . . . 16 | 4.4. The Maximum Packet Size (MPS) . . . . . . . . . . . . . . 16 | |||
4.5. Disabling the Effect of PMTUD . . . . . . . . . . . . . . 17 | 4.5. Disabling the Effect of PMTUD . . . . . . . . . . . . . . 17 | |||
4.6. Response to PTB Messages . . . . . . . . . . . . . . . . 17 | 4.6. Response to PTB Messages . . . . . . . . . . . . . . . . 17 | |||
4.6.1. Validation of PTB Messages . . . . . . . . . . . . . 17 | 4.6.1. Validation of PTB Messages . . . . . . . . . . . . . 17 | |||
4.6.2. Use of PTB Messages . . . . . . . . . . . . . . . . . 18 | 4.6.2. Use of PTB Messages . . . . . . . . . . . . . . . . . 18 | |||
5. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 20 | 5. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 20 | |||
5.1. DPLPMTUD Components . . . . . . . . . . . . . . . . . . . 20 | 5.1. DPLPMTUD Components . . . . . . . . . . . . . . . . . . . 21 | |||
5.1.1. Timers . . . . . . . . . . . . . . . . . . . . . . . 21 | 5.1.1. Timers . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
5.1.2. Constants . . . . . . . . . . . . . . . . . . . . . . 21 | 5.1.2. Constants . . . . . . . . . . . . . . . . . . . . . . 22 | |||
5.1.3. Variables . . . . . . . . . . . . . . . . . . . . . . 22 | 5.1.3. Variables . . . . . . . . . . . . . . . . . . . . . . 22 | |||
5.1.4. Overview of DPLPMTUD Phases . . . . . . . . . . . . . 23 | 5.1.4. Overview of DPLPMTUD Phases . . . . . . . . . . . . . 23 | |||
5.2. State Machine . . . . . . . . . . . . . . . . . . . . . . 25 | 5.2. State Machine . . . . . . . . . . . . . . . . . . . . . . 25 | |||
5.3. Search to Increase the PLPMTU . . . . . . . . . . . . . . 28 | 5.3. Search to Increase the PLPMTU . . . . . . . . . . . . . . 28 | |||
5.3.1. Probing for a larger PLPMTU . . . . . . . . . . . . . 28 | 5.3.1. Probing for a larger PLPMTU . . . . . . . . . . . . . 28 | |||
5.3.2. Selection of Probe Sizes . . . . . . . . . . . . . . 29 | 5.3.2. Selection of Probe Sizes . . . . . . . . . . . . . . 29 | |||
5.3.3. Resilience to Inconsistent Path Information . . . . . 29 | 5.3.3. Resilience to Inconsistent Path Information . . . . . 29 | |||
5.4. Robustness to Inconsistent Paths . . . . . . . . . . . . 30 | 5.4. Robustness to Inconsistent Paths . . . . . . . . . . . . 30 | |||
6. Specification of Protocol-Specific Methods . . . . . . . . . 30 | 6. Specification of Protocol-Specific Methods . . . . . . . . . 30 | |||
6.1. Application support for DPLPMTUD with UDP or UDP-Lite . . 30 | 6.1. Application support for DPLPMTUD with UDP or UDP-Lite . . 30 | |||
skipping to change at page 3, line 46 ¶ | skipping to change at page 3, line 46 ¶ | |||
6.3.1. Initial Connectivity . . . . . . . . . . . . . . . . 35 | 6.3.1. Initial Connectivity . . . . . . . . . . . . . . . . 35 | |||
6.3.2. Sending QUIC Probe Packets . . . . . . . . . . . . . 35 | 6.3.2. Sending QUIC Probe Packets . . . . . . . . . . . . . 35 | |||
6.3.3. Validating the Path with QUIC . . . . . . . . . . . . 36 | 6.3.3. Validating the Path with QUIC . . . . . . . . . . . . 36 | |||
6.3.4. Handling of PTB Messages by QUIC . . . . . . . . . . 36 | 6.3.4. Handling of PTB Messages by QUIC . . . . . . . . . . 36 | |||
7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 36 | 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 36 | |||
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 36 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 36 | |||
9. Security Considerations . . . . . . . . . . . . . . . . . . . 36 | 9. Security Considerations . . . . . . . . . . . . . . . . . . . 36 | |||
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 38 | 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 38 | |||
10.1. Normative References . . . . . . . . . . . . . . . . . . 38 | 10.1. Normative References . . . . . . . . . . . . . . . . . . 38 | |||
10.2. Informative References . . . . . . . . . . . . . . . . . 39 | 10.2. Informative References . . . . . . . . . . . . . . . . . 39 | |||
Appendix A. Revision Notes . . . . . . . . . . . . . . . . . . . 40 | Appendix A. Revision Notes . . . . . . . . . . . . . . . . . . . 41 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 45 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 45 | |||
1. Introduction | 1. Introduction | |||
The IETF has specified datagram transport using UDP, SCTP, and DCCP, | The IETF has specified datagram transport using UDP, SCTP, and DCCP, | |||
as well as protocols layered on top of these transports (e.g., SCTP/ | as well as protocols layered on top of these transports (e.g., SCTP/ | |||
UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP | UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP | |||
network layer. This document describes a robust method for Path MTU | network layer. This document describes a robust method for Path MTU | |||
Discovery (PMTUD) that can be used with these transport protocols (or | Discovery (PMTUD) that can be used with these transport protocols (or | |||
the applications that use their transport service) to discover an | the applications that use their transport service) to discover an | |||
skipping to change at page 6, line 24 ¶ | skipping to change at page 6, line 24 ¶ | |||
validate the message, because validation depends on information | validate the message, because validation depends on information | |||
about the active transport flows at an endpoint node (e.g., the | about the active transport flows at an endpoint node (e.g., the | |||
socket/address pairs being used, and other protocol header | socket/address pairs being used, and other protocol header | |||
information). | information). | |||
* When a packet is encapsulated/tunneled over an encrypted | * When a packet is encapsulated/tunneled over an encrypted | |||
transport, the tunnel/encapsulation ingress might have | transport, the tunnel/encapsulation ingress might have | |||
insufficient context, or computational power, to reconstruct the | insufficient context, or computational power, to reconstruct the | |||
transport header that would be needed to perform validation. | transport header that would be needed to perform validation. | |||
* When an ICMP message is generated by a router in a network segment | ||||
that has inserted a header into a packet, the quoted packet could | ||||
contain additional protocol header information that was not | ||||
included in the original sent packet, and which the PL sender does | ||||
not process or may not know how to process. This could disrupt | ||||
the ability of the sender to validate this PTB message. | ||||
* A Network Address Translation (NAT) device that translates a | * A Network Address Translation (NAT) device that translates a | |||
packet header, ought to also translate ICMP messages and update | packet header, ought to also translate ICMP messages and update | |||
the ICMP quoted packet [RFC5508] in that message. If this is not | the ICMP quoted packet [RFC5508] in that message. If this is not | |||
correctly translated then the sender would not be able to | correctly translated then the sender would not be able to | |||
associate the message with the PL that originated the packet, and | associate the message with the PL that originated the packet, and | |||
hence this ICMP message cannot be validated. | hence this ICMP message cannot be validated. | |||
1.2. Packetization Layer Path MTU Discovery | 1.2. Packetization Layer Path MTU Discovery | |||
The term Packetization Layer (PL) has been introduced to describe the | The term Packetization Layer (PL) has been introduced to describe the | |||
skipping to change at page 9, line 20 ¶ | skipping to change at page 9, line 23 ¶ | |||
Effective PMTU: The Effective PMTU is the current estimated value | Effective PMTU: The Effective PMTU is the current estimated value | |||
for PMTU that is used by a PMTUD. This is equivalent to the | for PMTU that is used by a PMTUD. This is equivalent to the | |||
PLPMTU derived by PLPMTUD plus the size of any headers added below | PLPMTU derived by PLPMTUD plus the size of any headers added below | |||
the PL, including the IP layer headers. | the PL, including the IP layer headers. | |||
EMTU_S: The Effective MTU for sending (EMTU_S) is defined in | EMTU_S: The Effective MTU for sending (EMTU_S) is defined in | |||
[RFC1122] as "the maximum IP datagram size that may be sent, for a | [RFC1122] as "the maximum IP datagram size that may be sent, for a | |||
particular combination of IP source and destination addresses...". | particular combination of IP source and destination addresses...". | |||
EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in | EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in | |||
[RFC1122] as the largest datagram size that can be reassembled by | [RFC1122] as "the largest datagram size that can be reassembled". | |||
EMTU_R (Effective MTU to receive). | ||||
Link: A Link is a communication facility or medium over which nodes | Link: A Link is a communication facility or medium over which nodes | |||
can communicate at the link layer, i.e., a layer below the IP | can communicate at the link layer, i.e., a layer below the IP | |||
layer. Examples are Ethernet LANs and Internet (or higher) layer | layer. Examples are Ethernet LANs and Internet (or higher) layer | |||
and tunnels. | and tunnels. | |||
Link MTU: The Link Maximum Transmission Unit (MTU) is the size in | Link MTU: The Link Maximum Transmission Unit (MTU) is the size in | |||
bytes of the largest IP packet, including the IP header and | bytes of the largest IP packet, including the IP header and | |||
payload, that can be transmitted over a link. Note that this | payload, that can be transmitted over a link. Note that this | |||
could more properly be called the IP MTU, to be consistent with | could more properly be called the IP MTU, to be consistent with | |||
skipping to change at page 9, line 47 ¶ | skipping to change at page 9, line 49 ¶ | |||
[RFC4821], that states "All links MUST enforce their MTU: links | [RFC4821], that states "All links MUST enforce their MTU: links | |||
that might non- deterministically deliver packets that are larger | that might non- deterministically deliver packets that are larger | |||
than their rated MTU MUST consistently discard such packets." | than their rated MTU MUST consistently discard such packets." | |||
MAX_PLPMTU: The MAX_PLPMTU is the largest size of PLPMTU that | MAX_PLPMTU: The MAX_PLPMTU is the largest size of PLPMTU that | |||
DPLPMTUD will attempt to use. | DPLPMTUD will attempt to use. | |||
MIN_PLPMTU: The MIN_PLPMTU is the smallest size of PLPMTU that | MIN_PLPMTU: The MIN_PLPMTU is the smallest size of PLPMTU that | |||
DPLPMTUD will attempt to use. | DPLPMTUD will attempt to use. | |||
MPS: MPS: The Maximum Packet Size (MPS) is the largest size of | MPS: The Maximum Packet Size (MPS) is the largest size of | |||
application data block that can be sent across a network path by a | application data block that can be sent across a network path by a | |||
PL using a single Datagram. | PL using a single Datagram. | |||
Packet: A Packet is the IP header plus the IP payload. | Packet: A Packet is the IP header plus the IP payload. | |||
Packetization Layer (PL): The PL is a layer of the network stack | Packetization Layer (PL): The PL is a layer of the network stack | |||
that places data into packets and performs transport protocol | that places data into packets and performs transport protocol | |||
functions. Examples of a PL include: TCP, SCTP, SCTP over DTLS or | functions. Examples of a PL include: TCP, SCTP, SCTP over DTLS or | |||
QUIC. | QUIC. | |||
skipping to change at page 11, line 18 ¶ | skipping to change at page 11, line 21 ¶ | |||
able to transmit a packet larger than the PLMPMTU. This is used | able to transmit a packet larger than the PLMPMTU. This is used | |||
to send a probe packet. In IPv4, a probe packet MUST be sent | to send a probe packet. In IPv4, a probe packet MUST be sent | |||
with the Don't Fragment (DF) bit set in the IP header, and | with the Don't Fragment (DF) bit set in the IP header, and | |||
without network layer endpoint fragmentation. In IPv6, a probe | without network layer endpoint fragmentation. In IPv6, a probe | |||
packet is always sent without source fragmentation (as specified | packet is always sent without source fragmentation (as specified | |||
in section 5.4 of [RFC8201]). | in section 5.4 of [RFC8201]). | |||
3. Reception feedback: The destination PL endpoint is REQUIRED to | 3. Reception feedback: The destination PL endpoint is REQUIRED to | |||
provide a feedback method that indicates to the DPLPMTUD sender | provide a feedback method that indicates to the DPLPMTUD sender | |||
when a probe packet has been received by the destination PL | when a probe packet has been received by the destination PL | |||
endpoint. | endpoint. Section 6 provides examples of how a PL can provide | |||
this acknowledgment of received probe packets. | ||||
4. Probe loss recovery: It is RECOMMENDED to use probe packets that | 4. Probe loss recovery: It is RECOMMENDED to use probe packets that | |||
do not carry any user data that would require retransmission if | do not carry any user data that would require retransmission if | |||
lost. Most datagram transports permit this. If a probe packet | lost. Most datagram transports permit this. If a probe packet | |||
contains user data requiring retransmission in case of loss, the | contains user data requiring retransmission in case of loss, the | |||
PL (or layers above) are REQUIRED to arrange any retransmission/ | PL (or layers above) are REQUIRED to arrange any retransmission/ | |||
repair of any resulting loss. The PL is REQUIRED to be robust | repair of any resulting loss. The PL is REQUIRED to be robust | |||
in the case where probe packets are lost due to other reasons | in the case where probe packets are lost due to other reasons | |||
(including link transmission error, congestion). | (including link transmission error, congestion). | |||
skipping to change at page 12, line 23 ¶ | skipping to change at page 12, line 26 ¶ | |||
the interval between probe packets MUST be at least one RTT. If | the interval between probe packets MUST be at least one RTT. If | |||
transmission of probe packets is limited by the congestion | transmission of probe packets is limited by the congestion | |||
controller, this could result in transmission of probe packets | controller, this could result in transmission of probe packets | |||
being delayed or suspended during congestion. | being delayed or suspended during congestion. | |||
8. Loss of a probe packet SHOULD NOT be treated as an indication of | 8. Loss of a probe packet SHOULD NOT be treated as an indication of | |||
congestion and SHOULD NOT trigger a congestion control reaction | congestion and SHOULD NOT trigger a congestion control reaction | |||
[RFC4821], because this could result in unnecessary reduction of | [RFC4821], because this could result in unnecessary reduction of | |||
the sending rate. | the sending rate. | |||
9. An update to the PLPMTU (or MPS) MUST NOT modify the congestion | 9. An update to the PLPMTU (or MPS) MUST NOT increase the | |||
window measured in bytes [RFC4821]. Therefore, an increase in | congestion window measured in bytes [RFC4821]. Therefore, an | |||
the packet size does not cause an increase the data rate in | increase in the packet size does not cause an increase in the | |||
bytes per second. | data rate in bytes per second. | |||
10. Probing and flow control: Flow control at the PL concerns the | 10. A PL that maintains the congestion window in terms of a limit to | |||
the number of outstanding fixed size packets SHOULD adapt this | ||||
limit to compensate for the size of the actual packets. | ||||
11. Probing and flow control: Flow control at the PL concerns the | ||||
end-to-end flow of data using the PL service. This does not | end-to-end flow of data using the PL service. This does not | |||
apply to DPLPMTU when probe packets use a design that does not | apply to DPLPMTU when probe packets use a design that does not | |||
carry user data to the remote application. | carry user data to the remote application. | |||
11. Shared PLPMTU state: The PMTU value calculated from the PLPMTU | 12. Shared PLPMTU state: The PMTU value calculated from the PLPMTU | |||
MAY also be stored with the corresponding entry associated with | MAY also be stored with the corresponding entry associated with | |||
the destination in the IP layer cache, and used by other PL | the destination in the IP layer cache, and used by other PL | |||
instances. The specification of PLPMTUD [RFC4821] states: "If | instances. The specification of PLPMTUD [RFC4821] states: "If | |||
PLPMTUD updates the MTU for a particular path, all Packetization | PLPMTUD updates the MTU for a particular path, all Packetization | |||
Layer sessions that share the path representation (as described | Layer sessions that share the path representation (as described | |||
in Section 5.2 of [RFC4821]) SHOULD be notified to make use of | in Section 5.2 of [RFC4821]) SHOULD be notified to make use of | |||
the new MTU". Such methods MUST be robust to the wide variety | the new MTU". Such methods MUST be robust to the wide variety | |||
of underlying network forwarding behaviors. Section 5.2 of | of underlying network forwarding behaviors. Section 5.2 of | |||
[RFC8201] provides guidance on the caching of PMTU information | [RFC8201] provides guidance on the caching of PMTU information | |||
and also the relation to IPv6 flow labels. | and also the relation to IPv6 flow labels. | |||
skipping to change at page 14, line 29 ¶ | skipping to change at page 14, line 36 ¶ | |||
Probing using application data and padding data: A probe packet that | Probing using application data and padding data: A probe packet that | |||
contains a data block supplied by an application that is combined | contains a data block supplied by an application that is combined | |||
with padding to inflate the length of the datagram to the size of | with padding to inflate the length of the datagram to the size of | |||
the probe packet. | the probe packet. | |||
Probing using application data: A probe packet that contains a data | Probing using application data: A probe packet that contains a data | |||
block supplied by an application that matches the size of the | block supplied by an application that matches the size of the | |||
probe packet. This method requests the application to issue a | probe packet. This method requests the application to issue a | |||
data block of the desired probe size. | data block of the desired probe size. | |||
A PL that uses a probe packet carrying an application data and needs | A PL that uses a probe packet carrying application data and needs | |||
protection from the loss of this probe packet, could perform | protection from the loss of this probe packet could perform | |||
transport-layer retransmission/repair of the data block (e.g., by | transport-layer retransmission/repair of the data block (e.g., by | |||
retransmission after loss is detected or by duplicating the data | retransmission after loss is detected or by duplicating the data | |||
block in a datagram without the padding data). This retransmited | block in a datagram without the padding data). This retransmitted | |||
data block might possibly need to be sent using a smaller PLPMTU, | data block might possibly need to be sent using a smaller PLPMTU, | |||
which could need the PL to to use a smaller packet size to traverse | which could need the PL to to use a smaller packet size to traverse | |||
the end-to-end path. (This could utilize endpoint network-layer or a | the end-to-end path. (This could utilize endpoint network-layer or a | |||
PL that can re-segment the data block into multiple datagrams). | PL that can re-segment the data block into multiple datagrams). | |||
DPLPMTUD MAY choose to use only one of these methods to simplify the | DPLPMTUD MAY choose to use only one of these methods to simplify the | |||
implementation. | implementation. | |||
Probe messages sent by a PL MUST contain enough information to | Probe messages sent by a PL MUST contain enough information to | |||
uniquely identify the probe within Maximum Segment Lifetime (e.g., | uniquely identify the probe within Maximum Segment Lifetime (e.g., | |||
skipping to change at page 15, line 21 ¶ | skipping to change at page 15, line 28 ¶ | |||
A PL that does not acknowledge data reception (e.g., UDP and UDP- | A PL that does not acknowledge data reception (e.g., UDP and UDP- | |||
Lite) is unable itself to detect when the packets that it sends are | Lite) is unable itself to detect when the packets that it sends are | |||
discarded because their size is greater than the actual PMTU. These | discarded because their size is greater than the actual PMTU. These | |||
PLs need to rely on an application protocol to detect this loss. | PLs need to rely on an application protocol to detect this loss. | |||
Section 6 specifies this function for a set of IETF-specified | Section 6 specifies this function for a set of IETF-specified | |||
protocols. | protocols. | |||
4.3. Black Hole Detection | 4.3. Black Hole Detection | |||
The description that follows uses the set of constants defined in | ||||
Section 5.1.2 and variables defined in Section 5.1.3. | ||||
Black Hole Detection is triggered by an indication that the network | Black Hole Detection is triggered by an indication that the network | |||
path could be unable to support the current PLPMTU size. | path could be unable to support the current PLPMTU size. | |||
There are three ways to detect black holes: | There are three ways to detect black holes: | |||
* A validated PTB message can be received that indicates a | * A validated PTB message can be received that indicates a | |||
PL_PTB_SIZE less than the current PLPMTU. A DPLPMTUD method MUST | PL_PTB_SIZE less than the current PLPMTU. A DPLPMTUD method MUST | |||
NOT rely solely on this method. | NOT rely solely on this method. | |||
* A PL can use the DPLPMTUD probing mechanism to periodically | * A PL can use the DPLPMTUD probing mechanism to periodically | |||
skipping to change at page 15, line 48 ¶ | skipping to change at page 16, line 10 ¶ | |||
* A PL can utilize an event that indicates the network path no | * A PL can utilize an event that indicates the network path no | |||
longer sustains the sender's PLPMTU size. This could use a | longer sustains the sender's PLPMTU size. This could use a | |||
mechanism implemented within the PL to detect excessive loss of | mechanism implemented within the PL to detect excessive loss of | |||
data sent with a specific packet size and then conclude that this | data sent with a specific packet size and then conclude that this | |||
excessive loss could be a result of an invalid PLPMTU (as in | excessive loss could be a result of an invalid PLPMTU (as in | |||
PLPMTUD for TCP [RFC4821]). | PLPMTUD for TCP [RFC4821]). | |||
A PL MAY inhibit sending probe packets when no application data has | A PL MAY inhibit sending probe packets when no application data has | |||
been sent since the previous probe packet. A PL preferring to use an | been sent since the previous probe packet. A PL preferring to use an | |||
up-to-data PLPMTU once user data is sent again, MAY choose to | up-to-data PLPMTU once user data is sent again MAY choose to continue | |||
continue PLPMTU discovery for each path. However, this could result | PLPMTU discovery for each path. However, this could result in | |||
in additional packets being sent. | additional packets being sent. | |||
When the method detects the current PLPMTU is not supported, DPLPMTUD | When the method detects the current PLPMTU is not supported, DPLPMTUD | |||
sets a lower PLPMTU, and sets a lower MPS. The PL then confirms that | sets a lower PLPMTU, and sets a lower MPS. The PL then confirms that | |||
the new PLPMTU can be successfully used across the path. A probe | the new PLPMTU can be successfully used across the path. A probe | |||
packet could need to have a size less than the size of the data block | packet could need to have a size less than the size of the data block | |||
generated by the application. | generated by the application. | |||
4.4. The Maximum Packet Size (MPS) | 4.4. The Maximum Packet Size (MPS) | |||
The result of probing determines a usable PLPMTU, which is used to | The result of probing determines a usable PLPMTU, which is used to | |||
skipping to change at page 17, line 4 ¶ | skipping to change at page 17, line 14 ¶ | |||
If DPLPMTUD results in a change to the MPS, the application needs to | If DPLPMTUD results in a change to the MPS, the application needs to | |||
adapt to the new MPS. A particular case can arise when packets have | adapt to the new MPS. A particular case can arise when packets have | |||
been sent with a size less than the MPS and the PLPMTU was | been sent with a size less than the MPS and the PLPMTU was | |||
subsequently reduced. If these packets are lost, the PL MAY segment | subsequently reduced. If these packets are lost, the PL MAY segment | |||
the data using the new MPS. If a PL is unable to re-segment a | the data using the new MPS. If a PL is unable to re-segment a | |||
previously sent datagram (e.g., [RFC4960]), then the sender either | previously sent datagram (e.g., [RFC4960]), then the sender either | |||
discards the datagram or could perform retransmission using network- | discards the datagram or could perform retransmission using network- | |||
layer fragmentation to form multiple IP packets not larger than the | layer fragmentation to form multiple IP packets not larger than the | |||
PLPMTU. For IPv4, the use of endpoint fragmentation by the sender is | PLPMTU. For IPv4, the use of endpoint fragmentation by the sender is | |||
preferred over clearing the DF-bit in the IPv4 header. Operational | preferred over clearing the DF bit in the IPv4 header. Operational | |||
experience reveals that IP fragmentation can reduce the reliability | experience reveals that IP fragmentation can reduce the reliability | |||
of Internet communication [I-D.ietf-intarea-frag-fragile], which may | of Internet communication [I-D.ietf-intarea-frag-fragile], which may | |||
reduce the success of retransmission. | reduce the success of retransmission. | |||
4.5. Disabling the Effect of PMTUD | 4.5. Disabling the Effect of PMTUD | |||
A PL implementing this specification MUST suspend network layer | A PL implementing this specification MUST suspend network layer | |||
processing of outgoing packets that enforces a PMTU | processing of outgoing packets that enforces a PMTU | |||
[RFC1191][RFC8201] for each flow utilizing DPLPMTUD, and instead use | [RFC1191][RFC8201] for each flow utilizing DPLPMTUD, and instead use | |||
DPLPMTUD to control the size of packets that are sent by a flow. | DPLPMTUD to control the size of packets that are sent by a flow. | |||
skipping to change at page 17, line 38 ¶ | skipping to change at page 17, line 48 ¶ | |||
both of which are referred to as PTB messages in this document. | both of which are referred to as PTB messages in this document. | |||
4.6.1. Validation of PTB Messages | 4.6.1. Validation of PTB Messages | |||
This section specifies utilization of PTB messages. | This section specifies utilization of PTB messages. | |||
* A simple implementation MAY ignore received PTB messages and in | * A simple implementation MAY ignore received PTB messages and in | |||
this case the PLPMTU is not updated when a PTB message is | this case the PLPMTU is not updated when a PTB message is | |||
received. | received. | |||
* An implementation that supports PTB messages MUST validate | * A PL that supports PTB messages MUST validate these messages | |||
messages before they are further processed. | before they are further processed. | |||
A PL that receives a PTB message from a router or middlebox, performs | A PL that receives a PTB message from a router or middlebox performs | |||
ICMP validation as specified in Section 5.2 of [RFC8085][RFC8201]. | ICMP validation as specified in Section 5.2 of [RFC8085][RFC8201]. | |||
Because DPLPMTUD operates at the PL, the PL needs to check that each | Because DPLPMTUD operates at the PL, the PL needs to check that each | |||
received PTB message is received in response to a packet transmitted | received PTB message is received in response to a packet transmitted | |||
by the endpoint PL performing DPLPMTUD. | by the endpoint PL performing DPLPMTUD. | |||
The PL MUST check the protocol information in the quoted packet | The PL MUST check the protocol information in the quoted packet | |||
carried in an ICMP PTB message payload to validate the message | carried in an ICMP PTB message payload to validate the message | |||
originated from the sending node. This validation includes | originated from the sending node. This validation includes | |||
determining that the combination of the IP addresses, the protocol, | determining that the combination of the IP addresses, the protocol, | |||
the source port and destination port match those returned in the | the source port and destination port match those returned in the | |||
quoted packet - this is also necessary for the PTB message to be | quoted packet - this is also necessary for the PTB message to be | |||
skipping to change at page 18, line 8 ¶ | skipping to change at page 18, line 18 ¶ | |||
The PL MUST check the protocol information in the quoted packet | The PL MUST check the protocol information in the quoted packet | |||
carried in an ICMP PTB message payload to validate the message | carried in an ICMP PTB message payload to validate the message | |||
originated from the sending node. This validation includes | originated from the sending node. This validation includes | |||
determining that the combination of the IP addresses, the protocol, | determining that the combination of the IP addresses, the protocol, | |||
the source port and destination port match those returned in the | the source port and destination port match those returned in the | |||
quoted packet - this is also necessary for the PTB message to be | quoted packet - this is also necessary for the PTB message to be | |||
passed to the corresponding PL. | passed to the corresponding PL. | |||
The validation SHOULD utilize information that it is not simple for | The validation SHOULD utilize information that it is not simple for | |||
an off-path attacker to determine [RFC8085]. For example, by | an off-path attacker to determine [RFC8085]. For example, it could | |||
checking the value of a protocol header field known only to the two | check the value of a protocol header field known only to the two PL | |||
PL endpoints. A datagram application that uses well-known source and | endpoints. A datagram application that uses well-known source and | |||
destination ports ought to also rely on other information to complete | destination ports ought to also rely on other information to complete | |||
this validation. | this validation. | |||
These checks are intended to provide protection from packets that | These checks are intended to provide protection from packets that | |||
originate from a node that is not on the network path. A PTB message | originate from a node that is not on the network path. A PTB message | |||
that does not complete the validation MUST NOT be further utilized by | that does not complete the validation MUST NOT be further utilized by | |||
the DPLPMTUD method. | the DPLPMTUD method. | |||
PTB messages that have been validated MAY be utilized by the DPLPMTUD | PTB messages that have been validated MAY be utilized by the DPLPMTUD | |||
algorithm, but MUST NOT be used directly to set the PLPMTU. The | algorithm, but MUST NOT be used directly to set the PLPMTU. The | |||
skipping to change at page 18, line 48 ¶ | skipping to change at page 19, line 11 ¶ | |||
This section provides a summary of how PTB messages can be utilized. | This section provides a summary of how PTB messages can be utilized. | |||
This processing depends on the PL_PTB_SIZE and the current value of a | This processing depends on the PL_PTB_SIZE and the current value of a | |||
set of variables: | set of variables: | |||
PL_PTB_SIZE < MIN_PLPMTU | PL_PTB_SIZE < MIN_PLPMTU | |||
* Invalid PL_PTB_SIZE see Section 4.6.1. | * Invalid PL_PTB_SIZE see Section 4.6.1. | |||
* PTB message ought to be discarded without further processing | * PTB message ought to be discarded without further processing | |||
(i.e., PLPMTU is not modified). | (i.e., PLPMTU is not modified). | |||
* The information could be utilized as an input to a trigger that | * The information could be utilized as an input that triggers | |||
would enable a resilience mode. | enabling a resilience mode (see Section 5.3.3). | |||
MIN_PLPMTU < PL_PTB_SIZE < BASE_PLPMTU | MIN_PLPMTU < PL_PTB_SIZE < BASE_PLPMTU | |||
* A robust PL MAY enter an error state (see Section 5.2) for an | * A robust PL MAY enter an error state (see Section 5.2) for an | |||
IPv4 path when the PL_PTB_SIZE reported in the PTB message is | IPv4 path when the PL_PTB_SIZE reported in the PTB message is | |||
larger than or equal to 68 bytes [RFC0791] and when this is | larger than or equal to 68 bytes [RFC0791] and when this is | |||
less than the BASE_PLPMTU. | less than the BASE_PLPMTU. | |||
* A robust PL MAY enter an error state (see Section 5.2) for an | * A robust PL MAY enter an error state (see Section 5.2) for an | |||
IPv6 path when the PL_PTB_SIZE reported in the PTB message is | IPv6 path when the PL_PTB_SIZE reported in the PTB message is | |||
larger than or equal to 1280 bytes [RFC8200] and when this is | larger than or equal to 1280 bytes [RFC8200] and when this is | |||
less than the BASE_PLPMTU. | less than the BASE_PLPMTU. | |||
PL_PTB_SIZE = PLPMTU | ||||
* Completes the search for a larger PLPMTU. | ||||
PL_PTB_SIZE > PROBED_SIZE | ||||
* Inconsistent network signal. | ||||
* PTB message ought to be discarded without further processing | ||||
(i.e., PLPMTU is not modified). | ||||
* The information could be utilized as an input to trigger | ||||
enabling a resilience mode. | ||||
BASE_PLPMTU <= PL_PTB_SIZE < PLPMTU | BASE_PLPMTU <= PL_PTB_SIZE < PLPMTU | |||
* This could be an indication of a black hole. The PLPMTU SHOULD | * This could be an indication of a black hole. The PLPMTU SHOULD | |||
be set to BASE_PLPMTU (the PLPMTU is reduced to the BASE_PLPMTU | be set to BASE_PLPMTU (the PLPMTU is reduced to the BASE_PLPMTU | |||
to avoid unnecessary packet loss when a black hole is | to avoid unnecessary packet loss when a black hole is | |||
encountered). | encountered). | |||
* The PL ought to start a search to quickly discover the new | * The PL ought to start a search to quickly discover the new | |||
PLPMTU. The PL_PTB_SIZE reported in the PTB message can be | PLPMTU. The PL_PTB_SIZE reported in the PTB message can be | |||
used to initialize a search algorithm. | used to initialize a search algorithm. | |||
PL_PTB_SIZE = PLPMTU | ||||
* Completes the search for a larger PLPMTU. | ||||
PLPMTU < PL_PTB_SIZE < PROBED_SIZE | PLPMTU < PL_PTB_SIZE < PROBED_SIZE | |||
* The PLPMTU continues to be valid, but the size of a packet used | * The PLPMTU continues to be valid, but the size of a packet used | |||
to search (PROBED_SIZE) was larger than the actual PMTU. | to search (PROBED_SIZE) was larger than the actual PMTU. | |||
* The PLPMTU is not updated. | * The PLPMTU is not updated. | |||
* The PL can use the reported PL_PTB_SIZE from the PTB message as | * The PL can use the reported PL_PTB_SIZE from the PTB message as | |||
the next search point when it resumes the search algorithm. | the next search point when it resumes the search algorithm. | |||
PL_PTB_SIZE > PROBED_SIZE | ||||
* Inconsistent network signal. | ||||
* PTB message ought to be discarded without further processing | ||||
(i.e., PLPMTU is not modified). | ||||
* The information could be utilized as an input to trigger | ||||
enabling a resilience mode. | ||||
5. Datagram Packetization Layer PMTUD | 5. Datagram Packetization Layer PMTUD | |||
This section specifies Datagram PLPMTUD (DPLPMTUD). The method can | This section specifies Datagram PLPMTUD (DPLPMTUD). The method can | |||
be introduced at various points (as indicated with * in the figure | be introduced at various points (as indicated with * in the figure | |||
below) in the IP protocol stack to discover the PLPMTU so that an | below) in the IP protocol stack to discover the PLPMTU so that an | |||
application can utilize an appropriate MPS for the current network | application can utilize an appropriate MPS for the current network | |||
path. | path. | |||
DPLPMTUD SHOULD NOT be used by an upper PL or application if it is | DPLPMTUD SHOULD NOT be used by an upper PL or application if it is | |||
already used in a lower layer, DPLPMTUD SHOULD only be performed once | already used in a lower layer DPLPMTUD SHOULD only be performed once | |||
between a pair of endpoints. A PL MUST adjust the MPS indicated by | between a pair of endpoints. A PL MUST adjust the MPS indicated by | |||
DPLPMTUD to account for any additional overhead introduced by the PL. | DPLPMTUD to account for any additional overhead introduced by the PL. | |||
+----------------------+ | +----------------------+ | |||
| Application* | | | Application* | | |||
+-----+------------+---+ | +-----+------------+---+ | |||
| | | | | | |||
+---+--+ +--+--+ | +---+--+ +--+--+ | |||
| QUIC*| |SCTP*| | | QUIC*| |SCTP*| | |||
+---+--+ +-+-+-+ | +---+--+ +-+-+-+ | |||
skipping to change at page 21, line 23 ¶ | skipping to change at page 21, line 28 ¶ | |||
timer value are provided in section 3.1.1 of the UDP Usage | timer value are provided in section 3.1.1 of the UDP Usage | |||
Guidelines [RFC8085]. | Guidelines [RFC8085]. | |||
PMTU_RAISE_TIMER: The PMTU_RAISE_TIMER is configured to the period a | PMTU_RAISE_TIMER: The PMTU_RAISE_TIMER is configured to the period a | |||
sender will continue to use the current PLPMTU, after which it re- | sender will continue to use the current PLPMTU, after which it re- | |||
enters the Search phase. This timer has a period of 600 seconds, | enters the Search phase. This timer has a period of 600 seconds, | |||
as recommended by PLPMTUD [RFC4821]. | as recommended by PLPMTUD [RFC4821]. | |||
DPLPMTUD MAY inhibit sending probe packets when no application | DPLPMTUD MAY inhibit sending probe packets when no application | |||
data has been sent since the previous probe packet. A PL | data has been sent since the previous probe packet. A PL | |||
preferring to use an up-to-data PMTU once user data is sent again, | preferring to use an up-to-date PMTU once user data is sent again, | |||
can choose to continue PMTU discovery for each path. However, | can choose to continue PMTU discovery for each path. However, | |||
this could result in sending additional packets. | this could result in sending additional packets. | |||
CONFIRMATION_TIMER: When an acknowledged PL is used, this timer MUST | CONFIRMATION_TIMER: When an acknowledged PL is used, this timer MUST | |||
NOT be used. For other PLs, the CONFIRMATION_TIMER is configured | NOT be used. For other PLs, the CONFIRMATION_TIMER is configured | |||
to the period a PL sender waits before confirming the current | to the period a PL sender waits before confirming the current | |||
PLPMTU is still supported. This is less than the PMTU_RAISE_TIMER | PLPMTU is still supported. This is less than the PMTU_RAISE_TIMER | |||
and used to decrease the PLPMTU (e.g., when a black hole is | and used to decrease the PLPMTU (e.g., when a black hole is | |||
encountered). Confirmation needs to be frequent enough when data | encountered). Confirmation needs to be frequent enough when data | |||
is flowing that the sending PL does not black hole extensive | is flowing that the sending PL does not black hole extensive | |||
amounts of traffic. Guidance on selection of the timer value are | amounts of traffic. Guidance on selection of the timer value are | |||
provided in section 3.1.1 of the UDP Usage Guidelines [RFC8085]. | provided in section 3.1.1 of the UDP Usage Guidelines [RFC8085]. | |||
DPLPMTUD MAY inhibit sending probe packets when no application | DPLPMTUD MAY inhibit sending probe packets when no application | |||
data has been sent since the previous probe packet. A PL | data has been sent since the previous probe packet. A PL | |||
preferring to use an up-to-data PMTU once user data is sent again, | preferring to use an up-to-data PMTU once user data is sent again, | |||
can choose to continue PMTU discovery for each path. However, | can choose to continue PMTU discovery for each path. However, | |||
this could result in sending additional packets. | this could result in sending additional packets. | |||
An implementation could implement the various timers using a single | The various timers could be implemented using a single timer | |||
timer. | ||||
5.1.2. Constants | 5.1.2. Constants | |||
The following constants are defined: | The following constants are defined: | |||
MAX_PROBES: The MAX_PROBES is the maximum value of the PROBE_COUNT | MAX_PROBES: The MAX_PROBES is the maximum value of the PROBE_COUNT | |||
counter (see Section 5.1.3). MAX_PROBES represents the limit for | counter (see Section 5.1.3). MAX_PROBES represents the limit for | |||
the number of consecutive probe attempts of any size. Search | the number of consecutive probe attempts of any size. Search | |||
algorithms benefit from a MAX_PROBES valugreater than 1 because | algorithms benefit from a MAX_PROBES value greater than 1 because | |||
this can provide robustness to isolated packet loss. The default | this can provide robustness to isolated packet loss. The default | |||
value of MAX_PROBES is 3. | value of MAX_PROBES is 3. | |||
MIN_PLPMTU: The MIN_PLPMTU is the smallest allowed probe packet | MIN_PLPMTU: The MIN_PLPMTU is the smallest allowed probe packet | |||
size. For IPv6, this value is 1280 bytes, as specified in | size. For IPv6, this value is 1280 bytes, as specified in | |||
[RFC8200]. For IPv4, the minimum value is 68 bytes. | [RFC8200]. For IPv4, the minimum value is 68 bytes. | |||
Note: An IPv4 router is required to be able to forward a datagram | Note: An IPv4 router is required to be able to forward a datagram | |||
of 68 bytes without further fragmentation. This is the combined | of 68 bytes without further fragmentation. This is the combined | |||
size of an IPv4 header and the minimum fragment size of 8 bytes. | size of an IPv4 header and the minimum fragment size of 8 bytes. | |||
skipping to change at page 23, line 28 ¶ | skipping to change at page 24, line 5 ¶ | |||
Figure 3: Relationships between packet size constants and variables | Figure 3: Relationships between packet size constants and variables | |||
5.1.4. Overview of DPLPMTUD Phases | 5.1.4. Overview of DPLPMTUD Phases | |||
This section provides a high-level informative view of the DPLPMTUD | This section provides a high-level informative view of the DPLPMTUD | |||
method, by describing the movement of the method through several | method, by describing the movement of the method through several | |||
phases of operation. More detail is available in the state machine | phases of operation. More detail is available in the state machine | |||
Section 5.2. | Section 5.2. | |||
+------+ | +------+ | |||
+------->| Base |-----------------+ Connectivity | +------->| Base |-----------------+ Connectivity | |||
| +------+ | or BASE_PLPMTU | | +------+ | or BASE_PLPMTU | |||
| | | confirmation failed | | | | confirmation failed | |||
| | v | | | v | |||
| | Connectivity +-------+ | | | Connectivity +-------+ | |||
| | and BASE_PLPMTU | Error | | | | and BASE_PLPMTU | Error | | |||
| | confirmed +-------+ | | | confirmed +-------+ | |||
| | | Consistent | | | | Consistent | |||
| v | connectivity | | v | connectivity | |||
PLPMTU | +--------+ | and BASE_PLPMTU | Black Hole | +--------+ | and BASE_PLPMTU | |||
confirmation | | Search |<---------------+ confirmed | detected | | Search |<---------------+ confirmed | |||
failed | +--------+ | | +--------+ | |||
| ^ | | | ^ | | |||
| | | | | | | | |||
| Raise | | Search | | Raise | | Search | |||
| timer | | algorithm | | timer | | algorithm | |||
| expired | | completed | | expired | | completed | |||
| | | | | | | | |||
| | v | | | v | |||
| +-----------------+ | | +-----------------+ | |||
+---| Search Complete | | +---| Search Complete | | |||
+-----------------+ | +-----------------+ | |||
Figure 4: DPLPMTUD Phases | Figure 4: DPLPMTUD Phases | |||
Base: The Base Phase confirms connectivity to the remote peer using | Base: The Base Phase confirms connectivity to the remote peer using | |||
packets of the BASE_PLPMTU. This phase is implicit for a | packets of the BASE_PLPMTU. This phase is implicit for a | |||
connection-oriented PL (where it can be performed in a PL | connection-oriented PL (where it can be performed in a PL | |||
connection handshake). A connectionless PL sends a probe packet | connection handshake). A connectionless PL sends a probe packet | |||
and uses acknowledgment of this probe packet to confirm that the | and uses acknowledgment of this probe packet to confirm that the | |||
remote peer is reachable. | remote peer is reachable. | |||
skipping to change at page 24, line 41 ¶ | skipping to change at page 25, line 18 ¶ | |||
Complete Phase. | Complete Phase. | |||
A PL could respond to PTB messages using the PTB to advance or | A PL could respond to PTB messages using the PTB to advance or | |||
terminate the search, see Section 4.6. | terminate the search, see Section 4.6. | |||
Search Complete: The Search Complete Phase is entered when the | Search Complete: The Search Complete Phase is entered when the | |||
PLPMTU is supported across the network path. A PL can use a | PLPMTU is supported across the network path. A PL can use a | |||
CONFIRMATION_TIMER to periodically repeat a probe packet for the | CONFIRMATION_TIMER to periodically repeat a probe packet for the | |||
current PLPMTU size. If the sender is unable to confirm | current PLPMTU size. If the sender is unable to confirm | |||
reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL | reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL | |||
signals a lack of reachability, DPLPMTUD enters the Base phase. | signals a lack of reachability, a black hole has been detected and | |||
DPLPMTUD enters the Base phase. | ||||
The PMTU_RAISE_TIMER is used to periodically resume the search | The PMTU_RAISE_TIMER is used to periodically resume the search | |||
phase to discover if the PLPMTU can be raised. Black Hole | phase to discover if the PLPMTU can be raised. Black Hole | |||
Detection causes the sender to enter the Base Phase. | Detection causes the sender to enter the Base Phase. | |||
Error: The Error Phase is entered when there is conflicting or | Error: The Error Phase is entered when there is conflicting or | |||
invalid PLPMTU information for the path (e.g., a failure to | invalid PLPMTU information for the path (e.g., a failure to | |||
support the BASE_PLPMTU) that cause DPLPMTUD to be unable to | support the BASE_PLPMTU) that cause DPLPMTUD to be unable to | |||
progress and the PLPMTU is lowered. | progress and the PLPMTU is lowered. | |||
DPLPMTUD remains in the Error Phase until a consistent view of the | DPLPMTUD remains in the Error Phase until a consistent view of the | |||
path can be discovered and it has also been confirmed that the | path can be discovered and it has also been confirmed that the | |||
path supports the BASE_PLPMTU (or DPLPMTUD is suspended). | path supports the BASE_PLPMTU (or DPLPMTUD is suspended). | |||
An implementation that only reduces the PLPMTU to a suitable size | A method that only reduces the PLPMTU to a suitable size would be | |||
would be sufficient to ensure reliable operation, but can be very | sufficient to ensure reliable operation, but can be very inefficient | |||
inefficient when the actual PMTU changes or when the method (for | when the actual PMTU changes or when the method (for whatever reason) | |||
whatever reason) makes a suboptimal choice for the PLPMTU. | makes a suboptimal choice for the PLPMTU. | |||
A full implementation of DPLPMTUD provides an algorithm enabling the | A full implementation of DPLPMTUD provides an algorithm enabling the | |||
DPLPMTUD sender to increase the PLPMTU following a change in the | DPLPMTUD sender to increase the PLPMTU following a change in the | |||
characteristics of the path, such as when a link is reconfigured with | characteristics of the path, such as when a link is reconfigured with | |||
a larger MTU, or when there is a change in the set of links traversed | a larger MTU, or when there is a change in the set of links traversed | |||
by an end-to-end flow (e.g., after a routing or path fail-over | by an end-to-end flow (e.g., after a routing or path fail-over | |||
decision). | decision). | |||
5.2. State Machine | 5.2. State Machine | |||
skipping to change at page 29, line 41 ¶ | skipping to change at page 29, line 41 ¶ | |||
sizes from a table of common PMTU sizes. When selecting the | sizes from a table of common PMTU sizes. When selecting the | |||
appropriate next size to search, an implementer ought to also | appropriate next size to search, an implementer ought to also | |||
consider that there can be common sizes of MPS that applications seek | consider that there can be common sizes of MPS that applications seek | |||
to use, and their could be common sizes of MTU used within the | to use, and their could be common sizes of MTU used within the | |||
network. | network. | |||
5.3.3. Resilience to Inconsistent Path Information | 5.3.3. Resilience to Inconsistent Path Information | |||
A decision to increase the PLPMTU needs to be resilient to the | A decision to increase the PLPMTU needs to be resilient to the | |||
possibility that information learned about the network path is | possibility that information learned about the network path is | |||
inconsistent. A path is inconsistent, when, for example, probe | inconsistent. A path is inconsistent when, for example, probe | |||
packets are lost due to other reasons (i.e., not packet size) or due | packets are lost due to other reasons (i.e., not packet size) or due | |||
to frequent path changes. Frequent path changes could occur by | to frequent path changes. Frequent path changes could occur by | |||
unexpected "flapping" - where some packets from a flow pass along one | unexpected "flapping" - where some packets from a flow pass along one | |||
path, but other packets follow a different path with different | path, but other packets follow a different path with different | |||
properties. | properties. | |||
A PL sender is able to detect inconsistency from the sequence of | A PL sender is able to detect inconsistency from the sequence of | |||
PLPMTU probes that are acknowledged or the sequence of PTB messages | PLPMTU probes that are acknowledged or the sequence of PTB messages | |||
that it receives. When inconsistent path information is detected, a | that it receives. When inconsistent path information is detected, a | |||
PL sender could use an alternate search mode that clamps the offered | PL sender could use an alternate search mode that clamps the offered | |||
MPS to a smaller value for a period of time. This avoids unnecessary | MPS to a smaller value for a period of time. This avoids unnecessary | |||
loss of packets. | loss of packets. | |||
5.4. Robustness to Inconsistent Paths | 5.4. Robustness to Inconsistent Paths | |||
Some paths could be unable to sustain packets of the BASE_PLPMTU | Some paths could be unable to sustain packets of the BASE_PLPMTU | |||
size. To be robust to these paths an implementation could implement | size. The Error State could be implemented to provide rubustness to | |||
the Error State. This allows fallback to a smaller than desired | such paths. This allows fallback to a smaller than desired PLPMTU, | |||
PLPMTU, rather than suffer connectivity failure. This could utilize | rather than suffer connectivity failure. This could utilize methods | |||
methods such as endpoint IP fragmentation to enable the PL sender to | such as endpoint IP fragmentation to enable the PL sender to | |||
communicate using packets smaller than the BASE_PLPMTU. | communicate using packets smaller than the BASE_PLPMTU. | |||
6. Specification of Protocol-Specific Methods | 6. Specification of Protocol-Specific Methods | |||
DPLPMTUD requires protocol-specific details to be specified for each | DPLPMTUD requires protocol-specific details to be specified for each | |||
PL that is used. | PL that is used. | |||
The first subsection provides guidance on how to implement the | The first subsection provides guidance on how to implement the | |||
DPLPMTUD method as a part of an application using UDP or UDP-Lite. | DPLPMTUD method as a part of an application using UDP or UDP-Lite. | |||
The guidance also applies to other datagram services that do not | The guidance also applies to other datagram services that do not | |||
skipping to change at page 31, line 11 ¶ | skipping to change at page 31, line 11 ¶ | |||
use common method for managing the PLPMTU has benefits, both in the | use common method for managing the PLPMTU has benefits, both in the | |||
ability to share state between different processes and opportunities | ability to share state between different processes and opportunities | |||
to coordinate probing. | to coordinate probing. | |||
6.1.1. Application Request | 6.1.1. Application Request | |||
An application needs an application-layer protocol mechanism (such as | An application needs an application-layer protocol mechanism (such as | |||
a message acknowledgment method) that solicits a response from a | a message acknowledgment method) that solicits a response from a | |||
destination endpoint. The method SHOULD allow the sender to check | destination endpoint. The method SHOULD allow the sender to check | |||
the value returned in the response to provide additional protection | the value returned in the response to provide additional protection | |||
from off-path insertion of data [RFC8085], suitable methods include a | from off-path insertion of data [RFC8085]. Suitable methods include | |||
parameter known only to the two endpoints, such as a session ID or | a parameter known only to the two endpoints, such as a session ID or | |||
initialized sequence number. | initialized sequence number. | |||
6.1.2. Application Response | 6.1.2. Application Response | |||
An application needs an application-layer protocol mechanism to | An application needs an application-layer protocol mechanism to | |||
communicate the response from the destination endpoint. This | communicate the response from the destination endpoint. This | |||
response could indicate successful reception of the probe across the | response could indicate successful reception of the probe across the | |||
path, but could also indicate that some (or all packets) have failed | path, but could also indicate that some (or all packets) have failed | |||
to reach the destination. | to reach the destination. | |||
skipping to change at page 36, line 4 ¶ | skipping to change at page 36, line 4 ¶ | |||
A probe packet consists of a QUIC Header and a payload containing | A probe packet consists of a QUIC Header and a payload containing | |||
PADDING Frames and a PING Frame. PADDING Frames are a single octet | PADDING Frames and a PING Frame. PADDING Frames are a single octet | |||
(0x00) and several of these can be used to create a probe packet of | (0x00) and several of these can be used to create a probe packet of | |||
size PROBED_SIZE. QUIC provides an acknowledged PL, a sender can | size PROBED_SIZE. QUIC provides an acknowledged PL, a sender can | |||
therefore enter the BASE state as soon as connectivity has been | therefore enter the BASE state as soon as connectivity has been | |||
confirmed. | confirmed. | |||
The current specification of QUIC sets the following: | The current specification of QUIC sets the following: | |||
* BASE_PLPMTU: A QUIC sender pads initial packets to confirm the | * BASE_PLPMTU: A QUIC sender pads initial packets to confirm the | |||
path can support packets of the required size, this sets the | path can support packets of the required size, which sets the | |||
BASE_PLPMTU and MIN_PLPMTU. | BASE_PLPMTU and MIN_PLPMTU. | |||
* MIN_PLPMTU: A QUIC sender that determines the MIN_PLPMTU has | * MIN_PLPMTU: A QUIC sender that determines the MIN_PLPMTU has | |||
fallen MUST immediately stop sending on the affected path. | fallen MUST immediately stop sending on the affected path. | |||
6.3.3. Validating the Path with QUIC | 6.3.3. Validating the Path with QUIC | |||
QUIC provides an acknowledged PL. A sender therefore MUST NOT | QUIC provides an acknowledged PL. A sender therefore MUST NOT | |||
implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. | |||
skipping to change at page 36, line 50 ¶ | skipping to change at page 36, line 50 ¶ | |||
The security considerations for the use of UDP and SCTP are provided | The security considerations for the use of UDP and SCTP are provided | |||
in the referenced RFCs. | in the referenced RFCs. | |||
To avoid excessive load, the interval between individual probe | To avoid excessive load, the interval between individual probe | |||
packets MUST be at least one RTT, and the interval between rounds of | packets MUST be at least one RTT, and the interval between rounds of | |||
probing is determined by the PMTU_RAISE_TIMER. | probing is determined by the PMTU_RAISE_TIMER. | |||
A PL sender needs to ensure that the method used to confirm reception | A PL sender needs to ensure that the method used to confirm reception | |||
of probe packets protects from off-path attackers injecting packets | of probe packets protects from off-path attackers injecting packets | |||
into the path. This protection if provided in IETF-defined protocols | into the path. This protection is provided in IETF-defined protocols | |||
(e.g., TCP, SCTP) using a randomly-initialized sequence number. A | (e.g., TCP, SCTP) using a randomly-initialized sequence number. A | |||
description of one way to do this when using UDP is provided in | description of one way to do this when using UDP is provided in | |||
section 5.1 of [RFC8085]). | section 5.1 of [RFC8085]). | |||
There are cases where ICMP Packet Too Big (PTB) messages are not | There are cases where ICMP Packet Too Big (PTB) messages are not | |||
delivered due to policy, configuration or equipment design (see | delivered due to policy, configuration or equipment design (see | |||
Section 1.1), this method therefore does not rely upon PTB messages | Section 1.1). This method therefore does not rely upon PTB messages | |||
being received, but is able to utilize these when they are received | being received, but is able to utilize these when they are received | |||
by the sender. PTB messages could potentially be used to cause a | by the sender. PTB messages could potentially be used to cause a | |||
node to inappropriately reduce the PLPMTU. A node supporting | node to inappropriately reduce the PLPMTU. A node supporting | |||
DPLPMTUD MUST therefore appropriately validate the payload of PTB | DPLPMTUD MUST therefore appropriately validate the payload of PTB | |||
messages to ensure these are received in response to transmitted | messages to ensure these are received in response to transmitted | |||
traffic (i.e., a reported error condition that corresponds to a | traffic (i.e., a reported error condition that corresponds to a | |||
datagram actually sent by the path layer, see Section 4.6.1). | datagram actually sent by the path layer, see Section 4.6.1). | |||
An on-path attacker, able to create a PTB message could forge PTB | An on-path attacker able to create a PTB message could forge PTB | |||
messages that include a valid quoted IP packet. Such an attack could | messages that include a valid quoted IP packet. Such an attack could | |||
be used to drive down the PLPMTU. There are two ways this method can | be used to drive down the PLPMTU. There are two ways this method can | |||
be mitigated against such attacks: First, by ensuring that a PL | be mitigated against such attacks: First, by ensuring that a PL | |||
sender never reduces the PLPMTU below the base size, solely in | sender never reduces the PLPMTU below the base size, solely in | |||
response to receiving a PTB message. This is achieved by first | response to receiving a PTB message. This is achieved by first | |||
entering the BASE state when such a message is received. Second, the | entering the BASE state when such a message is received. Second, the | |||
design does not require processing of PTB messages, a PL sender could | design does not require processing of PTB messages, a PL sender could | |||
therefore suspend processing of PTB messages (e.g., in a robustness | therefore suspend processing of PTB messages (e.g., in a robustness | |||
mode after detecting that subsequent probes actually confirm that a | mode after detecting that subsequent probes actually confirm that a | |||
size larger than the PTB_SIZE is supported by a path). | size larger than the PTB_SIZE is supported by a path). | |||
Parsing the quoted packet inside a PTB message can introduce addional | ||||
per-packet processing at the PL sender. This processing SHOULD be | ||||
limited to avoid a denial of service attack when arbitrary headers | ||||
are included. Rate-limiting the processing could result in PTB | ||||
messages not being received by a PL, however the DPLPMTUD method is | ||||
robust to such loss. | ||||
The successful processing of an ICMP message can trigger a probe when | The successful processing of an ICMP message can trigger a probe when | |||
the reported PTB size is valid, but this does not directly update the | the reported PTB size is valid, but this does not directly update the | |||
PLPMTU for the path. This prevents a message attempting to black | PLPMTU for the path. This prevents a message attempting to black | |||
hole data by indicating a size larger than supported by the path. | hole data by indicating a size larger than supported by the path. | |||
Parallel forwarding paths SHOULD be considered. Section 5.4 | It is possible that the information about a path is not stable. This | |||
identifies the need for robustness in the method because the path | could be a result of forwarding across more than one path that has a | |||
information might be inconsistent. | different actual PMTU or a single path presents a varying PMTU. The | |||
design of a PLPMTUD implementation SHOULD consider how to mitigate | ||||
the effects of varying path information. One possible mitigation is | ||||
to provide robustness (see Section 5.4) in the method that avoids | ||||
oscillation in the MPS. | ||||
A node performing DPLPMTUD could experience conflicting information | A node performing DPLPMTUD could experience conflicting information | |||
about the size of supported probe packets. This could occur when | about the size of supported probe packets. This could occur when | |||
there are multiple paths are concurrently in use and these exhibit a | multiple paths are concurrently in use and these exhibit a different | |||
different PMTU. If not considered, this could result in packets not | PMTU. If not considered, this could result in packets not being | |||
being delivered (black holed) when the PLPMTU results in a packet | delivered (black holed) when the PLPMTU results in a packet larger | |||
larger than the smallest actual PMTU. | than the smallest actual PMTU. | |||
DPLPMTUD methods can introduce padding data to inflate the length of | DPLPMTUD methods can introduce padding data to inflate the length of | |||
the datagram to the total size required for a probe packet. The | the datagram to the total size required for a probe packet. The | |||
total size of a probe packet includes all headers and padding added | total size of a probe packet includes all headers and padding added | |||
to the payload data being sent (e.g., including security-related | to the payload data being sent (e.g., including security-related | |||
fields such as an AEAD tag and TLS record layer padding). The value | fields such as an AEAD tag and TLS record layer padding). The value | |||
of the padding data does not influence the DPLPMTUD search algorithm, | of the padding data does not influence the DPLPMTUD search algorithm, | |||
and therefore needs to be set consistent with the policy of the PL. | and therefore needs to be set consistent with the policy of the PL. | |||
If a PL can make use of cryptographic confidentiality or data- | If a PL can make use of cryptographic confidentiality or data- | |||
skipping to change at page 45, line 11 ¶ | skipping to change at page 45, line 23 ¶ | |||
Working group draft -17: | Working group draft -17: | |||
* Updated text after GENART and IETF-LC. | * Updated text after GENART and IETF-LC. | |||
* Renamed BASE_MTU to BASE_PLPMTU, and MIN and MAX PMTU to PLPMTU | * Renamed BASE_MTU to BASE_PLPMTU, and MIN and MAX PMTU to PLPMTU | |||
(because these are about a base for the PLPMTU), and ensured | (because these are about a base for the PLPMTU), and ensured | |||
consistent separation of PMTU and PLPMTU. | consistent separation of PMTU and PLPMTU. | |||
* Adopted US-style English throughout. | * Adopted US-style English throughout. | |||
Working group draft -18: | ||||
* Updated text and address nits from OPSDIR, ART and IESG reviews. | ||||
* Order PTB processing based on PL_PTB_SIZE | ||||
Authors' Addresses | Authors' Addresses | |||
Godred Fairhurst | Godred Fairhurst | |||
University of Aberdeen | University of Aberdeen | |||
School of Engineering | School of Engineering | |||
Fraser Noble Building | Fraser Noble Building | |||
Aberdeen | Aberdeen | |||
AB24 3UE | AB24 3UE | |||
United Kingdom | United Kingdom | |||
End of changes. 46 change blocks. | ||||
96 lines changed or deleted | 128 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |