Internet Engineering Task Force G. Fairhurst Internet-Draft T. JonesUpdates: 4821Updates4821 (if approved) University of Aberdeen Intended status: Standards Track M. Tuexen Expires:August 22,7 December 2019 I. Ruengeler T. Voelker Muenster University of Applied SciencesFebruary 18,5 June 2019 Packetization Layer Path MTU Discovery for Datagram Transportsdraft-ietf-tsvwg-datagram-plpmtud-07draft-ietf-tsvwg-datagram-plpmtud-08 Abstract This document describes a robust method for Path MTU Discovery (PMTUD) for datagram Packetization Layers (PLs).The documentIt describes an extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path MTU Discovery for IPv4 and IPv6. The method allows a PL, or a datagram application that uses a PL, to discover whether a network path can support the current size of datagram. This can be used to detect and reduce the message size when a sender encounters a network black hole (where packets arediscarded, and no ICMP message is received).discarded). The method canalsoprobe a network path with progressively larger packets tofinddiscover whether the maximum packet size can be increased. This allows a sender to determine an appropriate packet size, providing functionally for datagram transports that is equivalent to the Packetization Layer PMTUD specification for TCP, specified in RFC 4821. The document also provides implementation notes for incorporating Datagram PMTUD into IETF datagram transports or applications that use datagram transports. When published, this specification updates RFC 4821. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire onAugust 22,7 December 2019. Copyright Notice Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents(https://trustee.ietf.org/license-info)(https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 4 1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 6 1.3. Path MTU Discovery for Datagram Services . . . . . . . . 7 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 7 3. Features Required to Provide Datagram PLPMTUD . . . . . . . .910 4. DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . . 12 4.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 12 4.2. Confirmation of Probed Packet Size . . . . . . . . . . .1314 4.3. Detection of Unsupported PLPMTU Size, aka BlackHolesHole Detection . . . . . . . . . . . . . . . . . . . . . . . . 14 4.4. Response to PTB Messages . . . . . . . . . . . . . . . . 15 4.4.1. Validation of PTB Messages . . . . . . . . . . . . . 15 4.4.2. Use of PTB Messages . . . . . . . . . . . . . . . . . 16 5. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 17 5.1. DPLPMTUD Components . . . . . . . . . . . . . . . . . . . 18 5.1.1. Timers . . . . . . . . . . . . . . . . . . . . . . . 18 5.1.2. Constants . . . . . . . . . . . . . . . . . . . . . . 19 5.1.3. Variables . . . . . . . . . . . . . . . . . . . . . .19 5.2.20 5.1.4. Overview of DPLPMTUD Phases . . . . . . . . . . . . . 21 5.2. State Machine . . . . . . . . . .20 5.2.1. BASE_PMTU Confirmation Phase. . . . . . . . . . . .22 5.2.2.23 5.3. SearchPhaseto Increase the PLPMTU . . . . . . . . . . . . . . 26 5.3.1. Probing for a larger PLPMTU . . . . . .22 5.2.2.1. Resilience to Inconsistent Path Information. . .22 5.2.3. Search Complete Phase. . . . 26 5.3.2. Selection of Probe Sizes . . . . . . . . . . . .23 5.2.4. PROBE_BASE Phase. . 27 5.3.3. Resilience to Inconsistent Path Information . . . . . 27 5.4. Robustness to Inconsistent Paths . . . . . . . . . . .23 5.2.5. ERROR Phase. 28 6. Specification of Protocol-Specific Methods . . . . . . . . . 28 6.1. Application support for DPLPMTUD with UDP or UDP-Lite . . . . . . . . . . .24 5.2.5.1. Robustness to Inconsistent Path. . . . . . . . .24 5.2.6. DISABLED Phase. . . . 28 6.1.1. Application Request . . . . . . . . . . . . . . .24 5.3. State Machine. . 29 6.1.2. Application Response . . . . . . . . . . . . . . . . 29 6.1.3. Sending Application Probe Packets . . . .24 5.4. Search to Increase the PLPMTU. . . . . . 29 6.1.4. Validating the Path . . . . . . . .27 5.4.1. Probing for a Larger PLPMTU. . . . . . . . . 29 6.1.5. Handling of PTB Messages . . . .27 5.4.2. Selection of Probe Sizes. . . . . . . . . . 29 6.2. DPLPMTUD for SCTP . . . .28 5.4.3. Resilience to Inconsistent Path Information. . . . .28 6. Specification of Protocol-Specific Methods. . . . . . . . .28 6.1. Application support for DPLPMTUD with UDP or UDP-Lite. .29 6.1.1. Application Request30 6.2.1. SCTP/IPv4 and SCTP/IPv6 . . . . . . . . . . . . . . . 30 6.2.2. DPLPMTUD for SCTP/UDP . .29 6.1.2. Application Response. . . . . . . . . . . . . . 31 6.2.3. DPLPMTUD for SCTP/DTLS . .29 6.1.3. Sending Application Probe Packets. . . . . . . . . .30 6.1.4. Validating the Path. . . 31 6.3. DPLPMTUD for QUIC . . . . . . . . . . . . . .30 6.1.5. Handling of PTB Messages. . . . . . 32 6.3.1. Sending QUIC Probe Packets . . . . . . . .30 6.2. DPLPMTUD with UDP Options. . . . . 32 6.3.2. Validating the Path with QUIC . . . . . . . . . . .30 6.2.1. UDP Probe Request Option. 33 6.3.3. Handling of PTB Messages by QUIC . . . . . . . . . .. . . 32 6.2.2. UDP Probe Response Option33 6.4. DPLPMTUD for UDP-Options . . . . . . . . . . . . . .32 6.3. DPLPMTUD for SCTP .. . 33 7. Acknowledgements . . . . . . . . . . . . . . . . .33 6.3.1. SCTP/IPv4 and SCTP/IPv6. . . . . 33 8. IANA Considerations . . . . . . . . . .33 6.3.1.1. Sending SCTP Probe Packets. . . . . . . . . . . 336.3.1.2. Validating the Path with SCTP .9. Security Considerations . . . . . . . . .34 6.3.1.3. PTB Message Handling by SCTP. . . . . . . . . .34 6.3.2. DPLPMTUD for SCTP/UDP33 10. References . . . . . . . . . . . . . . . .34 6.3.2.1. Sending SCTP/UDP Probe Packets. . . . . . . . . 346.3.2.2. Validating the Path with SCTP/UDP . . . . . . .10.1. Normative References .34 6.3.2.3. Handling of PTB Messages by SCTP/UDP. . . . . .34 6.3.3. DPLPMTUD for SCTP/DTLS. . . . . . . . . . . 34 10.2. Informative References . . . .34 6.3.3.1. Sending SCTP/DTLS Probe Packets. . . . . . . . .35 6.3.3.2. Validating the Path with SCTP/DTLS. . . . 36 Appendix A. Revision Notes . . .35 6.3.3.3. Handling of PTB Messages by SCTP/DTLS. . . . . .35 6.4. DPLPMTUD for QUIC. . . . . . . . . . 37 Authors' Addresses . . . . . . . . . .35 6.4.1. Sending QUIC Probe Packets. . . . . . . . . . . . .35 6.4.2. Validating40 1. Introduction The IETF has specified datagram transport using UDP, SCTP, and DCCP, as well as protocols layered on top of these transports (e.g., SCTP/ UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP network layer. This document describes a robust method for Path MTU Discovery (PMTUD) that may be used withQUIC . . . . . . . . . . . . 36 6.4.3. Handlingthese transport protocols (or the applications that use their transport service) to discover an appropriate size of packet to use across an Internet path. 1.1. Classical Path MTU Discovery Classical Path Maximum Transmission Unit Discovery (PMTUD) can be used with any transport that is able to process ICMP Packet Too Big (PTB) messages (e.g., [RFC1191] and [RFC8201]). In this document, the term PTBMessages by QUIC . . . . . . . . . . 36 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 36 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 36 9. Security Considerations . . . . . . . . . . . . . . . . . . . 36 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 38 10.1. Normative References . . . . . . . . . . . . . . . . . . 38 10.2. Informative References . . . . . . . . . . . . . . . . . 39 Appendix A. Event-driven state changes . . . . . . . . . . . . . 40 Appendix B. Revision Notes . . . . . . . . . . . . . . . . . . . 43 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 45 1. Introduction The IETF has specified datagram transport using UDP, SCTP, and DCCP, as well as protocols layered on top of these transports (e.g., SCTP/ UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP network layer. This document describes a robust method for Path MTU Discovery (PMTUD) that may be used with these transport protocols (or the applications that use their transport service) to discover an appropriate size of packet to use across an Internet path. 1.1. Classical Path MTU Discovery Classical Path Maximum Transmission Unit Discovery (PMTUD) can be used with any transport that is able to process ICMP Packet Too Big (PTB) messages (e.g., [RFC1191] and [RFC8201]). The term PTB message is applied to both IPv4 ICMP Unreachable messages (type 3) that carry the error Fragmentation Needed (Type 3, Code 4) [RFC0792] and ICMPv6 packet too big messages (Type 2) [RFC4443]. When a sender receives a PTB message, it reduces the effective MTU to the value reported as the Link MTU in the PTB message, and a method that from time-to-time increases the packet size in attempt to discover an increase in the supported PMTU. The packets sent with a size larger than the current effective PMTU are known as probe packets. Packets not intended as probe packets are either fragmented to the current effective PMTU, or the attempt to send fails with an error code. Applications are sometimes provided with a primitive to let them read the Maximum Packet Size (MPS), derived from the current effective PMTU. Classical PMTUD is subject to protocol failures. One failure arises when traffic using a packet size larger than the actual PMTU is black-holed (all datagrams sent with this size, or larger, are silently discarded without the sender receiving PTB messages). This could arise when the PTB messages are not delivered back to the sender for some reason (see for example [RFC2923]). Examples where PTB messages are not delivered include: o The generation of ICMP messages is usually rate limited. This may result in no PTB messages being sent to the sender (see section 2.4 of [RFC4443]) o ICMP messages are increasingly filtered by middleboxes (including firewalls) [RFC4890]. A stateful firewall could be configured with a policy to block incoming ICMP messages, which would prevent reception of PTB messages to endpoints behind this firewall. o When the router issuing the ICMP message drops a tunneled packet, the resulting ICMP message will be directed to the tunnel ingress. This tunnel endpoint is responsible for forwarding the ICMP message and also processing the quoted packet within the payload field to remove the effect of the tunnel, and return a correctly formatted ICMP message to the sender [I-D.ietf-intarea-tunnels]. Failure to do this results in black-holing. o Asymmetry in forwarding can result in there being no route back to the original sender, which would prevent an ICMP message being delivered to the sender. This can be also be an issue when policy-based routing is used, Equal Cost Multipath (ECMP) routing is used, or a middlebox acts as an application load balancer. An example is where the path towards the server is chosen by ECMP routing depending on bytes in the IP payload. In this case, when a packet sent by the server encounters a problem after the ECMP router, then any resulting ICMP message needs to also be directed by the ECMP router towards the same server (i.e., ICMP messages need to follow the same path as the flows to which they correspond). Failure to do this results in black-holing. o There are cases where the next hop destination fails to receive a packet because of its size. This could be due to misconfiguration of the layer 2 path between nodes, for instance the MTU configured in a layer 2 switch, or misconfiguration of the Maximum Receive Unit (MRU). If the packet is dropped by the link, this will not cause a PTB message to be sent, and result in consequent black- holing. Another failure could result if a node that is not on the network path sends a PTB message that attempts to force the sender to change the effective PMTU [RFC8201]. A sender can protect itself from reacting to such messages by utilising the quoted packet within a PTB message payload to validate that the received PTB message was generated in response to a packet that had actually originated from the sender. However, there are situations where a sender would be unable to provide this validation. Examples where validation of the PTB message is not possible include: o When a router issuing the ICMP message implements RFC792 [RFC0792], it is only required to include the first 64 bits of the IP payload of the packet within the quoted payload. This may be insufficient to perform the tunnel processing described in the previous bullet. There could be insufficient bytes remaining for the sender to interpret the quoted transport information. The recommendation in RFC1812 [RFC1812] is that IPv4 routers return a quoted packet with as much of the original datagram as possible without the length of the ICMP datagram exceeding 576 bytes. (IPv6 routers include as much of invoking packet as possible without the ICMPv6 packet exceeding 1280 bytes [RFC4443].) o The use of tunnels/encryption can reduce the size of the quoted packet returned to the original source address, increasing the risk that there could be insufficient bytes remaining for the sender to interpret the quoted transport information. o Even when the PTB message includes sufficient bytes of the quoted packet, the network layer could lack sufficient context to validate the message, because validation depends on information about the active transport flows at an endpoint node (e.g., the socket/address pairs being used, and other protocol header information). o When a packet is encapsulated/tunneled over an encrypted transport, the tunnel/encapsulation ingress might have insufficient context, or computational power, to reconstruct the transport header that would be needed to perform validation. 1.2. Packetization Layer Path MTU Discovery The term Packetization Layer (PL) has been introduced to describe the layer that is responsible for placing data blocks into the payload of IP packets and selecting an appropriate MPS. This function is often performed by a transport protocol, but can also be performed by other encapsulation methods working above the transport layer. In contrast to PMTUD, Packetization Layer Path MTU Discovery (PLPMTUD) [RFC4821] does not rely upon reception and validation of PTB messages. It is therefore more robust than Classical PMTUD. This has become the recommended approach for implementing PMTU discovery with TCP. It uses a general strategy where the PL sends probe packets to search for the largest size of unfragmented datagram that can be sent over a network path. The probe packets are sent with a progressively larger packet size. If a probe packet is successfully delivered (as determined by the PL), then the PLPMTU is raised to the size of the successful probe. If no response is received to a probe packet, the method reduces the probe size. This PLPMTU is used to set the application MPS. PLPMTUD introduces flexibility in the implementation of PMTU discovery. At one extreme, it can be configured to only perform PTB black hole detection and recovery to increase the robustness of Classical PMTUD, or at the other extreme, all PTB processing can be disabled and PLPMTUD can completely replace Classical PMTUD. PLPMTUD can also include additional consistency checks without increasing the risk of increased black-holing. For instance,the information available at the PL, or higher layers, makes PTB message validation more straight forward. 1.3. Path MTU Discovery for Datagram Services Section 5 of this document presents a set of algorithms for datagram protocols to discover the largest size of unfragmented datagram that can be sent over a network path. The method described relies on features of the PL described in Section 3 and applies to transport protocols operating over IPv4 and IPv6. It does not require cooperation from the lower layers, although it can utilise PTB messages when these received messages are made available to the PL. The UDP Usage Guidelines [RFC8085] state "an application SHOULD either use the Path MTU information provided by the IP layer or implement Path MTU Discovery (PMTUD)", but does not provide a mechanism for discovering the largest size of unfragmented datagram that can be used on a network path. Prior to this document, PLPMTUD had not been specified for UDP. Section 10.2 of [RFC4821] recommends a PLPMTUD probing method for the Stream Control Transport Protocol (SCTP). SCTP utilises probe packets consisting of a minimal sized HEARTBEAT chunk bundled with a PAD chunk as defined in [RFC4820], but RFC4821 does not provide a complete specification. The present document provides the details to complete that specification. The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires implementations to support Classical PMTUD and states that a DCCP sender "MUST maintain the MPS allowed for each active DCCP session". It also defines the current congestion control MPS (CCMPS) supported by a network path. This recommends use of PMTUD, and suggests use of control packets (DCCP-Sync) as path probe packets, because they do not risk application data loss. The method defined in this specification could be used with DCCP. Section 6 specifies the method for a set of transports, and provides information to enable the implementation of PLPMTUD with other datagram transports and applications that use datagram transports. 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. Other terminology is directly copied from [RFC4821], and the definitions in [RFC1122]. Actual PMTU: The Actual PMTU is the PMTU of a network path between a sender PL and a destination PL, which the DPLPMTUD algorithm seeks to determine. Black Holed: Packets are Black holed when the sender is unaware that packets are not delivered to the destination endpoint (e.g., when the sender transmits packets of a particular size with a previously known effective PMTU and they are silently discarded by the network, but is not made aware of a change to the path that resulted in a smaller PLPMTU by ICMP messages). Classical Path MTU Discovery: Classical PMTUD is a process described in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to learn the largest size of unfragmented datagram that can be used across a network path. Datagram: A datagram is a transport-layer protocol data unit, transmitted in the payload of an IP packet. Effective PMTU: The Effective PMTU is the current estimated value for PMTU that is used by a PMTUD. This is equivalent to the PLPMTU derived by PLPMTUD. EMTU_S: The Effective MTU for sending (EMTU_S) is defined in [RFC1122] as "the maximum IP datagram size that may be sent, for a particular combination of IP source and destination addresses...". EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in [RFC1122] as the largest datagram size that can be reassembled by EMTU_R ("Effective MTU to receive"). Link: A Link is a communication facility or medium over which nodes can communicate at the link layer, i.e., a layer below the IP layer. Examples are Ethernet LANs and Internet (or higher) layer and tunnels. Link MTU: The Link Maximum Transmission Unit (MTU) is the size in bytes of the largest IP packet, including the IP header and payload, that can be transmitted over a link. Note that this could more properly be called the IP MTU, to be consistent with how other standards organizations use the acronym. This includes the IP header, but excludes link layer headers and other framing that is not part of IP or the IP payload. Other standards organizations generally define the link MTU to include the link layer headers. MAX_PMTU: The MAX_PMTU is the largest size of PLPMTU that DPLPMTUD will attempt to use. MPS: The Maximum Packet Size (MPS) is the largest size of application data block that can be sent across a network path. In DPLPMTUD this quantity is derived from the PLPMTU by taking into consideration the size of the lower protocol layer headers. MIN_PMTU: The MIN_PMTU is the smallest size of PLPMTU that DPLPMTUD will attempt to use. Packet: A Packet is the IP header plus the IP payload. Packetization Layer (PL): The Packetization Layer (PL) is the layer of the network stack that places data into packets and performs transport protocol functions. Path: The Path is the set of links and routers traversed by a packet between a source node and a destination node by a particular flow. Path MTU (PMTU): The Path MTU (PMTU) is the minimum of the Link MTU of all the links forming a network path between a source node and a destination node. PTB_SIZE: The PTB_SIZE is a value reported in a validated PTB message that indicates next hop link MTU of a router along the path. PLPMTU: The Packetization Layer PMTU is an estimate of the actual PMTU provided by the DPLPMTUD algorithm. PLPMTUD: Packetization Layer Path MTU Discovery (PLPMTUD), the method described in this document for datagram PLs, which is an extension to Classical PMTU Discovery. Probe packet: A probe packet is a datagram sent with a purposely chosen size (typically the current PLPMTU or larger) to detect if packets of this size can be successfully sent end-to-end across the network path. 3. Features Required to Provide Datagram PLPMTUD TCP PLPMTUD has been defined using standard TCP protocol mechanisms. All of the requirements in [RFC4821] also apply to the use of the technique with a datagram PL. Unlike TCP, some datagram PLs require additional mechanisms to implement PLPMTUD. There are eight requirements for performing the datagram PLPMTUD method described in this specification: 1. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to provide information about the maximum size of packet that can be transmitted by the sender on the local link (the local Link MTU). It MAY utilize similar information about the receiver when this is supplied (note this could be less than EMTU_R). This avoids implementations trying to send probe packets that can not be transmitted by the local link. Too high of a value could reduce the efficiency of the search algorithm. Some applications also have a maximum transport protocol data unit (PDU) size, in which case there is no benefit from probing for a size larger than this (unless a transport allows multiplexing multiple applications PDUs into the same datagram). 2. PLPMTU: A datagram application using a transport layer not supporting fragmentationmessage isREQUIRED to be ableapplied tochooseboth IPv4 ICMP Unreachable messages (type 3) that carry thesize of datagrams sent toerror Fragmentation Needed (Type 3, Code 4) [RFC0792] and ICMPv6 packet too big messages (Type 2) [RFC4443]. When a sender receives a PTB message, it reduces thenetwork, upeffective MTU to thePLPMTU, or a smallervalue(suchreported as theMPS) derived from this. This value is managed byLink MTU in theDPLPMTUD method. The PLPMTU (specified asPTB message, and a method that from time-to-time increases theeffective PMTUpacket size inSection 1 of [RFC1191]) is equivalentattempt tothe EMTU_S (specifieddiscover an increase in[RFC1122]). 3. Probe packets: On request, a DPLPMTUD sender is REQUIRED to be able to transmit a packet larger thanthePLMPMTU. This is used to send a probe packet. In IPv4, a probe packet MUST besupported PMTU. The packets sent withthe Don't Fragment (DF) bit set in the IP header, and without network layer endpoint fragmentation. In IPv6,aprobe packet is always sent without source fragmentation (as specified in section 5.4 of [RFC8201]). 4. Processing PTB messages: A DPLPMTUD sender MAY optionally utilize PTB messages received fromsize larger than thenetwork layer to help identify when a network path doescurrent effective PMTU are known as probe packets. Packets notsupport the current size ofintended as probepacket. Any received PTB message MUST be validated before it is usedpackets are either fragmented toupdatethePLPMTU discovery information [RFC8201]. This validation confirms thatcurrent effective PMTU, or thePTB message was sent in responseattempt to send fails with an error code. Applications are sometimes provided with apacket originating by the sender, and needsprimitive tobe performed beforelet them read thePLPMTU discovery method reactsMaximum Packet Size (MPS), derived from the current effective PMTU. Classical PMTUD is subject to protocol failures. One failure arises when traffic using a packet size larger than the actual PMTU is black-holed (all datagrams sent with this size, or larger, are discarded). This could arise when the PTBmessage. A PTB message MUST NOT be usedmessages are not delivered back toincreasethePLPMTU [RFC8201]. 5. Reception feedback:sender for some reason (see for example [RFC2923]). Examples where PTB messages are not delivered include: * Thedestination PL endpointgeneration of ICMP messages isREQUIRED to provide a feedback method that indicatesusually rate limited. This could result in no PTB messages being generated to theDPLPMTUDsenderwhen a probe packet has been received(see section 2.4 of [RFC4443]) * ICMP messages can be filtered bythe destination PL endpoint. The mechanism needs tomiddleboxes (including firewalls) [RFC4890]. A stateful firewall could berobustconfigured with a policy to block incoming ICMP messages, which would prevent reception of PTB messages to a sending endpoint behind this firewall. * When thepossibility that packets could be significantly delayed alongrouter issuing the ICMP message drops anetwork path. The local PL endpoint attunneled packet, thesending node is REQUIRED to pass this feedbackresulting ICMP message will be directed to thesender-side DPLPMTUD method. 6. Probe loss recovery: Ittunnel ingress. This tunnel endpoint isRECOMMENDED to use probe packets that do not carry any user data. Most datagram transports permit this. If a proberesponsible for forwarding the ICMP message and also processing the quoted packetcontains user data requiring retransmission in case of loss,within thePL (or layers above) are REQUIREDpayload field toarrange any retransmission/repairremove the effect ofany resulting loss. DPLPMTUD is REQUIREDthe tunnel, and return a correctly formatted ICMP message tobe robustthe sender [I-D.ietf-intarea-tunnels]. Failure to do this prevents the PTB message reaching the original sender. * Asymmetry in forwarding can result in there being no return route to thecase where probe packets are lost dueoriginal sender, which would prevent an ICMP message being delivered toother reasons (including link transmission error, congestion). 7. Probing and congestion control: The DPLPMTUD sender treats isolated loss of a probe packet (withthe sender. This issue can also arise when policy- based routing is used, Equal Cost Multipath (ECMP) routing is used, orwithoutacorresponding PTB message)middlebox acts asa potential indication of a PMTU limit foran application load balancer. An example is where thepath. Loss ofpath towards the server is chosen by ECMP routing depending on bytes in the IP payload. In this case, when aprobepacketSHOULD NOT be treated as an indication of congestion andsent by theloss SHOULD NOT directly triggerserver encounters acongestion control reaction [RFC4821]. 8. Shared PLPMTU state: The PLPMTU value couldproblem after the ECMP router, then any resulting ICMP message needs to also bestored withdirected by thecorresponding entry inECMP router towards thedestination cache and used by other PL instances. The specification of PLPMTUD [RFC4821] states: "If PLPMTUD updatesoriginal sender. * There are additional cases where theMTU fornext hop destination fails to receive aparticular path, all Packetization Layer sessions that share the path representation (as described in Section 5.2packet because of[RFC4821]) SHOULDits size. This could benotifieddue tomake usemisconfiguration of thenew MTU". Such methods MUST be robust tolayer 2 path between nodes, for instance thewide variety of underlying network forwarding behaviours, PLPMTU adjustments based on shared PLPMTU values should be incorporatedMTU configured inthe search algorithms. Section 5.2a layer 2 switch, or misconfiguration of[RFC8201] provides guidance onthecaching of PMTU information and alsoMaximum Receive Unit (MRU). If therelation to IPv6 flow labels. In addition,packet is dropped by thefollowing principles are stated for design oflink, this will not cause aDPLPMTUD method: o MPS: A method is REQUIREDPTB message tosignal an appropriate MPSbe sent to thehigher layer using the PL. The value of the MPS can change followingoriginal sender. Another failure could result if achange to the path. It is RECOMMENDEDnode thatmethods avoid forcing an application to use an arbitrary small MPS (PLPMTU) for transmission while the methodissearching for the currently supported PLPMTU. Datagram PLs donotnecessarily support fragmentation of PDUs larger thanon the network path sends a PTB message that attempts to force a sender to change thePLPMTU.effective PMTU [RFC8201]. Areduced MPSsender canadversely impactprotect itself from reacting to such messages by utilising theperformance ofquoted packet within adatagram application. o Path validation: It is RECOMMENDED that methods are robustPTB message payload topath changesvalidate thatcould have occurred sincethepath characteristics were last confirmed, andreceived PTB message was generated in response to a packet that had actually originated from thepossibility of inconsistent path information being received. o Datagram reordering: A method is REQUIRED tosender. However, there are situations where a sender would berobustunable to provide this validation. Examples where validation of thepossibility thatPTB message is not possible include: * When aflow encounters reordering, orrouter issuing thetraffic (including probe packets)ICMP message implements RFC792 [RFC0792], it isdivided over more than one network path. o Whenonly required toprobe: It is RECOMMENDED that methods determine whetherinclude thepath capacity has increased since it last measuredfirst 64 bits of thepath. This determines whenIP payload of thepath should again be probed. 4. DPLPMTUD Mechanisms This section listspacket within theprotocol mechanisms used in this specification. 4.1. PLPMTU Probe Packets The DPLPMTUD method relies uponquoted payload. There could be insufficient bytes remaining for thePLsenderbeing able to generate probe packets with a specific size. TCP is able to generate these probe packets by choosing to appropriately segment data being sent [RFC4821]. In contrast, a datagram PL that needs to construct a probe packet has to either request an applicationtosend a data block thatinterpret the quoted transport information. Note: The recommendation in RFC1812 [RFC1812] islarger thanthatgenerated by an application, or to utilise padding functions to extendIPv4 routers return a quoted packet with as much of the original datagrambeyondas possible without thesizelength of theapplication data block. Protocols that permit exchangeICMP datagram exceeding 576 bytes. IPv6 routers include as much ofcontrol messages (without an application data block) could alternatively prefer to generate a probe packet by extending a control message with padding data. A receiver needs to be able to distinguish an in-band data block from any added padding. This is needed to ensure that any added padding is not passed on to an application atthereceiver. This results in threeinvoking packet as possibleways that a sender can create a probewithout the ICMPv6 packetlisted in orderexceeding 1280 bytes [RFC4443]. * The use ofpreference: Probing using padding data: A probetunnels/encryption can reduce the size of the quoted packetthat contains only control information together with any padding, which is neededreturned to the original source address, increasing the risk that there could beinflatedinsufficient bytes remaining for the sender to interpret thesize required forquoted transport information. * Even when the PTB message includes sufficient bytes of the quoted packet, theprobe packet. Since these probe packets do not carry an application-supplied data block, they do not typically require retransmission, although they do still consumenetworkcapacity and incurlayer could lack sufficient context to validate the message, because validation depends on information about the active transport flows at an endpointprocessing. Probing using application datanode (e.g., the socket/address pairs being used, andpadding data: A probe packet that containsother protocol header information). * When adata block supplied by an application thatpacket iscombined with padding to inflateencapsulated/tunneled over an encrypted transport, thelength oftunnel/encapsulation ingress might have insufficient context, or computational power, to reconstruct thedatagramtransport header that would be needed to perform validation. 1.2. Packetization Layer Path MTU Discovery The term Packetization Layer (PL) has been introduced to describe thesize requiredlayer that is responsible for placing data blocks into theprobe packet. If the application/transport needs protection from the losspayload ofthis probe packet,IP packets and selecting an appropriate MPS. This function is often performed by a transport protocol, but can also be performed by other encapsulation methods working above theapplication/transportcould perform transport-layer retransmission/repairlayer. In contrast to PMTUD, Packetization Layer Path MTU Discovery (PLPMTUD) [RFC4821] does not rely upon reception and validation ofthe data block (e.g., by retransmission after lossPTB messages. It isdetected or by duplicatingtherefore more robust than Classical PMTUD. This has become thedata block inrecommended approach for implementing PMTU discovery with TCP. It uses adatagram withoutgeneral strategy where thepadding data). Probing using application data: APL sends probepacketpackets to search for the largest size of unfragmented datagram thatcontainscan be sent over adata block supplied by an application that matchesnetwork path. Probe packets are sent with a progressively larger packet size. If a probe packet is successfully delivered (as determined by thesize required forPL), then theprobe packet. This method requestsPLPMTU is raised to theapplicationsize of the successful probe. If no response is received toissueadata block ofprobe packet, the method reduces thedesiredprobe size.IfThe result of probing with theapplication/ transport needs protection fromPLPMTU is used to set theloss of an unsuccessful probe packet,application MPS. PLPMTUD introduces flexibility in theapplication/transport needs thenimplementation of PMTU discovery. At one extreme, it can be configured to only performtransport- layer retransmission/repairICMP black Hole Detection and recovery to increase the robustness of Classical PMTUD, or at thedata block (e.g., by retransmission after loss is detected). A PLother extreme, all PTB processing can be disabled and PLPMTUD can completely replace Classical PMTUD. PLPMTUD can also include additional consistency checks without increasing the risk thatuses a probe packet carrying an applicationdatablock, could needis lost when probing toretransmit this application data block if the probe fails. This could needdiscover thePL to re-fragmentpath MTU. For example, information available at thedata blockPL, or higher layers, enables received PTB messages to be validated before being utilized. 1.3. Path MTU Discovery for Datagram Services Section 5 of this document presents asmaller packetset of algorithms for datagram protocols to discover the largest size of unfragmented datagram thatis expected to traversecan be sent over a network path. The method described relies on features of theend-to-end path (which could utilise endpoint network-layer orPLfragmentationdescribed in Section 3 and applies to transport protocols operating over IPv4 and IPv6. It does not require cooperation from the lower layers, although it can utilize PTB messages when these received messages areavailable). DPLPMTUD MAY choosemade available to the PL. The UDP Usage Guidelines [RFC8085] state "an application SHOULD either useonly one of these methods to simplifytheimplementation. Probe messages sentPath MTU information provided by the IP layer or implement Path MTU Discovery (PMTUD)", but does not provide aPL MUST contain enough information to uniquely identifymechanism for discovering theprobe within Maximum Segment Lifetime, while being robust to reordering and replay of probe response and PTB messages. 4.2. Confirmationlargest size ofProbed Packet Size The PL needsunfragmented datagram that can be used on amethodnetwork path. Prior todetermine (confirm) when probe packets havethis document, PLPMTUD had not beensuccessfully received end-to-end acrossspecified for UDP. Section 10.2 of [RFC4821] recommends anetwork path.PLPMTUD probing method for the Stream Control Transportprotocols can include end-to-end methods that detect and report reception of specific datagrams that they send (e.g., DCCP andProtocol (SCTP). SCTPprovide keep-alive/heartbeat features). When supported, this mechanism SHOULD also be used by DPLPMTUD to acknowledge receptionutilizes probe packets consisting of aprobe packet. A PL thatminimal sized HEARTBEAT chunk bundled with a PAD chunk as defined in [RFC4820], but RFC4821 does notacknowledge data reception (e.g., UDP and UDP- Lite) is unable itself to detect when the packets that it sends are discarded because their size is greater thanprovide a complete specification. The present document provides theactual PMTU. These PLs needdetails toeither rely on an application protocolcomplete that specification. The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires implementations todetect this loss, or make use of an additional transport method such as UDP- Options [I-D.ietf-tsvwg-udp-options]. Section 5 specifies this function forsupport Classical PMTUD and states that aset of IETF-specified protocols. 4.3. Detection of Black Holes A PLDCCP senderneeds to reduce"MUST maintain thePLPMTU when it discoversMPS allowed for each active DCCP session". It also defines theactual PMTUcurrent congestion control MPS (CCMPS) supported by a networkpath is less than the PLPMTU (i.e. to detect that traffic is being black holed).path. Thiscan be triggered when a validated PTB message is received, or by another event that indicates the networkrecommends use of PMTUD, and suggests use of control packets (DCCP-Sync) as pathno longer sustainsprobe packets, because they do not risk application data loss. The method defined in this specification could be used with DCCP. Section 6 specifies thecurrent packet size, such asmethod for aloss report from the PL or repeated lackset ofresponse to probe packets senttransports, and provides information toconfirmenable thePLPMTU. Detection is followed by a reductionimplementation ofthe PLPMTU. Black Hole detectionPLPMTUD with other datagram transports and applications that use datagram transports. 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. Other terminology isperformed by periodically sending packet probesdirectly copied from [RFC4821], and the definitions in [RFC1122]. Actual PMTU: The Actual PMTU is the PMTU ofsize PLPMTU to verify thata network pathstill supports the last acknowledged PLPMTU size. There are two waysbetween aDPLPMTUDsenderdetect that the current PLPMTU is not sustained by the path (i.e., to detect a black hole): o APLcan rely uponand amechanisms implemented withindestination PL, which thePL protocolDPLPMTUD algorithm seeks todetect excessive loss of data sent withdetermine. Black Hole: A Black Hole is encountered when aspecific packet size and then concludesender is unaware thatthis excessive loss could be a result of an invalid PMTU (as in PLPMTUD for TCP [RFC4821]). o A PL can use the probing mechanism to send confirmation probepacketsof the size of the current PLPMTU and a timer track whether acknowledgmentsarereceived (e.g.,not being delivered to thenumberdestination end point. Two types ofprobeBlack Hole are relevant to DPLPMTUD: Packet Black Hole: Packets encounter a Packet Black Hole when packetssent without receiving an acknowledgement, PROBE_COUNT, becomes greater than the MAX_PROBES). These messages needare not delivered tobe generated periodically (e.g., usingtheconfirmation timer Section 5.1.1), and MAY inhibit sending probe packetsdestination endpoint (e.g., whenno application data has been sent sincetheprevious probe packet. A PL preferring to use an up-to-data PMTU once user data is sent again, MAY choose to continue PMTU discovery for each path. However, this may result in additionalsender transmits packetsbeing sent. Successive lossofprobesa particular size with a previously known effective PMTU and they are discarded by the network). ICMP Black Hole An ICMP Black Hole isan indication thatencountered when thecurrent path no longer supportssender is unaware that packets are not delivered to thePLPMTU. Whendestination endpoint because PTB messages are not received by themethod detectsoriginating PL sender. Black holed : Traffic is black-holed when thecurrent PLPMTUsender is unaware that packets are notsupported (a black holebeing delivered. This could be due to a Packet Black Hole or an ICMP Black Hole. Classical Path MTU Discovery: Classical PMTUD isfound), DPLPMTUD setsalower MPS. The PL then confirms thatprocess described in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to learn theupdated PLPMTUlargest size of unfragmented datagram that can besuccessfullyused acrossthe path. This can need the PL to sendaprobe packet withnetwork path. Datagram: A datagram is asize less thantransport-layer protocol data unit, transmitted in thesizepayload ofthe data block generated byanapplication. In this case,IP packet. Effective PMTU: The Effective PMTU is thePL could providecurrent estimated value for PMTU that is used by awayPMTUD. This is equivalent tofragmentthe PLPMTU derived by PLPMTUD. EMTU_S: The Effective MTU for sending (EMTU_S) is defined in [RFC1122] as "the maximum IP datagram size that may be sent, for a particular combination of IP source and destination addresses...". EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in [RFC1122] as the largest datagram size that can be reassembled by EMTU_R (Effective MTU to receive). Link: A Link is a communication facility or medium over which nodes can communicate at thePL, or could instead utiliselink layer, i.e., acontrol packet with padding. 4.4. Response to PTB Messages This method requires the DPLPMTUD sender to validate any received PTB message before usinglayer below thePTB information.IP layer. Examples are Ethernet LANs and Internet (or higher) layer and tunnels. Link MTU: Theresponse to a PTB message depends onLink Maximum Transmission Unit (MTU) is thePTB_SIZE indicatedsize inthe PTB message, the statebytes of thePLPMTUD state machine, andlargest IP packet, including the IPprotocol being used. Section 4.4.1 first describes validation for both IPv4 ICMP Unreachable messages (type 3)header andICMPv6 packet too big messages, both of which are referred to as PTB messages inpayload, that can be transmitted over a link. Note that thisdocument. 4.4.1. Validation of PTB Messagescould more properly be called the IP MTU, to be consistent with how other standards organizations use the acronym. Thissection specifies utlisation of PTB messages. o A simple implementation MAY ignore received PTB messages and in this caseincludes thePLPMTUIP header, but excludes link layer headers and other framing that is notupdated when a PTB messagepart of IP or the IP payload. Other standards organizations generally define the link MTU to include the link layer headers. MAX_PMTU: The MAX_PMTU isreceived. o An implementationthe largest size of PLPMTU thatsupports PTB messages MUST validate messages before they are further processed. A PLDPLPMTUD will attempt to use. MPS: The Maximum Packet Size (MPS) is the largest size of application data block thatreceivescan be sent across aPTB message fromnetwork path by arouter or middlebox, performs ICMP validation as specified in Section 5.2 of [RFC8085][RFC8201]. BecausePL. In DPLPMTUDoperates at the PL, the PL needs to check that each received PTB messagethis quantity isreceived in response to a packet transmittedderived from the PLPMTU by taking into consideration theendpoint PL performing DPLPMTUD. The PL MUST checksize of the lower protocolinformation inlayer headers. Probe packets generated by DPLPMTUD can have a size larger than thequoted packet carried inMPS. MIN_PMTU: The MIN_PMTU is theICMP PTB message payloadsmallest size of PLPMTU that DPLPMTUD will attempt tovalidateuse. Packet: A Packet is themessage originated fromIP header plus thesending node. This validation includes determining thatIP payload. Packetization Layer (PL): The Packetization Layer (PL) is thecombinationlayer of theIP addresses, the protocol,network stack that places data into packets and performs transport protocol functions. Path: The Path is the set of links and routers traversed by a packet between a sourceportnode and a destinationport match those returned in the quoted packet - this is also necessary for the PTB message to be passed to the corresponding PL.node by a particular flow. Path MTU (PMTU): Thevalidation SHOULD utilise information that itPath MTU (PMTU) isnot simple for an off-path attacker to determine. For example, by checkingthevalueminimum ofa protocol header field known only tothetwo PL endpoints. A datagram application that uses well-knownLink MTU of all the links forming a network path between a source node and a destinationports ought to also rely on other information to complete this validation. These checks are intended to provide protection from packets that originate fromnode. PTB_SIZE: The PTB_SIZE is anodevalue reported in a validated PTB message thatis not onindicates next hop link MTU of a router along thenetworkpath.A PTB message that does not complete the validation MUST NOT be further utilised byPLPMTU: The Packetization Layer PMTU is an estimate of theDPLPMTUD method. PTB messages that have been validated MAY be utilisedactual PMTU provided by the DPLPMTUDalgorithm, but MUST NOT be used directly to setalgorithm. PLPMTUD: Packetization Layer Path MTU Discovery (PLPMTUD), thePLPMTU. Amethodthat utilises these PTB messages can improve the speed at thedescribed in this document for datagram PLs, whichthe algorithm detectsis anappropriate PLPMTU, comparedextension toone that relies solely on probing. Section 4.4.2 describes this processing. 4.4.2. Use of PTB MessagesClassical PMTU Discovery. Probe packet: Aset of checks are intended to provide protection from a router that reports an unexpected PTB_SIZE. The PL needs to check that the indicated PTB_SIZEprobe packet isless than thea datagram sent with a purposely chosen sizeused by probe(typically the current PLPMTU or larger) to detect if packetsand larger than minimum size accepted. This section provides a summaryofhow PTB messagesthis size can beutilised. This processing depends on the PTB_SIZE andsuccessfully sent end-to-end across thecurrent valuenetwork path. 3. Features Required to Provide Datagram PLPMTUD TCP PLPMTUD has been defined using standard TCP protocol mechanisms. All ofa setthe requirements in [RFC4821] also apply to the use ofvariables: MIN_PMTU < PTB_SIZE < BASE_PMTU * A robust PL MAY enterthePROBE_ERROR statetechnique with a datagram PL. Unlike TCP, some datagram PLs require additional mechanisms to implement PLPMTUD. There are eight requirements foran IPv4 path whenperforming thePTB_SIZE reporteddatagram PLPMTUD method described inthe PTB message >= 68 bytes and whenthis specification: 1. PMTU parameters: A DPLPMTUD sender isless thanRECOMMENDED to provide information about theBASE_PMTU. * A robust PL MAY entermaximum size of packet that can be transmitted by thePROBE_ERROR state for an IPv6 path whensender on thePTB_SIZE reported inlocal link (the local Link MTU). It MAY utilize similar information about thePTB message >= 1280 bytes andreceiver when this is supplied (note this could be less thanthe BASE_PMTU. PTB_SIZE = PLPMTU * Transition to SEARCH_COMPLETE. PTB_SIZE > PROBED_SIZE * The PTB_SIZE > PROBED_SIZE, inconsistent network signal. These PTB messages oughtEMTU_R). This avoids implementations trying tobe discarded without further processing (the PLPMTUsend probe packets that can notupdated). * The information couldbeutilised as an input to trigger enabling a resilience mode. BASE_PMTU <= PTB_SIZE < PLPMTU * Black hole detection is triggered andtransmitted by thePLPMTU ought to be set to BASE_PMTU. * The PLlocal link. Too high of a value coulduse PTB_SIZE reported inreduce the efficiency of thePTB message to initialise asearch algorithm.PLPMTU < PTB_SIZE < PROBED_SIZE * The PLPMTU continues to be valid, but the last PROBED_SIZE searched wasSome applications also have a maximum transport protocol data unit (PDU) size, in which case there is no benefit from probing for a size larger than this (unless a transport allows multiplexing multiple applications PDUs into theactual PMTU. * The PLPMTU is not updated. * Thesame datagram). 2. PLPMTU: A datagram application using a PLcan usenot supporting fragmentation is REQUIRED to be able to choose the size of datagrams sent to thereported PTB_SIZEnetwork, up to the PLPMTU, or a smaller value (such as the MPS) derived from this. This value is managed by thePTB messageDPLPMTUD method. The PLPMTU (specified as thenext search point when it resumeseffective PMTU in Section 1 of [RFC1191]) is equivalent to thesearch algorithm. xxx Author Note: Do we wantEMTU_S (specified in [RFC1122]). 3. Probe packets: On request, a DPLPMTUD sender is REQUIRED tospecify howbe able tohandle PTB Message with PTB_SIZE = 0? xxx 5. Datagram Packetization Layer PMTUDtransmit a packet larger than the PLMPMTU. Thissection specifies Datagram PLPMTUD (DPLPMTUD). The method canis used to send a probe packet. In IPv4, a probe packet MUST beintroduced at various points (as indicatedsent with* inthefigure below)Don't Fragment (DF) bit set in the IPprotocol stack to discover the PLPMTU so that an application can utilise an appropriate MPS for the currentheader, and without networkpath. DPLPMTUD SHOULD NOT be used by an application if itlayer endpoint fragmentation. In IPv6, a probe packet isalready usedalways sent without source fragmentation (as specified ina lower layer. +----------------------+ | Application* | +-+-------+----+---+---+ | | | | +---+--+ +--+--+ | +-+---+ | QUIC*| |UDPO*| | |SCTP*| +---+--+ +--+--+ | ++--+-+ | | | | | +-------+-+ | | | | | | | ++-+--++ | | UDP | | +---+--+ | | | +--------------+-----+-+ | Network Interface | +----------------------+ Figure 1: Examples where DPLPMTUD can be implemented The central ideasection 5.4 of [RFC8201]). 4. Processing PTB messages: A DPLPMTUDis probing by a sender. Probe packets are sentsender MAY optionally utilize PTB messages received from the network layer tofindhelp identify when a network path does not support themaximumcurrent size ofa userprobe packet. Any received PTB message MUST be validated before it is used to update the PLPMTU discovery information [RFC8201]. This validation confirms thatcanthe PTB message was sent in response to a packet originating by the sender, and needs to becompletely transferred acrossperformed before thenetwork path fromPLPMTU discovery method reacts to the PTB message. A PTB message MUST NOT be used to increase the PLPMTU [RFC8201]. 5. Reception feedback: The destination PL endpoint is REQUIRED to provide a feedback method that indicates to the DPLPMTUD sender when a probe packet has been received by the destination PL endpoint. The mechanism needs to be robust to thedestination. This section identifiespossibility that packets could be significantly delayed along a network path. The local PL endpoint at thecomponents needed for implementation,sending node is REQUIRED to pass this feedback to thephasessender DPLPMTUD method. 6. Probe loss recovery: It is RECOMMENDED to use probe packets that do not carry any user data. Most datagram transports permit this. If a probe packet contains user data requiring retransmission in case ofoperation,loss, thestate machine and search algorithm. 5.1. DPLPMTUD Components This section describes componentsPL (or layers above) are REQUIRED to arrange any retransmission/repair ofDPLPMTUD. 5.1.1. Timers The method utilises upany resulting loss. DPLPMTUD is REQUIRED tothree timers: PROBE_TIMER:be robust in the case where probe packets are lost due to other reasons (including link transmission error, congestion). 7. Probing and congestion control: ThePROBE_TIMER is configured to expire afterDPLPMTUD sender treats isolated loss of aperiod longer thanprobe packet (with or without a corresponding PTB message) as a potential indication of a PMTU limit for themaximum time to receive an acknowledgment topath. Loss of a probepacket. This value MUSTpacket SHOULD NOT besmaller than 1 second,treated as an indication of congestion and the loss SHOULD NOT directly trigger a congestion control reaction [RFC4821]. 8. Shared PLPMTU state: The PLPMTU value could also belarger than 15 seconds. Guidance on selection ofstored with thetimer value are providedcorresponding entry insection 3.1.1 of the UDP Usage Guidelines [RFC8085]. IfthePL has a path Round Trip Time (RTT) estimatedestination cache andtimely acknowledgements the PROBE_TIMER can be derived from theused by other PLRTT estimate. PMTU_RAISE_TIMER:instances. ThePMTU_RAISE_TIMER is configured tospecification of PLPMTUD [RFC4821] states: "If PLPMTUD updates theperiodMTU for asender will continueparticular path, all Packetization Layer sessions that share the path representation (as described in Section 5.2 of [RFC4821]) SHOULD be notified to make use of thecurrent PLPMTU, after which it re- entersnew MTU". Such methods MUST be robust to theSearch phase. This timer has a periodwide variety of600 secs, as recommended by PLPMTUD [RFC4821]. DPLPMTUD MAY inhibit sending probe packets when no application data has been sent sinceunderlying network forwarding behaviors, PLPMTU adjustments based on shared PLPMTU values should be incorporated in the search algorithms. Section 5.2 of [RFC8201] provides guidance on the caching of PMTU information and also the relation to IPv6 flow labels. In addition, theprevious probe packet.following principles are stated for design of a DPLPMTUD method: * MPS: APL preferring to use an up-to-data PMTU once user datamethod issent again, can chooseREQUIRED tocontinue PMTU discovery for each path. However, this could in sending additional packets. CONFIRMATION_TIMER: Whensignal anacknowledged PL is used, this timer MUST NOT be used. For other PLs, the CONFIRMATION_TIMER is configuredappropriate MPS to theperiodhigher layer using the PL. The value of the MPS can change following aPL sender waits before confirmingchange to thecurrent PLPMTUpath. It isstill supported. ThisRECOMMENDED that methods avoid forcing an application to use an arbitrary small MPS (PLPMTU) for transmission while the method islesssearching for the currently supported PLPMTU. Datagram PLs do not necessarily support fragmentation of PDUs larger than thePMTU_RAISE_TIMER and used to decreasePLPMTU. A reduced MPS can adversely impact thePLPMTU (e.g., whenperformance of ablack holedatagram application. * Path validation: It isencountered). Confirmation needsRECOMMENDED that methods are robust tobe frequent enough when data is flowingpath changes that could have occurred since thesending PL does not black hole extensive amounts of traffic. Guidance on selection ofpath characteristics were last confirmed, and to thetimer value are provided in section 3.1.1possibility of inconsistent path information being received. * Datagram reordering: A method is REQUIRED to be robust to theUDP Usage Guidelines [RFC8085]. DPLPMTUD MAY inhibit sendingpossibility that a flow encounters reordering, or the traffic (including probepackets when no application datapackets) is divided over more than one network path. * When to probe: It is RECOMMENDED that methods determine whether the path hasbeen sentchanged since it last measured theprevious probe packet. A PL preferring to use an up-to-data PMTU once user data is sent again,path. This canchoosehelp determine when tocontinue PMTU discovery for each path. However, this may result in sending additional packets. An implementation could implement the various timers using a single timer. 5.1.2. Constants The following constants are defined: MAX_PROBES: MAX_PROBES isprobe themaximum value ofpath again. 4. DPLPMTUD Mechanisms This section lists thePROBE_COUNT counter. The default value of MAX_PROBES is 10. MIN_PMTU:protocol mechanisms used in this specification. 4.1. PLPMTU Probe Packets TheMIN_PMTU is smallest allowedDPLPMTUD method relies upon the PL sender being able to generate probepacketpackets with a specific size.For IPv6, this value is 1280 bytes, as specified in [RFC2460]. For IPv4, the minimum value is 68 bytes. (An IPv4 routerTCP isrequired to beable toforwardgenerate these probe packets by choosing to appropriately segment data being sent [RFC4821]. In contrast, a datagramof 68 bytes without further fragmentation. This is the combined size of an IPv4 header and the minimum fragment size of 8 bytes. In addition, receivers are requiredPL that needs tobe ableconstruct a probe packet has toreassemble fragmented datagrams at least upeither request an application to576 bytes, as stated in section 3.3.3 of [RFC1122])) MAX_PMTU: The MAX_PMTUsend a data block that isthe largest size of PLPMTU. This has to be lesslarger than that generated by an application, orequalto utilize padding functions to extend a datagram beyond theminimumsize of thelocal MTUapplication data block. Protocols that permit exchange ofthe outgoing interface and the destination PMTU for receiving. Ancontrol messages (without an applicationor PL MAY reduce the MAX_PMTU when there is no needdata block) could alternatively prefer tosend packets larger thangenerate aspecific size. BASE_PMTU: The BASE_PMTU isprobe packet by extending aconfigured size expectedcontrol message with padding data. A receiver needs towork for most paths. The size is equalbe able toor larger than the MIN_PMTU and smaller than the MAX_PMTU. In the case of IPv6, this valuedistinguish an in-band data block from any added padding. This is1280 bytes [RFC2460]. When using IPv4, a size of 1200 bytesneeded to ensure that any added padding isRECOMMENDED. 5.1.3. Variablesnot passed on to an application at the receiver. Thismethod utilisesresults in three possible ways that asetsender can create a probe packet listed in order ofvariables: PROBED_SIZE: The PROBED_SIZEpreference: Probing using padding data: A probe packet that contains only control information together with any padding, which is needed to be inflated to the sizeofrequired for thecurrentprobe packet.This is a tentative value for the PLPMTU, which is awaiting confirmation by an acknowledgment. PROBE_COUNT: The PROBE_COUNT is a count of the number of unsuccessfulSince these probe packets do not carry an application-supplied data block, they do not typically require retransmission, although they do still consume network capacity and incur endpoint processing. Probing using application data and padding data: A probe packet thathave been sent withcontains asize of PROBED_SIZE. The valuedata block supplied by an application that isinitialisedcombined with padding tozero when a particular sizeinflate the length ofPROBED_SIZE is first attempted. The figure below illustratestherelationship betweendatagram to thepacketsizeconstants and variables, in this case whenrequired for theDPLPMTUD algorithm performs path probing to increaseprobe packet. If thesizeapplication/transport needs protection from the loss of this probe packet, thePLPMTU. The MPSapplication/ transport could perform transport-layer retransmission/repair of the data block (e.g., by retransmission after loss isless thandetected or by duplicating thePLPMTU.data block in a datagram without the padding data). Probing using application data: A probe packethas been sent ofthat contains a data block supplied by an application that matches the sizePROBED_SIZE. When this is acknowledged,required for thePLPMTU will be raised to PROBED_SIZE allowingprobe packet. This method requests thePROBED_SIZEapplication tobe increased towardsissue a data block of theactual PMTU. MIN_PMTU MAX_PMTU <--------------------------------------------------> | | | | V | | V BASE_PMTU | V Actual PMTU | PROBED_SIZE V PLPMTU Figure 2: Relationships betweendesired probeand packet sizes 5.2. DPLPMTUD Phases The Datagram PLPMTUD algorithm moves through several phases of operation. An implementation that only reducessize. If thePLPMTU to a suitable size would be sufficient to ensure reliable operation, but can be very inefficient whenapplication/ transport needs protection from theactual PMTU changes or whenloss of an unsuccessful probe packet, themethod (for whatever reason) makes a suboptimal choice forapplication/transport needs then to perform transport- layer retransmission/repair of thePLPMTU.data block (e.g., by retransmission after loss is detected). Afull implementation of DPLPMTUD providesPL that uses a probe packet carrying analgorithm enabling the DPLPMTUD senderapplication data block, could need toincreaseretransmit this application data block if thePLPMTU following a change inprobe fails. This could need thecharacteristics ofPL to re-fragment thepath, such as whendata block to alinksmaller packet size that isreconfigured with a larger MTU,expected to traverse the end-to-end path (which could utilize endpoint network-layer or PL fragmentation whenthere isthese are available). DPLPMTUD MAY choose to use only one of these methods to simplify the implementation. Probe messages sent by achange inPL MUST contain enough information to uniquely identify thesetprobe within Maximum Segment Lifetime, while being robust to reordering and replay oflinks traversed by anprobe response and PTB messages. 4.2. Confirmation of Probed Packet Size The PL needs a method to determine (confirm) when probe packets have been successfully received end-to-endflow (e.g., afteracross arouting or path fail-over decision). Black hole detection (Section 4.3)network path. Transport protocols can include end-to-end methods that detect andPTB processing (Section 4.4) proceed in parallel with these phasesreport reception ofoperation. +------------------------+ | BASE_PMTU Confirmation +-- Connectivity +------------+-----------+ \----+ or BASE_PMTU | ^ V Confirmation Fails Connectivityspecific datagrams that they send (e.g., DCCP and| | +-------+ BASE_PMTU confirmed | +---------+ Error | | +-------+ | CONFIRMATION_TIMER | Fires V +----------------+ +--------------+ | Search Complete|<---------+ Search | +----------------+ +--------------+ Search Algorithm Completes Figure 3: DPLPMTUD Phases BASE_PMTU Confirmation * Connectivity is confirmed. *SCTP provide keep-alive/heartbeat features). When supported, this mechanism SHOULD also be used by DPLPMTUDconfirms the BASE_PMTUto acknowledge reception of a probe packet. A PL that does not acknowledge data reception (e.g., UDP and UDP- Lite) issupported across the network path. * DPLPMTUD then enters the search phase. Search * DPLPMTUD performs probingunable itself toincreasedetect when thePLPMTU. * DPLPMTUD then enterspackets that it sends are discarded because their size is greater than thesearch completeactual PMTU. These PLs need to either rely on an application protocol to detect this loss, or make use of anerror phase. Search Complete * DPLPMTUD has foundadditional transport method such as UDP- Options [I-D.ietf-tsvwg-udp-options]. Section 6 specifies this function for asuitableset of IETF-specified protocols. 4.3. Detection of Unsupported PLPMTUthat is supported across the network path. *Size, aka Blackhole detection will confirm this PLPMTU continues to be supported. * On a longer time-frame, DPLPMTUD will re-enter the search phaseHole Detection A PL sender needs todiscover ifreduce the PLPMTUcan be raised. Error * Inconsistent or invalid network signals cause DPLPMTUD to be unable to progress. * This causes the algorithm to lower the MPS untilwhen it discovers the actual PMTU supported by a network path isshown to supportless than theBASE_PMTU,PLPMTU. This can be triggered when a validated PTB message is received, orto suspend DPLPMTUD. 5.2.1. BASE_PMTU Confirmation Phase DPLPMTUD starts inby another event that indicates theBASE_PMTU confirmation phase. BASE_PMTU confirmation is performed in two stages: 1. Connectivity tonetwork path no longer sustains theremote peer is first confirmed. When a connection-oriented PL is used, this stage is implicit. It is performedcurrent packet size, such aspart ofa loss report from thenormal PL connection handshake. In contrast, an connectionless PL MUST send an acknowledgedPL, or repeated lack of response to probepacketpackets sent to confirmthattheremote peerPLPMTU. Detection isreachable. 2. In the second stage, the PL confirms it can successfully sendfollowed by adatagramreduction of theBASE_PMTUPLPMTU. This is performed by sending packet probes of sizeacross the current path. A PL that does not wishPLPMTU tosupportverify that a network pathwith a PLPMTU less than BASE_PMTU can simplify the phase into a single step by performing connectivity checks with probes ofstill supports theBASE_PMTUlast acknowledged PLPMTU size. There are two alternative mechanism: * A PLMAY respond to PTB messages while in this phase, see Section 4.4. Once BASE_PMTU confirmation has completed, DPLPMTUDcanadvertise an MPS to an upper layer. If DPLPMTUD fails to complete these tests it enters the PROBE_DISABLED phase, see Section 5.2.6, and ceases using DPLPTMUD. 5.2.2. Search Phase The search phase utilises a search algorithm in attempt to increase the PLPMTU (see Section 5.4.1). The PL sender increases the MPS each time a packet probe confirmsrely upon alarger PLPMTU is supported by the path. The algorithm concludes by entering the SEARCH_COMPLETE phase, see Section 5.2.3. A PL MAY respond to PTB messages while in this phase, using the PTB to advance or terminate the search, see Section 4.4. Similarly black hole detection can terminate the search by enteringmechanism implemented within thePROBE_BASE phase, see Section 5.2.4. 5.2.2.1. Resilience to Inconsistent Path Information Sometimes aPLsender is ableto detectinconsistent results from the sequence of PLPMTU probes that it sends or the sequenceexcessive loss ofPTB messagesdata sent with a specific packet size and then conclude thatit receives. Thisthis excessive loss could bemanifested as excessive fluctuation of the MPS. When inconsistent path information is detected,a result of an invalid PMTU (as in PLPMTUD for TCP [RFC4821]). * A PLsendercanenable an alternate search mode that clampsuse theoffered MPSDPLPMTUD probing mechanism toa smaller value for a periodperiodically generate probe packets oftime. This avoids unnecessary black- holingthe size ofpackets. 5.2.3. Search Complete Phase On entry tothesearch complete phase,current PLPMTU (e.g., using theDPLPMTUD sender startsconfirmation timer Section 5.1.1). A timer tracks whether acknowledgments are received. Successive loss of probes is an indication that thePMTU_RAISE_TIMER. In this phase,current path no longer supports the PLPMTUremains at(e.g., when thevalue confirmed bynumber of probe packets sent without receiving an acknowledgement, PROBE_COUNT, becomes greater than MAX_PROBES). A PL MAY inhibit sending probe packets when no application data has been sent since thelast successfulprevious probe packet.InA PL preferring to use an up-to-data PLPMTU once user data is sent again, MAY choose to continue PLPMTU discovery for each path. However, thisphase,may result in additional packets being sent. When the method detects the current PLPMTU is not supported, DPLPMTUD sets a lower MPS. The PLMUST periodically confirmthen confirms that the updated PLPMTUis still supported bycan be successfully used across the path.If theThe PLis designed in a way that is unable to confirm reachabilitycould need to send a probe packet with a size less than thedestination endpoint after probing has completed,size of themethod usesdata block generated by an application. In this case, the PL could provide aCONFIRMATION_TIMERway toperiodically repeatfragment aprobedatagram at the PL, or use a control packetforas thecurrent PLPMTU size. Ifpacket probe. 4.4. Response to PTB Messages This method requires the DPLPMTUD senderis unabletoconfirm reachability for packets withvalidate any received PTB message before using the PTB information. The response to asize ofPTB message depends on thecurrent PLPMTU (e.g., ifPTB_SIZE indicated in theCONFIRMATION_TIMER expires) orPTB message, thePL signals a lackstate ofreachability,themethod exits the phasePLPMTUD state machine, andenters the PROBE_BASE phase, see Section 5.2.4. If the PMTU_RAISE_TIMER expires, the DPLPMTUD sender re-enterstheSearch phase, seeIP protocol being used. Section5.2.2, and resumes probing4.4.1 first describes validation fora larger PLPMTU. Back hole detection can be used in parallelboth IPv4 ICMP Unreachable messages (type 3) and ICMPv6 packet too big messages, both of which are referred tocheck thatas PTB messages in this document. 4.4.1. Validation of PTB Messages This section specifies utilization of PTB messages. * A simple implementation MAY ignore received PTB messages and in this case the PLPMTU is not updated when anetwork path continues to supportPTB message is received. * An implementation that supports PTB messages MUST validate messages before they are further processed. A PL that receives apreviously confirmed PLPMTU. IfPTB message from ablack hole is detectedrouter or middlebox, performs ICMP validation as specified in Section 5.2 of [RFC8085][RFC8201]. Because DPLPMTUD operates at thealgorithm moves toPL, thePROBE_BASE phase, see Section 5.2.4. The phase can also exited when a validatedPL needs to check that each received PTB message is received(see Section 4.4.1). 5.2.4. PROBE_BASE Phase This phase is entered when black hole detection orin response to a packet transmitted by the endpoint PL performing DPLPMTUD. The PL MUST check the protocol information in the quoted packet carried in an ICMP PTB messageindicatespayload to validate the message originated from the sending node. This validation includes determining that thePLPMTU is not supported bycombination of thepath. On entry to this phase,IP addresses, thePLPMTUprotocol, the source port and destination port match those returned in the quoted packet - this issetalso necessary for the PTB message to be passed to theBASE_PMTU, and acorrespondingreduced MPS is advertised. PROBED_SIZEPL. The validation SHOULD utilize information that it isthen setnot simple for an off-path attacker to determine [RFC8085]. For example, by checking thePLPMTU (i.e.,value of a protocol header field known only to theBASE_PMTU),two PL endpoints. A datagram application that uses well-known source and destination ports ought toconfirmalso rely on other information to complete thissizevalidation. These checks are intended to provide protection from packets that originate from a node that issupported acrossnot on the network path.If confirmed,A PTB message that does not complete the validation MUST NOT be further utilized by the DPLPMTUDentersmethod. PTB messages that have been validated MAY be utilized by theSearch PhaseDPLPMTUD algorithm, but MUST NOT be used directly todetermine whetherset thePL sender can use a largerPLPMTU.IfA method that utilizes these PTB messages can improve thepath cannot be confirmed to supportspeed at theBASE_PMTU after sending MAX_PROBES, DPLPMTUD moves towhich theError phase, seealgorithm detects an appropriate PLPMTU, compared to one that relies solely on probing. Section5.2.5. 5.2.5. ERROR Phase The ERROR phase is entered when there is conflicting or invalid PLPMTU information for the path (e.g.4.4.2 describes this processing. 4.4.2. Use of PTB Messages A set of checks are intended to provide protection from afailurerouter that reports an unexpected PTB_SIZE. The PL also needs tosupport the BASE_PMTU). In this phase,check that theMPSindicated PTB_SIZE isset to a valueless than theBASE_PMTU, but at least thesizeof the MIN_PMTU. DPLPMTUD remains in the ERROR phase untilused by probe packets and larger than minimum size accepted. This section provides aconsistent viewsummary ofthe pathhow PTB messages can bediscovered and it has also been confirmed thatutilized. This processing depends on thepath supportsPTB_SIZE and theBASE_PMTU. Note:current value of a set of variables: MIN_PMTUmay be identical to BASE_PMTU, simplifying< PTB_SIZE < BASE_PMTU * A robust PL MAY enter an error state (see Section 5.2) for an IPv4 path when theactionsPTB_SIZE reported in the PTB message is larger than or equal to 68 bytes and when thisphase. If no acknowledgementisreceived for PROBE_COUNT probes of size MIN_PMTU,less than themethod suspends DPLPMTUD, seeBASE_PMTU. * A robust PL MAY enter an error state (see Section5.2.5. 5.2.5.1. Robustness to Inconsistent Path Robustness to paths unable5.2) for an IPv6 path when the PTB_SIZE reported in the PTB message is larger than or equal tosustain1280 bytes and when this is less than the BASE_PMTU.Some pathsPTB_SIZE = PLPMTU * Completes the search for a larger PLPMTU. PTB_SIZE > PROBED_SIZE * Inconsistent network signal. * PTB message ought to be discarded without further processing (e. g. PLPMTU not modified). * The information could beunableutilized as an input tosustain packets of thetrigger enabling a resilience mode. BASE_PMTUsize. These paths<= PTB_SIZE < PLPMTU * Black Hole Detection is triggered and the PLPMTU ought to be set to BASE_PMTU. * The PL could usean alternate algorithm to implementthePROBE_ERROR phase that allows fallbackPTB_SIZE reported in the PTB message to initialize asmaller than desired PLPMTU, rather than suffer connectivity failure. This could also utilise methods such as endpoint IP fragmentationsearch algorithm. PLPMTU < PTB_SIZE < PROBED_SIZE * The PLPMTU continues toenablebe valid, but thePL sender to communicate using packets smallerlast PROBED_SIZE searched was larger than theBASE_PMTU. 5.2.6. DISABLED Phase This phase suspends operation of DPLPMTUD. It disables probing for theactual PMTU. * The PLPMTUuntil actionistaken by thenot updated. * The PLor application usingcan use thePL. 5.3. State Machine A state machine for DPLPMTUD is depicted in Figure 4. If multihoming is supported, a state machine is needed for each path. | | | Start | PL indicates loss | | of connectivity V V +---------------+ +---------------+ | DISABLED | | ERROR | +---------------+ +---------------+ | PL indicates PROBE_TIMER expiry: ^ | | connectivity PROBE_COUNT = MAX_PROBES | | +--------------------+ +---------------+ | | | | V | BASE_PMTU Probe | +---------------+ acked | | BASE |----------------------+ +---------------+ | Black hole detected or ^ | ^ ^ Black hole detected or |reported PTB_SIZE< PLPMTU | | | |from the PTB message as the next search point when it resumes the search algorithm. xxx Author Note: Do we want to specify how to handle PTB Message with PTB_SIZE<= 0? xxx 5. Datagram Packetization Layer PMTUD This section specifies Datagram PLPMTUD (DPLPMTUD). The method can be introduced at various points (as indicated with * in the figure below) in the IP protocol stack to discover the PLPMTU so that an application can utilize an appropriate MPS for the current network path. DPLPMTUD SHOULD NOT be used by an application if it is already used in a lower layer. +----------------------+ |+--------------------+ | | +--------------------+ | | +----+ | | | PROBE_TIMER expiry: | | | PROBE_COUNT < MAX_PROBES | | | | | | PMTU_RAISE_TIMER expiry | | | +-----------------------------------------+ | | |Application* | +-+-------+----+----+--+ | | | | +---+--+ +--+--+ |V+-+---+ |V +---------------+ +---------------+ |SEARCH_COMPLETE|QUIC*| |UDPO*| |SEARCHING|SCTP*| +---+--+ +--+--+ |+---------------+ +---------------++--+--+ |^ ^| |^| | +-------+--+ | | | | | |+-----------------------------------------+| +-+-+--+ | | UDP |MAX_PMTU Probe acked or| +---+--+ | | |PTB (BASE_PMTU <= PTB_SIZE < PROBED_SIZE) or+--------------+-----+-+ | Network Interface |+----+ PROBE_COUNT = MAX_PROBES +----+ CONFIRMATION_TIMER expiry: PROBE_TIMER expiry: PROBE_COUNT < MAX_PROBES or PROBE_COUNT < MAX_PROBES or PLPMTU Probe acked Probe acked+----------------------+ Figure4: State machine for Datagram PLPMTUD. Note: Some state changes are not show to simplify the diagram. The following states are defined: DISABLED: The DISABLED state is the initial state before probing has started. It is also entered from any other state, when the PL indicates loss of connectivity. This state is left, once the PL indicates connectivity to the remote PL. BASE: The BASE state is used to confirm that the BASE_PMTU size is supported by the network path and is designed to allow an application to continue working when there are transient reductions in the actual PMTU. It also seeks to avoid long periods1: Examples wheretraffic is black holed while searching for a larger PLPMTU. On entry, the PROBED_SIZEDPLPMTUD can be implemented The central idea of DPLPMTUD issetprobing by a sender. Probe packets are sent to find theBASE_PMTUmaximum sizeand the PROBE_COUNT is set to zero. Each time a probe packet is sent, and the PROBE_TIMER is started. The state is exited when the probe packet is acknowledged, and the PL sender enters the SEARCHING state. The state is also left when the PROBE_COUNT reaches MAX_PROBES;of aPTBuser messageis validated. This causesthat can be completely transferred across the network path from thePLsender toenter the ERROR state. SEARCHING: The SEARCHING state is the main probing state. This state is entered when probing fortheBASE_PMTU was successful.destination. ThePROBE_COUNT is set to zero whenfolloowing sections identify thefirst probe packet is sentcomponents needed foreach probe size. Each time a probe packet is acknowledged, the PLPMTU is set toimplementation, provides an overvoew of thePROBED_SIZE,phases of operation, andthen the PROBED_SIZE is increased usingspecifies the state machine and search algorithm.When a probe packet is sent and not acknowledged within the period of the PROBE_TIMER,5.1. DPLPMTUD Components This section describes thePROBE_COUNT is incrementedtimers, constants, andthe probe packet is retransmitted.variables of DPLPMTUD. 5.1.1. Timers Thestatemethod utilizes up to three timers: PROBE_TIMER: The PROBE_TIMER isexited when the PROBE_COUNT reaches MAX_PROBES;configured to expire after aPTB message is validated;period longer than the maximum time to receive an acknowledgment to a probeof size MAX_PMTU is acknowledged or black hole detection is triggered. SEARCH_COMPLETE: The SEARCH_COMPLETE state indicates a successful end topacket. This value MUST NOT be smaller than 1 second, and SHOULD be larger than 15 seconds. Guidance on selection of thePROBE_SEARCH state. DPLPMTUD remainstimer value are provided inthis state until eithersection 3.1.1 of the UDP Usage Guidelines [RFC8085]. If thePMTU_RAISE_TIMER expires; a received PTB message is validated; or black hole detection is triggered. When DPLPMTUD uses an unacknowledgedPL has a path Round Trip Time (RTT) estimate and timely acknowledgements the PROBE_TIMER can be derived from the PL RTT estimate. PMTU_RAISE_TIMER: The PMTU_RAISE_TIMER isinconfigured to theSEARCH_COMPLETE state,period aCONFIRMATION_TIMER periodically resetssender will continue to use thePROBE_COUNT and schedules a probe packet withcurrent PLPMTU, after which it re-enters thesizeSearch phase. This timer has a period of 600 secs, as recommended by PLPMTUD [RFC4821]. DPLPMTUD MAY inhibit sending probe packets when no application data has been sent since thePLPMTU. If theprevious probepacket failspacket. A PL preferring tobe acknowledged after MAX_PROBES attempts, the method enters the BASE state.use an up-to-data PMTU once user data is sent again, can choose to continue PMTU discovery for each path. However, this may result in sending additional packets. CONFIRMATION_TIMER: Whenused withan acknowledged PL(e.g., SCTP), DPLPMTUD SHOULD NOT continue to generate PLPMTU probes inis used, thisstate. ERROR: The ERROR state represents the case where eithertimer MUST NOT be used. For other PLs, thenetwork pathCONFIRMATION_TIMER isnot knownconfigured tosupportthe period aPLPMTU of at leastPL sender waits before confirming theBASE_PMTU size or when therecurrent PLPMTU iscontradictory information about the network path that would otherwise result in excessive variation instill supported. This is less than theMPS signalledPMTU_RAISE_TIMER and used to decrease thehigher layer. The state implementsPLPMTU (e.g., when amethodblack hole is encountered). Confirmation needs tomitigate oscillation inbe frequent enough when data is flowing that thestate-event engine. It signals a conservative valuesending PL does not black hole extensive amounts of traffic. Guidance on selection of theMPS to the higher layer bytimer value are provided in section 3.1.1 of thePL. The state is exitedUDP Usage Guidelines [RFC8085]. DPLPMTUD MAY inhibit sending probe packets whenPacket Probesnolonger detect the error or when the PL indicates that connectivityapplication data has beenlost. Implementations are permitted to enable endpoint fragmentation ifsent since theDPLPMTUD is unableprevious probe packet. A PL preferring tovalidate MIN_PMTU within PROBE_COUNT probes. If DPLPMTUDuse an up-to-data PMTU once user data isunablesent again, can choose tovalidate MIN_PMTU thecontinue PMTU discovery for each path. However, this may result in sending additional packets. An implementationshould transition to PROBE_DISABLED. Appendix A contains an informative descriptioncould implement the various timers using a single timer. 5.1.2. Constants The following constants are defined: MAX_PROBES: The MAX_PROBES is the maximum value ofkey events. 5.4. Search to IncreasethePLPMTU This section describesPROBE_COUNT counter (see Section 5.1.3). The default value of MAX_PROBES is 10. MIN_PMTU: The MIN_PMTU is thealgorithms used by DPLPMTUD to search for a larger PLPMTU. 5.4.1. Probing for a Larger PLPMTU Implementations use a search algorithm acrosssmallest allowed probe packet size. For IPv6, this value is 1280 bytes, as specified in [RFC2460]. For IPv4, thesearch rangeminimum value is 68 bytes. Note: An IPv4 router is required todetermine whether a larger PLPMTU canbesupported acrossable to forward anetwork path. The method discoversdatagram of 68 bytes without further fragmentation. This is thesearch range by confirmingcombined size of an IPv4 header and the minimumPLPMTU and then usingfragment size of 8 bytes. In addition, receivers are required to be able to reassemble fragmented datagrams at least up to 576 bytes, as stated in section 3.3.3 of [RFC1122]. MAX_PMTU: The MAX_PMTU is theprobe methodlargest size of PLPMTU. This has toselect a PROBED_SIZEbe less than or equal toMAX_PMTU. MAX_PMTU isthe minimum of the local MTU of the outgoing interface andEMTU_R (learned fromtheremote endpoint). The MAX_PMTUdestination PMTU for receiving. An application, or PL, MAYbe reduced by an application that sets a maximum toreduce thesize of datagrams it will send. The PROBE_COUNT is initialised to zeroMAX_PMTU whena probe packet is first sent with a particular size. A timerthere isused by the search algorithmno need totrigger the sending of probesend packetsof size PROBED_SIZE,larger thanthe PLPMTU. Each probe packet successfully sent to the remote peer is confirmed by acknowledgement at the PL, see Section 4.1. Each timeaprobe packetspecific size. BASE_PMTU: The BASE_PMTU issenta configured size expected tothe destination, the PROBE_TIMER is started.work for most paths. Thetimersize iscancelled when the PL receives acknowledgment thatequal to or larger than theprobe packet has been successfully sent acrossMIN_PMTU and smaller than thepath Section 4.1. This confirms thatMAX_PMTU. In thePROBED_SIZEcase of IPv6, this value issupported, and the1280 bytes [RFC2460]. When using IPv4, a size of 1200 bytes is RECOMMENDED. 5.1.3. Variables This method utilizes a set of variables: PROBED_SIZE: The PROBED_SIZEvalueisthen assigned tothePLPMTU. The search algorithm can continue to send subsequent probe packetssize ofan increasing size. Ifthetimer expires before acurrent probepacketpacket. This isacknowledged, the probe has failed to confirm the PROBED_SIZE. Each time the PROBE_TIMER expires,a tentative value for thePROBE_COUNTPLPMTU, which isincremented, the PROBE_TIMERawaiting confirmation by an acknowledgment. PROBE_COUNT: The PROBE_COUNT isreinitialised, andaprobe packetcount of thesamenumber of unsuccessful probe packets that have been sent with a size of PROBED_SIZE. The value isretransmitted (the replicated probe improve the resilienceinitialized toloss). The maximum number of retransmissions forzero when a particular size of PROBED_SIZE isconfigured (MAX_PROBES). Iffirst attempted. The figure below illustrates thevalue ofrelationship between thePROBE_COUNT reaches MAX_PROBES, probing will stop,packet size constants andthe PL sender enters the SEARCH_COMPLETE state. 5.4.2. Selection of Probe Sizes The search algorithm needs to determine a minimum useful gain in PLPMTU. It would not be constructive forvariables at aPL sender to attempt to probe for all sizes - this would incur unnecessary load on the path and has the undesirable effectpoint ofslowing thetimeto reach a more optimal MPS. Implementations SHOULD selectwhen theset of probe packet sizesDPLPMTUD algorithm performs path probing tomaximise the gain in PLPMTU from each search step. Implementations could optimize the search procedure by selecting step sizes from a table of common PMTU sizes. When selectingincrease theappropriate nextsizeto search, an implementor ought to also consider that there can be common sizesofMPS that applications seek to use. xxx Author Note:the PLPMTU. Afuture versionprobe packet has been sent of size PROBED_SIZE. Once thissectionis acknowledged, the PLPMTU willdetail example methods for selecting probe size values, but does not plan to mandate a single method. xxx 5.4.3. Resilienceraise toInconsistent Path Information A decisionPROBED_SIZE allowing the DPLPMTUD algorithm to further increase PROBED_SIZE towards the actual PMTU. MIN_PMTU MAX_PMTU <--------------------------------------------------> | | | | v | | v BASE_PMTU | v Actual PMTU | PROBED_SIZE v PLPMTUneeds to be resilient to the possibility that information learned about the network path is inconsistent (this could happen when probe packets are lost due to other reasons, or someFigure 2: Relationships between packet size constants and variables 5.1.4. Overview ofthe packets in a flow are forwarded alongDPLPMTUD Phases This section provides aportionhigh-level informative view of thepath that supports a different actual PMTU). Frequent path changes could occur due to unexpected "flapping" - where some packets from a flow pass along one path, but other packets follow a different path with different properties.DPLPMTUDcan be made resilient to these anomaliesmethod, byintroducing hysteresis intodescribing thesearch decisionmovement of the method through several phases of operation. More detail is available in the state machine Section 5.2. +------+ +------->| Base |----------------+ Connectivity | +------+ | or BASE_PMTU | | | confirmation failed | | v | | Connectivity +-------+ | | and BASE_PMTU | Error | | | confirmed +-------+ | | | | v | Consistent connectivity PLPMTU | +--------+ | and BASE_PMTU confirmation | | Search |<--------------+ confirmed failed | +--------+ | ^ | | | | | Raise | | Search | timer | | algorithm | expired | | completed | | | | | v | +-----------------+ +---| Search Complete | +-----------------+ Figure 3: DPLPMTUD Phases BASE_PMTU Confirmation Phase * The BASE_PMTU Confirmation Phase confirms connectivity toincreasetheMPS. 6. Specification of Protocol-Specific Methodsremote peer. Thissection specifies protocol-specific details for datagram PLPMTUDphase is implicit forIETF-specified transports. The first subsection provides guidance on how to implement the DPLPMTUD method asapart ofconnection-oriented PL (where it can be performed in a PL connection handshake). A connectionless PL needs to send anapplication using UDP or UDP-Lite.acknowledged probe packet to confirm that the remote peer is reachable. * Theguidancesender alsoapplies to other datagram servicesconfirms thatdoBASE_PMTU is supported across the network path. * A PL that does notincludewish to support aspecific transport protocol (such aspath with atunnel encapsulation). The following subsections describe how DPLPMTUDPLPMTU less than BASE_PMTU canbe implemented as a part of the transport service, allowing applications usingsimplify theservice to benefit from discovery ofphase into a single step by performing thePLPMTU without themselves needing to implement this method. 6.1. Application support for DPLPMTUDconnectivity checks withUDP or UDP-Lite The current specifications of UDP [RFC0768] and UDP-Lite [RFC3828] do not defineamethod in the RFC-series that supports PLPMTUD. In particular,probe of theUDP transport does not provideBASE_PMTU size. * Once confirmed, DPLPMTUD enters thetransport layer features neededSearch Phase. If this phase fails toimplement datagram PLPMTUD. Theconfirm, DPLPMTUDmethod can be implemented asenters the Error Phase. Search Phase * The Search Phase utilizes apart of an application built directly or indirectly on UDP or UDP-Lite, but relies on higher-layer protocol featuressearch algorithm toimplement the method [RFC8085]. Some primitives used by DPLPMTUD might not be available viasend probe packets to seek to increase theDatagram API (e.g.,PLPMTU. * The algorithm concludes when it has found a suitable PLPMTU, by entering theabilitySearch Complete Phase. * A PL could respond toaccessPTB messages using thePLPMTU cache, or interpret receivedPTBmessages). In addition, it is desirable that PMTU discovery is not performedto advance or terminate the search, see Section 4.4. * Black Hole Detection can also terminate the search bymultiple protocol layers. An application SHOULD avoid implementing DPLPMTUDentering the BASE_PMTU Confirmation phase. Search Complete Phase * The Search Complete Phase is entered when theunderlying transport system provides this capability. UsingPLPMTU is supported across the network path. * A PL can use acommon methodCONFIRMATION_TIMER to periodically repeat a probe packet formanagingthe current PLPMTUhas benefits, both insize. If theability to share state between different processes and opportunitiessender is unable tocoordinate probing. 6.1.1. Application Request An application needs an application-layer protocol mechanism (such as a message acknowledgement method) that solicits a response fromconfirm reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL signals adestination endpoint.lack of reachability, DPLPMTUD enters the BASE_PMTU Confirmation phase. * Themethod SHOULD allowPMTU_RAISE_TIMER is used to periodically resume thesendersearch phase tocheckdiscover if thevalue returned inPLPMTU can be raised. * Black Hole Detection or receipt of a validated PTB message Section 4.4.1) can cause theresponsesender toprovide additional protection from off-path insertion of data [RFC8085], suitable methods includeenter the BASE_PMTU Confirmation Phase. Error Phase * The Error Phase is entered when there is conflicting or invalid PLPMTU information for the path (e.g. aparameter known onlyfailure to support thetwo endpoints, such as a session ID or initialised sequence number. 6.1.2. Application Response An application needs an application-layer protocol mechanismBASE_PMTU) that cause DPLPMTUD tocommunicatebe unable to progress and theresponse fromPLPMTU is lowered * DPLPMTUD remains in thedestination endpoint. This response may indicate successful receptionError Phase until a consistent view of theprobe across the path, but couldpath can be discovered and it has alsoindicatebeen confirmed thatsome (or all packets) have failed to reachthedestination. 6.1.3. Sending Application Probe Packets A probe packet that may carry an application data block, butpath supports thesuccessful transmission of this dataBASE_PMTU (or DPLPMTUD isat risk when used for probing. Some applicationssuspended). * Note: MIN_PMTU maypreferbe identical touse a probe packetBASE_PMTU, simplifying the actions in this phase. An implementation thatdoes not carry an application data blockonly reduces the PLPMTU toavoid disruptiona suitable size would be sufficient tonormal data transfer. 6.1.4. Validatingensure reliable operation, but can be very inefficient when thePath An application that does not have other higher-layer information confirming correct deliveryactual PMTU changes or when the method (for whatever reason) makes a suboptimal choice for the PLPMTU. A full implementation ofdatagrams SHOULD implementDPLPMTUD provides an algorithm enabling theCONFIRMATION_TIMERDPLPMTUD sender toperiodically send probe packets whileincrease the PLPMTU following a change in theSEARCH_COMPLETE state. 6.1.5. Handlingcharacteristics ofPTB Messages An application that is able and wishes to receive PTB messages MUST perform ICMP validationthe path, such asspecifiedwhen a link is reconfigured with a larger MTU, or when there is a change inSection 5.2 of [RFC8085]. This requires thattheapplication to check each received PTB messages to validate itset of links traversed by an end-to-end flow (e.g., after a routing or path fail-over decision). 5.2. State Machine A state machine for DPLPMTUD isreceiveddepicted inresponseFigure 4. If multipath or multihoming is supported, a state machine is needed for each path. Note: Some state changes are not shown totransmitted traffic and thatsimplify thereporteddiagram. | | | Start | PL indicates loss | | of connectivity v v +---------------+ +---------------+ | DISABLED | | ERROR | +---------------+ PROBE_TIMER expiry: +---------------+ | PL indicates PROBE_COUNT = MAX_PROBES or ^ | | connectivity PTB: PTB_SIZE < BASE_PMTU | | +--------------------+ +---------------+ | | | | v | BASE_PMTU Probe | +---------------+ acked | | BASE |----------------------+ +---------------+ | Black hole detected or ^ | ^ ^ Black hole detected or | PTB: PTB_SIZE < PLPMTU | | | | PTB: PTB_SIZE < PLPMTU | +--------------------+ | | +--------------------+ | | +----+ | | | PROBE_TIMER expiry: | | | PROBE_COUNT < MAX_PROBES | | | | | | PMTU_RAISE_TIMER expiry | | | +-----------------------------------------+ | | | | | | | | | v | v +---------------+ +---------------+ |SEARCH_COMPLETE| | SEARCHING | +---------------+ +---------------+ | ^ ^ | | ^ | | | | | | | | +-----------------------------------------+ | | | | MAX_PMTU Probe acked or PROBE_TIMER | | | | expiry: PROBE_COUNT = MAX_PROBES or | | +----+ PTB: PTB_SIZE = PLPMTU +----+ CONFIRMATION_TIMER expiry: PROBE_TIMER expiry: PROBE_COUNT < MAX_PROBES or PROBE_COUNT < MAX_PROBES or PLPMTU Probe acked Probe acked or PTB: PLPMTU < PTB_SIZEis less than the current probed size (see Section 4.4.2). A validated PTB message MAY be used as input to the DPLPMTUD algorithm, but MUST NOT be used directly to set the PLPMTU. 6.2. DPLPMTUD with UDP Options UDP Options[I-D.ietf-tsvwg-udp-options] can supply the additional functionality required to implement DPLPMTUD within the UDP transport service. Implementing DPLPMTUD using UDP Options avoids the need< PROBED_SIZE Figure 4: State machine foreach application to implement the DPLPMTUD method. Section 5.6 of[I-D.ietf-tsvwg-udp-options] definesDatagram PLPMTUD The following states are defined: DISABLED: The DISABLED state is theMaximum Segment Size (MSS) option, which allowsinitial state before probing has started. It is also entered from any other state, when thelocal sender to indicatePL indicates loss of connectivity. This state is left, once theEMTU_RPL indicates connectivity to thepeer.remote PL. BASE: Thevalue received in this option can beBASE state is used toinitialise MAX_PMTU. UDP Options enables padding to be added to UDP datagramsconfirm thatare used as Probe Packets. Feedback confirming reception of each Probe Packetthe BASE_PMTU size isprovidedsupported bytwo new UDP Options: o The Probe Request Option (Section 6.2.1)the network path and isset by a sending PLdesigned tosolicit a response from a remote endpoint. A four-byte token identifies each request. o The Probe Response Option (Section 6.2.2 is generated by the UDP Options receiver in responseallow an application toreception of a previously received Probe Request Option. Each Probe Response Option echoes a previously received four-byte token. The token value allows implementationscontinue working when there are transient reductions in the actual PMTU. It also seeks tobe distinguish between acknowledgementsavoid long periods where traffic is black holed while searching forinitial probe packets and acknowledgements confirming receipt of subsequent probe packets (e.g., travelling along alternate paths witha largerRTT).PLPMTU. On entry, the PROBED_SIZE is set to the BASE_PMTU size and the PROBE_COUNT is set to zero. Each time a probe packetneeds to be uniquely identifiable byis sent, theUDP OptionsPROBE_TIMER is started. The state is exited when the probe packet is acknowledged, and the PL senderwithinenters theMaximum Segment Lifetime (MSL).SEARCHING state. TheUDP Options sender therefore needs to not recycle token values until they have expiredstate is also left when the PROBE_COUNT reaches MAX_PROBES orhave been acknowledged. A 4 byte value fora received PTB message is validated. This causes thetoken field provides sufficient space for multiple unique probesPL sender tobe made withinenter theMSL.ERROR state. SEARCHING: Theinitial value ofSEARCHING state is thefour byte token field SHOULD be assigned to a randomised value, as described in section 5.1 of [RFC8085]) to enhance protection from off-path attacks. Implementations ought to only send a probe packet with a Request Probe Option when required by their localmain probing state. This statemachine, i.e.,is entered when probingto grow the PLPMTU or to confirmfor thecurrent PLPMTU.BASE_PMTU was successful. TheprocedurePROBE_COUNT is set tohandlezero when theloss offirst probe packet is sent for each probe size. Each time aresponseprobe packet is acknowledged, theresponsibility of the sender of the request. Implementations are allowedPLPMTU is set totrack multiple requeststhe PROBED_SIZE, andrespond to them with a single packet. A PL needs to determine thatthen thepath can still supportPROBED_SIZE is increased using the search algorithm. When a probe packet is sent and not acknowledged within thesizeperiod ofdatagram thattheapplicationPROBE_TIMER, the PROBE_COUNT iscurrently sending inincremented and theDPLPMTUD search_doneprobe packet is retransmitted. The state(i.e., to detect black-holing of data). One way to achieve thisisto sendexited when the PROBE_COUNT reaches MAX_PROBES, a received PTB message is validated, a probepacketsof sizePLPMTUMAX_PMTU is acknowledged, orto utiliseahigher-layer method that provides explicit feedback indicating any packet loss. Another possibilityblack hole isto utilise data packets that carry a Timestamp Option. Reception ofdetected. SEARCH_COMPLETE: The SEARCH_COMPLETE state indicates avalid timestamp that was echoed by the remote endpoint can be usedsuccessful end toinfer connectivity. This can provide useful feedback even over paths with asymmetric capacity and/or that carry UDP Option flows that have very asymmetric datagram rates, because an echo ofthemost recent timestamp still indicates reception of at least one packet ofSEARCHING state. DPLPMTUD remains in this state until either thetransmitted size. This is sufficient to confirm therePMTU_RAISE_TIMER expires, a received PTB message isno black hole. In contrast, when sendingvalidated, or aprobe to increaseblack hole is detected. When DPLPMTUD uses an unacknowledged PL and is in thePLPMTU,SEARCH_COMPLETE state, atimestamp might be unable to unambiguously identify thatCONFIRMATION_TIMER periodically resets the PROBE_COUNT and schedules aspecificprobe packethas been received. Timestamp mechanisms cannot be used to confirmwith thereceptionsize ofindividualthe PLPMTU. If the probemessages and cannot be usedpacket fails tostimulate a response frombe acknowledged after MAX_PROBES attempts, theremote peer. 6.2.1. UDP Probe Request Option The Probe Request Option allows a sending endpointmethod enters the BASE state. When used with an acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue tosolicit a response from a destination endpoint.generate PLPMTU probes in this state. ERROR: TheProbe Request Option carries a four byte token set byERROR state represents thesender. This token can be set to a value thatcase where either the network path islikely to benot knownonlyto support a PLPMTU of at least thesender (andBASE_PMTU size or when there issent along the end-to-end path). The initial value ofcontradictory information about thetoken SHOULD be assigned to a randomised value, as describednetwork path that would otherwise result insection 5.1 of [RFC8085])excessive variation in the MPS signalled toenhance protection from off-path attacks.the higher layer. Thesender needsstate implements a method tothen check the value returnedmitigate oscillation in theUDP Probe Response Option. Thestate-event engine. It signals a conservative value of theToken field, uniquely identifies a probe within the maximum segment lifetime. +----------+--------+-----------------+ | Kind=9* | Len=6 | Token | +----------+--------+-----------------+ 1 byte 1 byte 4 bytes * To be confirmed by IANA. Figure 5: UDP Probe REQ Option Format 6.2.2. UDP Probe Response Option The Probe Response Option is generated in responseMPS toreception of a previously received Probe Request Option. This response is generatedthe higher layer by theUDP Option processing. The Probe Response Option carries a four byte token field.PL. TheToken field associatesstate is exited when packet probes no longer detect theresponse witherror or when theToken value carried inPL indicates that connectivity has been lost. Implementations are permitted to enable endpoint fragmentation if themost recently-received Echo Request. The rate of generation of UDP packets carrying a Probe Response OptionDPLPMTUD isexpectedunable tobe less than once per RTT and SHOULD be rate-limited (see Section 9). +----------+--------+-----------------+ | Kind=10* | Len=6 | Token | +----------+--------+-----------------+ 1 byte 1 byte 4 bytes * To be confirmedvalidate MIN_PMTU within PROBE_COUNT probes. If DPLPMTUD is unable to validate MIN_PMTU the implementation should transition to the DISABLED state. 5.3. Search to Increase the PLPMTU This section describes the algorithms used byIANA. Figure 6: UDP Probe RES Option Format 6.3.DPLPMTUD to search forSCTP Section 10.2 of [RFC4821] specifiesarecommended PLPMTUD probing methodlarger PLPMTU. 5.3.1. Probing forSCTP. It recommends thea larger PLPMTU Implementations useofa search algorithm across thePAD chunk, defined in [RFC4820] to be attachedsearch range to determine whether aminimum length HEARTBEAT chunk to buildlarger PLPMTU can be supported across aprobe packet. This enables probing without affectingnetwork path. The method discovers thetransfer of user messagessearch range by confirming the minimum PLPMTU andwithout interfering with congestion control. This is preferred tothen usingDATA chunks (with padding as required) as path probes. XXX Author Note: Future versions of this document might define a parameter contained intheINIT and INIT ACK chunkprobe method toindicate the remote peer MTUselect a PROBED_SIZE less than or equal to MAX_PMTU. MAX_PMTU is the minimum of the localpeer. However, multihoming makes this a bit complex, so it might not be worth doing. XXX 6.3.1. SCTP/IPv4MTU andSCTP/IPv6EMTU_R (learned from the remote endpoint). Thebase protocol is specified in [RFC4960]. This providesMAX_PMTU MAY be reduced by anacknowledged PL. A sender can therefore enterapplication that sets a maximum to thePROBE_BASE state as soon as connectivity has been confirmed. 6.3.1.1. Sending SCTP Probe Packets Probe packets consistsize ofan SCTP common header followed by a HEARTBEAT chunk and a PAD chunk.datagrams it will send. ThePAD chunkPROBE_COUNT isusedinitialized tocontrol the length of thezero when a probepacket. The HEARTBEAT chunkpacket is first sent with a particular size. A timer is used by the search algorithm to trigger the sending ofa HEARTBEAT ACK chunk. The receptionprobe packets of size PROBED_SIZE, larger than theHEARTBEAT ACK chunk acknowledges reception ofPLPMTU. Each probe packet successfully sent to the remote peer is confirmed by acknowledgement at the PL, see Section 4.1. Each time asuccessful probe.probe packet is sent to the destination, the PROBE_TIMER is started. TheHEARTBEAT chunk carries a Heartbeat Information parameter which should include, besidestimer is canceled when theinformation suggested in [RFC4960],PL receives acknowledgment that the probesize, whichpacket has been successfully sent across the path Section 4.1. This confirms that the PROBED_SIZE is supported, and thesize ofPROBED_SIZE value is then assigned to thecomplete datagram.PLPMTU. Thesizesearch algorithm can continue to send subsequent probe packets of an increasing size. If thePAD chunktimer expires before a probe packet istherefore computed by reducingacknowledged, theprobing size byprobe has failed to confirm theIPv4 or IPv6 header size,PROBED_SIZE. Each time theSCTP common header,PROBE_TIMER expires, theHEARTBEAT requestPROBE_COUNT is incremented, the PROBE_TIMER is reinitialized, and a probe packet of thePAD chunk header.same size is retransmitted (the replicated probe improve the resilience to loss). Thepayloadmaximum number of retransmissions for a particular size is configured (MAX_PROBES). If the value of the PROBE_COUNT reaches MAX_PROBES, probing will stop, and the PL sender enters the SEARCH_COMPLETE state. 5.3.2. Selection of Probe Sizes The search algorithm needs to determine a minimum useful gain in PLPMTU. It would not be constructive for a PL sender to attempt to probe for all sizes. This would incur unnecessary load on thePAD chunk contains arbitrary data. To avoid fragmentation of retransmitted data, probing starts right afterpath and has thehandshake, before data is sent. Assuming normal behaviour (i.e.,undesirable effect of slowing thePMTU is smaller than or equaltime tothe interface MTU), this process will takereach afew round trip time periods depending onmore optimal MPS. Implementations SHOULD select thenumberset ofPMTUprobe packet sizesprobed. The Heartbeat timer can be usedtoimplement the PROBE_TIMER. 6.3.1.2. Validating the Path with SCTP Since SCTP provides an acknowledged PL, a sender MUST NOT implementmaximize theCONFIRMATION_TIMER whilegain in PLPMTU from each search step. Implementations could optimize theSEARCH_COMPLETE state. 6.3.1.3. PTB Message Handlingsearch procedure bySCTP Normal ICMP validation MUSTselecting step sizes from a table of common PMTU sizes. When selecting the appropriate next size to search, an implementor ought to also consider that there can beperformed as specified in Appendix Ccommon sizes of[RFC4960]. This requiresMPS thatthe first 8 bytesapplications seek to use, and their could be common sizes of MTU used within theSCTP common headernetwork. 5.3.3. Resilience to Inconsistent Path Information A decision to increase the PLPMTU needs to be resilient to the possibility that information learned about the network path is inconsistent. A path is inconsistent, when, for example, probe packets arequoted inlost due to other reasons (i. e. not packet size) or due to frequent path changes. Frequent path changes could occur by unexpected "flapping" - where some packets from a flow pass along one path, but other packets follow a different path with different properties. A PL sender is able to detect inconsistency from thepayloadsequence of PLPMTU probes that it sends or the sequence of PTBmessage, which can be the case for ICMPv4 andmessages that it receives. When inconsistent path information isnormallydetected, a PL sender could use an alternate search mode that clamps thecaseoffered MPS to a smaller value forICMPv6. WhenaPTB message has been validated, the PTB_SIZE reported inperiod of time. This avoids unnecessary loss of packets due to MTU limitation. 5.4. Robustness to Inconsistent Paths Some paths could be unable to sustain packets of thePTB message SHOULDBASE_PMTU size. To beused withrobust to these paths an implementation could implement theDPLPMTUD algorithm, providing thatError State. This allows fallback to a smaller than desired PLPMTU, rather than suffer connectivity failure. This could utilize methods such as endpoint IP fragmentation to enable thereported PTB_SIZE is lessPL sender to communicate using packets smaller than thecurrent probe size. 6.3.2. DPLPMTUD for SCTP/UDP The UDP encapsulationBASE_PMTU. 6. Specification ofSCTP is specified in [RFC6951]. 6.3.2.1. Sending SCTP/UDP Probe Packets Packet probing can be performed as specified in Section 6.3.1.1.Protocol-Specific Methods This section specifies protocol-specific details for datagram PLPMTUD for IETF-specified transports. Themaximum payload is reduced by 8 bytes, which hasfirst subsection provides guidance on how tobe considered when filling the PAD chunk. 6.3.2.2. Validatingimplement thePath with SCTP/UDP Since SCTP providesDPLPMTUD method as a part of anacknowledged PL,application using UDP or UDP-Lite. The guidance also applies to other datagram services that do not include a specific transport protocol (such as a tunnel encapsulation). The following subsections describe how DPLPMTUD can be implemented as asender MUST NOT implementpart of theCONFIRMATION_TIMER while intransport service, allowing applications using theSEARCH_COMPLETE state. 6.3.2.3. Handlingservice to benefit from discovery ofPTB Messages by SCTP/UDP Normal ICMP validation MUST be performedthe PLPMTU without themselves needing to implement this method. 6.1. Application support forPTB messages as specified in Appendix CDPLPMTUD with UDP or UDP-Lite The current specifications of[RFC4960]. This requires thatUDP [RFC0768] and UDP-Lite [RFC3828] do not define a method in thefirst 8 bytes ofRFC-series that supports PLPMTUD. In particular, theSCTP common header are contained inUDP transport does not provide thePTB message, whichtransport layer features needed to implement datagram PLPMTUD. The DPLPMTUD method can bethe case for ICMPv4 (but note the UDP header also consumesimplemented as a part of an application built directly or indirectly on UDP or UDP-Lite, but relies on higher-layer protocol features to implement thequoted packet header) and is normally the case for ICMPv6. When the validation is completed, the PTB_SIZE indicated in the PTB message SHOULD bemethod [RFC8085]. Some primitives usedwith theby DPLPMTUDproviding that the reported PTB_SIZE is less thanmight not be available via thecurrent probe size. 6.3.3. DPLPMTUD for SCTP/DTLS TheDatagramTransport Layer Security (DTLS) encapsulation of SCTPAPI (e.g., the ability to access the PLPMTU cache, or interpret received PTB messages). In addition, it isspecified in [RFC8261]. Itdesirable that PMTU discovery isused for data channels in WebRTC implementations. 6.3.3.1. Sending SCTP/DTLS Probe Packets Packet probing can be done as specified in Section 6.3.1.1. 6.3.3.2. Validatingnot performed by multiple protocol layers. An application SHOULD avoid using DPLPMTUD when thePath with SCTP/DTLS Since SCTPunderlying transport system provides this capability. To use common method for managing the PLPMTU has benefits, both in the ability to share state between different processes and opportunities to coordinate probing. 6.1.1. Application Request An application needs anacknowledged PL,application-layer protocol mechanism (such as a message acknowledgement method) that solicits a response from a destination endpoint. The method SHOULD allow the senderMUST NOT implementto check theCONFIRMATION_TIMER whilevalue returned in theSEARCH_COMPLETE state. 6.3.3.3. Handlingresponse to provide additional protection from off-path insertion ofPTB Messages by SCTP/DTLS It is not possibledata [RFC8085], suitable methods include a parameter known only toperform normal ICMP validationthe two endpoints, such asspecified in [RFC4960], since even ifa session ID or initialized sequence number. 6.1.2. Application Response An application needs an application-layer protocol mechanism to communicate theICMP message payload contains sufficient information,response from thereflected SCTP common header would be encrypted. Therefore it is not possible to process PTB messages atdestination endpoint. This response may indicate successful reception of thePL. 6.4. DPLPMTUD for QUIC Quick UDP Internet Connection (QUIC) [I-D.ietf-quic-transport] is a UDP-based transportprobe across the path, but could also indicate thatprovides reception feedback. The UDP payload includessome (or all packets) have failed to reach theQUICdestination. 6.1.3. Sending Application Probe Packets A probe packetheader, protected payload, and any authentication fields. QUIC depends on a PMTU of at least 1280 bytes. Section 9.2 of [I-D.ietf-quic-transport] describes the path considerations when sending QUIC packets. It recommendsthat may carry an application data block, but theusesuccessful transmission ofPADDING frames to build the probe packet. Pure probe-only packets are constructed with PADDING frames and PING framesthis data is at risk when used for probing. Some applications may prefer tocreateuse apadding onlyprobe packet thatwill elicitdoes not carry anacknowledgement. Padding only frames enable probingapplication data block to avoid disruption to data transfer. 6.1.4. Validating thewithout affectingPath An application that does not have other higher-layer information confirming correct delivery of datagrams SHOULD implement thetransferCONFIRMATION_TIMER to periodically send probe packets while in the SEARCH_COMPLETE state. 6.1.5. Handling ofother QUIC frames. The recommendation for QUIC endpoints implementing DPLPMTUD is thereforePTB Messages An application thata MPSismaintained for each combination of localable andremote IP addresses [I-D.ietf-quic-transport]. If a QUIC endpoint determineswishes to receive PTB messages MUST perform ICMP validation as specified in Section 5.2 of [RFC8085]. This requires that thePMTU between any pair of local and remote IP addresses has fallen below an acceptable MPS,application to check each received PTB messages to validate itneedsis received in response toimmediately cease sending QUIC packets ontransmitted traffic and that theaffected path. This could result in termination ofreported PTB_SIZE is less than theconnection if an alternative path cannot be found [I-D.ietf-quic-transport]. 6.4.1. Sending QUIC Probe Packetscurrent probed size (see Section 4.4.2). Aprobe packet consistsvalidated PTB message MAY be used as input to the DPLPMTUD algorithm, but MUST NOT be used directly to set the PLPMTU. 6.2. DPLPMTUD for SCTP Section 10.2 of [RFC4821] specifies a recommended PLPMTUD probing method for SCTP. It recommends the use of the PAD chunk, defined in [RFC4820] to be attached to aQUIC Header andminimum length HEARTBEAT chunk to build apayload containing PADDING Framesprobe packet. This enables probing without affecting the transfer of user messages and without interfering with congestion control. This is preferred to using DATA chunks (with padding as required) as path probes. XXX Author Note: Future versions of this document might define aPING Frame. PADDING Frames are a single octet (0x00)parameter contained in the INIT andseveral of these can be usedINIT ACK chunk tocreateindicate the remote peer MTU to the local peer. However, multihoming makes this aprobe packet of size PROBED_SIZE. QUICbit complex, so it might not be worth doing. XXX 6.2.1. SCTP/IPv4 and SCTP/IPv6 The base protocol is specified in [RFC4960]. This provides an acknowledgedPL,PL. A sender can therefore enter thePROBE_BASEBASE state as soon as connectivity has been confirmed. 6.2.1.1. Sending SCTP Probe Packets Probe packets consist of an SCTP common header followed by a HEARTBEAT chunk and a PAD chunk. The PAD chunk is used to control the length of the probe packet. The HEARTBEAT chunk is used to trigger the sending of a HEARTBEAT ACK chunk. The reception of the HEARTBEAT ACK chunk acknowledges reception of a successful probe. The HEARTBEAT chunk carries a Heartbeat Information parameter which should include, besides the information suggested in [RFC4960], the probe size, which is the size of the complete datagram. The size of the PAD chunk is therefore computed by reducing the probing size by the IPv4 or IPv6 header size, the SCTP common header, the HEARTBEAT request and the PAD chunk header. Thecurrent specificationpayload ofQUIC setsthefollowing: o BASE_PMTU: 1200. A QUIC sender needs to pad initial packets to 1200 bytesPAD chunk contains arbitrary data. To avoid fragmentation of retransmitted data, probing starts right after the PL handshake, before data is sent. Assuming this behavior (i.e., the PMTU is smaller than or equal toconfirmthepath can support packets ofinterface MTU), this process will take auseful size. o MIN_PMTU: 1200 bytes. A QUIC sender that determinesfew round trip time periods depending on the number of PMTUhas fallen below 1200 bytes MUST immediately stop sending onsizes probed. The Heartbeat timer can be used to implement theaffected path. 6.4.2.PROBE_TIMER. 6.2.1.2. Validating the Path withQUIC QUICSCTP Since SCTP provides an acknowledgedPL. APL, a senderthereforeMUST NOT implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state.6.4.3. Handling of6.2.1.3. PTBMessages by QUIC QUIC operates over the UDP transport, and the guidelines on ICMP validation as specified in Section 5.2 of [RFC8085] therefore apply. In addition to UDP Port validation QUIC can validate an ICMP message by looking for valid Connection IDs in the quoted packet. 7. Acknowledgements This work was partially funded by the European Union's Horizon 2020 research and innovation programme under grant agreement No. 644334 (NEAT). The views expressed are solely those of the author(s). 8. IANA Considerations This memo includes no request to IANA. XXX If new UDP Options are specified in this document, a request to IANA will be included here. XXX If there are no requirements for IANA, the section will be removed during conversion into an RFC by the RFC Editor. 9. Security Considerations The security considerations forMessage Handling by SCTP Normal ICMP validation MUST be performed as specified in Appendix C of [RFC4960]. This requires that theusefirst 8 bytes ofUDP andthe SCTP common header areprovidedquoted in thereferences RFCs. Security guidance for applications using UDP is provided inpayload of theUDP Usage Guidelines [RFC8085], specificallyPTB message, which can be thegeneration of probe packetscase for ICMPv4 and isregarded asnormally the case for ICMPv6. When a"Low Data-Volume Application", describedPTB message has been validated, the PTB_SIZE reported insection 3.1.3 of this document. This recommendsthe PTB message SHOULD be used with the DPLPMTUD algorithm, providing thatsender limits generation of probe packets to an average rate lowerthe reported PTB_SIZE is less thanone probe per 3 seconds. A PL sender needs to ensure thatthemethod used to confirm reception ofcurrent probepackets offers protection from off-path attackers injecting packets into the path. This protection if provided in IETF-defined protocols (e.g., TCP, SCTP) using a randomly-initialised sequence number. A description of one way to do this when usingsize (see Section 4.4). 6.2.2. DPLPMTUD for SCTP/UDP The UDP encapsulation of SCTP isprovidedspecified insection 5.1 of [RFC8085]). There are cases where ICMP[RFC6951]. 6.2.2.1. Sending SCTP/UDP Probe Packets PacketToo Big (PTB) messages are not delivered due to policy, configuration or equipment design (seeprobing can be performed as specified in Section1.1), this method therefore does not rely upon PTB messages being received, but6.2.1.1. The maximum payload isable to utilise these when they are receivedreduced bythe sender. PTB messages could potentially be used to cause a node8 bytes, which has toinappropriately reducebe considered when filling thePLPMTU. A node supporting DPLPMTUDPAD chunk. 6.2.2.2. Validating the Path with SCTP/UDP Since SCTP provides an acknowledged PL, a sender MUSTtherefore appropriately validateNOT implement thepayloadCONFIRMATION_TIMER while in the SEARCH_COMPLETE state. 6.2.2.3. Handling of PTB Messages by SCTP/UDP ICMP validation MUST be performed for PTB messagesto ensure these are receivedas specified inresponse to transmitted traffic (i.e., a reported error conditionAppendix C of [RFC4960]. This requires thatcorresponds to a datagram actually sent bythepath layer, see Section 4.4.1). An on-path attacker, able to create a PTB message could forgefirst 8 bytes of the SCTP common header are contained in the PTBmessages that includemessage, which can be the case for ICMPv4 (but note the UDP header also consumes avalidpart of the quotedIP packet. Such an attack couldpacket header) and is normally the case for ICMPv6. When the validation is completed, the PTB_SIZE indicated in the PTB message SHOULD be usedto drive downwith thePLPMTU. There are two ways this method can be mitigated against such attacks: First, by ensuringDPLPMTUD providing thata PL sender never reducesthePLPMTU belowreported PTB_SIZE is less than thebase size, solelycurrent probe size. 6.2.3. DPLPMTUD for SCTP/DTLS The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is specified inresponse to receiving[RFC8261]. It is used for data channels in WebRTC implementations. 6.2.3.1. Sending SCTP/DTLS Probe Packets Packet probing can be done as specified in Section 6.2.1.1. 6.2.3.2. Validating the Path with SCTP/DTLS Since SCTP provides an acknowledged PL, a sender MUST NOT implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state. 6.2.3.3. Handling of PTBmessage. This is achievedMessages byfirst enteringSCTP/DTLS It is not possible to perform ICMP validation as specified in [RFC4960], since even if thePROBE_BASE state when such aICMP messageis received. Second,payload contains sufficient information, thedesign doesreflected SCTP common header would be encrypted. Therefore it is notrequire processing of PTB messages, a PL sender could therefore suspend processing ofpossible to process PTB messages(e.g., inat the PL. 6.3. DPLPMTUD for QUIC Quick UDP Internet Connection (QUIC) [I-D.ietf-quic-transport] is arobustness mode after detecting that subsequent probes actually confirmUDP-based transport thata size larger thanprovides reception feedback. The UDP payload includes thePTB_SIZE is supported byQUIC packet header, protected payload, and any authentication fields. QUIC depends on apath). Parallel forwarding paths SHOULD be considered.PMTU of at least 1280 bytes. Section5.2.5.1 identifies the need for robustness in14.1 of [I-D.ietf-quic-transport] describes themethodpath considerations when sending QUIC packets. It recommends thepath information may be inconsistent. A node performinguse of PADDING frames to build the probe packet. Pure probe-only packets are constructed with PADDING frames and PING frames to create a padding only packet that will elicit an acknowledgement. Such padding only packets enable probing without affecting the transfer of other QUIC frames. The recommendation for QUIC endpoints implementing DPLPMTUDcould experience conflicting information aboutis that a MPS is maintained for each combination of local and remote IP addresses [I-D.ietf-quic-transport]. If a QUIC endpoint determines that the PMTU between any pair of local and remote IP addresses has fallen below an acceptable MPS, it needs to immediately cease sending QUIC packets on thesize of supported probe packets.affected path. This couldoccur when there are multiple paths are concurrently in use and these exhibit a different PMTU. If not considered, this couldresult indata being black holed when the PLPMTU is larger than the smallest PMTU acrosstermination of thecurrent paths. 10. References 10.1. Normative References [I-D.ietf-quic-transport] Iyengar, J. and M. Thomson, "QUIC:connection if an alternative path cannot be found [I-D.ietf-quic-transport]. 6.3.1. Sending QUIC Probe Packets AUDP-Based Multiplexed and Secure Transport", draft-ietf-quic-transport-16 (work in progress), October 2018. [I-D.ietf-tsvwg-udp-options] Touch, J., "Transport Options for UDP", draft-ietf-tsvwg- udp-options-05 (work in progress), July 2018. [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI 10.17487/RFC0768, August 1980, <https://www.rfc-editor.org/info/rfc768>. [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, DOI 10.17487/RFC1191, November 1990, <https://www.rfc-editor.org/info/rfc1191>. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>. [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, December 1998, <https://www.rfc-editor.org/info/rfc2460>. [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., and G. Fairhurst, Ed., "The Lightweight User Datagram Protocol (UDP-Lite)", RFC 3828, DOI 10.17487/RFC3828, July 2004, <https://www.rfc-editor.org/info/rfc3828>. [RFC4820] Tuexen, M., Stewart, R.,probe packet consists of a QUIC Header andP. Lei, "Padding Chunka payload containing PADDING Frames andParameter for the Stream Control Transmission Protocol (SCTP)", RFC 4820, DOI 10.17487/RFC4820, March 2007, <https://www.rfc-editor.org/info/rfc4820>. [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", RFC 4960, DOI 10.17487/RFC4960, September 2007, <https://www.rfc-editor.org/info/rfc4960>. [RFC6951] Tuexen, M.a PING Frame. PADDING Frames are a single octet (0x00) andR. Stewart, "UDP Encapsulationseveral ofStream Control Transmission Protocol (SCTP) Packets for End-Hostthese can be used toEnd-Host Communication", RFC 6951, DOI 10.17487/RFC6951, May 2013, <https://www.rfc-editor.org/info/rfc6951>. [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, March 2017, <https://www.rfc-editor.org/info/rfc8085>. [RFC8174] Leiba, B., "Ambiguitycreate a probe packet ofUppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>. [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., "Path MTU Discovery for IP version 6", STD 87, RFC 8201, DOI 10.17487/RFC8201, July 2017, <https://www.rfc-editor.org/info/rfc8201>. [RFC8261] Tuexen, M., Stewart, R., Jesup, R., and S. Loreto, "Datagram Transport Layer Security (DTLS) Encapsulationsize PROBED_SIZE. QUIC provides an acknowledged PL, a sender can therefore enter the BASE state as soon as connectivity has been confirmed. The current specification ofSCTP Packets", RFC 8261, DOI 10.17487/RFC8261, November 2017, <https://www.rfc-editor.org/info/rfc8261>. 10.2. Informative References [I-D.ietf-intarea-tunnels] Touch, J. and M. Townsley, "IP TunnelsQUIC sets the following: * BASE_PMTU: 1200. A QUIC sender needs to pad initial packets to 1200 bytes to confirm the path can support packets of a useful size. * MIN_PMTU: 1200 bytes. A QUIC sender that determines the PMTU has fallen below 1200 bytes MUST immediately stop sending on the affected path. 6.3.2. Validating the Path with QUIC QUIC provides an acknowledged PL. A sender therefore MUST NOT implement the CONFIRMATION_TIMER while in theInternet Architecture", draft-ietf-intarea-tunnels-09 (workSEARCH_COMPLETE state. 6.3.3. Handling of PTB Messages by QUIC QUIC operates over the UDP transport, and the guidelines on ICMP validation as specified inprogress), July 2018. [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, RFC 792, DOI 10.17487/RFC0792, September 1981, <https://www.rfc-editor.org/info/rfc792>. [RFC1122] Braden, R., Ed., "RequirementsSection 5.2 of [RFC8085] therefore apply. In addition to UDP Port validation QUIC can validate an ICMP message by looking forInternet Hosts - Communication Layers", STD 3, RFC 1122, DOI 10.17487/RFC1122, October 1989, <https://www.rfc-editor.org/info/rfc1122>. [RFC1812] Baker, F., Ed., "Requirementsvalid Connection IDs in the quoted packet. 6.4. DPLPMTUD forIP Version 4 Routers", RFC 1812, DOI 10.17487/RFC1812, June 1995, <https://www.rfc-editor.org/info/rfc1812>. [RFC2923] Lahey, K., "TCP ProblemsUDP-Options UDP Options [I-D.ietf-tsvwg-udp-options] provides a way to extend UDP to provide new transport mechanisms. Support for using DPLPMTUD withPath MTU Discovery", RFC 2923, DOI 10.17487/RFC2923, September 2000, <https://www.rfc-editor.org/info/rfc2923>. [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram Congestion Control Protocol (DCCP)", RFC 4340, DOI 10.17487/RFC4340, March 2006, <https://www.rfc-editor.org/info/rfc4340>. [RFC4443] Conta, A., Deering, S.,UDP-Options is defined in the UDP- Options specification [I-D.ietf-tsvwg-udp-options]. 7. Acknowledgements This work was partially funded by the European Union's Horizon 2020 research andM. Gupta, Ed., "Internet Control Message Protocol (ICMPv6)innovation programme under grant agreement No. 644334 (NEAT). The views expressed are solely those of the author(s). 8. IANA Considerations This memo includes no request to IANA. If there are no requirements for IANA, theInternet Protocol Version 6 (IPv6) Specification", STD 89, RFC 4443, DOI 10.17487/RFC4443, March 2006, <https://www.rfc-editor.org/info/rfc4443>. [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, <https://www.rfc-editor.org/info/rfc4821>. [RFC4890] Davies, E.section will be removed during conversion into an RFC by the RFC Editor. 9. Security Considerations The security considerations for the use of UDP andJ. Mohacsi, "RecommendationsSCTP are provided in the references RFCs. Security guidance forFiltering ICMPv6 Messagesapplications using UDP is provided inFirewalls", RFC 4890, DOI 10.17487/RFC4890, May 2007, <https://www.rfc-editor.org/info/rfc4890>. Appendix A. Event-driven state changesthe UDP Usage Guidelines [RFC8085], specifically the generation of probe packets is regarded as a "Low Data-Volume Application", described in section 3.1.3 of this document. Thisappendix containsrecommends that sender limits generation of probe packets to aninformative descriptionaverage rate lower than one probe per 3 seconds. A PL sender needs to ensure that the method used to confirm reception ofkey events: Path Setup: Whenprobe packets offers protection from off-path attackers injecting packets into the path. This protection if provided in IETF-defined protocols (e.g., TCP, SCTP) using anew pathrandomly-initialized sequence number. A description of one way to do this when using UDP isinitiated, the stateprovided in section 5.1 of [RFC8085]). There are cases where ICMP Packet Too Big (PTB) messages are not delivered due to policy, configuration or equipment design (see Section 1.1), this method therefore does not rely upon PTB messages being received, but issetable toPROBE_START. This sendsutilize these when they are received by the sender. PTB messages could potentially be used to cause aprobe packet withnode to inappropriately reduce thesize ofPLPMTU. A node supporting DPLPMTUD MUST therefore appropriately validate theBASE_PMTU. As soon aspayload of PTB messages to ensure these are received in response to transmitted traffic (i.e., a reported error condition that corresponds to a datagram actually sent by the pathis confirmed, the state changeslayer, see Section 4.4.1). An on-path attacker, able toPROBE_SEARCH. Arrival ofcreate a PTB message could forge PTB messages that include a valid quoted IP packet. Such anAcknowledgment: Depending onattack could be used to drive down theprobing state,PLPMTU. There are two ways this method can be mitigated against such attacks: First, by ensuring that a PL sender never reduces thereaction differs accordingPLPMTU below the base size, solely in response toFigure 7, which isreceiving asimplification of Figure 4 focusing on this event. +--------------+ +----------------+ | PROBE_START | --3------------------------------> | PROBE_DISABLED | +--------------+ --4---------------- ------------> +----------------+ \/ +--------------+ /\ +--------------+ | PROBE_ERROR | -------------------- \ ----------> | PROBE_BASE | +--------------+ --4--------------/ \ +--------------+ \ +--------------+ --1 -------- \ +--------------+ | PROBE_BASE | \ --- \ ------> | PROBE_ERROR | +--------------+ --3--------- \ -----/ \ +--------------+ \ \ +--------------+ \ -----> +--------------+ | PROBE_SEARCH | --2--- -----------------> | PROBE_SEARCH | +--------------+ \ ------------------> +--------------+ \ ---- / +---------------+ / \ +---------------+ |SEARCH_COMPLETE| -1--- \ |SEARCH_COMPLETE| +---------------+ -5-- -----------------------> +---------------+ \ \ +--------------+ --------------------------> | PROBE_BASE | +--------------+ Condition 1: The maximum PMTU size has not yet been reached. Condition 2: The maximum PMTU size has been reached. Condition 3: Probe Timer expires and PROBE_COUNT = MAX_PROBEs. Condition 4: PROBE_ACK received. Condition 5: Black hole detected. Figure 7: State changes atPTB message. This is achieved by first entering thearrival of an acknowledgment Probing timeout: The PROBE_COUNTBASE state when such a message isinitialised to zero each timereceived. Second, thevaluedesign does not require processing ofPROBED_SIZE is changed and whenPTB messages, aacknowledgment confirming deliveryPL sender could therefore suspend processing of PTB messages (e.g., in aprobe packet. The PROBE_TIMER is started each timerobustness mode after detecting that subsequent probes actually confirm that aprobe packet is sent. Itsize larger than the PTB_SIZE isstoppedsupported by a path). Parallel forwarding paths SHOULD be considered. Section 5.4 identifies the need for robustness in the method whenan acknowledgment arrives that confirms deliverythe path information may be inconsistent. A node performing DPLPMTUD could experience conflicting information about the size of supported probe packets. This could occur when there are multiple paths are concurrently in use and these exhibit aprobe packet of PROBED_SIZE.different PMTU. Ifthe probe packet isnotacknowledged before the PROBE_TIMER expires,considered, this could result in data being black holed when thePROBE_COUNTPLPMTU isincremented. When the PROBE_COUNT equals the value MAX_PROBES,larger than thestate is changed, otherwise a new probe packet ofsmallest PMTU across thesame size (PROBED_SIZE) is resent. The state transitions are illustratedcurrent paths. 10. References 10.1. Normative References [I-D.ietf-quic-transport] Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed and Secure Transport", draft-ietf-quic-transport-20 (work inFigure 8. This shows a simplification of Figure 4 with a focus only on this event. +--------------+ +----------------+ | PROBE_START | --2------------------------------->| PROBE_DISABLED | +--------------+ +----------------+ +--------------+ +--------------+ | PROBE_ERROR | -----------------> | PROBE_ERROR | +--------------+ / +--------------+ / +--------------+ --2----------/ +--------------+ | PROBE_BASE | --1------------------------------> | PROBE_BASE | +--------------+ +--------------+ +--------------+ +--------------+ | PROBE_SEARCH | --1------------------------------> | PROBE_SEARCH | +--------------+ --2--------- +--------------+ \ +---------------+ \ +---------------+ |SEARCH_COMPLETE| -------------------> |SEARCH_COMPLETE| +---------------+ +---------------+ Condition 1: The maximum number of probe packets has not been reached. Condition 2: The maximum number of probe packets has been reached. XXX This diagram has not been validated. Figure 8: State changes atprogress), 23 April 2019, <http://www.ietf.org/internet-drafts/draft-ietf-quic- transport-20.txt>. [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI 10.17487/RFC0768, August 1980, <https://www.rfc-editor.org/info/rfc768>. [RFC1191] Mogul, J.C. and S.E. Deering, "Path MTU discovery", RFC 1191, DOI 10.17487/RFC1191, November 1990, <https://www.rfc-editor.org/info/rfc1191>. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>. [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, December 1998, <https://www.rfc-editor.org/info/rfc2460>. [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., and G. Fairhurst, Ed., "The Lightweight User Datagram Protocol (UDP-Lite)", RFC 3828, DOI 10.17487/RFC3828, July 2004, <https://www.rfc-editor.org/info/rfc3828>. [RFC4820] Tuexen, M., Stewart, R., and P. Lei, "Padding Chunk and Parameter for theexpirationStream Control Transmission Protocol (SCTP)", RFC 4820, DOI 10.17487/RFC4820, March 2007, <https://www.rfc-editor.org/info/rfc4820>. [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", RFC 4960, DOI 10.17487/RFC4960, September 2007, <https://www.rfc-editor.org/info/rfc4960>. [RFC6951] Tuexen, M. and R. Stewart, "UDP Encapsulation ofthe probe timer PMTU raise timer timeout: DPLPMTUD periodically sends a probe packetStream Control Transmission Protocol (SCTP) Packets for End-Host todetect whether a larger PMTU is possible. This probe packet is generated by the PMTU_RAISE_TIMER. Arrival of a PTB message: The active probingEnd-Host Communication", RFC 6951, DOI 10.17487/RFC6951, May 2013, <https://www.rfc-editor.org/info/rfc6951>. [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, March 2017, <https://www.rfc-editor.org/info/rfc8085>. [RFC8174] Leiba, B., "Ambiguity ofthe path can be supported by the arrivalUppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>. [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., "Path MTU Discovery for IP version 6", STD 87, RFC 8201, DOI 10.17487/RFC8201, July 2017, <https://www.rfc-editor.org/info/rfc8201>. [RFC8261] Tuexen, M., Stewart, R., Jesup, R., and S. Loreto, "Datagram Transport Layer Security (DTLS) Encapsulation ofa PTB message indicating the PTB_SIZE. Two examples are: 1. The PTB_SIZE is between the PLPMTUSCTP Packets", RFC 8261, DOI 10.17487/RFC8261, November 2017, <https://www.rfc-editor.org/info/rfc8261>. 10.2. Informative References [I-D.ietf-intarea-tunnels] Touch, J. and M. Townsley, "IP Tunnels in theprobe that triggered the PTB message. 2. The PTB_SIZE is smaller than the PLPMTU. In first case, the PROBE_BASE state transitions to the PROBE_ERROR state. In the PROBE_SEARCH state, a new probe packet is sentInternet Architecture", draft-ietf-intarea-tunnels-09 (work in progress), 19 July 2018, <http://www.ietf.org/internet-drafts/draft-ietf-intarea- tunnels-09.txt>. [I-D.ietf-tsvwg-udp-options] Touch, J., "Transport Options for UDP", draft-ietf-tsvwg- udp-options-07 (work in progress), 8 March 2019, <http://www.ietf.org/internet-drafts/draft-ietf-tsvwg-udp- options-07.txt>. [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, RFC 792, DOI 10.17487/RFC0792, September 1981, <https://www.rfc-editor.org/info/rfc792>. [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - Communication Layers", STD 3, RFC 1122, DOI 10.17487/RFC1122, October 1989, <https://www.rfc-editor.org/info/rfc1122>. [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", RFC 1812, DOI 10.17487/RFC1812, June 1995, <https://www.rfc-editor.org/info/rfc1812>. [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", RFC 2923, DOI 10.17487/RFC2923, September 2000, <https://www.rfc-editor.org/info/rfc2923>. [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram Congestion Control Protocol (DCCP)", RFC 4340, DOI 10.17487/RFC4340, March 2006, <https://www.rfc-editor.org/info/rfc4340>. [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet Control Message Protocol (ICMPv6) for thesize reported by the PTB message. In second case, the probing starts again with a value of PROBE_BASE.Internet Protocol Version 6 (IPv6) Specification", STD 89, RFC 4443, DOI 10.17487/RFC4443, March 2006, <https://www.rfc-editor.org/info/rfc4443>. [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, <https://www.rfc-editor.org/info/rfc4821>. [RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering ICMPv6 Messages in Firewalls", RFC 4890, DOI 10.17487/RFC4890, May 2007, <https://www.rfc-editor.org/info/rfc4890>. AppendixB.A. Revision Notes Note to RFC-Editor: please remove this entire section prior to publication. Individual draft -00:o* Comments and corrections are welcome directly to the authors or via the IETF TSVWG working group mailing list.o* This update is proposed for WG comments. Individual draft -01:o* Contains the first representation of the algorithm, showing the states and timerso* This update is proposed for WG comments. Individual draft -02:o* Contains updated representation of the algorithm, and textual corrections.o* The text describing when to set the effective PMTU has not yet been validated by the authorso* To determine security to off-path-attacks: We need to decide whether a received PTB message SHOULD/MUST be validated? The text on how to handle a PTB message indicating a link MTU larger than the probe has yet not been validated by the authorso* No text currently describes how to handle inconsistent results from arbitrary re-routing along different parallel pathso* This update is proposed for WG comments. Working Group draft -00:o* This draft follows a successful adoption call for TSVWGo* There is still work to complete, please comment on this draft. Working Group draft -01:o* This draft includes improved introduction.o* The draft is updated to require ICMP validation prior to accepting PTB messages - this to be confirmed by WGo* Section added to discuss Selection of Probe Size - methods to be evlauated and recommendations to be consideredo* Section added to align with work proposed in the QUIC WG. Working Group draft -02:o* The draft was updated based on feedback from the WG, and a detailed review by Magnus Westerlund.o* The document updates RFC 4821.o* Requirements list updated.o* Added more explicit discussion of a simpler black-hole detection mode.o* This draft includes reorganisation of the section on IETF protocols.o* Added more discussion of implementation within an application.o* Added text on flapping paths.o* Replaced 'effective MTU' with new term PLPMTU. Working Group draft -03:o* Updated figureso* Added more discussion on blackhole detectiono* Added figure describing just blackhole detectiono* Added figure relating MPS sizes Working Group draft -04:o* Described phases and named these consistently.o* Corrected transition from confirmation directly to the search phase (Base has been checked).o* Redrawn state diagrams.o* Renamed BASE_MTU to BASE_PMTU (because it is a base for the PMTU).o* Clarified Error state.o* Clarified supsending DPLPMTUD.o* Verified normative text in requirements section.o* Removed duplicate text.o* Changed all text to refer to /packet probe/probe packet/ /validation/verification/ added term /Probe Confirmation/ and clarified BlackHole detection. Working Group draft -05:o* Updated security considerations.o* Feedback after speaking with Joe Touch helped improve UDP-Options description. Working Group draft -06:o* Updated description of ICMP issues in section 1.1o* Update to description of QUIC. Working group draft -07:o* Moved description of the PTB processing method from the PTB requirements section.o* Clarified what is performed in the PTB validation check.o* Updated security consideration to explain PTB security without needing to read the rest of the document.o* Reformatted state machine diagram Working group draft -08: * Moved to rfcxml v3+ * Rendered diagrams to svg in html version. * Removed Appendix A. Event-driven state changes. * Removed section on DPLPMTUD with UDP Options. * Shortened the dsecription of phases. Authors' Addresses Godred Fairhurst University of Aberdeen School ofEngineeringEngineering, Fraser Noble Building Aberdeen AB24 3UEUKUnited Kingdom Email: gorry@erg.abdn.ac.uk Tom Jones University of Aberdeen School ofEngineeringEngineering, Fraser Noble Building Aberdeen AB24 3UEUKUnited Kingdom Email: tom@erg.abdn.ac.uk Michael Tuexen Muenster University of Applied Sciences Stegerwaldstrasse 39Steinfurt48565DESteinfurt Germany Email: tuexen@fh-muenster.de Irene Ruengeler Muenster University of Applied Sciences Stegerwaldstrasse 39Steinfurt48565DESteinfurt Germany Email: i.ruengeler@fh-muenster.de Timo Voelker Muenster University of Applied Sciences Stegerwaldstrasse 39Steinfurt48565DESteinfurt Germany Email: timo.voelker@fh-muenster.de