draft-ietf-tsvwg-datagram-plpmtud-01.txt | draft-ietf-tsvwg-datagram-plpmtud-02.txt | |||
---|---|---|---|---|
Internet Engineering Task Force G. Fairhurst | Internet Engineering Task Force G. Fairhurst | |||
Internet-Draft T. Jones | Internet-Draft T. Jones | |||
Intended status: Standards Track University of Aberdeen | Updates: 4821 (if approved) University of Aberdeen | |||
Expires: September 6, 2018 M. Tuexen | Intended status: Standards Track M. Tuexen | |||
I. Ruengeler | Expires: December 08, 2018 I. Ruengeler | |||
Muenster University of Applied Sciences | Muenster University of Applied Sciences | |||
March 05, 2018 | June 08, 2018 | |||
Packetization Layer Path MTU Discovery for Datagram Transports | Packetization Layer Path MTU Discovery for Datagram Transports | |||
draft-ietf-tsvwg-datagram-plpmtud-01 | draft-ietf-tsvwg-datagram-plpmtud-02 | |||
Abstract | Abstract | |||
This document describes a robust method for Path MTU Discovery | This document describes a robust method for Path MTU Discovery | |||
(PMTUD) for datagram Packetization layers. The method allows a | (PMTUD) for datagram Packetization layers. The document describes an | |||
Packetization Layer (PL), or a datagram application that uses a PL, | extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path | |||
to probe an network path with progressively larger packets to | MTU Discovery for IPv4 and IPv6. The method allows a Packetization | |||
determine a maximum packet size. The document describes an extension | Layer (PL), or a datagram application that uses a PL, to discover | |||
to RFC 1191 and RFC 8201, which specify ICMP-based Path MTU Discovery | whether a network path can support the current size of datagram and | |||
for IPv4 and IPv6. This provides functionally for datagram | to probe a network path with progressively larger packets to find | |||
transports that is equivalent to the Packetization layer PMTUD | whether the maxium packet size can be increased. This allows a | |||
specification for TCP, specified in RFC4821. | sender to determine an appropriate packet size. This provides | |||
functionally for datagram transports that is equivalent to the | ||||
Packetization layer PMTUD specification for TCP, specified in | ||||
RFC4821. | ||||
The document also provides implementation notes for incorporating | ||||
Datagram PMTUD into IETF Datagram transports or applications that use | ||||
transports. | ||||
When published, this specification updates RFC4821. | When published, this specification updates RFC4821. | |||
Status of This Memo | Status of this Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on December 08, 2018. | ||||
This Internet-Draft will expire on September 6, 2018. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2018 IETF Trust and the persons identified as the | Copyright (c) 2018 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents (http://trustee.ietf.org/ | |||
(http://trustee.ietf.org/license-info) in effect on the date of | license-info) in effect on the date of publication of this document. | |||
publication of this document. Please review these documents | Please review these documents carefully, as they describe your rights | |||
carefully, as they describe your rights and restrictions with respect | and restrictions with respect to this document. Code Components | |||
to this document. Code Components extracted from this document must | extracted from this document must include Simplified BSD License text | |||
include Simplified BSD License text as described in Section 4.e of | as described in Section 4.e of the Trust Legal Provisions and are | |||
the Trust Legal Provisions and are provided without warranty as | provided without warranty as described in the Simplified BSD License. | |||
described in the Simplified BSD License. | ||||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 3 | 1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . . 3 | |||
1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 4 | 1.2. Packetization Layer Path MTU Discovery . . . . . . . . . . 4 | |||
1.3. Path MTU Discovery for Datagram Services . . . . . . . . 5 | 1.3. Path MTU Discovery for Datagram Services . . . . . . . . . 5 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 7 | 3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 8 | |||
3.1. PMTU Probe Packets . . . . . . . . . . . . . . . . . . . 10 | 3.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . . 10 | |||
3.2. Validation of the Current Effective PMTU . . . . . . . . 11 | 3.2. Validation of Probe Packet Size . . . . . . . . . . . . . 11 | |||
3.3. Reduction of the Effective PMTU . . . . . . . . . . . . . 11 | 3.3. Reducing the PLPMTU: Confirming Path Characteristics . . . 12 | |||
4. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 12 | 3.4. Increasing the PLPMTU: Supporting Path Changes . . . . . . 12 | |||
4.1. Probing . . . . . . . . . . . . . . . . . . . . . . . . . 12 | 3.5. Robustness to inconsistent Path information . . . . . . . 12 | |||
4.2. Verification and Use of PTB Messages . . . . . . . . . . 13 | 4. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . . 13 | |||
4.3. Timers . . . . . . . . . . . . . . . . . . . . . . . . . 13 | 4.1. PROBE_SEARCH: Probing for a larger PLPMTU . . . . . . . . 13 | |||
4.4. Constants . . . . . . . . . . . . . . . . . . . . . . . . 14 | 4.2. The PROBE_DONE state . . . . . . . . . . . . . . . . . . . 14 | |||
4.5. Variables . . . . . . . . . . . . . . . . . . . . . . . . 14 | 4.3. Verification and Use of PTB Messages . . . . . . . . . . . 14 | |||
4.6. Selecting PROBED_SIZE . . . . . . . . . . . . . . . . . . 15 | 4.4. Timers . . . . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
4.7. State Machine . . . . . . . . . . . . . . . . . . . . . . 15 | 4.5. Constants . . . . . . . . . . . . . . . . . . . . . . . . 15 | |||
5. Specification of Protocol-Specific Methods . . . . . . . . . 18 | 4.6. Variables . . . . . . . . . . . . . . . . . . . . . . . . 16 | |||
5.1. DPLPMTUD for UDP and UDP-Lite . . . . . . . . . . . . . . 18 | 4.7. Selecting PROBED_SIZE . . . . . . . . . . . . . . . . . . 16 | |||
5.1.1. UDP Options . . . . . . . . . . . . . . . . . . . . . 18 | 4.8. Black Hole Detection . . . . . . . . . . . . . . . . . . . 17 | |||
5.1.2. UDP Options Required for PLPMTUD . . . . . . . . . . 18 | 4.9. State Machine . . . . . . . . . . . . . . . . . . . . . . 17 | |||
5.1.2.1. Echo Request Option . . . . . . . . . . . . . . . 19 | 5. Specification of Protocol-Specific Methods . . . . . . . . . . 20 | |||
5.1.2.2. Echo Response Option . . . . . . . . . . . . . . 19 | 5.1. Application support for DPLPMTUD with UDP or UDP-Lite . . 20 | |||
5.1.3. Sending UDP-Option Probe Packets . . . . . . . . . . 19 | 5.1.1. Application Request . . . . . . . . . . . . . . . . . 20 | |||
5.1.4. Validating the Path with UDP Options . . . . . . . . 20 | 5.1.2. Application Response . . . . . . . . . . . . . . . . . 20 | |||
5.1.5. Handling of PTB Messages by UDP . . . . . . . . . . . 20 | 5.1.3. Sending Application Probe Packets . . . . . . . . . . 21 | |||
5.2. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 20 | 5.1.4. Validating the Path . . . . . . . . . . . . . . . . . 21 | |||
5.2.1. SCTP/IP4 and SCTP/IPv6 . . . . . . . . . . . . . . . 20 | 5.1.5. Handling of PTB Messages . . . . . . . . . . . . . . . 21 | |||
5.2.1.1. Sending SCTP Probe Packets . . . . . . . . . . . 20 | 5.2. DPLPMTUD with UDP Options . . . . . . . . . . . . . . . . 21 | |||
5.2.1.2. Validating the Path with SCTP . . . . . . . . . . 21 | 5.2.1. UDP Request Option . . . . . . . . . . . . . . . . . . 22 | |||
5.2.1.3. PTB Message Handling by SCTP . . . . . . . . . . 21 | 5.2.2. UDP Response Option . . . . . . . . . . . . . . . . . 22 | |||
5.2.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 21 | 5.3. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 22 | |||
5.2.2.1. Sending SCTP/UDP Probe Packets . . . . . . . . . 21 | 5.3.1. SCTP/IP4 and SCTP/IPv6 . . . . . . . . . . . . . . . . 22 | |||
5.2.2.2. Validating the Path with SCTP/UDP . . . . . . . . 21 | 5.3.1.1. Sending SCTP Probe Packets . . . . . . . . . . . . 22 | |||
5.2.2.3. Handling of PTB Messages by SCTP/UDP . . . . . . 21 | 5.3.1.2. Validating the Path with SCTP . . . . . . . . . . 23 | |||
5.2.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . 22 | 5.3.1.3. PTB Message Handling by SCTP . . . . . . . . . . . 23 | |||
5.2.3.1. Sending SCTP/DTLS Probe Packets . . . . . . . . . 22 | ||||
5.2.3.2. Validating the Path with SCTP/DTLS . . . . . . . 22 | 5.3.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 23 | |||
5.2.3.3. Handling of PTB Messages by SCTP/DTLS . . . . . . 22 | 5.3.2.1. Sending SCTP/UDP Probe Packets . . . . . . . . . . 23 | |||
5.3. PMTUD for QUIC . . . . . . . . . . . . . . . . . . . . . 22 | 5.3.2.2. Validating the Path with SCTP/UDP . . . . . . . . 23 | |||
5.3.1. Sending QUIC Probe Packets . . . . . . . . . . . . . 22 | 5.3.2.3. Handling of PTB Messages by SCTP/UDP . . . . . . . 24 | |||
5.3.2. Validating the Path with QUIC . . . . . . . . . . . . 23 | 5.3.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . . 24 | |||
5.3.3. Handling of PTB Messages by QUIC . . . . . . . . . . 23 | 5.3.3.1. Sending SCTP/DTLS Probe Packets . . . . . . . . . 24 | |||
5.4. Other IETF Transports . . . . . . . . . . . . . . . . . . 23 | 5.3.3.2. Validating the Path with SCTP/DTLS . . . . . . . . 24 | |||
5.5. DPLPMTUD by Applications . . . . . . . . . . . . . . . . 23 | 5.3.3.3. Handling of PTB Messages by SCTP/DTLS . . . . . . 24 | |||
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 24 | 5.4. DPLPMTUD for QUIC . . . . . . . . . . . . . . . . . . . . 24 | |||
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 | 5.4.1. Sending QUIC Probe Packets . . . . . . . . . . . . . . 24 | |||
8. Security Considerations . . . . . . . . . . . . . . . . . . . 24 | 5.4.2. Validating the Path with QUIC . . . . . . . . . . . . 25 | |||
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 24 | 5.4.3. Handling of PTB Messages by QUIC . . . . . . . . . . . 25 | |||
9.1. Normative References . . . . . . . . . . . . . . . . . . 24 | 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 25 | |||
9.2. Informative References . . . . . . . . . . . . . . . . . 26 | 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 | |||
Appendix A. Event-driven state changes . . . . . . . . . . . . . 26 | 8. Security Considerations . . . . . . . . . . . . . . . . . . . 26 | |||
Appendix B. Revision Notes . . . . . . . . . . . . . . . . . . . 29 | 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 26 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 30 | 9.1. Normative References . . . . . . . . . . . . . . . . . . . 26 | |||
9.2. Informative References . . . . . . . . . . . . . . . . . . 28 | ||||
Appendix A. Event-driven state changes . . . . . . . . . . . . . . 28 | ||||
Appendix B. Revision Notes . . . . . . . . . . . . . . . . . . . . 31 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 32 | ||||
1. Introduction | 1. Introduction | |||
The IETF has specified datagram transport using UDP, SCTP, and DCCP, | The IETF has specified datagram transport using UDP, SCTP, and DCCP, | |||
as well as protocols layered on top of these transports (e.g., SCTP/ | as well as protocols layered on top of these transports (e.g., SCTP/ | |||
UDP, DCCP/UDP). | UDP, DCCP/UDP) and directly over the IP network layer. This document | |||
describes a robust method for Path MTU Discovery (PMTUD) that may be | ||||
used with these transport protocols (or the applications that use | ||||
their transport service) to discover an appropriate size of packet to | ||||
use across an Internet path. | ||||
1.1. Classical Path MTU Discovery | 1.1. Classical Path MTU Discovery | |||
Classical Path Maximum Transmission Unit Discovery (PMTUD) can be | Classical Path Maximum Transmission Unit Discovery (PMTUD) can be | |||
used with any transport that is able to process ICMP Packet Too Big | used with any transport that is able to process ICMP Packet Too Big | |||
(PTB) messages (e.g., [RFC1191] and [RFC8201]). The term PTB message | (PTB) messages (e.g., [RFC1191] and [RFC8201]). The term PTB message | |||
is applied to both IPv4 ICMP Unreachable messages (type 3) that carry | is applied to both IPv4 ICMP Unreachable messages (type 3) that carry | |||
the error Fragmentation Needed (Type 3, Code 4) and ICMPv6 packet too | the error Fragmentation Needed (Type 3, Code 4) and ICMPv6 packet too | |||
big messages (Type 2). When a sender receives a PTB message, it | big messages (Type 2). When a sender receives a PTB message, it | |||
reduces the effective Path MTU (PMTU) to the value reported as the | reduces the effective MTU to the value reported as the Link MTU in | |||
Link MTU in the PTB message, and a method that from time-to-time | the PTB message, and a method that from time-to-time increases the | |||
increases the packet size in attempt to discover an increase in the | packet size in attempt to discover an increase in the supported PMTU. | |||
supported PMTU. The packets sent with a size larger than the current | The packets sent with a size larger than the current effective PMTU | |||
effective PMTU are known as probe packets. | are known as probe packets. | |||
Packets not intended as probe packets are either fragmented to the | Packets not intended as probe packets are either fragmented to the | |||
current effective PMTU, or the attempt to send fails with an error | current effective PMTU, or the attempt to send fails with an error | |||
code. Applications are sometimes provided with a primitive to let | code. Applications are sometimes provided with a primitive to let | |||
them read the maximum packet size, derived from the current effective | them read the maximum packet size, derived from the current effective | |||
PMTU. | PMTU. | |||
Classical PMTUD is subject to protocol failures. One failure arises | Classical PMTUD is subject to protocol failures. One failure arises | |||
when traffic using a packet size larger than the actual supported | when traffic using a packet size larger than the actual PMTU is | |||
PMTU is black-holed (all datagrams sent with this size are silently | black-holed (all datagrams sent with this size, or larger, are | |||
discarded without the sender receiving ICMP PTB messages. This could | silently discarded without the sender receiving ICMP PTB messages). | |||
arise when the ICMP messages are not delivered back to the sender for | This could arise when the PTB messages are not delivered back to the | |||
some reason [RFC2923]). For example, ICMP messages are increasingly | sender for some reason [RFC2923]). For example, ICMP messages are | |||
filtered by middleboxes (including firewalls) [RFC4890]. Also, in | increasingly filtered by middleboxes (including firewalls) [RFC4890]. | |||
some cases are not correctly processed by tunnel endpoints. | A stateful firewall could be configured with a policy to block | |||
incoming ICMP messages, which would prevent reception of PTB messages | ||||
to endpoints behind this firewall. Other examples include cases | ||||
where PTB messages are not correctly processed/generated by tunnel | ||||
endpoints. | ||||
Another failure could result if a node not on the network path sends | Another failure could result if a node that is not on the network | |||
a PTB that attempts to force the sender to change the effective PMTU | path sends a PTB message that attempts to force the sender to change | |||
[RFC8201]. A sender can protect itself from reacting to such | the effective PMTU [RFC8201]. A sender can protect itself from | |||
messages by utilising the quoted packet within the PTB message | reacting to such messages by utilising the quoted packet within a PTB | |||
payload to verify that the received PTB message was generated in | message payload to verify that the received PTB message was generated | |||
response to a packet that had actually been sent. However, there are | in response to a packet that had actually originated from the sender. | |||
situations where a sender would be unable to provide this | However, there are situations where a sender would be unable to | |||
verification. | provide this verification. | |||
Examples where verification is not possible include: | Examples where verification is not possible include: | |||
o When the router issuing the ICMP message is acting on a tunneled | o When the router issuing the ICMP message is acting on a tunneled | |||
packet the ICMP message is directed to the tunnel endpoint. This | packet, the ICMP message will be directed to the tunnel endpoint. | |||
endpoint is responsible for processed in the quoted packet in the | This tunnel endpoint is responsible for forwardiung the ICMP | |||
payload field to remove the effect of the tunnel, and return the | message and also processing the quoted packet within the payload | |||
ICMP message to the sender. Failure to do this results in black- | field to remove the effect of the tunnel, and return a correctly | |||
holing. | fromatted ICMP message to the sender. Failure to do this results | |||
in black-holing. | ||||
o When the router issuing the ICMP message implements RFC792 | o When a router issuing the ICMP message implements RFC792 | |||
[RFC0792], which only requires the quoted payload to include the | [RFC0792], it is only required the to include the first 64 bits of | |||
first 64 bits of the IP payload of the packet, and the ICMP | the IP payload of the packet within the quoted payload.This may be | |||
message occurs within a tunnel. Even if the decpasulated message | insufficient to perfom the tunnel processing described in the | |||
is processed by the tunnel endpoint, there could be insufficient | previous bullet. Even if the decapsulated message is processed by | |||
bytes remaining for the sender to read the quoted transport | the tunnel endpoint, there could be insufficient bytes remaining | |||
information. RFC1812 [RFC1812] requires routers to return the | for the sender to interpret the quoted transport information. | |||
full packet if possible, often the case for IPv4 when used the | RFC1812 [RFC1812] requires routers to return the full packet if | |||
path includes tunnels; or where the packet has been encapsulated/ | possible, often the case for IPv4 when used the path includes | |||
tunneled over an encrypted transport and it is not possible to | tunnels; or where the packet has been encapsulated/tunneled over | |||
determine the original transport header ). | an encrypted transport and it is not possible to determine the | |||
original transport header ). | ||||
o Even when the PTB message includes sufficient bytes of the quoted | o Even when the PTB message includes sufficient bytes of the quoted | |||
packet, the network layer could lack sufficient context to perform | packet, the network layer could lack sufficient context to perform | |||
verification, because this depends on information about the active | verification, because this depends on information about the active | |||
transport flows at an endpoint node (e.g., the socket/address | transport flows at an endpoint node (e.g., the socket/address | |||
pairs being used, and other protocol header information). | pairs being used, and other protocol header information). | |||
1.2. Packetization Layer Path MTU Discovery | 1.2. Packetization Layer Path MTU Discovery | |||
The term Packetization Layer (PL) has been introduced to describe the | The term Packetization Layer (PL) has been introduced to describe the | |||
layer that is responsible for placing data blocks into the payload of | layer that is responsible for placing data blocks into the payload of | |||
packets and selecting an appropriate maximum packet size. This | IP packets and selecting an appropriate Maximum Packet Size (MPS). | |||
function is often performed by a transport protocol, but can also be | This function is often performed by a transport protocol, but can | |||
performed by other encapsulation methods working above the transport. | also be performed by other encapsulation methods working above the | |||
PTB verification is more straight forward at the PL or at a higher | transport. | |||
layer. | ||||
In contrast to PMTUD, Packetization Layer Path MTU Discovery | In contrast to PMTUD, Packetization Layer Path MTU Discovery | |||
(PLPMTUD) [RFC4821] does not rely upon reception and verification of | (PLPMTUD) [RFC4821] does not rely upon reception and verification of | |||
PTB messages. It is therefore more robust than Classical PMTUD. | PTB messages. It is therefore more robust than Classical PMTUD. This | |||
This has become the recommended approach for implementing PMTU | has become the recommended approach for implementing PMTU discovery | |||
discovery with TCP. | with TCP. | |||
It uses a general strategy where the PL sends probe packet to search | It uses a general strategy where the PL sends probe packet to search | |||
for an appropriate PMTU. The probe packets are sent a progressively | for the largest size of unfragmented datagram that can be sent over a | |||
larger packet size. If a probe packet is successfully delivered (as | path. The probe packets are sent with a progressively larger packet | |||
determined by the PL), then the effective Path MTU is raised to the | size. If a probe packet is successfully delivered (as determined by | |||
size of the successful probe. If no response is received to a probe | the PL), then the PLPMTU is raised to the size of the successful | |||
packet, the method reduces the probe size. | probe. If no response is received to a probe packet, the method | |||
reduces the probe size. This PLPMTU is used to set the application | ||||
MPS. | ||||
PLPMTUD introduces flexibility in the implementation of PMTU | PLPMTUD introduces flexibility in the implementation of PMTU | |||
discovery. At one extreme, it can be configured to only perform PTB | discovery. At one extreme, it can be configured to only perform PTB | |||
black hole detection and recovery to increase the robustness of | black hole detection and recovery to increase the robustness of | |||
Classical PMTUD, or at the other extreme, all PTB processing can be | Classical PMTUD, or at the other extreme, all PTB processing can be | |||
disabled and PLPMTUD can completely replace Classical PMTUD. PLPMTUD | disabled and PLPMTUD can completely replace Classical PMTUD. | |||
can also include additional consistency checks without increasing the | ||||
risk of increased black-holing. | PLPMTUD can also include additional consistency checks without | |||
increasing the risk of increased black-holing. For instance,the | ||||
information available at the PL, or higher layers, makes PTB | ||||
verification more straight forward. | ||||
1.3. Path MTU Discovery for Datagram Services | 1.3. Path MTU Discovery for Datagram Services | |||
Section 4 of this document presents a set of algorithms for datagram | Section 4 of this document presents a set of algorithms for datagram | |||
protocols to discover a maximum size for the effective PMTU across a | protocols to discover the largest size of unfragmented datagram that | |||
path. The methods described rely on features of the PL Section 3 and | can be sent over a path. The method described relies on features of | |||
apply to transport protocols over IPv4 and IPv6. It does not require | the PL Section 3 and apply to transport protocols operating over IPv4 | |||
cooperation from the lower layers (except that they are consistent | and IPv6. It does not require cooperation from the lower layers, | |||
about which packet sizes are acceptable). A method can utilise ICMP | although it can utilise ICMP PTB messages when these received | |||
PTB messages when these received messages are made available to the | messages are made available to the PL. | |||
PL. | ||||
The UDP-Guidelines [RFC8085] state "an application SHOULD either use | The UDP Usage Guidelines [RFC8085] state "an application SHOULD | |||
the Path MTU information provided by the IP layer or implement Path | either use the Path MTU information provided by the IP layer or | |||
MTU Discovery (PMTUD)", but does not provide a mechanism for | implement Path MTU Discovery (PMTUD)", but does not provide a | |||
discovering the largest size of unfragmented datagram than can be | mechanism for discovering the largest size of unfragmented datagram | |||
used on a path. Prior to this document, PLPMTUD had not been | than can be used on a path. Prior to this document, PLPMTUD had not | |||
specified for UDP. | been specified for UDP. | |||
Section 10.2 of [RFC4821] recommends a PLPMTUD probing method for the | Section 10.2 of [RFC4821] recommends a PLPMTUD probing method for the | |||
Stream Control Transport Protocol (SCTP). SCTP utilises heartbeat | Stream Control Transport Protocol (SCTP). SCTP utilises heartbeat | |||
messages as probe packets, but RFC4821 does not provide a complete | messages as probe packets, but RFC4821 does not provide a complete | |||
specification. This document provides the details to complete that | specification. This document provides the details to complete that | |||
specification. | specification. | |||
The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires | The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires | |||
implementations to support Classical PMTUD and states that a DCCP | implementations to support Classical PMTUD and states that a DCCP | |||
sender "MUST maintain the maximum packet size (MPS) allowed for each | sender "MUST maintain the MPS allowed for each active DCCP session". | |||
active DCCP session". It also defines the current congestion control | It also defines the current congestion control MPS (CCMPS) supported | |||
maximum packet size (CCMPS) supported by a path. This recommends use | by a path. This recommends use of PMTUD, and suggests use of control | |||
of PMTUD, and suggests use of control packets (DCCP-Sync) as path | packets (DCCP-Sync) as path probe packets, because they do not risk | |||
probe packets, because they do not risk application data loss. The | application data loss. The method defined in this specification | |||
method defined in this specification could be used with DCCP. | could be used with DCCP. | |||
Section 5 specifies the method for a set of transports, and provides | Section 5 specifies the method for a set of transports, and provides | |||
information to enables the implementation of PLPMTUD with other | information to enables the implementation of PLPMTUD with other | |||
datagram transports and applications that use datagram transports. | datagram transports and applications that use datagram transports. | |||
2. Terminology | 2. Terminology | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in [RFC2119]. | document are to be interpreted as described in [RFC2119]. | |||
Other terminology is directly copied from [RFC4821], and the | Other terminology is directly copied from [RFC4821], and the | |||
definitions in [RFC1122]. | definitions in [RFC1122]. | |||
Black-Holed: When the sender is unaware that packets are not | Black-Holed: When the sender is unaware that packets are not | |||
delivered to the destination endpoint (e.g., when the sender | delivered to the destination endpoint (e.g., when the sender | |||
transmits packets of a particular size with a previously known | transmits packets of a particular size with a previously known | |||
PMTU, but is unaware of a change to the path that resulted in a | effective PMTU (also refered to as the PLPMTU), but is unaware of | |||
smaller PMTU). | a change to the path that resulted in a smaller PLPMTU). | |||
Classical Path MTU Discovery: Classical PMTUD is a process described | Classical Path MTU Discovery: Classical PMTUD is a process described | |||
in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to | in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to | |||
learn the largest size of unfragmented datagram than can be used | learn the largest size of unfragmented datagram than can be used | |||
across a path. | across a path. | |||
Datagram: A datagram is a transport-layer protocol data unit, | Datagram: A datagram is a transport-layer protocol data unit, | |||
transmitted in the payload of an IP packet. | transmitted in the payload of an IP packet. | |||
Effective PMTU: The current estimated value for PMTU that is used by | Effective PMTU: The current estimated value for PMTU that is used by | |||
a Packetization Layer. | a PMTUD. This is equivalent to the PLPMTU derived by PLPMTUD. | |||
EMTU_S: The Effective MTU for sending (EMTU_S) is defined in | EMTU_S: The Effective MTU for sending (EMTU_S) is defined in | |||
[RFC1122] as "the maximum IP datagram size that may be sent, for a | [RFC1122] as "the maximum IP datagram size that may be sent, for a | |||
particular combination of IP source and destination addresses...". | particular combination of IP source and destination addresses...". | |||
EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in | EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in | |||
[RFC1122] as the largest datagram size that can be reassembled by | [RFC1122] as the largest datagram size that can be reassembled by | |||
EMTU_R ("Effective MTU to receive"). | EMTU_R ("Effective MTU to receive"). | |||
Link: A communication facility or medium over which nodes can | Link: A communication facility or medium over which nodes can | |||
communicate at the link layer, i.e., a layer below the IP layer. | communicate at the link layer, i.e., a layer below the IP layer. | |||
Examples are Ethernet LANs and Internet (or higher) layer and | Examples are Ethernet LANs and Internet (or higher) layer and | |||
tunnels. | tunnels. | |||
Link MTU: The Maximum Transmission Unit (MTU) is the size in bytes | Link MTU: The Maximum Transmission Unit (MTU) is the size in bytes of | |||
of the largest IP packet, including the IP header and payload, | the largest IP packet, including the IP header and payload, that | |||
that can be transmitted over a link. Note that this could more | can be transmitted over a link. Note that this could more | |||
properly be called the IP MT, to be consistent with how other | properly be called the IP MTU, to be consistent with how other | |||
standards organizations use the acronym MT. This includes the IP | standards organizations use the acronym MT. This includes the IP | |||
header, but excludes link layer headers and other framing that is | header, but excludes link layer headers and other framing that is | |||
not part of IP or the IP payload. Other standards organizations | not part of IP or the IP payload. Other standards organizations | |||
generally define link MTU to include the link layer headers. | generally define link MTU to include the link layer headers. | |||
MPS: The Maximum Packet Size (MPS), the largest size of application | MPS: The Maximum Packet Size (MPS) is the largest size of application | |||
data block that can be sent unfragmented across a path. In | data block that can be sent unfragmented across a path. In | |||
PLPMTUD this quantity is derived from Effective PMTU by taking | DPLPMTUD this quantity is derived from PLPMTU by taking into | |||
into consideration the size of the application and lower protocol | consideration the size of the application and lower protocol layer | |||
layer headers, and can be limited by the application protocol. | headers. | |||
Packet: An IP header plus the IP payload. | Packet: An IP header plus the IP payload. | |||
Packetization Layer (PL): The layer of the network stack that places | Packetization Layer (PL): The layer of the network stack that places | |||
data into packets and performs transport protocol functions. | data into packets and performs transport protocol functions. | |||
Path: The set of link and routers traversed by a packet between a | Path: The set of link and routers traversed by a packet between a | |||
source node and a destination node. | source node and a destination node by a particular flow. | |||
Path MTU (PMTU): The minimum of the link MTU of all the links | Path MTU (PMTU): The minimum of the Link MTU of all the links forming | |||
forming a path between a source node and a destination node. | a path between a source node and a destination node. | |||
PLPMTUD: Packetization Layer Path MTU Discovery, the method | PLPMTU: The estimate of the actual PMTU provided by the DPLPMTUD | |||
described in this document for datagram PLs, which is an extension | algorithm. | |||
to Classical PMTU Discovery. | ||||
Probe packet: A datagram sent with a purposely chosen size | PLPMTUD: Packetization Layer Path MTU Discovery, the method described | |||
(typically larger than the current Effective PMTU or MPS) to | in this document for datagram PLs, which is an extension to | |||
detect if messages of this size can be successfully sent along the | Classical PMTU Discovery. | |||
end-to-end path. | ||||
Probe packet: A datagram sent with a purposely chosen size (typically | ||||
larger than the current PLPMTU) to detect if packets of this size | ||||
can be successfully sent end-toend across the network path. | ||||
3. Features Required to Provide Datagram PLPMTUD | 3. Features Required to Provide Datagram PLPMTUD | |||
TCP PLPMTUD has been defined using standard TCP protocol mechanisms. | TCP PLPMTUD has been defined using standard TCP protocol mechanisms. | |||
All of the requirements in [RFC4821] also apply to use of the | All of the requirements in [RFC4821] also apply to use of the | |||
technique with a datagram PL. Unlike TCP, some datagram PLs require | technique with a datagram PL. Unlike TCP, some datagram PLs require | |||
additional mechanisms to implement PLPMTUD. | additional mechanisms to implement PLPMTUD. | |||
There are nine requirements for performing the datagram PLPMTUD | There are eight requirements for performing the datagram PLPMTUD | |||
method described in this specification: | method described in this specification: | |||
1. PMTU parameters: A PLPMTUD sender is REQUIRED to provide | 1. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to provide | |||
information about the maximum size of packet that can be | information about the maximum size of packet that can be | |||
transmitted by the sender on the local link (the link MTU and MAY | transmitted by the sender on the local link (the local Link MTU). | |||
utilize similar information about the receiver when this is | It MAY utilize similar information about the receiver when this | |||
supplied (note this could be less than EMTU_R). Some | is supplied (note this could be less than EMTU_R). This avoids | |||
applications also have a maximum transport protocol data unit | implementations trying to send probe packets that can not be | |||
(PDU) size, in which case there is no benefit from probing for a | transmited by the local link. Too high a value may reduce the | |||
size larger than this (unless a transport allows multiplexing | efficiency of the search algorithm. Some applications also have | |||
multiple applications PDUs into the same datagram). | a maximum transport protocol data unit (PDU) size, in which case | |||
there is no benefit from probing for a size larger than this | ||||
(unless a transport allows multiplexing multiple applications | ||||
PDUs into the same datagram). | ||||
2. Effective PMTU: A datagram application MUST be able to choose the | 2. PLPMTU: A datagram application MUST be able to choose the size of | |||
size of datagrams sent to the network, up to the effective PMTU, | datagrams sent to the network, up to the PLPMTU, or a smaller | |||
or a smaller value (such as the MPS) derived from this. This | value (such as the MPS) derived from this. This value is managed | |||
value is managed by the PMTUD method. The effective PMTU | by the DPLPMTUD method. The PLPMTU (specified as the effective | |||
(specified in Section 1 of [RFC1191]) is equivalent to the EMTU_S | PMTU in Section 1 of [RFC1191]) is equivalent to the EMTU_S | |||
(specified in [RFC1122]). | (specified in [RFC1122]). | |||
3. Probe packets: On request, a PLPMTUD sender is REQUIRED to be | 3. Probe packets: On request, a PLPMTUD sender is REQUIRED to be | |||
able to transmit a packet larger than the current effective PMTU | able to transmit a packet larger than the PLMPMTU. This can be | |||
(but always with a total size less than the link MTU). The | uses to send a probe packet. In IPv4, a probe packet MUST be | |||
method can use this as a probe packet. In IPv4, a probe packet | sent with the Don't Fragment (DF) bit set in the IP header, and | |||
is always sent with the Don't Fragment (DF) bit set in the IP | without network layer endpoint fragmentation. In IPv6, a probe | |||
header, and without network layer endpoint fragmentation. In | packet is always sent without source fragmentation (as specified | |||
IPv6, a probe packet is always sent without source fragmentation | in section 5.4 of [RFC8201]). | |||
(as specified in section 5.4 of [RFC8201]). | ||||
4. Processing PTB messages: A PLPMTUD sender MAY optionally utilize | 4. Processing PTB messages: A DPLPMTUD sender MAY optionally utilize | |||
PTB messages received from the network layer to help identify | PTB messages received from the network layer to help identify | |||
when a path does not support the current size of packet probe. | when a path does not support the current size of packet probe. | |||
Any received PTB message SHOULD/MUST be verified before it is | Any received PTB message MUST be verified before it is used to | |||
used to update the PMTU discovery information [RFC8201]. This | update the PLPMTU discovery information [RFC8201]. This | |||
verification confirms that the PTB message was sent in response | verification confirms that the PTB message was sent in response | |||
to a packet originating by the sender, and needs to be performed | to a packet originating by the sender, and needs to be performed | |||
before the PMTU discovery method reacts to the PTB message. When | before the PLPMTU discovery method reacts to the PTB message. | |||
the router link MTU is indicated in the PTB message this MAY be | When the router link MTU is indicated in the PTB message this MAY | |||
used by datagram PLPMTUD to reduce the size of a probe, but MUST | be used by DPLPMTUD to reduce the probe size but MUST NOT be used | |||
NOT be used to increase the effective PMTU ([RFC8201]). | to increase the PLPMTU ([RFC8201]). Verification SHOULD utilise | |||
information that can not be simply determined by an off-path | ||||
attacker, for example, by checking the value of a protocol header | ||||
field known only to the two PL endpoints. (Some datagram | ||||
applications use well-known source and destination ports and | ||||
therefore this check needs to rely on other information.) | ||||
5. Reception feedback: The destination PL endpoint is REQUIRED to | 5. Reception feedback: The destination PL endpoint is REQUIRED to | |||
provide a feedback method that indicates to the PLPMTUD sender | provide a feedback method that indicates to the DPLPMTUD sender | |||
when a probe packet has been received by the destination | when a probe packet has been received by the destination PL | |||
endpoint. The local PL endpoint at the sending node is REQUIRED | endpoint. The local PL endpoint at the sending node is REQUIRED | |||
to pass this feedback to the sender-side PLPMTUD method. | to pass this feedback to the sender-side DPLPMTUD method. | |||
6. Probing and congestion control: The isolated loss of a probe | 6. Probing and congestion control: The isolated loss of a probe | |||
packet SHOULD NOT be treated as an indication of congestion and | packet SHOULD NOT be treated as an indication of congestion and | |||
its loss does not directly trigger a congestion control reaction | its loss SHOULD NOT directly trigger a congestion control | |||
[RFC4821]. | reaction [RFC4821]. | |||
7. Probe loss recovery: If the data block carried by a probe message | 7. Probe loss recovery: If the data block carried by a probe message | |||
needs to be sent reliably, the PL (or layers above) MUST arrange | needs to be sent reliably, the PL (or layers above) MUST arrange | |||
retransmission/repair of any resulting loss. This method MUST be | retransmission/repair of any resulting loss. This method MUST be | |||
robust in the case where probe packets are lost due to other | robust in the case where probe packets are lost due to other | |||
reasons (including link transmission error, congestion). The | reasons (including link transmission error, congestion). The | |||
PLPMTUD method treats isolated loss of a probe packet (with or | DPLPMTUD method treats isolated loss of a probe packet (with or | |||
without an PTB message) as a potential indication of a PMTU limit | without an PTB message) as a potential indication of a PMTU limit | |||
on the path. The PL MAY retransmit any data included in a lost | on the path, but not as an indictaion of congestion [CC]. | |||
probe packet without adjusting its congestion window [RFC4821]. | ||||
8. Cached effective PMTU: The sender MUST cache the effective PMTU | ||||
value used by an instance of the PL between probes and needs also | ||||
to consider the disruption that could be incurred by an | ||||
unsuccessful probe - both upon the flow that incurs a probe loss, | ||||
and other flows that experience the effect of additional probe | ||||
traffic. | ||||
9. Shared effective PMTU state: The PMTU value could also be stored | 8. Shared PLPMTU state: The PLPMTU value could also be stored with | |||
with the corresponding entry in the destination cache and used by | the corresponding entry in the destination cache and used by | |||
other PL instances. The specification of PLPMTUD [RFC4821] | other PL instances. The specification of PLPMTUD [RFC4821] | |||
states: "If PLPMTUD updates the MTU for a particular path, all | states: "If PLPMTUD updates the MTU for a particular path, all | |||
Packetization Layer sessions that share the path representation | Packetization Layer sessions that share the path representation | |||
(as described in Section 5.2 of [RFC4821]) SHOULD be notified to | (as described in Section 5.2 of [RFC4821]) SHOULD be notified to | |||
make use of the new MTU and make the required congestion control | make use of the new MTU and make the required congestion control | |||
adjustments". Such methods need to robust to the wide variety of | adjustments". Such methods need to be robust to the wide variety | |||
underlying network forwarding behaviours. Section 5.2 of | of underlying network forwarding behaviours, PLPMTU adjustments | |||
[RFC8201] provides guidance on the caching of PMTU information | based on shared PLPMTU values should be incorporated in the | |||
and also the relation to IPv6 flow labels. | search algorithms. Section 5.2 of [RFC8201] provides guidance on | |||
the caching of PMTU information and also the relation to IPv6 | ||||
flow labels. | ||||
In addition the following design principles are stated: | In addition, the following principles are stated for design of a | |||
DPLPMTUD method: | ||||
o Suitable MPS: The PLPMTUD method SHOULD avoid forcing an | o MPS: A method MUST signal appropriate MPS to the higher layer | |||
application to use an arbitrary small MPS (effective PMTU) for | using the PL. This may change following a change to the path. The | |||
transmission while the method is searching for the currently | method SHOULD avoid forcing an application to use an arbitrary | |||
supported PMTU. Datagram PLs do not necessarily support | small MPS (PLPMTU) for transmission while the method is searching | |||
fragmentation of PDUs larger than the PMTU. A reduced MPS can | for the currently supported PLPMTU. Datagram PLs do not | |||
adversely impact the performance of a datagram application. | necessarily support fragmentation of PDUs larger than the PLPMTU. | |||
A reduced MPS can adversely impact the performance of a datagram | ||||
application. | ||||
o Path validation: The PLPMTUD method MUST be robust to path changes | o Path validation: A method MUST be robust to path changes that | |||
that could have occurred since the path characteristics were last | could have occurred since the path characteristics were last | |||
confirmed. | confirmed, and to the possibility of inconsistent path information | |||
being received. | ||||
o Datagram reordering: A method MUST be robust to the possibility | o Datagram reordering: A method MUST be robust to the possibility | |||
that a flow encounters reordering, or has the traffic (including | that a flow encounters reordering, or has the traffic (including | |||
probe packets) is divided over more than one network path. | probe packets) is divided over more than one network path. | |||
o When to probe: The PLPMTUD method SHOULD determine whether the | o When to probe: A method SHOULD determine whether the path capacity | |||
path capacity has increased since it last measured the path. This | has increased since it last measured the path. This determines | |||
determines when the path should again be probed. | when the path should again be probed. | |||
3.1. PMTU Probe Packets | 3.1. PLPMTU Probe Packets | |||
PMTU discovery relies upon the sender being able to generate probe | The DPLPMTUD method relies upon the PL sender being able to generate | |||
messages with a specific size. TCP is able to generate probe packets | probe messages with a specific size. TCP is able to generate these | |||
by choosing to appropriately segment data being sent [RFC4821]. | probe packets by choosing to appropriately segment data being sent | |||
[RFC4821]. | ||||
In contrast, a datagram PL that needs to construct a probe packet has | In contrast, a datagram PL that needs to construct a probe packet has | |||
to either request an application to send a data block that is larger | to either request an application to send a data block that is larger | |||
than that generated by an application, or to utilise padding | than that generated by an application, or to utilise padding | |||
functions to extend a datagram beyond the size of the application | functions to extend a datagram beyond the size of the application | |||
data block. Protocols that permit exchange of control messages | data block. Protocols that permit exchange of control messages | |||
(without an application data block) could alternatively prefer to | (without an application data block) could alternatively prefer to | |||
generate a probe packet by extending a control message with padding | generate a probe packet by extending a control message with padding | |||
data. | data. | |||
When the method fails to validate the PMTU for the path, it may be | When the method fails to validate the PLPMTU, it may be required to | |||
required to send a probe packet with a size less than the size of the | send a probe packet with a size less than the size of the data block | |||
data block generated by an application. In this case, the PL could | generated by an application. In this case, the PL could provide a | |||
provide a way to fragment a datagram at the PL, or could instead | way to fragment a datagram at the PL, or could instead utilise a | |||
utilise a control packet with padding. | control packet with padding. | |||
A receiver needs to be able to distinguish an in-band data block from | A receiver needs to be able to distinguish an in-band data block from | |||
any added padding. This is needed to ensure that any added padding | any added padding. This is needed to ensure that any added padding | |||
is not passed on to an application at the receiver. | is not passed on to an application at the receiver. | |||
This results in three possible ways that a sender can create a probe | This results in three possible ways that a sender can create a probe | |||
packet: | packet listed in order of preference: | |||
Probing using appication data: A probe packet that contains a data | Probing using padding data: A probe packet that contains only control | |||
block supplied by an application that matches the size required | information together with any padding needed to inflate the packet | |||
for the probe. This method requests the application to issue a | to the size required for the probe packet. Since these probe | |||
data block of the desired probe size. If the application/ | packets do not carry an application-supplied data block,they do | |||
transport needs protection from the loss of an unsuccessful probe | not typically require retransmission, although they do still | |||
packet, the application/transport needs then to perform transport- | consume network capacity and incur endpoint processing. | |||
layer retransmission/repair of the data block (e.g., by | ||||
retransmission after loss is detected or by duplicating the data | ||||
block in a datagram without the padding). | ||||
Probing using appication data and padding data: A probe packet that | Probing using appication data and padding data: A probe packet that | |||
contains a data block supplied by an application that is combined | contains a data block supplied by an application that is combined | |||
with padding to inflate the length of the datagram to the size | with padding to inflate the length of the datagram to the size | |||
required for the probe. If the application/transport needs | required for the probe packet. If the application/transport needs | |||
protection from the loss of this probe packet, the application/ | protection from the loss of this probe packet, the application/ | |||
transport may perform transport-layer retransmission/repair of the | transport may perform transport-layer retransmission/repair of the | |||
data block (e.g., by retransmission after loss is detected or by | data block (e.g., by retransmission after loss is detected or by | |||
duplicating the data block in a datagram without the padding | duplicating the data block in a datagram without the padding | |||
data). | data). | |||
Probing using padding data: A probe packet that contains only | Probing using appication data: A probe packet that contains a data | |||
control information together with any padding needed to inflate | block supplied by an application that matches the size required | |||
the packet to the size required for the probe. Since these probe | for the probe packet. This method requests the application to | |||
packets do not carry an application-supplied data block,they do | issue a data block of the desired probe size. If the application/ | |||
not typically require retransmission, although they do still | transport needs protection from the loss of an unsuccessful probe | |||
consume network capacity and incur endpoint processing. | packet, the application/transport needs then to perform transport- | |||
layer retransmission/repair of the data block (e.g., by | ||||
retransmission after loss is detected). | ||||
A datagram PLPMTUD MAY choose to use only one of these methods to | A PL that uses a probe packet carrying an application data block, | |||
simplify the implementation. | could need to retransmit this application data block if the probe | |||
fails. This could need the PL to re-fragment the data block to a | ||||
smaller packet size that is expected to traverse the end-to-end path | ||||
(which could utilise network-layer or PL fragmentation when these are | ||||
available). | ||||
3.2. Validation of the Current Effective PMTU | DLPMTUD MAY choose to use only one of these methods to simplify the | |||
implementation. | ||||
3.2. Validation of Probe Packet Size | ||||
The PL needs a method to determine when probe packets have been | The PL needs a method to determine when probe packets have been | |||
successfully received end-to-end across a network path. | successfully received end-to-end across a network path. | |||
Transport protocols can include end-to-end methods that detect and | Transport protocols can include end-to-end methods that detect and | |||
report reception of specific datagrams that they send (e.g., DCCP and | report reception of specific datagrams that they send (e.g., DCCP and | |||
SCTP provide keep-alive/heartbeat features). When supported, this | SCTP provide keep-alive/heartbeat features). When supported, this | |||
mechanism SHOULD also be used by PLPMTUD to acknowledge reception of | mechanism SHOULD also be used by DPLPMTUD to acknowledge reception of | |||
a probe packet. | a probe packet. | |||
A PL that does not acknowledge data reception (e.g., UDP and UDP- | A PL that does not acknowledge data reception (e.g., UDP and UDP- | |||
Lite) is unable to detect when the packets it sends are discarded | Lite) is unable to detect when the packets that it sends are | |||
because their size is greater than the actual PMTUD. These PLs need | discarded because their size is greater than the actual PMTU. These | |||
to either rely on an application protocol to detect this, or make use | PLs need to either rely on an application protocol to detect this | |||
of an additional transport method such as UDP-Options | loss, or make use of an additional transport method such as UDP- | |||
[I-D.ietf-tsvwg-udp-options]. In addition, they might need to send | Options [I-D.ietf-tsvwg-udp-options]. In addition, they might need | |||
reachability probes (e.g., periodically solicit a response from the | to send reachability probes (e.g., periodically solicit a response | |||
destination) to determine whether the current effective PMTU is still | from the destination) to determine whether the last successfully | |||
supported by the network path. | probed PLPMTU is still supported by the network path. | |||
Section Section 4 specifies this function for a set of IETF-specified | Section Section 4 specifies this function for a set of IETF-specified | |||
protocols. | protocols. | |||
3.3. Reduction of the Effective PMTU | 3.3. Reducing the PLPMTU: Confirming Path Characteristics | |||
When the current effective PMTU is no longer supported by the network | If the DPLPMTUD method detects that a packet with the PLPMTU size is | |||
path, the transport needs to detect this and reduce the effective | no supported by the network path, then the DLPMTUD method needs to | |||
PMTU. | validate the PLPMTU. This can happen when a validated PTB message is | |||
received, or another event that indicates the network path no longer | ||||
sustains this packet size, such as a loss report from the PL | ||||
o A PL that sends a datagram larger than the actual PMTU that | All implementations of DPLPMTUD are REQUIRED to provide support that | |||
includes no application data block, or one that does not attempt | reduces the PLPMTU when the actual PMTU supported by a network path | |||
to provide any retransmission, can send a new probe packet with an | is less than the PLPMTU. | |||
updated probe size. | ||||
o A PL that wishes to resend the application data block, could then | 3.4. Increasing the PLPMTU: Supporting Path Changes | |||
need to re-fragment the data block to a smaller packet size that | ||||
is expected to traverse the end-to-end path. This could utilise | ||||
network-layer or PL fragmentation when these are available. A | ||||
fragmented datagram MUST NOT be used as a probe packet (see | ||||
[RFC8201]). | ||||
A method can additionally utilise PTB messages to detect when the | An implementation that only reduces the PLPMTU to a suitable size is | |||
actual PMTU supported by a network path is less than the current size | sufficient to ensure reliable operation, but may be very inefficient | |||
of datagrams (or probe messages) that are being sent. | when the actual PMTU changes or when the method (for whatever reason) | |||
makes a suboptimal choice for the PLPMTU. | ||||
A full implementation of the DPLPMTUD method is RECOMMENDED to | ||||
provide a way for the sending PL endpoint to detect when the PLPMTU | ||||
is smaller than the actual PMTU size. This allows the sender to | ||||
increase the PLPMTU following a change in the characteristics of the | ||||
path, such as when a link is reconfigured with a larger MTU, or when | ||||
there is a change in the set of links traversed by an end-to-end flow | ||||
(e.g. after a routing or fail-over decision). | ||||
3.5. Robustness to inconsistent Path information | ||||
The decision to increase the PLPMTU needs to be robust to the | ||||
possibility that information learned about the path is inconsistent | ||||
(this could happen when probe packets are lost due to other reasons, | ||||
or some of the packets in a flow are forwarded along a portion of the | ||||
path that supports a different PMTU). | ||||
Frequent path changes could occur due to unexpected "flapping" - | ||||
where some packets from a flow pass along one path, but other packets | ||||
follow a different path with different properties. DPLPMTUD can be | ||||
made robust to these anomalies by introducing hysteresis into the | ||||
decision to increase the Maximum Packet Size. | ||||
XXX A future revision of this section will include recommend | ||||
appropriate methods to provide robustness. XXX | ||||
4. Datagram Packetization Layer PMTUD | 4. Datagram Packetization Layer PMTUD | |||
This section specifies Datagram PLPMTUD. | This section specifies Datagram PLPMTUD (DPLPMTUD). This method can | |||
be introduced at various points in the IP protocol stack, to discover | ||||
the PLPMTU so that the application can use an MPS appropriate to the | ||||
current network path. | ||||
The central idea of PLPMTU discovery is probing by a sender. Probe | (preamble) | |||
packets of increasing size are sent to find out the maximum size of a | ||||
user message that is completely transferred across the network path | ||||
from the sender to the destination. | ||||
4.1. Probing | +-----------+ | |||
| APP* | | ||||
+-----------+ | ||||
__|| | | |___ | ||||
___/ | | | \ | ||||
__/ | | | \__ | ||||
+------++-----+ | +------+ | | ||||
| QUIC*||UDPO*| | | SCTP*| | | ||||
+------++-----+ | +-+-----+ | | ||||
+-----+ +------+ | ||||
| UDP | | SCTP*| | ||||
+-----+ +------+ | ||||
| | | ||||
+----------------------+ | ||||
| Network Interface | | ||||
+----------------------+ | ||||
The PLPMTUD method utilises a timer to trigger the generation of | (postamble) | |||
probe packets. The probe_timer is started each time a probe packet | ||||
is sent to the destination and is cancelled when receipt of the probe | ||||
packet is acknowledged. | ||||
The PROBE_COUNT is initialised to zero when a probe packet is first | The central idea of DPLPMTUD is probing by a sender. Probe packets | |||
sent with a particular size. Each time the probe_timer expires, the | of increasing size are sent to find out the maximum size of user | |||
PROBE_COUNT is incremented, and a probe packet of the same size is | message that is completely transferred across the network path from | |||
retransmitted. The maximum number of retransmissions per probing | the sender to the destination. | |||
size is configured (MAX_PROBES). If the value of the PROBE_COUNT | ||||
reaches MAX_PROBES, probing will be stopped and the last successfully | ||||
probed PMTU is set as the effective PMTU. | ||||
Once probing is completed, the sender continues to use the effective | 4.1. PROBE_SEARCH: Probing for a larger PLPMTU | |||
PMTU until either a PTB message is received or the PMTU_RAISE_TIMER | ||||
expires. If the PL is unable to verify reachability to the | ||||
destination endpoint after probing has completed, the method uses a | ||||
REACHABILITY_TIMER to periodically repeat a probe packet for the | ||||
current effective PMTU size, while the PMTU_RAISE_TIMER is running. | ||||
If the resulting probe packet is not acknowledged (i.e. the | ||||
PROBE_TIMER expires), the method re-starts probing for the PMTU. | ||||
4.2. Verification and Use of PTB Messages | The DPLPMTUD method utilises probe packets to confirm that a packet | |||
of size PROBE_SIZE can travere the network path. The PROBE_COUNT is | ||||
initialised to zero when a probe packet is first sent with a | ||||
particular size. | ||||
A timer is used to trigger the generation of probe packets. The | ||||
probe_timer is started each time a probe packet is sent to the | ||||
destination and is cancelled when receipt of the probe packet is | ||||
acknowledged. THE PROBE_SIZE is confirmed, and this value is then | ||||
assignmed to PLPMTU. The DPLPMTUD method may send subsequent probes | ||||
of an increasing size. Increasing probes follows a search strategy | ||||
as discussed in Section 4.7. | ||||
Each time the probe_timer expires, the PROBE_COUNT is incremented, | ||||
teh probe_timer is reinitialised, and a probe packet of the same size | ||||
is retransmitted. | ||||
The maximum number of retransmissions for a PROBE_SIZE is configured | ||||
(MAX_PROBES). If the value of the PROBE_COUNT reaches MAX_PROBES, | ||||
probing will stop. | ||||
4.2. The PROBE_DONE state | ||||
When the PL sender complete probing for a larger PLPMTU, it enters | ||||
the PROBE_DONE state. This starts the PMTU_RAISE_TIMER. While this | ||||
running, the PLPMTU remains at the value set in the last succesful | ||||
probe packet. | ||||
If the PL is designed in a way that is unable to verify reachability | ||||
to the destination endpoint after probing has completed, the method | ||||
uses a REACHABILITY_TIMER to periodically repeat a probe packet for | ||||
the current PLPMTU size, while the PMTU_RAISE_TIMER is running. If | ||||
the REACHABILITY_TIMER expires, the method exits the PROBE_DONE | ||||
state. The done state is also exited when a verified PTB message is | ||||
received. | ||||
If the PMTU_RAISE_TIMER expires, the PL sender also exits the | ||||
PROBE_DONE state, but in this case resumes probing from the size of | ||||
the PLPMTU. | ||||
4.3. Verification and Use of PTB Messages | ||||
This section describes processing for both IPv4 ICMP Unreachable | This section describes processing for both IPv4 ICMP Unreachable | |||
messages (type 3) and ICMPv6 packet too big messages. | messages (type 3) and ICMPv6 packet too big messages. | |||
A node that receives a PTB message from a router or middlebox, MUST | A node that receives a PTB message from a router or middlebox, MUST | |||
verify the PTB message. The node checks the protocol information in | verify the PTB message. The node checks the protocol information in | |||
the quoted payload to verify that the message originated from the | the quoted payload to verify that the message originated from the | |||
sending node. The node also checks that the reported MTU size is | sending node. The node also checks that the reported MTU size is | |||
less than the size used by packet probes. PTB messages are discarded | less than the size used by packet probes. PTB messages are discarded | |||
if they fail to pass these checks, or where there is insufficient | if they fail to pass these checks, or where there is insufficient | |||
ICMP payload to perform these checks. The checks are intended to | ICMP payload to perform these checks. The checks are intended to | |||
provide protection from packets that originate from a node that is | provide protection from packets that originate from a node that is | |||
not on the network path or a node that attempts to report a larger | not on the network path or a node that attempts to report a larger | |||
MTU than the current probe size. | MTU than the current probe size. | |||
PTB messages that have been verified can be utilised by the DPLPMTUD | PTB messages that have been verified can be utilised by the DPLPMTUD | |||
algorithm. A method that utilises these PTB messages can improve | algorithm. A method that utilises these PTB messages can improve | |||
performance compared to one that relies solely on probing. | performance compared to one that relies solely on probing. | |||
4.3. Timers | 4.4. Timers | |||
This method utilises three timers: | The method in the previous subsections utilises three timers: | |||
PROBE_TIMER: Configured to expire after a period longer than the | PROBE_TIMER: Configured to expire after a period longer than the | |||
maximum time to receive an acknowledgment to a probe packet. This | maximum time to receive an acknowledgment to a probe packet. This | |||
value MUST be larger than 1 second, and SHOULD be larger than 15 | value MUST be larger than 1 second, and SHOULD be larger than 15 | |||
seconds. Guidance on selection of the timer value are provide in | seconds. Guidance on selection of the timer value are provide in | |||
section 3.1.1 of the UDP Usage Guidelines [RFC8085]. | section 3.1.1 of the UDP Usage Guidelines [RFC8085]. | |||
PMTU_RAISE_TIMER: Configured to the period a sender ought to | If the PL has an RTT estimate and timely acknowedgements the | |||
continue use the current effective PMTU, after which it re- | PROBE_TIMER can be derrived from the PL RTT estimate. | |||
commences probing for a higher PMTU. This timer has a period of | ||||
600 secs, as recommended by PLPMTUD [RFC4821]. | ||||
REACHABILITY_TIMER: Configured to the period a sender ought to wait | PMTU_RAISE_TIMER: Configured to the period a sender ought to continue | |||
before confirming the current effective PMTU is still supported. | use the current PLPMTU, after which it re-commences probing for a | |||
This is less than the PMTU_RAISE_TIMER. | higher PMTU. This timer has a period of 600 secs, as recommended | |||
by DPLPMTUD [RFC4821]. | ||||
An application that needs to employ keep-alive messages to deliver | REACHABILITY_TIMER: Configured to the period a sender ought to wait | |||
useful service over UDP SHOULD NOT transmit them more frequently | before confirming the current PLPMTU is still supported. This is | |||
than once every 15 seconds and SHOULD use longer intervals when | less than the PMTU_RAISE_TIMER and used to decrease the PLPMTU | |||
possible. DPLPMTUD ought to suspend reachability probes when no | (e.g. when a black hole is encountered). | |||
application data has been sent since the previous probe packet. | ||||
Guidance on selection of the timer value are provide in section | DPLPMTUD ought to suspend reachability probes when no application | |||
3.1.1 of the UDP Usage Guidelines[RFC8085]. | data has been sent since the previous probe packet. Guidance on | |||
selection of the timer value are provide in section 3.1.1 of the | ||||
UDP Usage Guidelines[RFC8085]. DPLPMTUD ought to be suspended or | ||||
only sent in conjuction with out traffic during periods of | ||||
dormancy. This verification needs to be frequent enough when data | ||||
is flowing that you do not black hole extensive amounts of traffic | ||||
An implementation could implement the various timers using a single | An implementation could implement the various timers using a single | |||
timer process. | timer process. | |||
4.4. Constants | 4.5. Constants | |||
The following constants are defined: | The following constants are defined: | |||
MAX_PROBES: The maximum value of the PROBE_ERROR_COUNTER. The | MAX_PROBES: The maximum value of the PROBE_ERROR_COUNTER. The default | |||
default value of MAX_PROBES is 10. | value of MAX_PROBES is 10. | |||
MIN_PMTU: The smallest allowed probe packet size. This value is | MIN_PMTU: The smallest allowed probe packet size. For IPv6, this | |||
1280 bytes, as specified in [RFC2460]. For IPv4, the minimum | value is 1280 bytes, as specified in [RFC2460]. For IPv4, the | |||
value is 68 bytes. (An IPv4 routed is required to be able to | minimum value is 68 bytes. (An IPv4 routed is required to be able | |||
forward a datagram of 68 octets without further fragmentation. | to forward a datagram of 68 octets without further fragmentation. | |||
This is the combined size of an IPv4 header and the minimum | This is the combined size of an IPv4 header and the minimum | |||
fragment size of 8 octets.) | fragment size of 8 octets.) | |||
BASE_PMTU: The BASE_PMTU is a considered a size that ought to work | BASE_PMTU: The BASE_PMTU is a considered a size that ought to work in | |||
in most cases. The size is equal to or larger than the minimum | most cases. The size is equal to or larger than the minimum | |||
permitted and smaller than the maximum allowed. In the case of | permitted and smaller than the maximum allowed. In the case of | |||
IPv6, this value is 1280 bytes [RFC2460]. When using IPv4, a size | IPv6, this value is 1280 bytes [RFC2460]. When using IPv4, a size | |||
of 1200 is RECOMMENDED. | of 1200 bytes is RECOMMENDED. | |||
MAX_PMTU: The MAX_PMTU is the largest size of PMTU that is probed. | MAX_PMTU: The MAX_PMTU is the largest size of PLPMTU that is probed. | |||
This has to be less than or equal to the minimum of the local MTU | This has to be less than or equal to the minimum of the local MTU | |||
of the outgoing interface and the destination effective MTU for | of the outgoing interface and the destination PLMTU for receiving. | |||
receiving. An application or PL may reduce this when it knows | An application or PL may reduce this when it knows there is no | |||
there is no need to send packets above a specific size. | need to send packets above a specific size. | |||
4.5. Variables | 4.6. Variables | |||
This method utilises a set of variables: | This method utilises a set of variables: | |||
effective PMTU: The effective PMTU is the maximum size of datagram | PROBE_TIMER: Configured to expire after a period longer than the | |||
that the method has currently determined can be supported along | maximum time to receive an acknowledgment to a probe packet. This | |||
the entire path. | value MUST be larger than 1 second, and SHOULD be larger than 15 | |||
seconds. Guidance on selection of the timer value are provide in | ||||
section 3.1.1 of the UDP Usage Guidelines [RFC8085]. | ||||
PROBED_SIZE: The PROBED_SIZE is the size of the current probe | PL with RTT estimates may use values smaller than 1 seconded | |||
packet. This is a tentative value for the effective PMTU, which | derrived from their RTT estimate to speed up detection of | |||
is awaiting confirmation by an acknowledgment. | connectivity issues on the path. | |||
PROBE_COUNT: This is a count of the number of unsuccessful probe | PROBED_SIZE: The PROBED_SIZE is the size of the current probe packet. | |||
packets that have been sent with size PROBED_SIZE. The value is | This is a tentative value for the PLPMTU, which is awaiting | |||
confirmation by an acknowledgment. | ||||
PROBE_COUNT: This is a count of the number of unsuccessful probe | ||||
packets that have been sent with size PROBED_SIZE. The value is | ||||
initialised to zero when a particular size of PROBED_SIZE is first | initialised to zero when a particular size of PROBED_SIZE is first | |||
attempted. | attempted. | |||
PTB_SIZE: The PTB_Size is value returned by a verified PTB message | PTB_SIZE: The PTB_Size is value returned by a verified PTB message | |||
indicating the local MTU size of a router along the path. | indicating the local MTU size of a router along the path. | |||
4.6. Selecting PROBED_SIZE | 4.7. Selecting PROBED_SIZE | |||
Implementations discover the search range by validating the minimum | Implementations discover the search range by validating the minimum | |||
path MTU and then using the probe method to select a PROBED_SIZE less | path MTU and then using the probe method to select a PROBED_SIZE less | |||
than or equal to the maximum PMTU_MAX. Where PMTU_MAX is the minimum | than or equal to the maximum PMTU_MAX. Where PMTU_MAX is the minimum | |||
of the the local link MTU and EMTU_R (learned from the remote | of the local link MTU and EMTU_R (learned from the remote endpoint). | |||
endpoint). The PMTU_MAX MAY be constrained by an application that | The PMTU_MAX MAY be constrained by an application that has a maximum | |||
has a maximum to the size of datagrams it wishes to send. | to the size of datagrams it wishes to send. | |||
Implementations use a search algorithm to choose probe sizes within | Implementations use a search algorithm to choose probe sizes within | |||
the search range. XXX The current method does not specify or | the search range. | |||
recommend a specific methods for selecting a probe size. One simple | ||||
method is to increase the size of probe in increments until it fails, | ||||
other methods may use tables to select probe sizes, or search | ||||
algorithms - this part to be expanded based on experience and | ||||
consideration of methods XXX | ||||
xxx A future version of this section will detail example methods for | ||||
selecting probe size values, but does not plan to mandate a single | ||||
method. xxx | ||||
Implementations MAY optimizse the search procedure by selecting step | Implementations MAY optimizse the search procedure by selecting step | |||
sizes from a table of common MTU sizes. | sizes from a table of common PMTU sizes. | |||
Implementations SHOULD select probe sizes to maximise the gain in | Implementations SHOULD select probe sizes to maximise the gain in | |||
PMTU each search step. Implementations ought to take into | PLPMTU each search step. Implementations ought to take into | |||
consideration useful probe size steps and a minimum useful gain in | consideration useful probe size steps and a minimum useful gain in | |||
PMTU. | PLPMTU. | |||
4.7. State Machine | 4.8. Black Hole Detection | |||
A state machine for Datagram PLPMTUD is depicted in Figure 1. If | The DPLPMTUD method can be used to detect paths that fail to support | |||
multihoming is supported, a state machine is needed for each active | a packet size, but return no PTB message. The black hole detection | |||
path. | function detects such cases and responds by reducing the PLPMTU, | |||
allowing the endpoint to inform the application of the reduced MPS | ||||
and accordingly send smaller packets. Black Hole detection is | ||||
triggered by the reachability function. | ||||
PROBE_TIMER expiry | 4.9. State Machine | |||
(PROBE_COUNT = MAX_PROBES) | ||||
+-------------+ +--------------+ | ||||
=->| PROBE_START |--------------->|PROBE_DISABLED| | ||||
PROBE_TIMER expiry | +-------------+ +--------------+ | ||||
(PROBE_COUNT = | | | | ||||
MAX_PROBES) ------- | Connectivity confirmed | ||||
v | ||||
----------- +------------+ -- PROBE_TIMER expiry | ||||
MAX_PMTU acked or | | PROBE_BASE | | (PROBE_COUNT < | ||||
PTB (>= BASE_PMTU)| -----> +------------+ <- MAX_PROBES) | ||||
---------------- | /\ | | | ||||
| | | | | PTB | ||||
| PMTU_RAISE_TIMER| | | | (PTB_SIZE < BASE_PMTU) | ||||
| or reachability | | | | or | ||||
| (PROBE_COUNT | | | | PROBE_TIMER expiry | ||||
| = MAX_PROBES) | | | | (PROBE_COUNT = MAX_PROBES) | ||||
| ------------- | | \ | ||||
| | PTB | | \ | ||||
| | (< PROBED_SIZE)| | \ | ||||
| | | | ---------------- | ||||
| | | | | | ||||
| | | | Probe | | ||||
| | | | acked | | ||||
v | | v v | ||||
+------------+ +--------------+ Probe +-------------+ | ||||
| PROBE_DONE |<-------------- | PROBE_SEARCH |<-------| PROBE_ERROR | | ||||
+------------+ MAX_PMTU acked +--------------+ acked +-------------+ | ||||
/\ | or /\ | | ||||
| | PROBE_TIMER expiry | | | ||||
| |(PROBE_COUNT = MAX_PROBES) | | | ||||
| | | | | ||||
------ -------- | ||||
Reachability probe acked PROBE_TIMER expiry | ||||
or PROBE_TIMER expiry (PROBE_COUNT < MAX_PROBES) | ||||
(PROBE_COUNT < MAX_PROBES) or | ||||
Probe acked | ||||
Figure 1: State machine for Datagram PLPMTUD | A state machine for DPLPMTUD is depicted in Figure 2. If multihoming | |||
is supported, a state machine is needed for each active path. | ||||
XXX State machine to be updated to describe handling of validated PTB | PROBE_TIMER expiry | |||
messages XXX | (PROBE_COUNT = MAX_PROBES) | |||
+-------------+ +--------------+ | ||||
=->| PROBE_START |--------------->|PROBE_DISABLED| | ||||
PROBE_TIMER expiry | +-------------+ +--------------+ | ||||
(PROBE_COUNT = | | | | ||||
MAX_PROBES) ------- | Connectivity confirmed | ||||
v | ||||
----------- +------------+ -- PROBE_TIMER expiry | ||||
MAX_PMTU acked or | | PROBE_BASE | | (PROBE_COUNT < | ||||
PTB (>= BASE_PMTU)| -----> +------------+ <- MAX_PROBES) | ||||
---------------- | /\ | | | ||||
| | | | | PTB | ||||
| PMTU_RAISE_TIMER| | | | (PTB_SIZE < BASE_PMTU) | ||||
| or reachability | | | | or | ||||
| (PROBE_COUNT | | | | PROBE_TIMER expiry | ||||
| = MAX_PROBES) | | | | (PROBE_COUNT = MAX_PROBES) | ||||
| ------------- | | \ | ||||
| | PTB | | \ | ||||
| | (< PROBED_SIZE)| | \ | ||||
| | | | ---------------- | ||||
| | | | | | ||||
| | | | Probe | | ||||
| | | | acked | | ||||
v | | v v | ||||
+------------+ +--------------+ Probe +-------------+ | ||||
| PROBE_DONE |<-------------- | PROBE_SEARCH |<-------| PROBE_ERROR | | ||||
+------------+ MAX_PMTU acked +--------------+ acked +-------------+ | ||||
/\ | or /\ | | ||||
| | PROBE_TIMER expiry | | | ||||
| |(PROBE_COUNT = MAX_PROBES) | | | ||||
| | | | | ||||
------ -------- | ||||
Reachability probe acked PROBE_TIMER expiry | ||||
or PROBE_TIMER expiry (PROBE_COUNT < MAX_PROBES) | ||||
(PROBE_COUNT < MAX_PROBES) or | ||||
Probe acked | ||||
XXX Method may be updated to clarify how probe sizes are used during | XXX A future version of this document will update the state machine | |||
probing XXX | to describe handling of validated PTB messages. XXX | |||
The following states are defined to reflect the probing process: | The following states are defined to reflect the probing process: | |||
PROBE_START: The PROBE_START state is the initial state before | PROBE_START: The PROBE_START state is the initial state before | |||
probing has started. PLPMTUD is not performed in this state. The | probing has started. PLPMTUD is not performed in this state. The | |||
state transitions to PROBE_BASE, when a path has been confirmed, | state transitions to PROBE_BASE, when a path has been confirmed, | |||
i.e. when a sent packet has been acknowledged on this path. The | i.e. when a sent packet has been acknowledged on this path. Any | |||
effective PMTU is set to the BASE_PMTU size. Probing ought to | transport method may be used to exit PROBE_BASE as long as the | |||
start immediately after connection setup to prevent the loss of | send packet is acknowledge by the other side. The PLPMTU is set | |||
user data. | to the BASE_PMTU size. Probing ought to start immediately after | |||
connection setup to prevent the prevent the loss of user data. | ||||
PROBE_BASE: The PROBE_BASE state is the starting point for probing | PROBE_BASE: The PROBE_BASE state is the starting point for probing | |||
with datagram PLPMTUD. It is used to confirm whether the | with datagram PLPMTUD. It is used to confirm whether the BASE_PMTU | |||
BASE_PMTU size is supported by the network path. On entry, the | size is supported by the network path. On entry, the PROBED_SIZE | |||
PROBED_SIZE is set to the BASE_PMTU size and the PROBE_COUNT is | is set to the BASE_PMTU size and the PROBE_COUNT is set to zero. | |||
set to zero. A probe packet is sent, and the PROBE_TIMER is | A probe packet is sent, and the PROBE_TIMER is started. The state | |||
started. The state is left when the PROBE_COUNT reaches | is left when the PROBE_COUNT reaches MAX_PROBES; a PTB message is | |||
MAX_PROBES; a PTB message is verified, or a probe packet is | verified, or a probe packet is acknowledged. | |||
acknowledged. | ||||
PROBE_SEARCH: The PROBE_SEARCH state is the main probing state. | PROBE_SEARCH: The PROBE_SEARCH state is the main probing state. This | |||
This state is entered either when probing for the BASE_PMTU was | state is entered either when probing for the BASE_PMTU was | |||
successful or when there is a successful reachability test in the | successful or when there is a successful reachability test in the | |||
PROBE_ERROR state. On entry, the effective PMTU is set to the | PROBE_ERROR state. On entry, the PLPMTU is set to the last | |||
last acknowledged PROBED_SIZE. | acknowledged PROBED_SIZE. | |||
The PROBE_COUNT is set to zero when the first probe packet is sent | The PROBE_COUNT is set to zero when the first probe packet is sent | |||
for each probed size. Each time a probe packet is acknowledged, | for each probe size. Each time a probe packet is acknowledged, | |||
the effective PMTU is set to the PROBED_SIZE, and then the | the PLPMTU is set to the PROBED_SIZE, and then the PROBED_SIZE is | |||
PROBED_SIZE is increased. | increased. | |||
When a probe packet is sent and not acknowledged within the period | When a probe packet is sent and not acknowledged within the period | |||
of the PROBE_TIMER, the PROBE_COUNT is incremented and the probe | of the PROBE_TIMER, the PROBE_COUNT is incremented and the probe | |||
packet is retransmitted. The state is exited when the PROBE_COUNT | packet is retransmitted. The state is exited when the PROBE_COUNT | |||
reaches MAX_PROBES; a PTB message is verified; or a probe of size | reaches MAX_PROBES; a PTB message is verified; or a probe of size | |||
PMTU_MAX is acknowledged. | PMTU_MAX is acknowledged. | |||
PROBE_ERROR: The PROBE_ERROR state represents the case where the | PROBE_ERROR: The PROBE_ERROR state represents the case where the | |||
network path is not known to support an effective PMTU of at least | network path is not known to support an PLPMTU of at least the | |||
the BASE_PMTU size. It is entered when either a probe of size | BASE_PMTU size. It is entered when either a probe of size | |||
BASE_PMTU has not been acknowledged or a verified PTB message | BASE_PMTU has not been acknowledged or a verified PTB message | |||
indicates a smaller link MTU than the BASE_PMTU. On entry, the | indicates a smaller link MTU than the BASE_PMTU. On entry, the | |||
PROBE_COUNT is set to zero and the PROBED_SIZE is set to the | PROBE_COUNT is set to zero and the PROBED_SIZE is set to the | |||
MIN_PMTU size, and the effective PMTU is reset to MIN_PMTU size. | MIN_PMTU size, and the PLPMTU is reset to MIN_PMTU size. In this | |||
In this state, a probe packet is sent, and the PROBE_TIMER is | state, a probe packet is sent, and the PROBE_TIMER is started. | |||
started. The state transitions to the PROBE_SEARCH state when a | The state transitions to the PROBE_SEARCH state when a probe | |||
probe packet is acknowledged. | packet is acknowledged. | |||
PROBE_DONE: The PROBE_DONE state indicates a successful end to a | PROBE_DONE: The PROBE_DONE state indicates a successful end to a | |||
probing phase. Datagram PLPMTUD remains in this state until | probing phase. DPLPMTUD remains in this state until either the | |||
either the PMTU_RAISE_TIMER expires or a received PTB message is | PMTU_RAISE_TIMER expires or a received PTB message is verified. | |||
verified. | ||||
When PLPMTUD uses an unacknowledged PL and is in the PROBE_DONE | When PLPMTUD uses an unacknowledged PL and is in the PROBE_DONE | |||
state, a REACHABILITY_TIMER periodically resets the PROBE_COUNT | state, a REACHABILITY_TIMER periodically resets the PROBE_COUNT | |||
and schedules a probe packet with the size of the effective PMTU. | and schedules a probe packet with the size of the PLPMTU. If the | |||
If the probe packet fails to be acknowledged after MAX_PROBES | probe packet fails to be acknowledged after MAX_PROBES attempts, | |||
attempts, the method enters the PROBE_BASE state. When used with | the method enters the PROBE_BASE state. When used with an | |||
an acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue to | acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue to | |||
probe in this state. | probe in this state. | |||
PROBE_DISABLED: The PROBE_DISABLED state indicates that connectivity | PROBE_DISABLED: The PROBE_DISABLED state indicates that connectivity | |||
could not be established. DPLPMTUD MUST NOT probe in this state. | could not be established. DPLPMTUD MUST NOT probe in this state. | |||
Appendix A contains an informative description of key events. | Appendix Appendix A contains an informative description of key | |||
events. | ||||
5. Specification of Protocol-Specific Methods | 5. Specification of Protocol-Specific Methods | |||
This section specifies protocol-specific details for datagram PLPMTUD | This section specifies protocol-specific details for datagram PLPMTUD | |||
for IETF-specified transports. | for IETF-specified transports. | |||
5.1. DPLPMTUD for UDP and UDP-Lite | The first subsection provides guidance on how to implement the | |||
DPLPMTUD method as a part of an application using UDP or UDP-Lite. | ||||
The guidance also applies to other datagram services that do not | ||||
include a specific transport protocol (such as a tunnel | ||||
encapsulation). The following subsection describe how DPLPMTUD can be | ||||
implemented as a part of the transport service, allowing applications | ||||
using the service to benefit from discovery of the PLPMTU without | ||||
themselves needing to implement this method. | ||||
The current specifications of UDP [RFC0768] and UDP-LIte [RFC3828] do | 5.1. Application support for DPLPMTUD with UDP or UDP-Lite | |||
not define a method in the RFC-series that supports PLPMTUD. In | ||||
particular, these transports do not provide the transport layer | ||||
features needed to implement datagram PLPMTUD, and any support for | ||||
Datagram PLPMTUD would therefore need to rely on higher-layer | ||||
protocol features [RFC8085]. | ||||
5.1.1. UDP Options | The current specifications of UDP [RFC0768] and UDP-Lite [RFC3828] do | |||
not define a method in the RFC-series that supports PLPMTUD. In | ||||
particular, the UDP transport does not provide the transport layer | ||||
features needed to implement datagram PLPMTUD. | ||||
UDP-Options [I-D.ietf-tsvwg-udp-options] supply the additional | The DPLPMTUD method can be implemented as a part of an application | |||
functionality required to implement datagram PLPMTUD. This enables | built directly or indirectly on UDP or UDP-Lite, but relies on | |||
padding to be added to UDP datagrams and can be used to provide | higher-layer protocol features to implement the method [RFC8085]. | |||
feedback acknowledgement of received probe packets. | ||||
5.1.2. UDP Options Required for PLPMTUD | Some primitives used by DPLPMTUD might not be available via the | |||
Datagram API (e.g., the ability to access the PLPMTU cache, or | ||||
interpret received ICMP PTB messages). | ||||
This subsection proposes two new UDP-Options that add support for | In addition, it is desirable that PMTU discovery is not performed by | |||
requesting a datagram response be sent and to mark this datagram as a | multiple protocol layers. An application SHOULD avoid implementing | |||
response to a request. | DPLPMTUD when the underlying transport system provides this | |||
capability. Using a common method for manging the PLPMTU has | ||||
benefits, both in the ability to share state between different | ||||
processes and opportunities to coordinate probing. | ||||
XXX Future versions of the spec may define a parameter in an Option | 5.1.1. Application Request | |||
to indicate the EMTU_R to the peer that can be used to initialise | ||||
PMTU_MAX. XXX | ||||
5.1.2.1. Echo Request Option | An application needs an application-layer protocol mechanism (such as | |||
a message acknowledgement method) that solicits a response from a | ||||
destination endpoint. The method SHOULD allow the sender to check | ||||
the value returned in the response to provide additional protection | ||||
from off-path insertion of data [RFC8085], suitable methods include a | ||||
parameter known only to the two endpoints, such as a session ID or | ||||
initialised sequence number. | ||||
The Echo Request Option allows a sending endpoint to solicit a | 5.1.2. Application Response | |||
response from a destination endpoint. | An application needs an application-layer protocol mechanism to | |||
communicate the response from the destination endpoint. This | ||||
response may indicate successful reception of the probe across the | ||||
path, but could also indicate that some (or all packets) have failed | ||||
to reach the destination. | ||||
The Echo Request carries a four byte token set by the sender. This | 5.1.3. Sending Application Probe Packets | |||
token can be set to a value that is likely to be known only to the | ||||
sender (and becomes known to nodes along the end-to-end path). The | ||||
sender can then check the value returned in the response to provide | ||||
additional protection from off-path insertion of data [RFC8085]. | ||||
+---------+--------+-----------------+ | A probe packet that may carry an application data block, but the | |||
| Kind=9 | Len=6 | Token | | successful transmission of this data is at risk when used for | |||
+---------+--------+-----------------+ | probing. Some applications may prefer to use a probe packet that | |||
1 byte 1 byte 4 bytes | does not carry an application data block to avoid disruption to | |||
normal data transfer. | ||||
Figure 2: UDP ECHOREQ Option Format | 5.1.4. Validating the Path | |||
5.1.2.2. Echo Response Option | An application that does not have other higher-layer information | |||
confirming correct delivery of datagrams SHOULD implement the | ||||
REACHABILITY_TIMER to periodically send probe packets while in the | ||||
PROBE_DONE state. | ||||
The Echo Response Option is generated by the PL in response to | 5.1.5. Handling of PTB Messages | |||
reception of a previously received Echo Request. The Token field | ||||
associates the response with the Token value carried in the most | ||||
recently-received Echo Request. The rate of generation of UDP | ||||
packets carrying an Echo Response Option MAY be rate-limited. | ||||
+---------+--------+-----------------+ | An application that is able and wishes to receive PTB messages MUST | |||
| Kind=10 | Len=6 | Token | | perform ICMP verification as specified in Section 5.2 of [RFC8085]. | |||
+---------+--------+-----------------+ | This requires that the application verifies each received PTB | |||
1 byte 1 byte 4 bytes | messages to verify these are received in response to transmitted | |||
traffic and that the reported link MTU is less than the current probe | ||||
size. A verified PTB message MAY be used as input to the DPLPMTUD | ||||
algorithm, but MUST NOT be used directly to set the PLPMTU. | ||||
Figure 3: UDP ECHORES Option Format | 5.2. DPLPMTUD with UDP Options | |||
5.1.3. Sending UDP-Option Probe Packets | UDP-Options [I-D.ietf-tsvwg-udp-options] can supply the additional | |||
functionality required to implement DPLPMTUD within the UDP transport | ||||
service. This avoids the need for applications to implement the | ||||
DPLPMTUD method. | ||||
This method specifies a probe packet that does not carry an | This enables padding to be added to UDP datagrams and can be used to | |||
application data block. The probe packet consists of a UDP datagram | provide feedback acknowledgement of received probe packets. | |||
header followed by a UDP Option containing the ECHOREQ option, which | ||||
is followed by NOP Options to pad the remainder of the datagram | ||||
payload to the probe size. NOP padding is used to control the length | ||||
of the probe packet. | ||||
A UDP Option carrying the ECHORES option is used to provide feedback | The specification also defines two UDP Options to support DPLMTUD. | |||
when a probe packet is received at the destination endpoint. | ||||
5.1.4. Validating the Path with UDP Options | Section 5.6 of [I-D.ietf-tsvwg-udp-options] defines the MSS option | |||
which allows the local sender to indicate the EMTU_R to the peer. | ||||
This option can be used to initialise PMTU_MAX. An application | ||||
wishing to avoid the effects of MSS-Clamping (where a middlebox | ||||
changes the advertised TCP maximum sending size) ought to use a | ||||
cryptographic method to encrypt this parameter. | ||||
Since UDP is an unacknowledged PL, a sender that does not have | 5.2.1. UDP Request Option | |||
higher-layer information confirming correct delivery of datagrams | ||||
SHOULD implement the REACHABILITY_TIMER to periodically send probe | ||||
packets while in the PROBE_DONE state. | ||||
5.1.5. Handling of PTB Messages by UDP | The Request Option allows a sending endpoint to solicit a response | |||
from a destination endpoint. | ||||
Normal ICMP verification MUST be performed as specified in | The Request Option carries a four byte token set by the sender. This | |||
Section 5.2 of [RFC8085]. This requires that the PL verifies each | token can be set to a value that is likely to be known only to the | |||
received PTB messages to verify these are received in response to | sender (and becomes known to nodes along the end-to-end path). The | |||
transmitted traffic and that the reported LInk MTU is less than the | sender can then check the value returned in the response to provide | |||
current probe size. A verified PTB message MAY be used as input to | additional protection from off-path insertion of data [RFC8085]. | |||
the PLPMTUD algorithm. | ||||
5.2. DPLPMTUD for SCTP | +---------+--------+-----------------+ | |||
| Kind=9 | Len=6 | Token | | ||||
+---------+--------+-----------------+ | ||||
1 byte 1 byte 4 bytes | ||||
5.2.2. UDP Response Option | ||||
The Response Option is generated by the PL in response to reception | ||||
of a previously received Echo Request. The Token field associates | ||||
the response with the Token value carried in the most recently- | ||||
received Echo Request. The rate of generation of UDP packets | ||||
carrying a Response Option MAY be rate-limited. | ||||
+---------+--------+-----------------+ | ||||
| Kind=10 | Len=6 | Token | | ||||
+---------+--------+-----------------+ | ||||
1 byte 1 byte 4 bytes | ||||
5.3. DPLPMTUD for SCTP | ||||
Section 10.2 of [RFC4821] specifies a recommended PLPMTUD probing | Section 10.2 of [RFC4821] specifies a recommended PLPMTUD probing | |||
method for SCTP. It recommends the use of the PAD chunk, defined in | method for SCTP. It recommends the use of the PAD chunk, defined in | |||
[RFC4820] to be attached to a minimum length HEARTBEAT chunk to build | [RFC4820] to be attached to a minimum length HEARTBEAT chunk to build | |||
a probe packet. This enables probing without affecting the transfer | a probe packet. This enables probing without affecting the transfer | |||
of user messages and without interfering with congestion control. | of user messages and without interfering with congestion control. | |||
This is preferred to using DATA chunks (with padding as required) as | This is preferred to using DATA chunks (with padding as required) as | |||
path probes. | path probes. | |||
XXX Future versions of this specification might define a parameter | XXX Future versions of this document might define a parameter | |||
contained in the INIT and INIT ACK chunk to indicate the MTU to the | contained in the INIT and INIT ACK chunk to indicate the remote peer | |||
peer. However, multihoming makes this a bit complex, so it might not | MTU to the local peer. However, multihoming makes this a bit | |||
be worth doing. XXX | complex, so it might not be worth doing. XXX | |||
5.2.1. SCTP/IP4 and SCTP/IPv6 | ||||
The base protocol is specified in [RFC4960]. | 5.3.1. SCTP/IP4 and SCTP/IPv6 | |||
5.2.1.1. Sending SCTP Probe Packets | The base protocol is specified in [RFC4960]. This provides an | |||
acknowledged PL. A sender can therefore enter the PROBE_BASE state as | ||||
soon as connectivity has been confirmed. | ||||
5.3.1.1. Sending SCTP Probe Packets | ||||
Probe packets consist of an SCTP common header followed by a | Probe packets consist of an SCTP common header followed by a | |||
HEARTBEAT chunk and a PAD chunk. The PAD chunk is used to control | HEARTBEAT chunk and a PAD chunk. The PAD chunk is used to control | |||
the length of the probe packet. The HEARTBEAT chunk is used to | the length of the probe packet. The HEARTBEAT chunk is used to | |||
trigger the sending of a HEARTBEAT ACK chunk. The reception of the | trigger the sending of a HEARTBEAT ACK chunk. The reception of the | |||
HEARTBEAT ACK chunk acknowledges reception of a successful probe. | HEARTBEAT ACK chunk acknowledges reception of a successful probe. | |||
The HEARTBEAT chunk carries a Heartbeat Information parameter which | The HEARTBEAT chunk carries a Heartbeat Information parameter which | |||
should include, besides the information suggested in [RFC4960], the | should include, besides the information suggested in [RFC4960], the | |||
probing size, which is the MTU size the complete datagram will add up | probe size, which is the size of the complete datagram. The size of | |||
to. The size of the PAD chunk is therefore computed by reducing the | the PAD chunk is therefore computed by reducing the probing size by | |||
probing size by the IPv4 or IPv6 header size, the SCTP common header, | the IPv4 or IPv6 header size, the SCTP common header, the HEARTBEAT | |||
the HEARTBEAT request and the PAD chunk header. The payload of the | request and the PAD chunk header. The payload of the PAD chunk | |||
PAD chunk contains arbitrary data. | contains arbitrary data. | |||
To avoid fragmentation of retransmitted data, probing starts right | To avoid fragmentation of retransmitted data, probing starts right | |||
after the handshake, before data is sent. Assuming normal behaviour | after the handshake, before data is sent. Assuming normal behaviour | |||
(i.e., the PMTU is smaller than or equal to the interface MTU), this | (i.e., the PMTU is smaller than or equal to the interface MTU), this | |||
process will take a few round trip time periods depending on the | process will take a few round trip time periods depending on the | |||
number of PMTU sizes probed. The Heartbeat timer can be used to | number of PMTU sizes probed. The Heartbeat timer can be used to | |||
implement the PROBE_TIMER. | implement the PROBE_TIMER. | |||
5.2.1.2. Validating the Path with SCTP | 5.3.1.2. Validating the Path with SCTP | |||
Since SCTP provides an acknowledged PL, a sender does MUST NOT | Since SCTP provides an acknowledged PL, a sender does MUST NOT | |||
implement the REACHABILITY_TIMER while in the PROBE_DONE state. | implement the REACHABILITY_TIMER while in the PROBE_DONE state. | |||
5.2.1.3. PTB Message Handling by SCTP | 5.3.1.3. PTB Message Handling by SCTP | |||
Normal ICMP verification MUST be performed as specified in Appendix C | Normal ICMP verification MUST be performed as specified in Appendix C | |||
of [RFC4960]. This requires that the first 8 bytes of the SCTP | of [RFC4960]. This requires that the first 8 bytes of the SCTP | |||
common header are quoted in the payload of the PTB message, which can | common header are quoted in the payload of the PTB message, which can | |||
be the case for ICMPv4 and is normally the case for ICMPv6. | be the case for ICMPv4 and is normally the case for ICMPv6. | |||
When a PTB message has been verified, the router Link MTU indicated | When a PTB message has been verified, the router Link MTU indicated | |||
in the PTB message SHOULD be used with the PLPMTUD algorithm, | in the PTB message SHOULD be used with the DPLPMTUD algorithm, | |||
providing that the reported Link MTU is less than the current probe | providing that the reported Link MTU is less than the current probe | |||
size. | size. | |||
5.2.2. DPLPMTUD for SCTP/UDP | 5.3.2. DPLPMTUD for SCTP/UDP | |||
The UDP encapsulation of SCTP is specified in [RFC6951]. | The UDP encapsulation of SCTP is specified in [RFC6951]. | |||
5.2.2.1. Sending SCTP/UDP Probe Packets | 5.3.2.1. Sending SCTP/UDP Probe Packets | |||
Packet probing can be performed as specified in Section 5.2.1.1. The | Packet probing can be performed as specified in Section 5.3.1.1. The | |||
maximum payload is reduced by 8 bytes, which has to be considered | maximum payload is reduced by 8 bytes, which has to be considered | |||
when filling the PAD chunk. | when filling the PAD chunk. | |||
5.2.2.2. Validating the Path with SCTP/UDP | 5.3.2.2. Validating the Path with SCTP/UDP | |||
Since SCTP provides an acknowledged PL, a sender does MUST NOT | Since SCTP provides an acknowledged PL, a sender does MUST NOT | |||
implement the REACHABILITY_TIMER while in the PROBE_DONE state. | implement the REACHABILITY_TIMER while in the PROBE_DONE state. | |||
5.2.2.3. Handling of PTB Messages by SCTP/UDP | 5.3.2.3. Handling of PTB Messages by SCTP/UDP | |||
Normal ICMP verification MUST be performed for PTB messages as | Normal ICMP verification MUST be performed for PTB messages as | |||
specified in Appendix C of [RFC4960]. This requires that the first 8 | specified in Appendix C of [RFC4960]. This requires that the first 8 | |||
bytes of the SCTP common header are contained in the PTB message, | bytes of the SCTP common header are contained in the PTB message, | |||
which can be the case for ICMPv4 (but note the UDP header also | which can be the case for ICMPv4 (but note the UDP header also | |||
consumes a part of the quoted packet header) and is normally the case | consumes a part of the quoted packet header) and is normally the case | |||
for ICMPv6. When the verification is completed, the router Link MTU | for ICMPv6. When the verification is completed, the router Link MTU | |||
size indicated in the PTB message SHOULD be used with the PLPMTUD | size indicated in the PTB message SHOULD be used with the DPLPMTUD | |||
algorithm providing that the reported LInk MTU is less than the | providing that the reported link MTU is less than the current probe | |||
current probe size. | size. | |||
5.2.3. DPLPMTUD for SCTP/DTLS | 5.3.3. DPLPMTUD for SCTP/DTLS | |||
The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is | The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is | |||
specified in [I-D.ietf-tsvwg-sctp-dtls-encaps]. It is used for data | specified in [I-D.ietf-tsvwg-sctp-dtls-encaps]. It is used for data | |||
channels in WebRTC implementations. | channels in WebRTC implementations. | |||
5.2.3.1. Sending SCTP/DTLS Probe Packets | 5.3.3.1. Sending SCTP/DTLS Probe Packets | |||
Packet probing can be done as specified in Section 5.2.1.1. | Packet probing can be done as specified in Section 5.3.1.1. | |||
5.2.3.2. Validating the Path with SCTP/DTLS | 5.3.3.2. Validating the Path with SCTP/DTLS | |||
Since SCTP provides an acknowledged PL, a sender does MUST NOT | Since SCTP provides an acknowledged PL, a sender does MUST NOT | |||
implement the REACHABILITY_TIMER while in the PROBE_DONE state. | implement the REACHABILITY_TIMER while in the PROBE_DONE state. | |||
5.2.3.3. Handling of PTB Messages by SCTP/DTLS | 5.3.3.3. Handling of PTB Messages by SCTP/DTLS | |||
It is not possible to perform normal ICMP verification as specified | It is not possible to perform normal ICMP verification as specified | |||
in [RFC4960], since even if the ICMP message payload contains | in [RFC4960], since even if the ICMP message payload contains | |||
sufficient information, the reflected SCTP common header would be | sufficient information, the reflected SCTP common header would be | |||
encrypted. Therefore it is not possible to process PTB messages at | encrypted. Therefore it is not possible to process PTB messages at | |||
the PL. | the PL. | |||
5.3. PMTUD for QUIC | 5.4. DPLPMTUD for QUIC | |||
XXX New section XXX | Quick UDP Internet Connection (QUIC) [I-D.ietf-quic-transport] is a | |||
UDP-based transport that provides reception feedback. | ||||
Quick UDP Internet Connection (QUIC) is a UDP-based transport that | Section 9.2 of [I-D.ietf-quic-transport] describes the path | |||
provides reception feedback [I-D.ietf-quic-transport]. | considerations when sending QUIC packets. It recommends the use of | |||
PADDING frames to build the probe packet. This enables probing the | ||||
without affecting the transfer of other QUIC frames. | ||||
Section 9.2 of [I-D.ietf-quic-transport] details the path | This provides an acknowledged PL. A sender can therefore enter the | |||
considerations when sending QUIC packets. It reccomends the use of | PROBE_BASE state as soon as connectivity has been confirmed. | |||
PADDING frames to buld the probe packet. This enables probing the | ||||
without affecting the transfer of other frames. | ||||
5.3.1. Sending QUIC Probe Packets | 5.4.1. Sending QUIC Probe Packets | |||
A probe packet consists of a QUIC Header and a payload containing | ||||
only PADDING Frames. PADDING Frames are a single octet (0x00) and | ||||
several of these can be used to create a probe packet of size | ||||
PROBED_SIZE. QUIC provides an acknowledged PL. A sender can therefore | ||||
enter the PROBE_BASE state as soon as connectivity has been | ||||
confirmed. | ||||
Probe packets consist of a QUIC Header and a payload containing only | The current specification of QUIC sets the following: | |||
PADDING Frames. PADDING Frames are a single octet (0x00) and | ||||
serveral of these can be used to create a probe packet of size | ||||
PROBED_SIZE. | ||||
A QUIC sender needs to pad initial packets to 1200 bytes to validate | o BASE_PMTU: 1200. A QUIC sender needs to pad initial packets to | |||
the path can support packets of a useful size. If a QUIC sender | 1200 bytes to validate the path can support packets of a useful | |||
determines the PMTU on a path has fallen below 1280 octets it MUST | size. | |||
immediately stop sending on the affected path. | ||||
5.3.2. Validating the Path with QUIC | o MIN_PMTU: 1200 bytes. A QUIC sender that determines the PMTU has | |||
fallen below 1200 bytes MUST immediately stop sending on the | ||||
affected path. | ||||
Since QUIC provides an acknowledged PL, a sender does MUST NOT | 5.4.2. Validating the Path with QUIC | |||
QUIC provides an acknowledged PL. A sender therefore MUST NOT | ||||
implement the REACHABILITY_TIMER while in the PROBE_DONE state. | implement the REACHABILITY_TIMER while in the PROBE_DONE state. | |||
5.3.3. Handling of PTB Messages by QUIC | 5.4.3. Handling of PTB Messages by QUIC | |||
QUIC does not specify any methods for validating ICMP responses, but | QUIC operates over the UDP transport, and the guidelines on ICMP | |||
does provide some guidlines to make it harder for an off path | verification as specified in Section 5.2 of [RFC8085] therefore | |||
attacker to inject ICMP messages. | apply. Although QUIC does not currently specify a method for | |||
validating ICMP responses, it does provide some guidelines to make it | ||||
harder for an off-path attacker to inject ICMP messages. | ||||
o Set the IPv4 Don't Fragment (DF) bit on a small proportion of | o Set the IPv4 Don't Fragment (DF) bit on a small proportion of | |||
packets, so that most invalid ICMP messages arrive when there are | packets, so that most invalid ICMP messages arrive when there are | |||
no DF packets outstanding, and can therefore be identified as | no DF packets outstanding, and can therefore be identified as | |||
spurious. | spurious. | |||
o Store additional information from the IP or UDP headers from DF | o Store additional information from the IP or UDP headers from DF | |||
packets (for example, the IP ID or UDP checksum) to further | packets (for example, the IP ID or UDP checksum) to further | |||
authenticate incoming Datagram Too Big messages. | authenticate incoming Datagram Too Big messages. | |||
o Any reduction in PMTU due to a report contained in an ICMP packet | o Any reduction in PMTU due to a report contained in an ICMP packet | |||
is provisional until QUIC's loss detection algorithm determines | is provisional until QUIC's loss detection algorithm determines | |||
that the packet is actually lost. | that the packet is actually lost. | |||
XXX The above list was pulled whole from quic-transport XXX | XXX The above list was pulled whole from quic-transport - input is | |||
invited from QUIC contributors. XXX | ||||
5.4. Other IETF Transports | ||||
XXX This section to be updated in a later revision. XXX | ||||
5.5. DPLPMTUD by Applications | ||||
Applications that use the Datagram API (e.g., applications built | ||||
directly or indirectly on UDP) can implement DPLPMTUD. Some | ||||
primitives used by DPLPMTUD might not be available via this interface | ||||
(e.g., the ability to access the PMTU cache, or interpret received | ||||
ICMP PTB messages). | ||||
In addition, it is important that PMTUD is not performed by multiple | ||||
protocol layers. | ||||
XXX This section will be completed in a future revision of this ID | ||||
XXX | ||||
6. Acknowledgements | 6. Acknowledgements | |||
This work was partially funded by the European Union's Horizon 2020 | This work was partially funded by the European Union's Horizon 2020 | |||
research and innovation programme under grant agreement No. 644334 | research and innovation programme under grant agreement No. 644334 | |||
(NEAT). The views expressed are solely those of the author(s). | (NEAT). The views expressed are solely those of the author(s). | |||
7. IANA Considerations | 7. IANA Considerations | |||
This memo includes no request to IANA. | This memo includes no request to IANA. | |||
XXX If new UDP Options are specified in this document, a request to | XXX If new UDP Options are specified in this document, a request to | |||
IANA will be included here. XXX | IANA will be included here. XXX | |||
If there are no requirements for IANA, the section will be removed | If there are no requirements for IANA, the section will be removed | |||
during conversion into an RFC by the RFC Editor. | during conversion into an RFC by the RFC Editor. | |||
8. Security Considerations | 8. Security Considerations | |||
skipping to change at page 24, line 25 ¶ | skipping to change at page 26, line 16 ¶ | |||
XXX If new UDP Options are specified in this document, a request to | XXX If new UDP Options are specified in this document, a request to | |||
IANA will be included here. XXX | IANA will be included here. XXX | |||
If there are no requirements for IANA, the section will be removed | If there are no requirements for IANA, the section will be removed | |||
during conversion into an RFC by the RFC Editor. | during conversion into an RFC by the RFC Editor. | |||
8. Security Considerations | 8. Security Considerations | |||
The security considerations for the use of UDP and SCTP are provided | The security considerations for the use of UDP and SCTP are provided | |||
in the references RFCs. Security guidance for applications using UDP | in the references RFCs. Security guidance for applications using UDP | |||
is provided in the UDP-Guidelines [RFC8085]. | is provided in the UDP Usage Guidelines [RFC8085]. | |||
PTB messages could potentially be used to cause a node to | There are cases where PTB messages are not delivered due to policy, | |||
inappropriately reduce the effective PMTU. A node supporting PLPMTUD | configuration or equipment design (see Section 1.1), this method | |||
MUST appropriately verify the payload of PTB messages to ensure these | therefore does not rely upon PTB messages being received, but is able | |||
are received in response to transmitted traffic (i.e., a reported | to utilise these when they are received by the sender. PTB messages | |||
error condition that corresponds to a datagram actually sent by the | could potentially be used to cause a node to inappropriately reduce | |||
path layer. | the PLPMTU. A node supporting DPLPMTUD MUST therefore appropriately | |||
verify the payload of PTB messages to ensure these are received in | ||||
response to transmitted traffic (i.e., a reported error condition | ||||
that corresponds to a datagram actually sent by the path layer. | ||||
XXX Determine if parallel forwarding paths needs to be considered. | Parallel forwarding paths may need to be considered. Section 3.5 | |||
XXX | identifies the need for robustness in the method when the path | |||
information may be inconsistent. | ||||
A node performing PLPMTUD could experience conflicting information | A node performing DPLPMTUD could experience conflicting information | |||
about the size of supported probe packets. This could occur when | about the size of supported probe packets. This could occur when | |||
there are multiple paths are concurrently in use and these exhibit a | there are multiple paths are concurrently in use and these exhibit a | |||
different PMTU. If not considered, this could result in data being | different PMTU. If not considered, this could result in data being | |||
blackholed when the effective PMTU is larger than the smallest PMTU | black holed when the PLPMTU is larger than the smallest PMTU across | |||
across the current paths. | the current paths. | |||
An on-path attacker could forge PTB messages to drive down the PLPMTU | ||||
9. References | 9. References | |||
9.1. Normative References | 9.1. Normative References | |||
[I-D.ietf-quic-transport] | [I-D.ietf-quic-transport] | |||
Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed | Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed | |||
and Secure Transport", draft-ietf-quic-transport-04 (work | and Secure Transport", Internet-Draft draft-ietf-quic- | |||
in progress), June 2017. | transport-04, June 2017. | |||
[I-D.ietf-tsvwg-sctp-dtls-encaps] | [I-D.ietf-tsvwg-sctp-dtls-encaps] | |||
Tuexen, M., Stewart, R., Jesup, R., and S. Loreto, "DTLS | Tuexen, M., Stewart, R., Jesup, R. and S. Loreto, "DTLS | |||
Encapsulation of SCTP Packets", draft-ietf-tsvwg-sctp- | Encapsulation of SCTP Packets", Internet-Draft draft-ietf- | |||
dtls-encaps-09 (work in progress), January 2015. | tsvwg-sctp-dtls-encaps-09, January 2015. | |||
[I-D.ietf-tsvwg-udp-options] | [I-D.ietf-tsvwg-udp-options] | |||
Touch, J., "Transport Options for UDP", draft-ietf-tsvwg- | Touch, J., "Transport Options for UDP", Internet-Draft | |||
udp-options-01 (work in progress), June 2017. | draft-ietf-tsvwg-udp-options-01, June 2017. | |||
[RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, | [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, | |||
DOI 10.17487/RFC0768, August 1980, <https://www.rfc- | August 1980. | |||
editor.org/info/rfc768>. | ||||
[RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, | [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, | |||
RFC 792, DOI 10.17487/RFC0792, September 1981, | RFC 792, DOI 10.17487/RFC0792, September 1981, <https:// | |||
<https://www.rfc-editor.org/info/rfc792>. | www.rfc-editor.org/info/rfc792>. | |||
[RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - | [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - | |||
Communication Layers", STD 3, RFC 1122, | Communication Layers", STD 3, RFC 1122, DOI 10.17487/ | |||
DOI 10.17487/RFC1122, October 1989, <https://www.rfc- | RFC1122, October 1989, <https://www.rfc-editor.org/info/ | |||
editor.org/info/rfc1122>. | rfc1122>. | |||
[RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", | [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", | |||
RFC 1812, DOI 10.17487/RFC1812, June 1995, | RFC 1812, DOI 10.17487/RFC1812, June 1995, <https://www | |||
<https://www.rfc-editor.org/info/rfc1812>. | .rfc-editor.org/info/rfc1812>. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
DOI 10.17487/RFC2119, March 1997, <https://www.rfc- | ||||
editor.org/info/rfc2119>. | ||||
[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | |||
(IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, | (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, | |||
December 1998, <https://www.rfc-editor.org/info/rfc2460>. | December 1998, <https://www.rfc-editor.org/info/rfc2460>. | |||
[RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., | [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E.Ed., | |||
and G. Fairhurst, Ed., "The Lightweight User Datagram | and G. Fairhurst, Ed., "The Lightweight User Datagram | |||
Protocol (UDP-Lite)", RFC 3828, DOI 10.17487/RFC3828, July | Protocol (UDP-Lite)", RFC 3828, DOI 10.17487/RFC3828, July | |||
2004, <https://www.rfc-editor.org/info/rfc3828>. | 2004, <https://www.rfc-editor.org/info/rfc3828>. | |||
[RFC4820] Tuexen, M., Stewart, R., and P. Lei, "Padding Chunk and | [RFC4820] Tuexen, M., Stewart, R. and P. Lei, "Padding Chunk and | |||
Parameter for the Stream Control Transmission Protocol | Parameter for the Stream Control Transmission Protocol | |||
(SCTP)", RFC 4820, DOI 10.17487/RFC4820, March 2007, | (SCTP)", RFC 4820, DOI 10.17487/RFC4820, March 2007, | |||
<https://www.rfc-editor.org/info/rfc4820>. | <https://www.rfc-editor.org/info/rfc4820>. | |||
[RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", | [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", | |||
RFC 4960, DOI 10.17487/RFC4960, September 2007, | RFC 4960, DOI 10.17487/RFC4960, September 2007, <https:// | |||
<https://www.rfc-editor.org/info/rfc4960>. | www.rfc-editor.org/info/rfc4960>. | |||
[RFC6951] Tuexen, M. and R. Stewart, "UDP Encapsulation of Stream | [RFC6951] Tuexen, M. and R. Stewart, "UDP Encapsulation of Stream | |||
Control Transmission Protocol (SCTP) Packets for End-Host | Control Transmission Protocol (SCTP) Packets for End-Host | |||
to End-Host Communication", RFC 6951, | to End-Host Communication", RFC 6951, DOI 10.17487/ | |||
DOI 10.17487/RFC6951, May 2013, <https://www.rfc- | RFC6951, May 2013, <https://www.rfc-editor.org/info/ | |||
editor.org/info/rfc6951>. | rfc6951>. | |||
[RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage | [RFC8085] Eggert, L., Fairhurst, G. and G. Shepherd, "UDP Usage | |||
Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, | Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, | |||
March 2017, <https://www.rfc-editor.org/info/rfc8085>. | March 2017, <https://www.rfc-editor.org/info/rfc8085>. | |||
[RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., | [RFC8201] McCann, J., Deering, S., Mogul, J. and R. Hinden, Ed., | |||
"Path MTU Discovery for IP version 6", STD 87, RFC 8201, | "Path MTU Discovery for IP version 6", STD 87, RFC 8201, | |||
DOI 10.17487/RFC8201, July 2017, <https://www.rfc- | DOI 10.17487/RFC8201, July 2017, <https://www.rfc- | |||
editor.org/info/rfc8201>. | editor.org/info/rfc8201>. | |||
9.2. Informative References | 9.2. Informative References | |||
[RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, | [RFC1191] Mogul, J.C. and S.E. Deering, "Path MTU discovery", RFC | |||
DOI 10.17487/RFC1191, November 1990, <https://www.rfc- | 1191, DOI 10.17487/RFC1191, November 1990, <https://www | |||
editor.org/info/rfc1191>. | .rfc-editor.org/info/rfc1191>. | |||
[RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", | [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", RFC | |||
RFC 2923, DOI 10.17487/RFC2923, September 2000, | 2923, DOI 10.17487/RFC2923, September 2000, <https://www | |||
<https://www.rfc-editor.org/info/rfc2923>. | .rfc-editor.org/info/rfc2923>. | |||
[RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram | [RFC4340] Kohler, E., Handley, M. and S. Floyd, "Datagram Congestion | |||
Congestion Control Protocol (DCCP)", RFC 4340, | Control Protocol (DCCP)", RFC 4340, March 2006. | |||
DOI 10.17487/RFC4340, March 2006, <https://www.rfc- | ||||
editor.org/info/rfc4340>. | ||||
[RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU | [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU | |||
Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, | Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, | |||
<https://www.rfc-editor.org/info/rfc4821>. | <https://www.rfc-editor.org/info/rfc4821>. | |||
[RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering | [RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering | |||
ICMPv6 Messages in Firewalls", RFC 4890, | ICMPv6 Messages in Firewalls", RFC 4890, DOI 10.17487/ | |||
DOI 10.17487/RFC4890, May 2007, <https://www.rfc- | RFC4890, May 2007, <https://www.rfc-editor.org/info/ | |||
editor.org/info/rfc4890>. | rfc4890>. | |||
Appendix A. Event-driven state changes | Appendix A. Event-driven state changes | |||
This appendix contains an informative description of key events: | This appendix contains an informative description of key events: | |||
Path Setup: When a new path is initiated, the state is set to | Path Setup: When a new path is initiated, the state is set to | |||
PROBE_START. As soon as the path is confirmed, the state changes | PROBE_START. As soon as the path is confirmed, the state changes | |||
to PROBE_BASE and the probing mechanism for this path is started. | to PROBE_BASE and the probing mechanism for this path is started. | |||
the first probe packet is sent with the size of the BASE_PMTU. | the first probe packet is sent with the size of the BASE_PMTU. | |||
Arrival of an Acknowledgment: Depending on the probing state, the | Arrival of an Acknowledgment: Depending on the probing state, the | |||
reaction differs according to Figure 4, which is just a | reaction differs according to Figure 5, which is just a | |||
simplification of Figure 1 focusing on this event. | simplification of Figure 2 focusing on this event. | |||
+--------------+ +----------------+ | +--------------+ +----------------+ | |||
| PROBE_START | --3------------------------------->| PROBE_DISABLED | | | PROBE_START | --3------------------------------->| PROBE_DISABLED | | |||
+--------------+ --4-----------\ +----------------+ | +--------------+ --4-----------\ +----------------+ | |||
\ | \ | |||
+--------------+ \ | +--------------+ \ | |||
| PROBE_ERROR | --------------- \ | | PROBE_ERROR | --------------- \ | |||
+--------------+ \ \ | +--------------+ \ \ | |||
\ \ | \ \ | |||
+--------------+ \ \ +--------------+ | +--------------+ \ \ +--------------+ | |||
| PROBE_BASE | --1---------- \ ------------> | PROBE_BASE | | | PROBE_BASE | --1---------- \ ------------> | PROBE_BASE | | |||
+--------------+ --2----- \ \ +--------------+ | +--------------+ --2----- \ \ +--------------+ | |||
\ \ \ | \ \ \ | |||
+--------------+ \ \ ------------> +--------------+ | +--------------+ \ \ ------------> +--------------+ | |||
| PROBE_SEARCH | --2--- \ -----------------> | PROBE_SEARCH | | | PROBE_SEARCH | --2--- \ -----------------> | PROBE_SEARCH | | |||
+--------------+ --1---\----\---------------------> +--------------+ | +--------------+ --1---\----\---------------------> +--------------+ | |||
\ \ | \ \ | |||
+--------------+ \ \ +--------------+ | +--------------+ \ \ +--------------+ | |||
| PROBE_DONE | \ -------------------> | PROBE_DONE | | | PROBE_DONE | \ -------------------> | PROBE_DONE | | |||
+--------------+ -----------------------> +--------------+ | +--------------+ -----------------------> +--------------+ | |||
Condition 1: The maximum PMTU size has not yet been reached. | Condition 1: The maximum PMTU size has not yet been reached. | |||
Condition 2: The maximum PMTU size has been reached. Conition 3: | Condition 2: The maximum PMTU size has been reached. Conition 3: | |||
Probe Timer expires and PROBE_COUNT = MAX_PROBEs. Condition 4: | Probe Timer expires and PROBE_COUNT = MAX_PROBEs. Condition 4: | |||
PROBE_ACK received. | PROBE_ACK received. | |||
Figure 4: State changes at the arrival of an acknowledgment | Probing timeout: The PROBE_COUNT is initialised to zero each time the | |||
value of PROBED_SIZE is changed. The PROBE_TIMER is started each | ||||
Probing timeout: The PROBE_COUNT is initialised to zero each time | time a probe packet is sent. It is stopped when an acknowledgment | |||
the value of PROBED_SIZE is changed. The PROBE_TIMER is started | arrives that confirms delivery of a probe packet. If the probe | |||
each time a probe packet is sent. It is stopped when an | packet is not acknowledged before the PROBE_TIMER expires, the | |||
acknowledgment arrives that confirms delivery of a probe packet. | PROBE_ERROR_COUNTER is incremented. When the PROBE_COUNT equals | |||
If the probe packet is not acknowledged before the PROBE_TIMER | the value MAX_PROBES, the state is changed, otherwise a new probe | |||
expires, the PROBE_ERROR_COUNTER is incremented. When the | packet of the same size (PROBED_SIZE) is resent. The state | |||
PROBE_COUNT equals the value MAX_PROBES, the state is changed, | transitions are illustrated in Figure 6. This shows a | |||
otherwise a new probe packet of the same size (PROBED_SIZE) is | simplification of Figure 2 with a focus only on this event. | |||
resent. The state transitions are illustrated in Figure 5. This | ||||
shows a simplification of Figure 1 with a focus only on this | ||||
event. | ||||
+--------------+ +----------------+ | +--------------+ +----------------+ | |||
| PROBE_START |----------------------------------->| PROBE_DISABLED | | | PROBE_START |----------------------------------->| PROBE_DISABLED | | |||
+--------------+ +----------------+ | +--------------+ +----------------+ | |||
+--------------+ +--------------+ | +--------------+ +--------------+ | |||
| PROBE_ERROR | -----------------> | PROBE_ERROR | | | PROBE_ERROR | -----------------> | PROBE_ERROR | | |||
+--------------+ / +--------------+ | +--------------+ / +--------------+ | |||
/ | / | |||
+--------------+ --2----------/ +--------------+ | +--------------+ --2----------/ +--------------+ | |||
| PROBE_BASE | --1------------------------------> | PROBE_BASE | | | PROBE_BASE | --1------------------------------> | PROBE_BASE | | |||
+--------------+ +--------------+ | +--------------+ +--------------+ | |||
+--------------+ +--------------+ | +--------------+ +--------------+ | |||
| PROBE_SEARCH | --1------------------------------> | PROBE_SEARCH | | | PROBE_SEARCH | --1------------------------------> | PROBE_SEARCH | | |||
+--------------+ --2--------- +--------------+ | +--------------+ --2--------- +--------------+ | |||
\ | \ | |||
+--------------+ \ +--------------+ | +--------------+ \ +--------------+ | |||
| PROBE_DONE | -------------------> | PROBE_DONE | | | PROBE_DONE | -------------------> | PROBE_DONE | | |||
+--------------+ +--------------+ | +--------------+ +--------------+ | |||
Condition 1: The maximum number of probe packets has not been | Condition 1: The maximum number of probe packets has not been | |||
reached. Condition 2: The maximum number of probe packets has been | reached. Condition 2: The maximum number of probe packets has been | |||
reached. | reached. | |||
Figure 5: State changes at the expiration of the probe timer | PMTU raise timer timeout: The path through the network can change | |||
PMTU raise timer timeout: The path through the network can change | ||||
over time. It impossible to discover whether a path change has | over time. It impossible to discover whether a path change has | |||
increased the actual PMTU by exchanging packets less than or equal | increased the actual PMTU by exchanging packets less than or equal | |||
to the effective PMTU. This requires PLPMTUD to periodically send | to the PLPMTU. This requires PLPMTUD to periodically send a probe | |||
a probe packet to detect whether a larger PMTU is possible. This | packet to detect whether a larger PMTU is possible. This probe | |||
probe packet is generated by the PMTU_RAISE_TIMER. When the timer | packet is generated by the PMTU_RAISE_TIMER. When the timer | |||
expires, probing is restarted with the BASE_PMTU and the state is | expires, probing is restarted with the BASE_PMTU and the state is | |||
changed to PROBE_BASE. | changed to PROBE_BASE. | |||
Arrival of an ICMP message: The active probing of the path can be | Arrival of an ICMP message: The active probing of the path can be | |||
supported by the arrival of PTB messages sent by routers or | supported by the arrival of PTB messages sent by routers or | |||
middleboxes with a link MTU that is smaller than the probe packet | middleboxes with a link MTU that is smaller than the probe packet | |||
size. If the PTB message includes the router link MTU, three | size. If the PTB message includes the router link MTU, three | |||
cases can be distinguished: | cases can be distinguished: | |||
1. The indicated link MTU in the PTB message is between the | 1. The indicated link MTU in the PTB message is between the | |||
already probed and effective MTU and the probe that triggered | already probed and PLMTU and the probe that triggered the PTB | |||
the PTB message. | message. | |||
2. The indicated link MTU in the PTB message is smaller than the | 2. The indicated link MTU in the PTB message is smaller than the | |||
effective PMTU. | PLPMTU. | |||
3. The indicated link MTU in the PTB message is equal to the | 3. The indicated link MTU in the PTB message is equal to the | |||
BASE_PMTU. | BASE_PMTU. | |||
In first case, the PROBE_BASE state transitions to the PROBE_ERROR | In first case, the PROBE_BASE state transitions to the PROBE_ERROR | |||
state. In the PROBE_SEARCH state, a new probe packet is sent with | state. In the PROBE_SEARCH state, a new probe packet is sent with | |||
the sized reported by the PTB message. Its result is handled | the sized reported by the PTB message. Its result is handled | |||
according to the former events. | according to the former events. | |||
The second case could be a result of a network re-configuration. | The second case could be a result of a network re-configuration. | |||
If the reported link MTU in the PTB message is greater than the | If the reported link MTU in the PTB message is greater than the | |||
BASE_MTU, the probing starts again with a value of PROBE_BASE. | BASE_MTU, the probing starts again with a value of PROBE_BASE. | |||
Otherwise, the method enters the state PROBE_ERROR. | Otherwise, the method enters the state PROBE_ERROR. | |||
In the third case, the maximum possible PMTU has been reached. | In the third case, the maximum possible PMTU has been reached. | |||
This ought to be probed again, because there could be a link | This ought to be probed again, because there could be a link | |||
further along the path with a still smaller MTU. | further along the path with a still smaller MTU. | |||
Note: Not all routers include the link MTU size when they send a | Note: Not all routers include the link MTU size when they send a | |||
PTB message. If the PTB message does not indicate the link MTU, | PTB message. If the PTB message does not indicate the link MTU, | |||
the probe is handled in the same way as condition 2 of Figure 5. | the probe is handled in the same way as condition 2 of Figure 6. | |||
Appendix B. Revision Notes | Appendix B. Revision Notes | |||
Note to RFC-Editor: please remove this entire section prior to | Note to RFC-Editor: please remove this entire section prior to | |||
publication. | publication. | |||
Individual draft -00: | Individual draft -00: | |||
o Comments and corrections are welcome directly to the authors or | o Comments and corrections are welcome directly to the authors or | |||
via the IETF TSVWG working group mailing list. | via the IETF TSVWG working group mailing list. | |||
skipping to change at page 30, line 36 ¶ | skipping to change at page 32, line 25 ¶ | |||
o This draft includes improved introduction. | o This draft includes improved introduction. | |||
o The draft is updated to require ICMP validation prior to accepting | o The draft is updated to require ICMP validation prior to accepting | |||
PTB messages - this to be confirmed by WG | PTB messages - this to be confirmed by WG | |||
o Section added to discuss Selection of Probe Size - methods to be | o Section added to discuss Selection of Probe Size - methods to be | |||
evlauated and recommendations to be considered | evlauated and recommendations to be considered | |||
o Section added to align with work proposed in the QUIC WG. | o Section added to align with work proposed in the QUIC WG. | |||
Working Group draft -02: | ||||
o The draft was updated based on feedback from the WG, and a | ||||
detailed review by Magnus Westerlund. | ||||
o The document updates RFC 4821. | ||||
o Requirements list updated. | ||||
o Added more explicit discussion of a simpler black-hole detection | ||||
mode. | ||||
o This draft includes reorganisation of the section on IETF | ||||
protocols. | ||||
o Added more discussion of implementation within an application. | ||||
o Added text on flapping paths. | ||||
o Replaced 'effective MTU' with new term PLPMTU. | ||||
Authors' Addresses | Authors' Addresses | |||
Godred Fairhurst | Godred Fairhurst | |||
University of Aberdeen | University of Aberdeen | |||
School of Engineering | School of Engineering | |||
Fraser Noble Building | Fraser Noble Building | |||
Aberdeen AB24 3U | Aberdeen, AB24 3U | |||
UK | UK | |||
Email: gorry@erg.abdn.ac.uk | Email: gorry@erg.abdn.ac.uk | |||
Tom Jones | Tom Jones | |||
University of Aberdeen | University of Aberdeen | |||
School of Engineering | School of Engineering | |||
Fraser Noble Building | Fraser Noble Building | |||
Aberdeen AB24 3U | Aberdeen, AB24 3U | |||
UK | UK | |||
Email: tom@erg.abdn.ac.uk | Email: tom@erg.abdn.ac.uk | |||
Michael Tuexen | Michael Tuexen | |||
Muenster University of Applied Sciences | Muenster University of Applied Sciences | |||
Stegerwaldstrasse 39 | Stegerwaldstrasse 39 | |||
Stein fart 48565 | Stein fart, 48565 | |||
DE | DE | |||
Email: tuexen@fh-muenster.de | Email: tuexen@fh-muenster.de | |||
Irene Ruengeler | Irene Ruengeler | |||
Muenster University of Applied Sciences | Muenster University of Applied Sciences | |||
Stegerwaldstrasse 39 | Stegerwaldstrasse 39 | |||
Stein fart 48565 | Stein fart, 48565 | |||
DE | DE | |||
Email: i.ruengeler@fh-muenster.de | Email: i.ruengeler@fh-muenster.de | |||
End of changes. 228 change blocks. | ||||
736 lines changed or deleted | 889 lines changed or added | |||
This html diff was produced by rfcdiff 1.46. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |