draft-ietf-tsvwg-datagram-plpmtud-02.txt | draft-ietf-tsvwg-datagram-plpmtud-03.txt | |||
---|---|---|---|---|
Internet Engineering Task Force G. Fairhurst | Internet Engineering Task Force G. Fairhurst | |||
Internet-Draft T. Jones | Internet-Draft T. Jones | |||
Updates: 4821 (if approved) University of Aberdeen | Updates: 4821 (if approved) University of Aberdeen | |||
Intended status: Standards Track M. Tuexen | Intended status: Standards Track M. Tuexen | |||
Expires: December 08, 2018 I. Ruengeler | Expires: January 3, 2019 I. Ruengeler | |||
Muenster University of Applied Sciences | Muenster University of Applied Sciences | |||
June 08, 2018 | July 02, 2018 | |||
Packetization Layer Path MTU Discovery for Datagram Transports | Packetization Layer Path MTU Discovery for Datagram Transports | |||
draft-ietf-tsvwg-datagram-plpmtud-02 | draft-ietf-tsvwg-datagram-plpmtud-03 | |||
Abstract | Abstract | |||
This document describes a robust method for Path MTU Discovery | This document describes a robust method for Path MTU Discovery | |||
(PMTUD) for datagram Packetization layers. The document describes an | (PMTUD) for datagram Packetization layers. The document describes an | |||
extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path | extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path | |||
MTU Discovery for IPv4 and IPv6. The method allows a Packetization | MTU Discovery for IPv4 and IPv6. The method allows a Packetization | |||
Layer (PL), or a datagram application that uses a PL, to discover | Layer (PL), or a datagram application that uses a PL, to discover | |||
whether a network path can support the current size of datagram and | whether a network path can support the current size of datagram. | |||
to probe a network path with progressively larger packets to find | This can be used to detect and reduce the message size when a sender | |||
whether the maxium packet size can be increased. This allows a | encounters a network black hole (where packets are discarded, and no | |||
sender to determine an appropriate packet size. This provides | ICMP message is received). The method can also probe a network path | |||
functionally for datagram transports that is equivalent to the | with progressively larger packets to find whether the maximum packet | |||
Packetization layer PMTUD specification for TCP, specified in | size can be increased. This allows a sender to determine an | |||
RFC4821. | appropriate packet size, providing functionally for datagram | |||
transports that is equivalent to the Packetization layer PMTUD | ||||
specification for TCP, specified in RFC4821. | ||||
The document also provides implementation notes for incorporating | The document also provides implementation notes for incorporating | |||
Datagram PMTUD into IETF Datagram transports or applications that use | Datagram PMTUD into IETF Datagram transports or applications that use | |||
transports. | transports. | |||
When published, this specification updates RFC4821. | When published, this specification updates RFC4821. | |||
Status of this Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on December 08, 2018. | ||||
This Internet-Draft will expire on January 3, 2019. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2018 IETF Trust and the persons identified as the | Copyright (c) 2018 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (http://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
Please review these documents carefully, as they describe your rights | publication of this document. Please review these documents | |||
and restrictions with respect to this document. Code Components | carefully, as they describe your rights and restrictions with respect | |||
extracted from this document must include Simplified BSD License text | to this document. Code Components extracted from this document must | |||
as described in Section 4.e of the Trust Legal Provisions and are | include Simplified BSD License text as described in Section 4.e of | |||
provided without warranty as described in the Simplified BSD License. | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | ||||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . . 3 | 1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 3 | |||
1.2. Packetization Layer Path MTU Discovery . . . . . . . . . . 4 | 1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 5 | |||
1.3. Path MTU Discovery for Datagram Services . . . . . . . . . 5 | 1.3. Path MTU Discovery for Datagram Services . . . . . . . . 6 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 8 | 3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 8 | |||
3.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . . 10 | 3.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 10 | |||
3.2. Validation of Probe Packet Size . . . . . . . . . . . . . 11 | 3.2. Validation of Probe Packet Size . . . . . . . . . . . . . 12 | |||
3.3. Reducing the PLPMTU: Confirming Path Characteristics . . . 12 | 3.3. Reducing the PLPMTU: Confirming Path Characteristics . . 12 | |||
3.4. Increasing the PLPMTU: Supporting Path Changes . . . . . . 12 | 3.4. Increasing the PLPMTU: Supporting Path Changes . . . . . 13 | |||
3.5. Robustness to inconsistent Path information . . . . . . . 12 | 3.5. Robustness to inconsistent Path information . . . . . . . 13 | |||
4. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . . 13 | 4. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 13 | |||
4.1. PROBE_SEARCH: Probing for a larger PLPMTU . . . . . . . . 13 | 4.1. PROBE_SEARCH: Probing for a larger PLPMTU . . . . . . . . 14 | |||
4.2. The PROBE_DONE state . . . . . . . . . . . . . . . . . . . 14 | 4.2. The PROBE_DONE state . . . . . . . . . . . . . . . . . . 15 | |||
4.3. Verification and Use of PTB Messages . . . . . . . . . . . 14 | 4.3. Validation and Use of PTB Messages . . . . . . . . . . . 15 | |||
4.4. Timers . . . . . . . . . . . . . . . . . . . . . . . . . . 14 | 4.4. Timers . . . . . . . . . . . . . . . . . . . . . . . . . 16 | |||
4.5. Constants . . . . . . . . . . . . . . . . . . . . . . . . 15 | 4.5. Constants . . . . . . . . . . . . . . . . . . . . . . . . 16 | |||
4.6. Variables . . . . . . . . . . . . . . . . . . . . . . . . 16 | 4.6. Variables . . . . . . . . . . . . . . . . . . . . . . . . 17 | |||
4.7. Selecting PROBED_SIZE . . . . . . . . . . . . . . . . . . 16 | 4.7. Selecting PROBED_SIZE . . . . . . . . . . . . . . . . . . 18 | |||
4.8. Black Hole Detection . . . . . . . . . . . . . . . . . . . 17 | 4.8. Simple Black Hole Detection . . . . . . . . . . . . . . . 18 | |||
4.9. State Machine . . . . . . . . . . . . . . . . . . . . . . 17 | 4.8.1. Simple Black Hole Detection State Machine . . . . . . 19 | |||
5. Specification of Protocol-Specific Methods . . . . . . . . . . 20 | 4.9. Full State Machine . . . . . . . . . . . . . . . . . . . 20 | |||
5.1. Application support for DPLPMTUD with UDP or UDP-Lite . . 20 | 5. Specification of Protocol-Specific Methods . . . . . . . . . 23 | |||
5.1.1. Application Request . . . . . . . . . . . . . . . . . 20 | 5.1. Application support for DPLPMTUD with UDP or UDP-Lite . . 23 | |||
5.1.2. Application Response . . . . . . . . . . . . . . . . . 20 | 5.1.1. Application Request . . . . . . . . . . . . . . . . . 24 | |||
5.1.3. Sending Application Probe Packets . . . . . . . . . . 21 | 5.1.2. Application Response . . . . . . . . . . . . . . . . 24 | |||
5.1.4. Validating the Path . . . . . . . . . . . . . . . . . 21 | 5.1.3. Sending Application Probe Packets . . . . . . . . . . 24 | |||
5.1.5. Handling of PTB Messages . . . . . . . . . . . . . . . 21 | 5.1.4. Validating the Path . . . . . . . . . . . . . . . . . 24 | |||
5.2. DPLPMTUD with UDP Options . . . . . . . . . . . . . . . . 21 | 5.1.5. Handling of PTB Messages . . . . . . . . . . . . . . 24 | |||
5.2.1. UDP Request Option . . . . . . . . . . . . . . . . . . 22 | 5.2. DPLPMTUD with UDP Options . . . . . . . . . . . . . . . . 24 | |||
5.2.2. UDP Response Option . . . . . . . . . . . . . . . . . 22 | 5.2.1. UDP Request Option . . . . . . . . . . . . . . . . . 25 | |||
5.3. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 22 | 5.2.2. UDP Response Option . . . . . . . . . . . . . . . . . 25 | |||
5.3.1. SCTP/IP4 and SCTP/IPv6 . . . . . . . . . . . . . . . . 22 | 5.3. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 26 | |||
5.3.1.1. Sending SCTP Probe Packets . . . . . . . . . . . . 22 | 5.3.1. SCTP/IP4 and SCTP/IPv6 . . . . . . . . . . . . . . . 26 | |||
5.3.1.2. Validating the Path with SCTP . . . . . . . . . . 23 | 5.3.1.1. Sending SCTP Probe Packets . . . . . . . . . . . 26 | |||
5.3.1.3. PTB Message Handling by SCTP . . . . . . . . . . . 23 | 5.3.1.2. Validating the Path with SCTP . . . . . . . . . . 27 | |||
5.3.1.3. PTB Message Handling by SCTP . . . . . . . . . . 27 | ||||
5.3.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 23 | 5.3.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 27 | |||
5.3.2.1. Sending SCTP/UDP Probe Packets . . . . . . . . . . 23 | 5.3.2.1. Sending SCTP/UDP Probe Packets . . . . . . . . . 27 | |||
5.3.2.2. Validating the Path with SCTP/UDP . . . . . . . . 23 | 5.3.2.2. Validating the Path with SCTP/UDP . . . . . . . . 27 | |||
5.3.2.3. Handling of PTB Messages by SCTP/UDP . . . . . . . 24 | 5.3.2.3. Handling of PTB Messages by SCTP/UDP . . . . . . 27 | |||
5.3.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . . 24 | 5.3.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . 28 | |||
5.3.3.1. Sending SCTP/DTLS Probe Packets . . . . . . . . . 24 | 5.3.3.1. Sending SCTP/DTLS Probe Packets . . . . . . . . . 28 | |||
5.3.3.2. Validating the Path with SCTP/DTLS . . . . . . . . 24 | 5.3.3.2. Validating the Path with SCTP/DTLS . . . . . . . 28 | |||
5.3.3.3. Handling of PTB Messages by SCTP/DTLS . . . . . . 24 | 5.3.3.3. Handling of PTB Messages by SCTP/DTLS . . . . . . 28 | |||
5.4. DPLPMTUD for QUIC . . . . . . . . . . . . . . . . . . . . 24 | 5.4. DPLPMTUD for QUIC . . . . . . . . . . . . . . . . . . . . 28 | |||
5.4.1. Sending QUIC Probe Packets . . . . . . . . . . . . . . 24 | 5.4.1. Sending QUIC Probe Packets . . . . . . . . . . . . . 28 | |||
5.4.2. Validating the Path with QUIC . . . . . . . . . . . . 25 | 5.4.2. Validating the Path with QUIC . . . . . . . . . . . . 29 | |||
5.4.3. Handling of PTB Messages by QUIC . . . . . . . . . . . 25 | 5.4.3. Handling of PTB Messages by QUIC . . . . . . . . . . 29 | |||
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 25 | 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 29 | |||
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 | 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 | |||
8. Security Considerations . . . . . . . . . . . . . . . . . . . 26 | 8. Security Considerations . . . . . . . . . . . . . . . . . . . 30 | |||
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 26 | 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 | |||
9.1. Normative References . . . . . . . . . . . . . . . . . . . 26 | 9.1. Normative References . . . . . . . . . . . . . . . . . . 30 | |||
9.2. Informative References . . . . . . . . . . . . . . . . . . 28 | 9.2. Informative References . . . . . . . . . . . . . . . . . 32 | |||
Appendix A. Event-driven state changes . . . . . . . . . . . . . . 28 | Appendix A. Event-driven state changes . . . . . . . . . . . . . 32 | |||
Appendix B. Revision Notes . . . . . . . . . . . . . . . . . . . . 31 | Appendix B. Revision Notes . . . . . . . . . . . . . . . . . . . 35 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 32 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 37 | |||
1. Introduction | 1. Introduction | |||
The IETF has specified datagram transport using UDP, SCTP, and DCCP, | The IETF has specified datagram transport using UDP, SCTP, and DCCP, | |||
as well as protocols layered on top of these transports (e.g., SCTP/ | as well as protocols layered on top of these transports (e.g., SCTP/ | |||
UDP, DCCP/UDP) and directly over the IP network layer. This document | UDP, DCCP/UDP) and directly over the IP network layer. This document | |||
describes a robust method for Path MTU Discovery (PMTUD) that may be | describes a robust method for Path MTU Discovery (PMTUD) that may be | |||
used with these transport protocols (or the applications that use | used with these transport protocols (or the applications that use | |||
their transport service) to discover an appropriate size of packet to | their transport service) to discover an appropriate size of packet to | |||
use across an Internet path. | use across an Internet path. | |||
1.1. Classical Path MTU Discovery | 1.1. Classical Path MTU Discovery | |||
Classical Path Maximum Transmission Unit Discovery (PMTUD) can be | Classical Path Maximum Transmission Unit Discovery (PMTUD) can be | |||
used with any transport that is able to process ICMP Packet Too Big | used with any transport that is able to process ICMP Packet Too Big | |||
(PTB) messages (e.g., [RFC1191] and [RFC8201]). The term PTB message | (PTB) messages (e.g., [RFC1191] and [RFC8201]). The term PTB message | |||
is applied to both IPv4 ICMP Unreachable messages (type 3) that carry | is applied to both IPv4 ICMP Unreachable messages (type 3) that carry | |||
the error Fragmentation Needed (Type 3, Code 4) and ICMPv6 packet too | the error Fragmentation Needed (Type 3, Code 4) and ICMPv6 packet too | |||
big messages (Type 2). When a sender receives a PTB message, it | big messages (Type 2). When a sender receives a PTB message, it | |||
reduces the effective MTU to the value reported as the Link MTU in | reduces the effective MTU to the value reported as the Link MTU in | |||
the PTB message, and a method that from time-to-time increases the | the PTB message, and a method that from time-to-time increases the | |||
packet size in attempt to discover an increase in the supported PMTU. | packet size in attempt to discover an increase in the supported PMTU. | |||
The packets sent with a size larger than the current effective PMTU | The packets sent with a size larger than the current effective PMTU | |||
are known as probe packets. | are known as probe packets. | |||
Packets not intended as probe packets are either fragmented to the | Packets not intended as probe packets are either fragmented to the | |||
current effective PMTU, or the attempt to send fails with an error | current effective PMTU, or the attempt to send fails with an error | |||
code. Applications are sometimes provided with a primitive to let | code. Applications are sometimes provided with a primitive to let | |||
them read the maximum packet size, derived from the current effective | them read the maximum packet size, derived from the current effective | |||
PMTU. | PMTU. | |||
Classical PMTUD is subject to protocol failures. One failure arises | Classical PMTUD is subject to protocol failures. One failure arises | |||
when traffic using a packet size larger than the actual PMTU is | when traffic using a packet size larger than the actual PMTU is | |||
black-holed (all datagrams sent with this size, or larger, are | black-holed (all datagrams sent with this size, or larger, are | |||
silently discarded without the sender receiving ICMP PTB messages). | silently discarded without the sender receiving ICMP PTB messages). | |||
This could arise when the PTB messages are not delivered back to the | This could arise when the PTB messages are not delivered back to the | |||
sender for some reason [RFC2923]). For example, ICMP messages are | sender for some reason [RFC2923]). For example, ICMP messages are | |||
increasingly filtered by middleboxes (including firewalls) [RFC4890]. | increasingly filtered by middleboxes (including firewalls) [RFC4890]. | |||
A stateful firewall could be configured with a policy to block | A stateful firewall could be configured with a policy to block | |||
incoming ICMP messages, which would prevent reception of PTB messages | incoming ICMP messages, which would prevent reception of PTB messages | |||
to endpoints behind this firewall. Other examples include cases | to endpoints behind this firewall. Other examples include cases | |||
where PTB messages are not correctly processed/generated by tunnel | where PTB messages are not correctly processed/generated by tunnel | |||
endpoints. | endpoints. | |||
Another failure could result if a node that is not on the network | Another failure could result if a node that is not on the network | |||
path sends a PTB message that attempts to force the sender to change | path sends a PTB message that attempts to force the sender to change | |||
the effective PMTU [RFC8201]. A sender can protect itself from | the effective PMTU [RFC8201]. A sender can protect itself from | |||
reacting to such messages by utilising the quoted packet within a PTB | reacting to such messages by utilising the quoted packet within a PTB | |||
message payload to verify that the received PTB message was generated | message payload to validate that the received PTB message was | |||
in response to a packet that had actually originated from the sender. | generated in response to a packet that had actually originated from | |||
However, there are situations where a sender would be unable to | the sender. However, there are situations where a sender would be | |||
provide this verification. | unable to provide this validation. | |||
Examples where verification is not possible include: | Examples where validation of the PTB message is not possible include: | |||
o When the router issuing the ICMP message is acting on a tunneled | o When the router issuing the ICMP message is acting on a tunneled | |||
packet, the ICMP message will be directed to the tunnel endpoint. | packet, the ICMP message will be directed to the tunnel endpoint. | |||
This tunnel endpoint is responsible for forwardiung the ICMP | This tunnel endpoint is responsible for forwardiung the ICMP | |||
message and also processing the quoted packet within the payload | message and also processing the quoted packet within the payload | |||
field to remove the effect of the tunnel, and return a correctly | field to remove the effect of the tunnel, and return a correctly | |||
fromatted ICMP message to the sender. Failure to do this results | fromatted ICMP message to the sender. Failure to do this results | |||
in black-holing. | in black-holing. | |||
o When a router issuing the ICMP message implements RFC792 | o When a router issuing the ICMP message implements RFC792 | |||
skipping to change at page 4, line 51 ¶ | skipping to change at page 5, line 19 ¶ | |||
previous bullet. Even if the decapsulated message is processed by | previous bullet. Even if the decapsulated message is processed by | |||
the tunnel endpoint, there could be insufficient bytes remaining | the tunnel endpoint, there could be insufficient bytes remaining | |||
for the sender to interpret the quoted transport information. | for the sender to interpret the quoted transport information. | |||
RFC1812 [RFC1812] requires routers to return the full packet if | RFC1812 [RFC1812] requires routers to return the full packet if | |||
possible, often the case for IPv4 when used the path includes | possible, often the case for IPv4 when used the path includes | |||
tunnels; or where the packet has been encapsulated/tunneled over | tunnels; or where the packet has been encapsulated/tunneled over | |||
an encrypted transport and it is not possible to determine the | an encrypted transport and it is not possible to determine the | |||
original transport header ). | original transport header ). | |||
o Even when the PTB message includes sufficient bytes of the quoted | o Even when the PTB message includes sufficient bytes of the quoted | |||
packet, the network layer could lack sufficient context to perform | packet, the network layer could lack sufficient context to | |||
verification, because this depends on information about the active | validate the message, because this depends on information about | |||
transport flows at an endpoint node (e.g., the socket/address | the active transport flows at an endpoint node (e.g., the socket/ | |||
pairs being used, and other protocol header information). | address pairs being used, and other protocol header information). | |||
1.2. Packetization Layer Path MTU Discovery | 1.2. Packetization Layer Path MTU Discovery | |||
The term Packetization Layer (PL) has been introduced to describe the | The term Packetization Layer (PL) has been introduced to describe the | |||
layer that is responsible for placing data blocks into the payload of | layer that is responsible for placing data blocks into the payload of | |||
IP packets and selecting an appropriate Maximum Packet Size (MPS). | IP packets and selecting an appropriate Maximum Packet Size (MPS). | |||
This function is often performed by a transport protocol, but can | This function is often performed by a transport protocol, but can | |||
also be performed by other encapsulation methods working above the | also be performed by other encapsulation methods working above the | |||
transport. | transport. | |||
In contrast to PMTUD, Packetization Layer Path MTU Discovery | In contrast to PMTUD, Packetization Layer Path MTU Discovery | |||
(PLPMTUD) [RFC4821] does not rely upon reception and verification of | (PLPMTUD) [RFC4821] does not rely upon reception and validation of | |||
PTB messages. It is therefore more robust than Classical PMTUD. This | PTB messages. It is therefore more robust than Classical PMTUD. | |||
has become the recommended approach for implementing PMTU discovery | This has become the recommended approach for implementing PMTU | |||
with TCP. | discovery with TCP. | |||
It uses a general strategy where the PL sends probe packet to search | It uses a general strategy where the PL sends probe packet to search | |||
for the largest size of unfragmented datagram that can be sent over a | for the largest size of unfragmented datagram that can be sent over a | |||
path. The probe packets are sent with a progressively larger packet | path. The probe packets are sent with a progressively larger packet | |||
size. If a probe packet is successfully delivered (as determined by | size. If a probe packet is successfully delivered (as determined by | |||
the PL), then the PLPMTU is raised to the size of the successful | the PL), then the PLPMTU is raised to the size of the successful | |||
probe. If no response is received to a probe packet, the method | probe. If no response is received to a probe packet, the method | |||
reduces the probe size. This PLPMTU is used to set the application | reduces the probe size. This PLPMTU is used to set the application | |||
MPS. | MPS. | |||
PLPMTUD introduces flexibility in the implementation of PMTU | PLPMTUD introduces flexibility in the implementation of PMTU | |||
discovery. At one extreme, it can be configured to only perform PTB | discovery. At one extreme, it can be configured to only perform PTB | |||
black hole detection and recovery to increase the robustness of | black hole detection and recovery to increase the robustness of | |||
Classical PMTUD, or at the other extreme, all PTB processing can be | Classical PMTUD, or at the other extreme, all PTB processing can be | |||
disabled and PLPMTUD can completely replace Classical PMTUD. | disabled and PLPMTUD can completely replace Classical PMTUD. | |||
PLPMTUD can also include additional consistency checks without | PLPMTUD can also include additional consistency checks without | |||
increasing the risk of increased black-holing. For instance,the | increasing the risk of increased black-holing. For instance,the | |||
information available at the PL, or higher layers, makes PTB | information available at the PL, or higher layers, makes PTB | |||
verification more straight forward. | validation more straight forward. | |||
1.3. Path MTU Discovery for Datagram Services | 1.3. Path MTU Discovery for Datagram Services | |||
Section 4 of this document presents a set of algorithms for datagram | Section 4 of this document presents a set of algorithms for datagram | |||
protocols to discover the largest size of unfragmented datagram that | protocols to discover the largest size of unfragmented datagram that | |||
can be sent over a path. The method described relies on features of | can be sent over a path. The method described relies on features of | |||
the PL Section 3 and apply to transport protocols operating over IPv4 | the PL Section 3 and apply to transport protocols operating over IPv4 | |||
and IPv6. It does not require cooperation from the lower layers, | and IPv6. It does not require cooperation from the lower layers, | |||
although it can utilise ICMP PTB messages when these received | although it can utilise ICMP PTB messages when these received | |||
messages are made available to the PL. | messages are made available to the PL. | |||
The UDP Usage Guidelines [RFC8085] state "an application SHOULD | The UDP Usage Guidelines [RFC8085] state "an application SHOULD | |||
either use the Path MTU information provided by the IP layer or | either use the Path MTU information provided by the IP layer or | |||
implement Path MTU Discovery (PMTUD)", but does not provide a | implement Path MTU Discovery (PMTUD)", but does not provide a | |||
mechanism for discovering the largest size of unfragmented datagram | mechanism for discovering the largest size of unfragmented datagram | |||
than can be used on a path. Prior to this document, PLPMTUD had not | than can be used on a path. Prior to this document, PLPMTUD had not | |||
been specified for UDP. | been specified for UDP. | |||
Section 10.2 of [RFC4821] recommends a PLPMTUD probing method for the | Section 10.2 of [RFC4821] recommends a PLPMTUD probing method for the | |||
Stream Control Transport Protocol (SCTP). SCTP utilises heartbeat | Stream Control Transport Protocol (SCTP). SCTP utilises heartbeat | |||
messages as probe packets, but RFC4821 does not provide a complete | messages as probe packets, but RFC4821 does not provide a complete | |||
specification. This document provides the details to complete that | specification. This document provides the details to complete that | |||
specification. | specification. | |||
The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires | The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires | |||
implementations to support Classical PMTUD and states that a DCCP | implementations to support Classical PMTUD and states that a DCCP | |||
sender "MUST maintain the MPS allowed for each active DCCP session". | sender "MUST maintain the MPS allowed for each active DCCP session". | |||
It also defines the current congestion control MPS (CCMPS) supported | It also defines the current congestion control MPS (CCMPS) supported | |||
by a path. This recommends use of PMTUD, and suggests use of control | by a path. This recommends use of PMTUD, and suggests use of control | |||
packets (DCCP-Sync) as path probe packets, because they do not risk | packets (DCCP-Sync) as path probe packets, because they do not risk | |||
skipping to change at page 6, line 33 ¶ | skipping to change at page 7, line 8 ¶ | |||
2. Terminology | 2. Terminology | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in [RFC2119]. | document are to be interpreted as described in [RFC2119]. | |||
Other terminology is directly copied from [RFC4821], and the | Other terminology is directly copied from [RFC4821], and the | |||
definitions in [RFC1122]. | definitions in [RFC1122]. | |||
Black-Holed: When the sender is unaware that packets are not | Black-Holed: When the sender is unaware that packets are not | |||
delivered to the destination endpoint (e.g., when the sender | delivered to the destination endpoint (e.g., when the sender | |||
transmits packets of a particular size with a previously known | transmits packets of a particular size with a previously known | |||
effective PMTU (also refered to as the PLPMTU), but is unaware of | effective PMTU (also refered to as the PLPMTU), but is unaware of | |||
a change to the path that resulted in a smaller PLPMTU). | a change to the path that resulted in a smaller PLPMTU). | |||
Classical Path MTU Discovery: Classical PMTUD is a process described | Classical Path MTU Discovery: Classical PMTUD is a process described | |||
in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to | in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to | |||
learn the largest size of unfragmented datagram than can be used | learn the largest size of unfragmented datagram than can be used | |||
across a path. | across a path. | |||
Datagram: A datagram is a transport-layer protocol data unit, | Datagram: A datagram is a transport-layer protocol data unit, | |||
transmitted in the payload of an IP packet. | transmitted in the payload of an IP packet. | |||
Effective PMTU: The current estimated value for PMTU that is used by | Effective PMTU: The current estimated value for PMTU that is used by | |||
a PMTUD. This is equivalent to the PLPMTU derived by PLPMTUD. | a PMTUD. This is equivalent to the PLPMTU derived by PLPMTUD. | |||
EMTU_S: The Effective MTU for sending (EMTU_S) is defined in | EMTU_S: The Effective MTU for sending (EMTU_S) is defined in | |||
[RFC1122] as "the maximum IP datagram size that may be sent, for a | [RFC1122] as "the maximum IP datagram size that may be sent, for a | |||
particular combination of IP source and destination addresses...". | particular combination of IP source and destination addresses...". | |||
EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in | EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in | |||
[RFC1122] as the largest datagram size that can be reassembled by | [RFC1122] as the largest datagram size that can be reassembled by | |||
EMTU_R ("Effective MTU to receive"). | EMTU_R ("Effective MTU to receive"). | |||
Link: A communication facility or medium over which nodes can | Link: A communication facility or medium over which nodes can | |||
communicate at the link layer, i.e., a layer below the IP layer. | communicate at the link layer, i.e., a layer below the IP layer. | |||
Examples are Ethernet LANs and Internet (or higher) layer and | Examples are Ethernet LANs and Internet (or higher) layer and | |||
tunnels. | tunnels. | |||
Link MTU: The Maximum Transmission Unit (MTU) is the size in bytes of | Link MTU: The Maximum Transmission Unit (MTU) is the size in bytes | |||
the largest IP packet, including the IP header and payload, that | of the largest IP packet, including the IP header and payload, | |||
can be transmitted over a link. Note that this could more | that can be transmitted over a link. Note that this could more | |||
properly be called the IP MTU, to be consistent with how other | properly be called the IP MTU, to be consistent with how other | |||
standards organizations use the acronym MT. This includes the IP | standards organizations use the acronym MT. This includes the IP | |||
header, but excludes link layer headers and other framing that is | header, but excludes link layer headers and other framing that is | |||
not part of IP or the IP payload. Other standards organizations | not part of IP or the IP payload. Other standards organizations | |||
generally define link MTU to include the link layer headers. | generally define link MTU to include the link layer headers. | |||
MPS: The Maximum Packet Size (MPS) is the largest size of application | MPS: The Maximum Packet Size (MPS) is the largest size of | |||
data block that can be sent unfragmented across a path. In | application data block that can be sent unfragmented across a | |||
DPLPMTUD this quantity is derived from PLPMTU by taking into | path. In DPLPMTUD this quantity is derived from PLPMTU by taking | |||
consideration the size of the application and lower protocol layer | into consideration the size of the application and lower protocol | |||
headers. | layer headers. | |||
Packet: An IP header plus the IP payload. | Packet: An IP header plus the IP payload. | |||
Packetization Layer (PL): The layer of the network stack that places | Packetization Layer (PL): The layer of the network stack that places | |||
data into packets and performs transport protocol functions. | data into packets and performs transport protocol functions. | |||
Path: The set of link and routers traversed by a packet between a | Path: The set of link and routers traversed by a packet between a | |||
source node and a destination node by a particular flow. | source node and a destination node by a particular flow. | |||
Path MTU (PMTU): The minimum of the Link MTU of all the links forming | Path MTU (PMTU): The minimum of the Link MTU of all the links | |||
a path between a source node and a destination node. | forming a path between a source node and a destination node. | |||
PLPMTU: The estimate of the actual PMTU provided by the DPLPMTUD | PLPMTU: The estimate of the actual PMTU provided by the DPLPMTUD | |||
algorithm. | algorithm. | |||
PLPMTUD: Packetization Layer Path MTU Discovery, the method described | PLPMTUD: Packetization Layer Path MTU Discovery, the method | |||
in this document for datagram PLs, which is an extension to | described in this document for datagram PLs, which is an extension | |||
Classical PMTU Discovery. | to Classical PMTU Discovery. | |||
Probe packet: A datagram sent with a purposely chosen size (typically | Probe packet: A datagram sent with a purposely chosen size | |||
larger than the current PLPMTU) to detect if packets of this size | (typically larger than the current PLPMTU) to detect if packets of | |||
can be successfully sent end-toend across the network path. | this size can be successfully sent end-toend across the network | |||
path. | ||||
3. Features Required to Provide Datagram PLPMTUD | 3. Features Required to Provide Datagram PLPMTUD | |||
TCP PLPMTUD has been defined using standard TCP protocol mechanisms. | TCP PLPMTUD has been defined using standard TCP protocol mechanisms. | |||
All of the requirements in [RFC4821] also apply to use of the | All of the requirements in [RFC4821] also apply to use of the | |||
technique with a datagram PL. Unlike TCP, some datagram PLs require | technique with a datagram PL. Unlike TCP, some datagram PLs require | |||
additional mechanisms to implement PLPMTUD. | additional mechanisms to implement PLPMTUD. | |||
There are eight requirements for performing the datagram PLPMTUD | There are eight requirements for performing the datagram PLPMTUD | |||
method described in this specification: | method described in this specification: | |||
1. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to provide | 1. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to provide | |||
information about the maximum size of packet that can be | information about the maximum size of packet that can be | |||
transmitted by the sender on the local link (the local Link MTU). | transmitted by the sender on the local link (the local Link MTU). | |||
It MAY utilize similar information about the receiver when this | It MAY utilize similar information about the receiver when this | |||
is supplied (note this could be less than EMTU_R). This avoids | is supplied (note this could be less than EMTU_R). This avoids | |||
implementations trying to send probe packets that can not be | implementations trying to send probe packets that can not be | |||
transmited by the local link. Too high a value may reduce the | transmited by the local link. Too high a value may reduce the | |||
efficiency of the search algorithm. Some applications also have | efficiency of the search algorithm. Some applications also have | |||
a maximum transport protocol data unit (PDU) size, in which case | a maximum transport protocol data unit (PDU) size, in which case | |||
there is no benefit from probing for a size larger than this | there is no benefit from probing for a size larger than this | |||
(unless a transport allows multiplexing multiple applications | (unless a transport allows multiplexing multiple applications | |||
PDUs into the same datagram). | PDUs into the same datagram). | |||
2. PLPMTU: A datagram application MUST be able to choose the size of | 2. PLPMTU: A datagram application MUST be able to choose the size of | |||
datagrams sent to the network, up to the PLPMTU, or a smaller | datagrams sent to the network, up to the PLPMTU, or a smaller | |||
value (such as the MPS) derived from this. This value is managed | value (such as the MPS) derived from this. This value is managed | |||
by the DPLPMTUD method. The PLPMTU (specified as the effective | by the DPLPMTUD method. The PLPMTU (specified as the effective | |||
PMTU in Section 1 of [RFC1191]) is equivalent to the EMTU_S | PMTU in Section 1 of [RFC1191]) is equivalent to the EMTU_S | |||
(specified in [RFC1122]). | (specified in [RFC1122]). | |||
3. Probe packets: On request, a PLPMTUD sender is REQUIRED to be | 3. Probe packets: On request, a PLPMTUD sender is REQUIRED to be | |||
able to transmit a packet larger than the PLMPMTU. This can be | able to transmit a packet larger than the PLMPMTU. This can be | |||
uses to send a probe packet. In IPv4, a probe packet MUST be | uses to send a probe packet. In IPv4, a probe packet MUST be | |||
sent with the Don't Fragment (DF) bit set in the IP header, and | sent with the Don't Fragment (DF) bit set in the IP header, and | |||
without network layer endpoint fragmentation. In IPv6, a probe | without network layer endpoint fragmentation. In IPv6, a probe | |||
packet is always sent without source fragmentation (as specified | packet is always sent without source fragmentation (as specified | |||
in section 5.4 of [RFC8201]). | in section 5.4 of [RFC8201]). | |||
4. Processing PTB messages: A DPLPMTUD sender MAY optionally utilize | 4. Processing PTB messages: A DPLPMTUD sender MAY optionally utilize | |||
PTB messages received from the network layer to help identify | PTB messages received from the network layer to help identify | |||
when a path does not support the current size of packet probe. | when a path does not support the current size of packet probe. | |||
Any received PTB message MUST be verified before it is used to | Any received PTB message MUST be validated before it is used to | |||
update the PLPMTU discovery information [RFC8201]. This | update the PLPMTU discovery information [RFC8201]. This | |||
verification confirms that the PTB message was sent in response | validation confirms that the PTB message was sent in response to | |||
to a packet originating by the sender, and needs to be performed | a packet originating by the sender, and needs to be performed | |||
before the PLPMTU discovery method reacts to the PTB message. | before the PLPMTU discovery method reacts to the PTB message. | |||
When the router link MTU is indicated in the PTB message this MAY | When the router link MTU is indicated in the PTB message this MAY | |||
be used by DPLPMTUD to reduce the probe size but MUST NOT be used | be used by DPLPMTUD to reduce the probe size but MUST NOT be used | |||
to increase the PLPMTU ([RFC8201]). Verification SHOULD utilise | to increase the PLPMTU ([RFC8201]). This validation SHOULD | |||
information that can not be simply determined by an off-path | utilise information that can not be simply determined by an off- | |||
attacker, for example, by checking the value of a protocol header | path attacker, for example, by checking the value of a protocol | |||
field known only to the two PL endpoints. (Some datagram | header field known only to the two PL endpoints. (Some datagram | |||
applications use well-known source and destination ports and | applications use well-known source and destination ports and | |||
therefore this check needs to rely on other information.) | therefore this check needs to rely on other information.) | |||
5. Reception feedback: The destination PL endpoint is REQUIRED to | 5. Reception feedback: The destination PL endpoint is REQUIRED to | |||
provide a feedback method that indicates to the DPLPMTUD sender | provide a feedback method that indicates to the DPLPMTUD sender | |||
when a probe packet has been received by the destination PL | when a probe packet has been received by the destination PL | |||
endpoint. The local PL endpoint at the sending node is REQUIRED | endpoint. The local PL endpoint at the sending node is REQUIRED | |||
to pass this feedback to the sender-side DPLPMTUD method. | to pass this feedback to the sender-side DPLPMTUD method. | |||
6. Probing and congestion control: The isolated loss of a probe | 6. Probing and congestion control: The isolated loss of a probe | |||
packet SHOULD NOT be treated as an indication of congestion and | packet SHOULD NOT be treated as an indication of congestion and | |||
its loss SHOULD NOT directly trigger a congestion control | its loss SHOULD NOT directly trigger a congestion control | |||
reaction [RFC4821]. | reaction [RFC4821]. | |||
7. Probe loss recovery: If the data block carried by a probe message | 7. Probe loss recovery: If the data block carried by a probe packet | |||
needs to be sent reliably, the PL (or layers above) MUST arrange | needs to be sent reliably, the PL (or layers above) MUST arrange | |||
retransmission/repair of any resulting loss. This method MUST be | retransmission/repair of any resulting loss. This method MUST be | |||
robust in the case where probe packets are lost due to other | robust in the case where probe packets are lost due to other | |||
reasons (including link transmission error, congestion). The | reasons (including link transmission error, congestion). The | |||
DPLPMTUD method treats isolated loss of a probe packet (with or | DPLPMTUD method treats isolated loss of a probe packet (with or | |||
without an PTB message) as a potential indication of a PMTU limit | without an PTB message) as a potential indication of a PMTU limit | |||
on the path, but not as an indictaion of congestion [CC]. | on the path, but not as an indictaion of congestion Paragraph 6. | |||
8. Shared PLPMTU state: The PLPMTU value could also be stored with | 8. Shared PLPMTU state: The PLPMTU value could also be stored with | |||
the corresponding entry in the destination cache and used by | the corresponding entry in the destination cache and used by | |||
other PL instances. The specification of PLPMTUD [RFC4821] | other PL instances. The specification of PLPMTUD [RFC4821] | |||
states: "If PLPMTUD updates the MTU for a particular path, all | states: "If PLPMTUD updates the MTU for a particular path, all | |||
Packetization Layer sessions that share the path representation | Packetization Layer sessions that share the path representation | |||
(as described in Section 5.2 of [RFC4821]) SHOULD be notified to | (as described in Section 5.2 of [RFC4821]) SHOULD be notified to | |||
make use of the new MTU and make the required congestion control | make use of the new MTU and make the required congestion control | |||
adjustments". Such methods need to be robust to the wide variety | adjustments". Such methods need to be robust to the wide variety | |||
of underlying network forwarding behaviours, PLPMTU adjustments | of underlying network forwarding behaviours, PLPMTU adjustments | |||
based on shared PLPMTU values should be incorporated in the | based on shared PLPMTU values should be incorporated in the | |||
search algorithms. Section 5.2 of [RFC8201] provides guidance on | search algorithms. Section 5.2 of [RFC8201] provides guidance on | |||
the caching of PMTU information and also the relation to IPv6 | the caching of PMTU information and also the relation to IPv6 | |||
flow labels. | flow labels. | |||
In addition, the following principles are stated for design of a | In addition, the following principles are stated for design of a | |||
DPLPMTUD method: | DPLPMTUD method: | |||
o MPS: A method MUST signal appropriate MPS to the higher layer | o MPS: A method MUST signal appropriate MPS to the higher layer | |||
using the PL. This may change following a change to the path. The | using the PL. This may change following a change to the path. | |||
method SHOULD avoid forcing an application to use an arbitrary | The method SHOULD avoid forcing an application to use an arbitrary | |||
small MPS (PLPMTU) for transmission while the method is searching | small MPS (PLPMTU) for transmission while the method is searching | |||
for the currently supported PLPMTU. Datagram PLs do not | for the currently supported PLPMTU. Datagram PLs do not | |||
necessarily support fragmentation of PDUs larger than the PLPMTU. | necessarily support fragmentation of PDUs larger than the PLPMTU. | |||
A reduced MPS can adversely impact the performance of a datagram | A reduced MPS can adversely impact the performance of a datagram | |||
application. | application. | |||
o Path validation: A method MUST be robust to path changes that | o Path validation: A method MUST be robust to path changes that | |||
could have occurred since the path characteristics were last | could have occurred since the path characteristics were last | |||
confirmed, and to the possibility of inconsistent path information | confirmed, and to the possibility of inconsistent path information | |||
being received. | being received. | |||
o Datagram reordering: A method MUST be robust to the possibility | o Datagram reordering: A method MUST be robust to the possibility | |||
that a flow encounters reordering, or has the traffic (including | that a flow encounters reordering, or has the traffic (including | |||
probe packets) is divided over more than one network path. | probe packets) is divided over more than one network path. | |||
o When to probe: A method SHOULD determine whether the path capacity | o When to probe: A method SHOULD determine whether the path capacity | |||
has increased since it last measured the path. This determines | has increased since it last measured the path. This determines | |||
when the path should again be probed. | when the path should again be probed. | |||
3.1. PLPMTU Probe Packets | 3.1. PLPMTU Probe Packets | |||
The DPLPMTUD method relies upon the PL sender being able to generate | The DPLPMTUD method relies upon the PL sender being able to generate | |||
probe messages with a specific size. TCP is able to generate these | probe packets with a specific size. TCP is able to generate these | |||
probe packets by choosing to appropriately segment data being sent | probe packets by choosing to appropriately segment data being sent | |||
[RFC4821]. | [RFC4821]. | |||
In contrast, a datagram PL that needs to construct a probe packet has | In contrast, a datagram PL that needs to construct a probe packet has | |||
to either request an application to send a data block that is larger | to either request an application to send a data block that is larger | |||
than that generated by an application, or to utilise padding | than that generated by an application, or to utilise padding | |||
functions to extend a datagram beyond the size of the application | functions to extend a datagram beyond the size of the application | |||
data block. Protocols that permit exchange of control messages | data block. Protocols that permit exchange of control messages | |||
(without an application data block) could alternatively prefer to | (without an application data block) could alternatively prefer to | |||
generate a probe packet by extending a control message with padding | generate a probe packet by extending a control message with padding | |||
skipping to change at page 11, line 10 ¶ | skipping to change at page 11, line 27 ¶ | |||
way to fragment a datagram at the PL, or could instead utilise a | way to fragment a datagram at the PL, or could instead utilise a | |||
control packet with padding. | control packet with padding. | |||
A receiver needs to be able to distinguish an in-band data block from | A receiver needs to be able to distinguish an in-band data block from | |||
any added padding. This is needed to ensure that any added padding | any added padding. This is needed to ensure that any added padding | |||
is not passed on to an application at the receiver. | is not passed on to an application at the receiver. | |||
This results in three possible ways that a sender can create a probe | This results in three possible ways that a sender can create a probe | |||
packet listed in order of preference: | packet listed in order of preference: | |||
Probing using padding data: A probe packet that contains only control | Probing using padding data: A probe packet that contains only | |||
information together with any padding needed to inflate the packet | control information together with any padding needed to inflate | |||
to the size required for the probe packet. Since these probe | the packet to the size required for the probe packet. Since these | |||
packets do not carry an application-supplied data block,they do | probe packets do not carry an application-supplied data block,they | |||
not typically require retransmission, although they do still | do not typically require retransmission, although they do still | |||
consume network capacity and incur endpoint processing. | consume network capacity and incur endpoint processing. | |||
Probing using appication data and padding data: A probe packet that | Probing using appication data and padding data: A probe packet that | |||
contains a data block supplied by an application that is combined | contains a data block supplied by an application that is combined | |||
with padding to inflate the length of the datagram to the size | with padding to inflate the length of the datagram to the size | |||
required for the probe packet. If the application/transport needs | required for the probe packet. If the application/transport needs | |||
protection from the loss of this probe packet, the application/ | protection from the loss of this probe packet, the application/ | |||
transport may perform transport-layer retransmission/repair of the | transport may perform transport-layer retransmission/repair of the | |||
data block (e.g., by retransmission after loss is detected or by | data block (e.g., by retransmission after loss is detected or by | |||
duplicating the data block in a datagram without the padding | duplicating the data block in a datagram without the padding | |||
data). | data). | |||
Probing using appication data: A probe packet that contains a data | Probing using appication data: A probe packet that contains a data | |||
block supplied by an application that matches the size required | block supplied by an application that matches the size required | |||
for the probe packet. This method requests the application to | for the probe packet. This method requests the application to | |||
issue a data block of the desired probe size. If the application/ | issue a data block of the desired probe size. If the application/ | |||
transport needs protection from the loss of an unsuccessful probe | transport needs protection from the loss of an unsuccessful probe | |||
packet, the application/transport needs then to perform transport- | packet, the application/transport needs then to perform transport- | |||
layer retransmission/repair of the data block (e.g., by | layer retransmission/repair of the data block (e.g., by | |||
retransmission after loss is detected). | retransmission after loss is detected). | |||
A PL that uses a probe packet carrying an application data block, | A PL that uses a probe packet carrying an application data block, | |||
could need to retransmit this application data block if the probe | could need to retransmit this application data block if the probe | |||
skipping to change at page 11, line 53 ¶ | skipping to change at page 12, line 22 ¶ | |||
DLPMTUD MAY choose to use only one of these methods to simplify the | DLPMTUD MAY choose to use only one of these methods to simplify the | |||
implementation. | implementation. | |||
3.2. Validation of Probe Packet Size | 3.2. Validation of Probe Packet Size | |||
The PL needs a method to determine when probe packets have been | The PL needs a method to determine when probe packets have been | |||
successfully received end-to-end across a network path. | successfully received end-to-end across a network path. | |||
Transport protocols can include end-to-end methods that detect and | Transport protocols can include end-to-end methods that detect and | |||
report reception of specific datagrams that they send (e.g., DCCP and | report reception of specific datagrams that they send (e.g., DCCP and | |||
SCTP provide keep-alive/heartbeat features). When supported, this | SCTP provide keep-alive/heartbeat features). When supported, this | |||
mechanism SHOULD also be used by DPLPMTUD to acknowledge reception of | mechanism SHOULD also be used by DPLPMTUD to acknowledge reception of | |||
a probe packet. | a probe packet. | |||
A PL that does not acknowledge data reception (e.g., UDP and UDP- | A PL that does not acknowledge data reception (e.g., UDP and UDP- | |||
Lite) is unable to detect when the packets that it sends are | Lite) is unable to detect when the packets that it sends are | |||
discarded because their size is greater than the actual PMTU. These | discarded because their size is greater than the actual PMTU. These | |||
PLs need to either rely on an application protocol to detect this | PLs need to either rely on an application protocol to detect this | |||
loss, or make use of an additional transport method such as UDP- | loss, or make use of an additional transport method such as UDP- | |||
Options [I-D.ietf-tsvwg-udp-options]. In addition, they might need | Options [I-D.ietf-tsvwg-udp-options]. In addition, they might need | |||
to send reachability probes (e.g., periodically solicit a response | to send reachability probes (e.g., periodically solicit a response | |||
from the destination) to determine whether the last successfully | from the destination) to determine whether the last successfully | |||
probed PLPMTU is still supported by the network path. | probed PLPMTU is still supported by the network path. | |||
Section Section 4 specifies this function for a set of IETF-specified | Section Section 4 specifies this function for a set of IETF-specified | |||
protocols. | protocols. | |||
3.3. Reducing the PLPMTU: Confirming Path Characteristics | 3.3. Reducing the PLPMTU: Confirming Path Characteristics | |||
If the DPLPMTUD method detects that a packet with the PLPMTU size is | If the DPLPMTUD method detects that a packet with the PLPMTU size is | |||
no supported by the network path, then the DLPMTUD method needs to | no supported by the network path, then the DLPMTUD method needs to | |||
validate the PLPMTU. This can happen when a validated PTB message is | validate the PLPMTU. This can happen when a validated PTB message is | |||
received, or another event that indicates the network path no longer | received, or another event that indicates the network path no longer | |||
sustains this packet size, such as a loss report from the PL | sustains this packet size, such as a loss report from the PL | |||
All implementations of DPLPMTUD are REQUIRED to provide support that | All implementations of DPLPMTUD are REQUIRED to provide support that | |||
reduces the PLPMTU when the actual PMTU supported by a network path | reduces the PLPMTU when the actual PMTU supported by a network path | |||
is less than the PLPMTU. | is less than the PLPMTU. | |||
3.4. Increasing the PLPMTU: Supporting Path Changes | 3.4. Increasing the PLPMTU: Supporting Path Changes | |||
An implementation that only reduces the PLPMTU to a suitable size is | An implementation that only reduces the PLPMTU to a suitable size is | |||
sufficient to ensure reliable operation, but may be very inefficient | sufficient to ensure reliable operation, but may be very inefficient | |||
when the actual PMTU changes or when the method (for whatever reason) | when the actual PMTU changes or when the method (for whatever reason) | |||
makes a suboptimal choice for the PLPMTU. | makes a suboptimal choice for the PLPMTU. | |||
A full implementation of the DPLPMTUD method is RECOMMENDED to | A full implementation of the DPLPMTUD method is RECOMMENDED to | |||
provide a way for the sending PL endpoint to detect when the PLPMTU | provide a way for the sending PL endpoint to detect when the PLPMTU | |||
is smaller than the actual PMTU size. This allows the sender to | is smaller than the actual PMTU size. This allows the sender to | |||
increase the PLPMTU following a change in the characteristics of the | increase the PLPMTU following a change in the characteristics of the | |||
path, such as when a link is reconfigured with a larger MTU, or when | path, such as when a link is reconfigured with a larger MTU, or when | |||
there is a change in the set of links traversed by an end-to-end flow | there is a change in the set of links traversed by an end-to-end flow | |||
(e.g. after a routing or fail-over decision). | (e.g. after a routing or fail-over decision). | |||
3.5. Robustness to inconsistent Path information | 3.5. Robustness to inconsistent Path information | |||
The decision to increase the PLPMTU needs to be robust to the | The decision to increase the PLPMTU needs to be robust to the | |||
possibility that information learned about the path is inconsistent | possibility that information learned about the path is inconsistent | |||
(this could happen when probe packets are lost due to other reasons, | (this could happen when probe packets are lost due to other reasons, | |||
or some of the packets in a flow are forwarded along a portion of the | or some of the packets in a flow are forwarded along a portion of the | |||
path that supports a different PMTU). | path that supports a different PMTU). | |||
Frequent path changes could occur due to unexpected "flapping" - | Frequent path changes could occur due to unexpected "flapping" - | |||
where some packets from a flow pass along one path, but other packets | where some packets from a flow pass along one path, but other packets | |||
follow a different path with different properties. DPLPMTUD can be | follow a different path with different properties. DPLPMTUD can be | |||
made robust to these anomalies by introducing hysteresis into the | made robust to these anomalies by introducing hysteresis into the | |||
decision to increase the Maximum Packet Size. | decision to increase the Maximum Packet Size. | |||
XXX A future revision of this section will include recommend | XXX A future revision of this section will include recommend | |||
appropriate methods to provide robustness. XXX | appropriate methods to provide robustness. XXX | |||
4. Datagram Packetization Layer PMTUD | 4. Datagram Packetization Layer PMTUD | |||
This section specifies Datagram PLPMTUD (DPLPMTUD). This method can | This section specifies Datagram PLPMTUD (DPLPMTUD). This method can | |||
be introduced at various points in the IP protocol stack, to discover | be introduced at various points in the IP protocol stack, to discover | |||
the PLPMTU so that the application can use an MPS appropriate to the | the PLPMTU so that the application can use an MPS appropriate to the | |||
current network path. | current network path. | |||
(preamble) | +----------------------+ | |||
| APP* | | ||||
+-----------+ | +-+-------+----+---+---+ | |||
| APP* | | | | | | | |||
+-----------+ | +---+--+ +--+--+ | +-+---+ | |||
__|| | | |___ | | QUIC*| |UDPO*| | |SCTP*| | |||
___/ | | | \ | +---+--+ +--+--+ | ++--+-+ | |||
__/ | | | \__ | | | | | | | |||
+------++-----+ | +------+ | | +-------++ | | | | |||
| QUIC*||UDPO*| | | SCTP*| | | | | | | | |||
+------++-----+ | +-+-----+ | | ++-+--++ | | |||
+-----+ +------+ | | UDP | | | |||
| UDP | | SCTP*| | +---+--+ | | |||
+-----+ +------+ | | | | |||
| | | +--------------+-----+-+ | |||
+----------------------+ | | Network Interface | | |||
| Network Interface | | +----------------------+ | |||
+----------------------+ | ||||
(postamble) | Figure 1: Examples where DPLPMTUD can be implemented | |||
The central idea of DPLPMTUD is probing by a sender. Probe packets | The central idea of DPLPMTUD is probing by a sender. Probe packets | |||
of increasing size are sent to find out the maximum size of user | are sent to find out the maximum size of user message that is | |||
message that is completely transferred across the network path from | completely transferred across the network path from the sender to the | |||
the sender to the destination. | destination. | |||
The are various functions performed by the algorithm: | ||||
4.1. PROBE_SEARCH: Probing for a larger PLPMTU | 4.1. PROBE_SEARCH: Probing for a larger PLPMTU | |||
The DPLPMTUD method utilises probe packets to confirm that a packet | The DPLPMTUD method utilises probe packets to confirm that a packet | |||
of size PROBE_SIZE can travere the network path. The PROBE_COUNT is | of size PROBED_SIZE can traverse the network path. The PROBE_COUNT | |||
initialised to zero when a probe packet is first sent with a | is initialised to zero when a probe packet is first sent with a | |||
particular size. | particular size. | |||
A timer is used to trigger the generation of probe packets. The | A timer is used to trigger the generation of probe packets. The | |||
probe_timer is started each time a probe packet is sent to the | probe_timer is started each time a probe packet is sent to the | |||
destination and is cancelled when receipt of the probe packet is | destination and is cancelled when receipt of the probe packet is | |||
acknowledged. THE PROBE_SIZE is confirmed, and this value is then | acknowledged. The PROBED_SIZE is confirmed, and this value is then | |||
assignmed to PLPMTU. The DPLPMTUD method may send subsequent probes | assignmed to PLPMTU. The DPLPMTUD method may send subsequent probes | |||
of an increasing size. Increasing probes follows a search strategy | of an increasing size. Increasing probes follow a search strategy as | |||
as discussed in Section 4.7. | discussed in Section 4.7. | |||
Each time the probe_timer expires, the PROBE_COUNT is incremented, | Each time the probe_timer expires, the PROBE_COUNT is incremented, | |||
teh probe_timer is reinitialised, and a probe packet of the same size | the probe_timer is reinitialised, and a probe packet of the same size | |||
is retransmitted. | is retransmitted. | |||
The maximum number of retransmissions for a PROBE_SIZE is configured | The maximum number of retransmissions for a PROBED_SIZE is configured | |||
(MAX_PROBES). If the value of the PROBE_COUNT reaches MAX_PROBES, | (MAX_PROBES). If the value of the PROBE_COUNT reaches MAX_PROBES, | |||
probing will stop. | probing will stop and enters the PROBE_DONE state. | |||
4.2. The PROBE_DONE state | 4.2. The PROBE_DONE state | |||
When the PL sender complete probing for a larger PLPMTU, it enters | When the PL sender completes probing for a larger PLPMTU, it enters | |||
the PROBE_DONE state. This starts the PMTU_RAISE_TIMER. While this | the PROBE_DONE state. This starts the PMTU_RAISE_TIMER. While this | |||
running, the PLPMTU remains at the value set in the last succesful | running, the PLPMTU remains at the value set in the last succesful | |||
probe packet. | probe packet. | |||
If the PL is designed in a way that is unable to verify reachability | If the PL is designed in a way that is unable to validate | |||
to the destination endpoint after probing has completed, the method | reachability to the destination endpoint after probing has completed, | |||
uses a REACHABILITY_TIMER to periodically repeat a probe packet for | the method uses a REACHABILITY_TIMER to periodically repeat a probe | |||
the current PLPMTU size, while the PMTU_RAISE_TIMER is running. If | packet for the current PLPMTU size, while the PMTU_RAISE_TIMER is | |||
the REACHABILITY_TIMER expires, the method exits the PROBE_DONE | running. If the REACHABILITY_TIMER expires, the method exits the | |||
state. The done state is also exited when a verified PTB message is | PROBE_DONE state. The done state is also exited when a validated PTB | |||
received. | message is received. | |||
If the PMTU_RAISE_TIMER expires, the PL sender also exits the | If the PMTU_RAISE_TIMER expires, the PL sender also exits the | |||
PROBE_DONE state, but in this case resumes probing from the size of | PROBE_DONE state, but in this case resumes probing from the size of | |||
the PLPMTU. | the PLPMTU. | |||
4.3. Verification and Use of PTB Messages | 4.3. Validation and Use of PTB Messages | |||
This section describes processing for both IPv4 ICMP Unreachable | This section describes processing for both IPv4 ICMP Unreachable | |||
messages (type 3) and ICMPv6 packet too big messages. | messages (type 3) and ICMPv6 packet too big messages. | |||
A node that receives a PTB message from a router or middlebox, MUST | A PL that receives a PTB message from a router or middlebox, MUST | |||
verify the PTB message. The node checks the protocol information in | validate the PTB message. The PL checks the protocol information in | |||
the quoted payload to verify that the message originated from the | the quoted payload to validate the message originated from the | |||
sending node. The node also checks that the reported MTU size is | sending node. The node also checks that the reported link MTU size | |||
less than the size used by packet probes. PTB messages are discarded | is less than the size used by packet probes. PTB messages are | |||
if they fail to pass these checks, or where there is insufficient | discarded if they fail to pass these checks, or where there is | |||
ICMP payload to perform these checks. The checks are intended to | insufficient ICMP payload to perform these checks. The checks are | |||
provide protection from packets that originate from a node that is | intended to provide protection from packets that originate from a | |||
not on the network path or a node that attempts to report a larger | node that is not on the network path or a node that attempts to | |||
MTU than the current probe size. | report a larger link MTU than the current probe size. | |||
PTB messages that have been verified can be utilised by the DPLPMTUD | PTB messages that have been validated can be utilised by the DPLPMTUD | |||
algorithm. A method that utilises these PTB messages can improve | algorithm. A method that utilises these PTB messages can improve the | |||
performance compared to one that relies solely on probing. | speed at the which the algorithm detects an appropriate PLPMTU | |||
compared to one that relies solely on probing. | ||||
4.4. Timers | 4.4. Timers | |||
The method in the previous subsections utilises three timers: | The method in the previous subsections utilises three timers: | |||
PROBE_TIMER: Configured to expire after a period longer than the | PROBE_TIMER: Configured to expire after a period longer than the | |||
maximum time to receive an acknowledgment to a probe packet. This | maximum time to receive an acknowledgment to a probe packet. This | |||
value MUST be larger than 1 second, and SHOULD be larger than 15 | value MUST be larger than 1 second, and SHOULD be larger than 15 | |||
seconds. Guidance on selection of the timer value are provide in | seconds. Guidance on selection of the timer value are provide in | |||
section 3.1.1 of the UDP Usage Guidelines [RFC8085]. | section 3.1.1 of the UDP Usage Guidelines [RFC8085]. | |||
If the PL has an RTT estimate and timely acknowedgements the | If the PL has an RTT estimate and timely acknowedgements the | |||
PROBE_TIMER can be derrived from the PL RTT estimate. | PROBE_TIMER can be derrived from the PL RTT estimate. | |||
PMTU_RAISE_TIMER: Configured to the period a sender ought to continue | PMTU_RAISE_TIMER: Configured to the period a sender ought to | |||
use the current PLPMTU, after which it re-commences probing for a | continue use the current PLPMTU, after which it re-commences | |||
higher PMTU. This timer has a period of 600 secs, as recommended | probing for a higher PMTU. This timer has a period of 600 secs, | |||
by DPLPMTUD [RFC4821]. | as recommended by DPLPMTUD [RFC4821]. | |||
REACHABILITY_TIMER: Configured to the period a sender ought to wait | REACHABILITY_TIMER: Configured to the period a sender ought to wait | |||
before confirming the current PLPMTU is still supported. This is | before confirming the current PLPMTU is still supported. This is | |||
less than the PMTU_RAISE_TIMER and used to decrease the PLPMTU | less than the PMTU_RAISE_TIMER and used to decrease the PLPMTU | |||
(e.g. when a black hole is encountered). | (e.g. when a black hole is encountered). | |||
DPLPMTUD ought to suspend reachability probes when no application | DPLPMTUD ought to suspend reachability probes when no application | |||
data has been sent since the previous probe packet. Guidance on | data has been sent since the previous probe packet. Guidance on | |||
selection of the timer value are provide in section 3.1.1 of the | selection of the timer value are provide in section 3.1.1 of the | |||
UDP Usage Guidelines[RFC8085]. DPLPMTUD ought to be suspended or | UDP Usage Guidelines[RFC8085]. DPLPMTUD ought to be suspended or | |||
only sent in conjuction with out traffic during periods of | only sent in conjuction with out traffic during periods of | |||
dormancy. This verification needs to be frequent enough when data | dormancy. This PLPMTU validation needs to be frequent enough when | |||
is flowing that you do not black hole extensive amounts of traffic | data is flowing that the sending PL does not black hole extensive | |||
amounts of traffic | ||||
An implementation could implement the various timers using a single | An implementation could implement the various timers using a single | |||
timer process. | timer process. | |||
4.5. Constants | 4.5. Constants | |||
The following constants are defined: | The following constants are defined: | |||
MAX_PROBES: The maximum value of the PROBE_ERROR_COUNTER. The default | MAX_PROBES: The maximum value of the PROBE_ERROR_COUNTER. The | |||
value of MAX_PROBES is 10. | default value of MAX_PROBES is 10. | |||
MIN_PMTU: The smallest allowed probe packet size. For IPv6, this | MIN_PMTU: The smallest allowed probe packet size. For IPv6, this | |||
value is 1280 bytes, as specified in [RFC2460]. For IPv4, the | value is 1280 bytes, as specified in [RFC2460]. For IPv4, the | |||
minimum value is 68 bytes. (An IPv4 routed is required to be able | minimum value is 68 bytes. (An IPv4 routed is required to be able | |||
to forward a datagram of 68 octets without further fragmentation. | to forward a datagram of 68 octets without further fragmentation. | |||
This is the combined size of an IPv4 header and the minimum | This is the combined size of an IPv4 header and the minimum | |||
fragment size of 8 octets.) | fragment size of 8 octets.) | |||
BASE_PMTU: The BASE_PMTU is a considered a size that ought to work in | BASE_PMTU: The BASE_PMTU is a considered a size that ought to work | |||
most cases. The size is equal to or larger than the minimum | in most cases. The size is equal to or larger than the minimum | |||
permitted and smaller than the maximum allowed. In the case of | permitted and smaller than the maximum allowed. In the case of | |||
IPv6, this value is 1280 bytes [RFC2460]. When using IPv4, a size | IPv6, this value is 1280 bytes [RFC2460]. When using IPv4, a size | |||
of 1200 bytes is RECOMMENDED. | of 1200 bytes is RECOMMENDED. | |||
MAX_PMTU: The MAX_PMTU is the largest size of PLPMTU that is probed. | MAX_PMTU: The MAX_PMTU is the largest size of PLPMTU that is probed. | |||
This has to be less than or equal to the minimum of the local MTU | This has to be less than or equal to the minimum of the local MTU | |||
of the outgoing interface and the destination PLMTU for receiving. | of the outgoing interface and the destination PLMTU for receiving. | |||
An application or PL may reduce this when it knows there is no | An application or PL may reduce this when it knows there is no | |||
need to send packets above a specific size. | need to send packets above a specific size. | |||
The figure below illustrates the relationship between some of these | ||||
variables, in this case when the DPLPMTUD algorithm performs path | ||||
probing to increase the size of the PLPMTU. The MPS is less than the | ||||
PLPMTU. A probe packet has been sent of size PROBED_SIZE. When this | ||||
is acknowledged, the PLPMTU will be raised to PROBED_SIZE allowing | ||||
the PROBED_SIZE to be increased towards the actual PMTU. | ||||
MIN_PMTU PMTU_MAX | ||||
<------------------------------------------------------> | ||||
| | | | | | ||||
V | | | V | ||||
BASE_PMTU V | V Actual PMTU | ||||
MPS | PROBED_SIZE | ||||
V | ||||
PLPMTU | ||||
Figure 2: Relationships between probe and packet sizes | ||||
4.6. Variables | 4.6. Variables | |||
This method utilises a set of variables: | This method utilises a set of variables: | |||
PROBE_TIMER: Configured to expire after a period longer than the | PROBE_TIMER: Configured to expire after a period longer than the | |||
maximum time to receive an acknowledgment to a probe packet. This | maximum time to receive an acknowledgment to a probe packet. This | |||
value MUST be larger than 1 second, and SHOULD be larger than 15 | value MUST be larger than 1 second, and SHOULD be larger than 15 | |||
seconds. Guidance on selection of the timer value are provide in | seconds. Guidance on selection of the timer value are provide in | |||
section 3.1.1 of the UDP Usage Guidelines [RFC8085]. | section 3.1.1 of the UDP Usage Guidelines [RFC8085]. | |||
PL with RTT estimates may use values smaller than 1 seconded | PL with RTT estimates may use values smaller than 1 seconded | |||
derrived from their RTT estimate to speed up detection of | derrived from their RTT estimate to speed up detection of | |||
connectivity issues on the path. | connectivity issues on the path. | |||
PROBED_SIZE: The PROBED_SIZE is the size of the current probe packet. | PROBED_SIZE: The PROBED_SIZE is the size of the current probe | |||
This is a tentative value for the PLPMTU, which is awaiting | packet. This is a tentative value for the PLPMTU, which is | |||
confirmation by an acknowledgment. | awaiting confirmation by an acknowledgment. | |||
PROBE_COUNT: This is a count of the number of unsuccessful probe | PROBE_COUNT: This is a count of the number of unsuccessful probe | |||
packets that have been sent with size PROBED_SIZE. The value is | packets that have been sent with size PROBED_SIZE. The value is | |||
initialised to zero when a particular size of PROBED_SIZE is first | initialised to zero when a particular size of PROBED_SIZE is first | |||
attempted. | attempted. | |||
PTB_SIZE: The PTB_Size is value returned by a verified PTB message | PTB_SIZE: The PTB_Size is value returned by a validated PTB message | |||
indicating the local MTU size of a router along the path. | indicating the local MTU size of a router along the path. | |||
4.7. Selecting PROBED_SIZE | 4.7. Selecting PROBED_SIZE | |||
Implementations discover the search range by validating the minimum | Implementations discover the search range by validating the minimum | |||
path MTU and then using the probe method to select a PROBED_SIZE less | path MTU and then using the probe method to select a PROBED_SIZE less | |||
than or equal to the maximum PMTU_MAX. Where PMTU_MAX is the minimum | than or equal to the maximum PMTU_MAX. Where PMTU_MAX is the minimum | |||
of the local link MTU and EMTU_R (learned from the remote endpoint). | of the local link MTU and EMTU_R (learned from the remote endpoint). | |||
The PMTU_MAX MAY be constrained by an application that has a maximum | The PMTU_MAX MAY be constrained by an application that has a maximum | |||
to the size of datagrams it wishes to send. | to the size of datagrams it wishes to send. | |||
Implementations use a search algorithm to choose probe sizes within | Implementations use a search algorithm to choose probe sizes within | |||
the search range. | the search range. | |||
xxx A future version of this section will detail example methods for | xxx A future version of this section will detail example methods for | |||
selecting probe size values, but does not plan to mandate a single | selecting probe size values, but does not plan to mandate a single | |||
method. xxx | method. xxx | |||
Implementations MAY optimizse the search procedure by selecting step | Implementations MAY optimizse the search procedure by selecting step | |||
sizes from a table of common PMTU sizes. | sizes from a table of common PMTU sizes. | |||
Implementations SHOULD select probe sizes to maximise the gain in | Implementations SHOULD select probe sizes to maximise the gain in | |||
PLPMTU each search step. Implementations ought to take into | PLPMTU each search step. Implementations ought to take into | |||
consideration useful probe size steps and a minimum useful gain in | consideration useful probe size steps and a minimum useful gain in | |||
PLPMTU. | PLPMTU. | |||
4.8. Black Hole Detection | 4.8. Simple Black Hole Detection | |||
The DPLPMTUD method can be used to detect paths that fail to support | The DPLPMTUD method can be used to provide black hole detection. | |||
a packet size, but return no PTB message. The black hole detection | This enables a reduction of the PLPMTU when a PL sender encounters a | |||
function detects such cases and responds by reducing the PLPMTU, | path that fails to support the current MPS and also fails to return a | |||
allowing the endpoint to inform the application of the reduced MPS | PTB message to the sender. | |||
and accordingly send smaller packets. Black Hole detection is | ||||
triggered by the reachability function. | ||||
4.9. State Machine | The simple method starts by setting the PLPMTU to the BASE_PMTU. | |||
When the method detects that communication is not possible with this | ||||
size of packet, the PLPMTU is reduced, until an operable message size | ||||
is reached or the PLPMTU reaches the BASE_MTU size. The method | ||||
enables a sending PL to inform an application of the reduced MPS and | ||||
accordingly send smaller packets. | ||||
A state machine for DPLPMTUD is depicted in Figure 2. If multihoming | The simple black hole detetction method does not seek to increase the | |||
is supported, a state machine is needed for each active path. | PLPMTU. This makes it vulneable to transient reductions in the | |||
actual PLPMTU, which could result in a PLPMTU lower than the actual | ||||
PMTU. | ||||
PROBE_TIMER expiry | The full methiod is specified in Section 4.9. | |||
(PROBE_COUNT = MAX_PROBES) | ||||
+-------------+ +--------------+ | 4.8.1. Simple Black Hole Detection State Machine | |||
=->| PROBE_START |--------------->|PROBE_DISABLED| | ||||
PROBE_TIMER expiry | +-------------+ +--------------+ | The PL sender starts with the PLPMTU and PROBED_SIZE set to the | |||
(PROBE_COUNT = | | | | BASE_PMTU. | |||
MAX_PROBES) ------- | Connectivity confirmed | ||||
v | While a PL has a PLPMTU greater than the BASE_MTU, the PL needs to | |||
----------- +------------+ -- PROBE_TIMER expiry | send probe packets at the PROBED_SIZE to revalidate the PLPMTU. | |||
MAX_PMTU acked or | | PROBE_BASE | | (PROBE_COUNT < | Black hole detection is also triggered by lack of reachability at the | |||
PTB (>= BASE_PMTU)| -----> +------------+ <- MAX_PROBES) | PL. When the PL sender detects that multiple transmissions of | |||
---------------- | /\ | | | packets of PROBED_SIZE are no longer being acknowledged (e.g., When | |||
| | | | | PTB | the number of probe packets sent without receiving an acknowledgement | |||
| PMTU_RAISE_TIMER| | | | (PTB_SIZE < BASE_PMTU) | (PROBE_COUNT) becomes greater than the MAX_PROBES), the PL concludes | |||
| or reachability | | | | or | that it has detected a black hole and reduces PLPMTU. | |||
| (PROBE_COUNT | | | | PROBE_TIMER expiry | ||||
| = MAX_PROBES) | | | | (PROBE_COUNT = MAX_PROBES) | The connectivity check may be performened by the protocol | |||
| ------------- | | \ | implementing the PL (as in PLPMTUD for TCP [RFC4821]). When the | |||
| | PTB | | \ | application using the PL does not regularly send packets of size | |||
| | (< PROBED_SIZE)| | \ | PROBED_SIZE, additional probe packets need to be sent by PL using the | |||
| | | | ---------------- | reachability timer Section 4.4. | |||
| | | | | | ||||
| | | | Probe | | If method does reduces the PLPMTU to the MIN_PMTU, the method | |||
| | | | acked | | concludes the path does not support the MIN_PMTU. | |||
v | | v v | ||||
+------------+ +--------------+ Probe +-------------+ | If multihoming is supported, a state machine is needed for each | |||
| PROBE_DONE |<-------------- | PROBE_SEARCH |<-------| PROBE_ERROR | | active path. | |||
+------------+ MAX_PMTU acked +--------------+ acked +-------------+ | ||||
/\ | or /\ | | The state machine for a simple black hole detection mechanism is | |||
| | PROBE_TIMER expiry | | | depicted in Figure 3. | |||
| |(PROBE_COUNT = MAX_PROBES) | | | ||||
| | | | | XXX a future version of the simple black hole detection state machine | |||
------ -------- | might consider icmp PTB messages XXX | |||
Reachability probe acked PROBE_TIMER expiry | +------------+ | |||
or PROBE_TIMER expiry (PROBE_COUNT < MAX_PROBES) | | PROBE_START| | |||
(PROBE_COUNT < MAX_PROBES) or | +-----+------+ | |||
Probe acked | | Connectivity confirmed | |||
| (reachability tests start) | ||||
PROBE_COUNT >= V | ||||
MAX_PROBES +------------+ | ||||
+---------------| PROBE_BASE +->-+ | ||||
| +-----+------+ | | ||||
| | ^ | PROBE_COUNT < MAX_PROBES | ||||
| | +-----+ | ||||
| V | ||||
| | PROBE_ACK | ||||
| PROBE_COUNT | | ||||
| = MAX_PROBES +------------+ | ||||
| (reduce +-<-+ PROBE_DONE +->-+ | ||||
| PLPMTU) | +------+-----+ | | ||||
| | ^ | ^ | PROBE_COUNT < MAX_PROBES | ||||
| | | | | | (Contine probing) | ||||
| +-----+ | +-----+ | ||||
V V | ||||
+------------+ | | ||||
| PROBE_ERROR|<------------+ | ||||
+------------+ | ||||
Figure 3: State machine for detecting black holes | ||||
4.9. Full State Machine | ||||
A full state machine for DPLPMTUD is depicted in Figure 4. If | ||||
multihoming is supported, a state machine is needed for each active | ||||
path. | ||||
PROBE_TIMER expiry | ||||
(PROBE_COUNT = MAX_PROBES) | ||||
+-------------+ +--------------+ | ||||
+->| PROBE_START +--------------->|PROBE_DISABLED| | ||||
PROBE_TIMER expiry | +--+-------+--+ +--------------+ | ||||
(PROBE_COUNT = | | | | ||||
MAX_PROBES) +-----+ | Connectivity confirmed | ||||
v | ||||
+---------- +------------+ -+ PROBE_TIMER expiry | ||||
MAX_PMTU acked or | | PROBE_BASE | | (PROBE_COUNT < | ||||
PTB (>= BASE_PMTU)| +----> +--------+---+ <+ MAX_PROBES) | ||||
+---------------+ | /\ | | | ||||
| | | | | PTB | ||||
| PMTU_RAISE_TIMER| | | | (PTB_SIZE < BASE_PMTU) | ||||
| or reachability | | | | or | ||||
| (PROBE_COUNT | | | | PROBE_TIMER expiry | ||||
| = MAX_PROBES) | | | | (PROBE_COUNT = MAX_PROBES) | ||||
| +-----------+ | | \ | ||||
| | PTB | | \ | ||||
| | (< PROBED_SIZE)| | \ | ||||
| | | | ---------------+ | ||||
| | | | | | ||||
| | | | Probe | | ||||
| | | | acked | | ||||
v | | v v | ||||
+----------+-+ +----+---------+ Probe +-------------+ | ||||
| PROBE_DONE |<-------------- | PROBE_SEARCH |<-------| PROBE_ERROR | | ||||
+------+-----+ MAX_PMTU acked +------------+-+ acked +-------------+ | ||||
/\ | or /\ | | ||||
| | PROBE_TIMER expiry | | | ||||
| |(PROBE_COUNT = MAX_PROBES) | | | ||||
| | | | | ||||
+----+ +------+ | ||||
Reachability probe acked PROBE_TIMER expiry | ||||
or PROBE_TIMER expiry (PROBE_COUNT < MAX_PROBES) | ||||
(PROBE_COUNT < MAX_PROBES) or | ||||
Probe acked | ||||
Figure 4: State machine for Datagram PLPMTUD | ||||
XXX A future version of this document will update the state machine | XXX A future version of this document will update the state machine | |||
to describe handling of validated PTB messages. XXX | to describe handling of validated PTB messages. XXX | |||
The following states are defined to reflect the probing process: | The following states are defined to reflect the probing process: | |||
PROBE_START: The PROBE_START state is the initial state before | PROBE_START: The PROBE_START state is the initial state before | |||
probing has started. PLPMTUD is not performed in this state. The | probing has started. PLPMTUD is not performed in this state. The | |||
state transitions to PROBE_BASE, when a path has been confirmed, | state transitions to PROBE_BASE, when a path has been confirmed, | |||
i.e. when a sent packet has been acknowledged on this path. Any | i.e. when a sent packet has been acknowledged on this path. Any | |||
transport method may be used to exit PROBE_BASE as long as the | transport method may be used to exit PROBE_BASE as long as the | |||
send packet is acknowledge by the other side. The PLPMTU is set | send packet is acknowledge by the other side. The PLPMTU is set | |||
to the BASE_PMTU size. Probing ought to start immediately after | to the BASE_PMTU size. Probing ought to start immediately after | |||
connection setup to prevent the prevent the loss of user data. | connection setup to prevent the prevent the loss of user data. | |||
PROBE_BASE: The PROBE_BASE state is the starting point for probing | PROBE_BASE: The PROBE_BASE state is the starting point for probing | |||
with datagram PLPMTUD. It is used to confirm whether the BASE_PMTU | with datagram PLPMTUD. It is used to confirm whether the | |||
size is supported by the network path. On entry, the PROBED_SIZE | BASE_PMTU size is supported by the network path. On entry, the | |||
is set to the BASE_PMTU size and the PROBE_COUNT is set to zero. | PROBED_SIZE is set to the BASE_PMTU size and the PROBE_COUNT is | |||
A probe packet is sent, and the PROBE_TIMER is started. The state | set to zero. A probe packet is sent, and the PROBE_TIMER is | |||
is left when the PROBE_COUNT reaches MAX_PROBES; a PTB message is | started. The state is left when the PROBE_COUNT reaches | |||
verified, or a probe packet is acknowledged. | MAX_PROBES; a PTB message is validated, or a probe packet is | |||
acknowledged. | ||||
PROBE_SEARCH: The PROBE_SEARCH state is the main probing state. This | PROBE_SEARCH: The PROBE_SEARCH state is the main probing state. | |||
state is entered either when probing for the BASE_PMTU was | This state is entered either when probing for the BASE_PMTU was | |||
successful or when there is a successful reachability test in the | successful or when there is a successful reachability test in the | |||
PROBE_ERROR state. On entry, the PLPMTU is set to the last | PROBE_ERROR state. On entry, the PLPMTU is set to the last | |||
acknowledged PROBED_SIZE. | acknowledged PROBED_SIZE. | |||
The PROBE_COUNT is set to zero when the first probe packet is sent | The PROBE_COUNT is set to zero when the first probe packet is sent | |||
for each probe size. Each time a probe packet is acknowledged, | for each probe size. Each time a probe packet is acknowledged, | |||
the PLPMTU is set to the PROBED_SIZE, and then the PROBED_SIZE is | the PLPMTU is set to the PROBED_SIZE, and then the PROBED_SIZE is | |||
increased. | increased. | |||
When a probe packet is sent and not acknowledged within the period | When a probe packet is sent and not acknowledged within the period | |||
of the PROBE_TIMER, the PROBE_COUNT is incremented and the probe | of the PROBE_TIMER, the PROBE_COUNT is incremented and the probe | |||
packet is retransmitted. The state is exited when the PROBE_COUNT | packet is retransmitted. The state is exited when the PROBE_COUNT | |||
reaches MAX_PROBES; a PTB message is verified; or a probe of size | reaches MAX_PROBES; a PTB message is validated; or a probe of size | |||
PMTU_MAX is acknowledged. | PMTU_MAX is acknowledged. | |||
PROBE_ERROR: The PROBE_ERROR state represents the case where the | PROBE_ERROR: The PROBE_ERROR state represents the case where the | |||
network path is not known to support an PLPMTU of at least the | network path is not known to support an PLPMTU of at least the | |||
BASE_PMTU size. It is entered when either a probe of size | BASE_PMTU size. It is entered when either a probe of size | |||
BASE_PMTU has not been acknowledged or a verified PTB message | BASE_PMTU has not been acknowledged or a validated PTB message | |||
indicates a smaller link MTU than the BASE_PMTU. On entry, the | indicates a smaller link MTU than the BASE_PMTU. On entry, the | |||
PROBE_COUNT is set to zero and the PROBED_SIZE is set to the | PROBE_COUNT is set to zero and the PROBED_SIZE is set to the | |||
MIN_PMTU size, and the PLPMTU is reset to MIN_PMTU size. In this | MIN_PMTU size, and the PLPMTU is reset to MIN_PMTU size. In this | |||
state, a probe packet is sent, and the PROBE_TIMER is started. | state, a probe packet is sent, and the PROBE_TIMER is started. | |||
The state transitions to the PROBE_SEARCH state when a probe | The state transitions to the PROBE_SEARCH state when a probe | |||
packet is acknowledged. | packet is acknowledged. | |||
PROBE_DONE: The PROBE_DONE state indicates a successful end to a | PROBE_DONE: The PROBE_DONE state indicates a successful end to a | |||
probing phase. DPLPMTUD remains in this state until either the | probing phase. DPLPMTUD remains in this state until either the | |||
PMTU_RAISE_TIMER expires or a received PTB message is verified. | PMTU_RAISE_TIMER expires or a received PTB message is validated. | |||
When PLPMTUD uses an unacknowledged PL and is in the PROBE_DONE | When PLPMTUD uses an unacknowledged PL and is in the PROBE_DONE | |||
state, a REACHABILITY_TIMER periodically resets the PROBE_COUNT | state, a REACHABILITY_TIMER periodically resets the PROBE_COUNT | |||
and schedules a probe packet with the size of the PLPMTU. If the | and schedules a probe packet with the size of the PLPMTU. If the | |||
probe packet fails to be acknowledged after MAX_PROBES attempts, | probe packet fails to be acknowledged after MAX_PROBES attempts, | |||
the method enters the PROBE_BASE state. When used with an | the method enters the PROBE_BASE state. When used with an | |||
acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue to | acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue to | |||
probe in this state. | probe in this state. | |||
PROBE_DISABLED: The PROBE_DISABLED state indicates that connectivity | PROBE_DISABLED: The PROBE_DISABLED state indicates that connectivity | |||
could not be established. DPLPMTUD MUST NOT probe in this state. | could not be established. DPLPMTUD MUST NOT probe in this state. | |||
Appendix Appendix A contains an informative description of key | Appendix A contains an informative description of key events. | |||
events. | ||||
5. Specification of Protocol-Specific Methods | 5. Specification of Protocol-Specific Methods | |||
This section specifies protocol-specific details for datagram PLPMTUD | This section specifies protocol-specific details for datagram PLPMTUD | |||
for IETF-specified transports. | for IETF-specified transports. | |||
The first subsection provides guidance on how to implement the | The first subsection provides guidance on how to implement the | |||
DPLPMTUD method as a part of an application using UDP or UDP-Lite. | DPLPMTUD method as a part of an application using UDP or UDP-Lite. | |||
The guidance also applies to other datagram services that do not | The guidance also applies to other datagram services that do not | |||
include a specific transport protocol (such as a tunnel | include a specific transport protocol (such as a tunnel | |||
encapsulation). The following subsection describe how DPLPMTUD can be | encapsulation). The following subsection describe how DPLPMTUD can | |||
implemented as a part of the transport service, allowing applications | be implemented as a part of the transport service, allowing | |||
using the service to benefit from discovery of the PLPMTU without | applications using the service to benefit from discovery of the | |||
themselves needing to implement this method. | PLPMTU without themselves needing to implement this method. | |||
5.1. Application support for DPLPMTUD with UDP or UDP-Lite | 5.1. Application support for DPLPMTUD with UDP or UDP-Lite | |||
The current specifications of UDP [RFC0768] and UDP-Lite [RFC3828] do | The current specifications of UDP [RFC0768] and UDP-Lite [RFC3828] do | |||
not define a method in the RFC-series that supports PLPMTUD. In | not define a method in the RFC-series that supports PLPMTUD. In | |||
particular, the UDP transport does not provide the transport layer | particular, the UDP transport does not provide the transport layer | |||
features needed to implement datagram PLPMTUD. | features needed to implement datagram PLPMTUD. | |||
The DPLPMTUD method can be implemented as a part of an application | The DPLPMTUD method can be implemented as a part of an application | |||
built directly or indirectly on UDP or UDP-Lite, but relies on | built directly or indirectly on UDP or UDP-Lite, but relies on | |||
higher-layer protocol features to implement the method [RFC8085]. | higher-layer protocol features to implement the method [RFC8085]. | |||
Some primitives used by DPLPMTUD might not be available via the | Some primitives used by DPLPMTUD might not be available via the | |||
Datagram API (e.g., the ability to access the PLPMTU cache, or | Datagram API (e.g., the ability to access the PLPMTU cache, or | |||
interpret received ICMP PTB messages). | interpret received ICMP PTB messages). | |||
skipping to change at page 21, line 28 ¶ | skipping to change at page 24, line 41 ¶ | |||
5.1.4. Validating the Path | 5.1.4. Validating the Path | |||
An application that does not have other higher-layer information | An application that does not have other higher-layer information | |||
confirming correct delivery of datagrams SHOULD implement the | confirming correct delivery of datagrams SHOULD implement the | |||
REACHABILITY_TIMER to periodically send probe packets while in the | REACHABILITY_TIMER to periodically send probe packets while in the | |||
PROBE_DONE state. | PROBE_DONE state. | |||
5.1.5. Handling of PTB Messages | 5.1.5. Handling of PTB Messages | |||
An application that is able and wishes to receive PTB messages MUST | An application that is able and wishes to receive PTB messages MUST | |||
perform ICMP verification as specified in Section 5.2 of [RFC8085]. | perform ICMP validation as specified in Section 5.2 of [RFC8085]. | |||
This requires that the application verifies each received PTB | This requires that the application to check each received PTB | |||
messages to verify these are received in response to transmitted | messages to validate it is received in response to transmitted | |||
traffic and that the reported link MTU is less than the current probe | traffic and that the reported link MTU is less than the current probe | |||
size. A verified PTB message MAY be used as input to the DPLPMTUD | size. A validated PTB message MAY be used as input to the DPLPMTUD | |||
algorithm, but MUST NOT be used directly to set the PLPMTU. | algorithm, but MUST NOT be used directly to set the PLPMTU. | |||
5.2. DPLPMTUD with UDP Options | 5.2. DPLPMTUD with UDP Options | |||
UDP-Options [I-D.ietf-tsvwg-udp-options] can supply the additional | UDP-Options [I-D.ietf-tsvwg-udp-options] can supply the additional | |||
functionality required to implement DPLPMTUD within the UDP transport | functionality required to implement DPLPMTUD within the UDP transport | |||
service. This avoids the need for applications to implement the | service. This avoids the need for applications to implement the | |||
DPLPMTUD method. | DPLPMTUD method. | |||
This enables padding to be added to UDP datagrams and can be used to | This enables padding to be added to UDP datagrams and can be used to | |||
provide feedback acknowledgement of received probe packets. | provide feedback acknowledgement of received probe packets. | |||
The specification also defines two UDP Options to support DPLMTUD. | The specification also defines two UDP Options to support DPLMTUD. | |||
Section 5.6 of [I-D.ietf-tsvwg-udp-options] defines the MSS option | Section 5.6 of [I-D.ietf-tsvwg-udp-options] defines the MSS option | |||
which allows the local sender to indicate the EMTU_R to the peer. | which allows the local sender to indicate the EMTU_R to the peer. | |||
This option can be used to initialise PMTU_MAX. An application | This option can be used to initialise PMTU_MAX. An application | |||
wishing to avoid the effects of MSS-Clamping (where a middlebox | wishing to avoid the effects of MSS-Clamping (where a middlebox | |||
changes the advertised TCP maximum sending size) ought to use a | changes the advertised TCP maximum sending size) ought to use a | |||
cryptographic method to encrypt this parameter. | cryptographic method to encrypt this parameter. | |||
5.2.1. UDP Request Option | 5.2.1. UDP Request Option | |||
The Request Option allows a sending endpoint to solicit a response | The Request Option allows a sending endpoint to solicit a response | |||
from a destination endpoint. | from a destination endpoint. | |||
The Request Option carries a four byte token set by the sender. This | The Request Option carries a four byte token set by the sender. This | |||
token can be set to a value that is likely to be known only to the | token can be set to a value that is likely to be known only to the | |||
sender (and becomes known to nodes along the end-to-end path). The | sender (and becomes known to nodes along the end-to-end path). The | |||
sender can then check the value returned in the response to provide | sender can then check the value returned in the response to provide | |||
additional protection from off-path insertion of data [RFC8085]. | additional protection from off-path insertion of data [RFC8085]. | |||
+---------+--------+-----------------+ | +---------+--------+-----------------+ | |||
| Kind=9 | Len=6 | Token | | | Kind=9 | Len=6 | Token | | |||
+---------+--------+-----------------+ | +---------+--------+-----------------+ | |||
1 byte 1 byte 4 bytes | 1 byte 1 byte 4 bytes | |||
Figure 5: UDP REQ Option Format | ||||
5.2.2. UDP Response Option | 5.2.2. UDP Response Option | |||
The Response Option is generated by the PL in response to reception | The Response Option is generated by the PL in response to reception | |||
of a previously received Echo Request. The Token field associates | of a previously received Echo Request. The Token field associates | |||
the response with the Token value carried in the most recently- | the response with the Token value carried in the most recently- | |||
received Echo Request. The rate of generation of UDP packets | received Echo Request. The rate of generation of UDP packets | |||
carrying a Response Option MAY be rate-limited. | carrying a Response Option MAY be rate-limited. | |||
+---------+--------+-----------------+ | +---------+--------+-----------------+ | |||
| Kind=10 | Len=6 | Token | | | Kind=10 | Len=6 | Token | | |||
+---------+--------+-----------------+ | +---------+--------+-----------------+ | |||
1 byte 1 byte 4 bytes | 1 byte 1 byte 4 bytes | |||
Figure 6: UDP RES Option Format | ||||
5.3. DPLPMTUD for SCTP | 5.3. DPLPMTUD for SCTP | |||
Section 10.2 of [RFC4821] specifies a recommended PLPMTUD probing | Section 10.2 of [RFC4821] specifies a recommended PLPMTUD probing | |||
method for SCTP. It recommends the use of the PAD chunk, defined in | method for SCTP. It recommends the use of the PAD chunk, defined in | |||
[RFC4820] to be attached to a minimum length HEARTBEAT chunk to build | [RFC4820] to be attached to a minimum length HEARTBEAT chunk to build | |||
a probe packet. This enables probing without affecting the transfer | a probe packet. This enables probing without affecting the transfer | |||
of user messages and without interfering with congestion control. | of user messages and without interfering with congestion control. | |||
This is preferred to using DATA chunks (with padding as required) as | This is preferred to using DATA chunks (with padding as required) as | |||
path probes. | path probes. | |||
XXX Future versions of this document might define a parameter | XXX Future versions of this document might define a parameter | |||
contained in the INIT and INIT ACK chunk to indicate the remote peer | contained in the INIT and INIT ACK chunk to indicate the remote peer | |||
MTU to the local peer. However, multihoming makes this a bit | MTU to the local peer. However, multihoming makes this a bit | |||
complex, so it might not be worth doing. XXX | complex, so it might not be worth doing. XXX | |||
5.3.1. SCTP/IP4 and SCTP/IPv6 | 5.3.1. SCTP/IP4 and SCTP/IPv6 | |||
The base protocol is specified in [RFC4960]. This provides an | The base protocol is specified in [RFC4960]. This provides an | |||
acknowledged PL. A sender can therefore enter the PROBE_BASE state as | acknowledged PL. A sender can therefore enter the PROBE_BASE state | |||
soon as connectivity has been confirmed. | as soon as connectivity has been confirmed. | |||
5.3.1.1. Sending SCTP Probe Packets | 5.3.1.1. Sending SCTP Probe Packets | |||
Probe packets consist of an SCTP common header followed by a | Probe packets consist of an SCTP common header followed by a | |||
HEARTBEAT chunk and a PAD chunk. The PAD chunk is used to control | HEARTBEAT chunk and a PAD chunk. The PAD chunk is used to control | |||
the length of the probe packet. The HEARTBEAT chunk is used to | the length of the probe packet. The HEARTBEAT chunk is used to | |||
trigger the sending of a HEARTBEAT ACK chunk. The reception of the | trigger the sending of a HEARTBEAT ACK chunk. The reception of the | |||
HEARTBEAT ACK chunk acknowledges reception of a successful probe. | HEARTBEAT ACK chunk acknowledges reception of a successful probe. | |||
The HEARTBEAT chunk carries a Heartbeat Information parameter which | The HEARTBEAT chunk carries a Heartbeat Information parameter which | |||
should include, besides the information suggested in [RFC4960], the | should include, besides the information suggested in [RFC4960], the | |||
probe size, which is the size of the complete datagram. The size of | probe size, which is the size of the complete datagram. The size of | |||
the PAD chunk is therefore computed by reducing the probing size by | the PAD chunk is therefore computed by reducing the probing size by | |||
skipping to change at page 23, line 32 ¶ | skipping to change at page 27, line 12 ¶ | |||
number of PMTU sizes probed. The Heartbeat timer can be used to | number of PMTU sizes probed. The Heartbeat timer can be used to | |||
implement the PROBE_TIMER. | implement the PROBE_TIMER. | |||
5.3.1.2. Validating the Path with SCTP | 5.3.1.2. Validating the Path with SCTP | |||
Since SCTP provides an acknowledged PL, a sender does MUST NOT | Since SCTP provides an acknowledged PL, a sender does MUST NOT | |||
implement the REACHABILITY_TIMER while in the PROBE_DONE state. | implement the REACHABILITY_TIMER while in the PROBE_DONE state. | |||
5.3.1.3. PTB Message Handling by SCTP | 5.3.1.3. PTB Message Handling by SCTP | |||
Normal ICMP verification MUST be performed as specified in Appendix C | Normal ICMP validation MUST be performed as specified in Appendix C | |||
of [RFC4960]. This requires that the first 8 bytes of the SCTP | of [RFC4960]. This requires that the first 8 bytes of the SCTP | |||
common header are quoted in the payload of the PTB message, which can | common header are quoted in the payload of the PTB message, which can | |||
be the case for ICMPv4 and is normally the case for ICMPv6. | be the case for ICMPv4 and is normally the case for ICMPv6. | |||
When a PTB message has been verified, the router Link MTU indicated | When a PTB message has been validated, the router Link MTU indicated | |||
in the PTB message SHOULD be used with the DPLPMTUD algorithm, | in the PTB message SHOULD be used with the DPLPMTUD algorithm, | |||
providing that the reported Link MTU is less than the current probe | providing that the reported Link MTU is less than the current probe | |||
size. | size. | |||
5.3.2. DPLPMTUD for SCTP/UDP | 5.3.2. DPLPMTUD for SCTP/UDP | |||
The UDP encapsulation of SCTP is specified in [RFC6951]. | The UDP encapsulation of SCTP is specified in [RFC6951]. | |||
5.3.2.1. Sending SCTP/UDP Probe Packets | 5.3.2.1. Sending SCTP/UDP Probe Packets | |||
Packet probing can be performed as specified in Section 5.3.1.1. The | Packet probing can be performed as specified in Section 5.3.1.1. The | |||
maximum payload is reduced by 8 bytes, which has to be considered | maximum payload is reduced by 8 bytes, which has to be considered | |||
when filling the PAD chunk. | when filling the PAD chunk. | |||
5.3.2.2. Validating the Path with SCTP/UDP | 5.3.2.2. Validating the Path with SCTP/UDP | |||
Since SCTP provides an acknowledged PL, a sender does MUST NOT | Since SCTP provides an acknowledged PL, a sender does MUST NOT | |||
implement the REACHABILITY_TIMER while in the PROBE_DONE state. | implement the REACHABILITY_TIMER while in the PROBE_DONE state. | |||
5.3.2.3. Handling of PTB Messages by SCTP/UDP | 5.3.2.3. Handling of PTB Messages by SCTP/UDP | |||
Normal ICMP verification MUST be performed for PTB messages as | Normal ICMP validation MUST be performed for PTB messages as | |||
specified in Appendix C of [RFC4960]. This requires that the first 8 | specified in Appendix C of [RFC4960]. This requires that the first 8 | |||
bytes of the SCTP common header are contained in the PTB message, | bytes of the SCTP common header are contained in the PTB message, | |||
which can be the case for ICMPv4 (but note the UDP header also | which can be the case for ICMPv4 (but note the UDP header also | |||
consumes a part of the quoted packet header) and is normally the case | consumes a part of the quoted packet header) and is normally the case | |||
for ICMPv6. When the verification is completed, the router Link MTU | for ICMPv6. When the validation is completed, the router Link MTU | |||
size indicated in the PTB message SHOULD be used with the DPLPMTUD | size indicated in the PTB message SHOULD be used with the DPLPMTUD | |||
providing that the reported link MTU is less than the current probe | providing that the reported link MTU is less than the current probe | |||
size. | size. | |||
5.3.3. DPLPMTUD for SCTP/DTLS | 5.3.3. DPLPMTUD for SCTP/DTLS | |||
The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is | The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is | |||
specified in [I-D.ietf-tsvwg-sctp-dtls-encaps]. It is used for data | specified in [RFC8261]. It is used for data channels in WebRTC | |||
channels in WebRTC implementations. | implementations. | |||
5.3.3.1. Sending SCTP/DTLS Probe Packets | 5.3.3.1. Sending SCTP/DTLS Probe Packets | |||
Packet probing can be done as specified in Section 5.3.1.1. | Packet probing can be done as specified in Section 5.3.1.1. | |||
5.3.3.2. Validating the Path with SCTP/DTLS | 5.3.3.2. Validating the Path with SCTP/DTLS | |||
Since SCTP provides an acknowledged PL, a sender does MUST NOT | Since SCTP provides an acknowledged PL, a sender does MUST NOT | |||
implement the REACHABILITY_TIMER while in the PROBE_DONE state. | implement the REACHABILITY_TIMER while in the PROBE_DONE state. | |||
5.3.3.3. Handling of PTB Messages by SCTP/DTLS | 5.3.3.3. Handling of PTB Messages by SCTP/DTLS | |||
It is not possible to perform normal ICMP verification as specified | It is not possible to perform normal ICMP validation as specified in | |||
in [RFC4960], since even if the ICMP message payload contains | [RFC4960], since even if the ICMP message payload contains sufficient | |||
sufficient information, the reflected SCTP common header would be | information, the reflected SCTP common header would be encrypted. | |||
encrypted. Therefore it is not possible to process PTB messages at | Therefore it is not possible to process PTB messages at the PL. | |||
the PL. | ||||
5.4. DPLPMTUD for QUIC | 5.4. DPLPMTUD for QUIC | |||
Quick UDP Internet Connection (QUIC) [I-D.ietf-quic-transport] is a | Quick UDP Internet Connection (QUIC) [I-D.ietf-quic-transport] is a | |||
UDP-based transport that provides reception feedback. | UDP-based transport that provides reception feedback. | |||
Section 9.2 of [I-D.ietf-quic-transport] describes the path | Section 9.2 of [I-D.ietf-quic-transport] describes the path | |||
considerations when sending QUIC packets. It recommends the use of | considerations when sending QUIC packets. It recommends the use of | |||
PADDING frames to build the probe packet. This enables probing the | PADDING frames to build the probe packet. This enables probing the | |||
without affecting the transfer of other QUIC frames. | without affecting the transfer of other QUIC frames. | |||
This provides an acknowledged PL. A sender can therefore enter the | This provides an acknowledged PL. A sender can therefore enter the | |||
PROBE_BASE state as soon as connectivity has been confirmed. | PROBE_BASE state as soon as connectivity has been confirmed. | |||
5.4.1. Sending QUIC Probe Packets | 5.4.1. Sending QUIC Probe Packets | |||
A probe packet consists of a QUIC Header and a payload containing | A probe packet consists of a QUIC Header and a payload containing | |||
only PADDING Frames. PADDING Frames are a single octet (0x00) and | only PADDING Frames. PADDING Frames are a single octet (0x00) and | |||
several of these can be used to create a probe packet of size | several of these can be used to create a probe packet of size | |||
PROBED_SIZE. QUIC provides an acknowledged PL. A sender can therefore | PROBED_SIZE. QUIC provides an acknowledged PL. A sender can | |||
enter the PROBE_BASE state as soon as connectivity has been | therefore enter the PROBE_BASE state as soon as connectivity has been | |||
confirmed. | confirmed. | |||
The current specification of QUIC sets the following: | The current specification of QUIC sets the following: | |||
o BASE_PMTU: 1200. A QUIC sender needs to pad initial packets to | o BASE_PMTU: 1200. A QUIC sender needs to pad initial packets to | |||
1200 bytes to validate the path can support packets of a useful | 1200 bytes to validate the path can support packets of a useful | |||
size. | size. | |||
o MIN_PMTU: 1200 bytes. A QUIC sender that determines the PMTU has | o MIN_PMTU: 1200 bytes. A QUIC sender that determines the PMTU has | |||
fallen below 1200 bytes MUST immediately stop sending on the | fallen below 1200 bytes MUST immediately stop sending on the | |||
affected path. | affected path. | |||
5.4.2. Validating the Path with QUIC | 5.4.2. Validating the Path with QUIC | |||
QUIC provides an acknowledged PL. A sender therefore MUST NOT | QUIC provides an acknowledged PL. A sender therefore MUST NOT | |||
implement the REACHABILITY_TIMER while in the PROBE_DONE state. | implement the REACHABILITY_TIMER while in the PROBE_DONE state. | |||
5.4.3. Handling of PTB Messages by QUIC | 5.4.3. Handling of PTB Messages by QUIC | |||
QUIC operates over the UDP transport, and the guidelines on ICMP | QUIC operates over the UDP transport, and the guidelines on ICMP | |||
verification as specified in Section 5.2 of [RFC8085] therefore | validation as specified in Section 5.2 of [RFC8085] therefore apply. | |||
apply. Although QUIC does not currently specify a method for | Although QUIC does not currently specify a method for validating ICMP | |||
validating ICMP responses, it does provide some guidelines to make it | responses, it does provide some guidelines to make it harder for an | |||
harder for an off-path attacker to inject ICMP messages. | off-path attacker to inject ICMP messages. | |||
o Set the IPv4 Don't Fragment (DF) bit on a small proportion of | o Set the IPv4 Don't Fragment (DF) bit on a small proportion of | |||
packets, so that most invalid ICMP messages arrive when there are | packets, so that most invalid ICMP messages arrive when there are | |||
no DF packets outstanding, and can therefore be identified as | no DF packets outstanding, and can therefore be identified as | |||
spurious. | spurious. | |||
o Store additional information from the IP or UDP headers from DF | o Store additional information from the IP or UDP headers from DF | |||
packets (for example, the IP ID or UDP checksum) to further | packets (for example, the IP ID or UDP checksum) to further | |||
authenticate incoming Datagram Too Big messages. | authenticate incoming Datagram Too Big messages. | |||
o Any reduction in PMTU due to a report contained in an ICMP packet | o Any reduction in PMTU due to a report contained in an ICMP packet | |||
is provisional until QUIC's loss detection algorithm determines | is provisional until QUIC's loss detection algorithm determines | |||
that the packet is actually lost. | that the packet is actually lost. | |||
XXX The above list was pulled whole from quic-transport - input is | XXX The above list was pulled whole from quic-transport - input is | |||
invited from QUIC contributors. XXX | invited from QUIC contributors. XXX | |||
6. Acknowledgements | 6. Acknowledgements | |||
This work was partially funded by the European Union's Horizon 2020 | This work was partially funded by the European Union's Horizon 2020 | |||
research and innovation programme under grant agreement No. 644334 | research and innovation programme under grant agreement No. 644334 | |||
(NEAT). The views expressed are solely those of the author(s). | (NEAT). The views expressed are solely those of the author(s). | |||
7. IANA Considerations | 7. IANA Considerations | |||
This memo includes no request to IANA. | This memo includes no request to IANA. | |||
XXX If new UDP Options are specified in this document, a request to | XXX If new UDP Options are specified in this document, a request to | |||
IANA will be included here. XXX | IANA will be included here. XXX | |||
If there are no requirements for IANA, the section will be removed | If there are no requirements for IANA, the section will be removed | |||
during conversion into an RFC by the RFC Editor. | during conversion into an RFC by the RFC Editor. | |||
8. Security Considerations | 8. Security Considerations | |||
skipping to change at page 26, line 23 ¶ | skipping to change at page 30, line 22 ¶ | |||
The security considerations for the use of UDP and SCTP are provided | The security considerations for the use of UDP and SCTP are provided | |||
in the references RFCs. Security guidance for applications using UDP | in the references RFCs. Security guidance for applications using UDP | |||
is provided in the UDP Usage Guidelines [RFC8085]. | is provided in the UDP Usage Guidelines [RFC8085]. | |||
There are cases where PTB messages are not delivered due to policy, | There are cases where PTB messages are not delivered due to policy, | |||
configuration or equipment design (see Section 1.1), this method | configuration or equipment design (see Section 1.1), this method | |||
therefore does not rely upon PTB messages being received, but is able | therefore does not rely upon PTB messages being received, but is able | |||
to utilise these when they are received by the sender. PTB messages | to utilise these when they are received by the sender. PTB messages | |||
could potentially be used to cause a node to inappropriately reduce | could potentially be used to cause a node to inappropriately reduce | |||
the PLPMTU. A node supporting DPLPMTUD MUST therefore appropriately | the PLPMTU. A node supporting DPLPMTUD MUST therefore appropriately | |||
verify the payload of PTB messages to ensure these are received in | validate the payload of PTB messages to ensure these are received in | |||
response to transmitted traffic (i.e., a reported error condition | response to transmitted traffic (i.e., a reported error condition | |||
that corresponds to a datagram actually sent by the path layer. | that corresponds to a datagram actually sent by the path layer. | |||
Parallel forwarding paths may need to be considered. Section 3.5 | Parallel forwarding paths may need to be considered. Section 3.5 | |||
identifies the need for robustness in the method when the path | identifies the need for robustness in the method when the path | |||
information may be inconsistent. | information may be inconsistent. | |||
A node performing DPLPMTUD could experience conflicting information | A node performing DPLPMTUD could experience conflicting information | |||
about the size of supported probe packets. This could occur when | about the size of supported probe packets. This could occur when | |||
there are multiple paths are concurrently in use and these exhibit a | there are multiple paths are concurrently in use and these exhibit a | |||
different PMTU. If not considered, this could result in data being | different PMTU. If not considered, this could result in data being | |||
black holed when the PLPMTU is larger than the smallest PMTU across | black holed when the PLPMTU is larger than the smallest PMTU across | |||
the current paths. | the current paths. | |||
An on-path attacker could forge PTB messages to drive down the PLPMTU | An on-path attacker could forge PTB messages to drive down the PLPMTU | |||
9. References | 9. References | |||
9.1. Normative References | 9.1. Normative References | |||
[I-D.ietf-quic-transport] | [I-D.ietf-quic-transport] | |||
Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed | Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed | |||
and Secure Transport", Internet-Draft draft-ietf-quic- | and Secure Transport", draft-ietf-quic-transport-13 (work | |||
transport-04, June 2017. | in progress), June 2018. | |||
[I-D.ietf-tsvwg-sctp-dtls-encaps] | ||||
Tuexen, M., Stewart, R., Jesup, R. and S. Loreto, "DTLS | ||||
Encapsulation of SCTP Packets", Internet-Draft draft-ietf- | ||||
tsvwg-sctp-dtls-encaps-09, January 2015. | ||||
[I-D.ietf-tsvwg-udp-options] | [I-D.ietf-tsvwg-udp-options] | |||
Touch, J., "Transport Options for UDP", Internet-Draft | Touch, J., "Transport Options for UDP", draft-ietf-tsvwg- | |||
draft-ietf-tsvwg-udp-options-01, June 2017. | udp-options-04 (work in progress), July 2018. | |||
[RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, | [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, | |||
August 1980. | DOI 10.17487/RFC0768, August 1980, | |||
<https://www.rfc-editor.org/info/rfc768>. | ||||
[RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, | [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, | |||
RFC 792, DOI 10.17487/RFC0792, September 1981, <https:// | RFC 792, DOI 10.17487/RFC0792, September 1981, | |||
www.rfc-editor.org/info/rfc792>. | <https://www.rfc-editor.org/info/rfc792>. | |||
[RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - | [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - | |||
Communication Layers", STD 3, RFC 1122, DOI 10.17487/ | Communication Layers", STD 3, RFC 1122, | |||
RFC1122, October 1989, <https://www.rfc-editor.org/info/ | DOI 10.17487/RFC1122, October 1989, | |||
rfc1122>. | <https://www.rfc-editor.org/info/rfc1122>. | |||
[RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", | [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", | |||
RFC 1812, DOI 10.17487/RFC1812, June 1995, <https://www | RFC 1812, DOI 10.17487/RFC1812, June 1995, | |||
.rfc-editor.org/info/rfc1812>. | <https://www.rfc-editor.org/info/rfc1812>. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | ||||
<https://www.rfc-editor.org/info/rfc2119>. | ||||
[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | |||
(IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, | (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, | |||
December 1998, <https://www.rfc-editor.org/info/rfc2460>. | December 1998, <https://www.rfc-editor.org/info/rfc2460>. | |||
[RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E.Ed., | [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., | |||
and G. Fairhurst, Ed., "The Lightweight User Datagram | and G. Fairhurst, Ed., "The Lightweight User Datagram | |||
Protocol (UDP-Lite)", RFC 3828, DOI 10.17487/RFC3828, July | Protocol (UDP-Lite)", RFC 3828, DOI 10.17487/RFC3828, July | |||
2004, <https://www.rfc-editor.org/info/rfc3828>. | 2004, <https://www.rfc-editor.org/info/rfc3828>. | |||
[RFC4820] Tuexen, M., Stewart, R. and P. Lei, "Padding Chunk and | [RFC4820] Tuexen, M., Stewart, R., and P. Lei, "Padding Chunk and | |||
Parameter for the Stream Control Transmission Protocol | Parameter for the Stream Control Transmission Protocol | |||
(SCTP)", RFC 4820, DOI 10.17487/RFC4820, March 2007, | (SCTP)", RFC 4820, DOI 10.17487/RFC4820, March 2007, | |||
<https://www.rfc-editor.org/info/rfc4820>. | <https://www.rfc-editor.org/info/rfc4820>. | |||
[RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", | [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", | |||
RFC 4960, DOI 10.17487/RFC4960, September 2007, <https:// | RFC 4960, DOI 10.17487/RFC4960, September 2007, | |||
www.rfc-editor.org/info/rfc4960>. | <https://www.rfc-editor.org/info/rfc4960>. | |||
[RFC6951] Tuexen, M. and R. Stewart, "UDP Encapsulation of Stream | [RFC6951] Tuexen, M. and R. Stewart, "UDP Encapsulation of Stream | |||
Control Transmission Protocol (SCTP) Packets for End-Host | Control Transmission Protocol (SCTP) Packets for End-Host | |||
to End-Host Communication", RFC 6951, DOI 10.17487/ | to End-Host Communication", RFC 6951, | |||
RFC6951, May 2013, <https://www.rfc-editor.org/info/ | DOI 10.17487/RFC6951, May 2013, | |||
rfc6951>. | <https://www.rfc-editor.org/info/rfc6951>. | |||
[RFC8085] Eggert, L., Fairhurst, G. and G. Shepherd, "UDP Usage | [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage | |||
Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, | Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, | |||
March 2017, <https://www.rfc-editor.org/info/rfc8085>. | March 2017, <https://www.rfc-editor.org/info/rfc8085>. | |||
[RFC8201] McCann, J., Deering, S., Mogul, J. and R. Hinden, Ed., | [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., | |||
"Path MTU Discovery for IP version 6", STD 87, RFC 8201, | "Path MTU Discovery for IP version 6", STD 87, RFC 8201, | |||
DOI 10.17487/RFC8201, July 2017, <https://www.rfc- | DOI 10.17487/RFC8201, July 2017, | |||
editor.org/info/rfc8201>. | <https://www.rfc-editor.org/info/rfc8201>. | |||
[RFC8261] Tuexen, M., Stewart, R., Jesup, R., and S. Loreto, | ||||
"Datagram Transport Layer Security (DTLS) Encapsulation of | ||||
SCTP Packets", RFC 8261, DOI 10.17487/RFC8261, November | ||||
2017, <https://www.rfc-editor.org/info/rfc8261>. | ||||
9.2. Informative References | 9.2. Informative References | |||
[RFC1191] Mogul, J.C. and S.E. Deering, "Path MTU discovery", RFC | [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, | |||
1191, DOI 10.17487/RFC1191, November 1990, <https://www | DOI 10.17487/RFC1191, November 1990, | |||
.rfc-editor.org/info/rfc1191>. | <https://www.rfc-editor.org/info/rfc1191>. | |||
[RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", RFC | [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", | |||
2923, DOI 10.17487/RFC2923, September 2000, <https://www | RFC 2923, DOI 10.17487/RFC2923, September 2000, | |||
.rfc-editor.org/info/rfc2923>. | <https://www.rfc-editor.org/info/rfc2923>. | |||
[RFC4340] Kohler, E., Handley, M. and S. Floyd, "Datagram Congestion | [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram | |||
Control Protocol (DCCP)", RFC 4340, March 2006. | Congestion Control Protocol (DCCP)", RFC 4340, | |||
DOI 10.17487/RFC4340, March 2006, | ||||
<https://www.rfc-editor.org/info/rfc4340>. | ||||
[RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU | [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU | |||
Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, | Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, | |||
<https://www.rfc-editor.org/info/rfc4821>. | <https://www.rfc-editor.org/info/rfc4821>. | |||
[RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering | [RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering | |||
ICMPv6 Messages in Firewalls", RFC 4890, DOI 10.17487/ | ICMPv6 Messages in Firewalls", RFC 4890, | |||
RFC4890, May 2007, <https://www.rfc-editor.org/info/ | DOI 10.17487/RFC4890, May 2007, | |||
rfc4890>. | <https://www.rfc-editor.org/info/rfc4890>. | |||
Appendix A. Event-driven state changes | Appendix A. Event-driven state changes | |||
This appendix contains an informative description of key events: | This appendix contains an informative description of key events: | |||
Path Setup: When a new path is initiated, the state is set to | Path Setup: When a new path is initiated, the state is set to | |||
PROBE_START. As soon as the path is confirmed, the state changes | PROBE_START. As soon as the path is confirmed, the state changes | |||
to PROBE_BASE and the probing mechanism for this path is started. | to PROBE_BASE and probing for this path is started. The first | |||
the first probe packet is sent with the size of the BASE_PMTU. | probe packet is sent with the size of the BASE_PMTU. | |||
Arrival of an Acknowledgment: Depending on the probing state, the | Arrival of an Acknowledgment: Depending on the probing state, the | |||
reaction differs according to Figure 5, which is just a | reaction differs according to Figure 7, which is a simplification | |||
simplification of Figure 2 focusing on this event. | of Figure 4 focusing on this event. | |||
+--------------+ +----------------+ | +--------------+ +----------------+ | |||
| PROBE_START | --3------------------------------->| PROBE_DISABLED | | | PROBE_START | --3------------------------------->| PROBE_DISABLED | | |||
+--------------+ --4-----------\ +----------------+ | +--------------+ --4-----------\ +----------------+ | |||
\ | \ | |||
+--------------+ \ | +--------------+ \ | |||
| PROBE_ERROR | --------------- \ | | PROBE_ERROR | --------------- \ | |||
+--------------+ \ \ | +--------------+ \ \ | |||
\ \ | \ \ | |||
+--------------+ \ \ +--------------+ | +--------------+ \ \ +--------------+ | |||
| PROBE_BASE | --1---------- \ ------------> | PROBE_BASE | | | PROBE_BASE | --1---------- \ ------------> | PROBE_BASE | | |||
+--------------+ --2----- \ \ +--------------+ | +--------------+ --2----- \ \ +--------------+ | |||
\ \ \ | \ \ \ | |||
+--------------+ \ \ ------------> +--------------+ | +--------------+ \ \ ------------> +--------------+ | |||
| PROBE_SEARCH | --2--- \ -----------------> | PROBE_SEARCH | | | PROBE_SEARCH | --2--- \ -----------------> | PROBE_SEARCH | | |||
+--------------+ --1---\----\---------------------> +--------------+ | +--------------+ --1---\----\---------------------> +--------------+ | |||
\ \ | \ \ | |||
+--------------+ \ \ +--------------+ | +--------------+ \ \ +--------------+ | |||
| PROBE_DONE | \ -------------------> | PROBE_DONE | | | PROBE_DONE | \ -------------------> | PROBE_DONE | | |||
+--------------+ -----------------------> +--------------+ | +--------------+ -----------------------> +--------------+ | |||
Condition 1: The maximum PMTU size has not yet been reached. | Condition 1: The maximum PMTU size has not yet been reached. | |||
Condition 2: The maximum PMTU size has been reached. Conition 3: | Condition 2: The maximum PMTU size has been reached. Conition 3: | |||
Probe Timer expires and PROBE_COUNT = MAX_PROBEs. Condition 4: | Probe Timer expires and PROBE_COUNT = MAX_PROBEs. Condition 4: | |||
PROBE_ACK received. | PROBE_ACK received. | |||
Probing timeout: The PROBE_COUNT is initialised to zero each time the | Figure 7: State changes at the arrival of an acknowledgment | |||
value of PROBED_SIZE is changed. The PROBE_TIMER is started each | ||||
time a probe packet is sent. It is stopped when an acknowledgment | ||||
arrives that confirms delivery of a probe packet. If the probe | ||||
packet is not acknowledged before the PROBE_TIMER expires, the | ||||
PROBE_ERROR_COUNTER is incremented. When the PROBE_COUNT equals | ||||
the value MAX_PROBES, the state is changed, otherwise a new probe | ||||
packet of the same size (PROBED_SIZE) is resent. The state | ||||
transitions are illustrated in Figure 6. This shows a | ||||
simplification of Figure 2 with a focus only on this event. | ||||
+--------------+ +----------------+ | Probing timeout: The PROBE_COUNT is initialised to zero each time | |||
| PROBE_START |----------------------------------->| PROBE_DISABLED | | the value of PROBED_SIZE is changed and when a acknowledgment | |||
+--------------+ +----------------+ | confirming delivery of a probe packet arries. The PROBE_TIMER is | |||
started each time a probe packet is sent. It is stopped when an | ||||
acknowledgment arrives that confirms delivery of a probe packet of | ||||
PROBED_SIZE. If the probe packet is not acknowledged before the | ||||
PROBE_TIMER expires, the PROBE_COUNT is incremented. When the | ||||
PROBE_COUNT equals the value MAX_PROBES, the state is changed, | ||||
otherwise a new probe packet of the same size (PROBED_SIZE) is | ||||
resent. The state transitions are illustrated in Figure 8. This | ||||
shows a simplification of Figure 4 with a focus only on this | ||||
event. | ||||
+--------------+ +--------------+ | +--------------+ +----------------+ | |||
| PROBE_ERROR | -----------------> | PROBE_ERROR | | | PROBE_START |----------------------------------->| PROBE_DISABLED | | |||
+--------------+ / +--------------+ | +--------------+ +----------------+ | |||
/ | ||||
+--------------+ --2----------/ +--------------+ | ||||
| PROBE_BASE | --1------------------------------> | PROBE_BASE | | ||||
+--------------+ +--------------+ | ||||
+--------------+ +--------------+ | +--------------+ +--------------+ | |||
| PROBE_SEARCH | --1------------------------------> | PROBE_SEARCH | | | PROBE_ERROR | -----------------> | PROBE_ERROR | | |||
+--------------+ --2--------- +--------------+ | +--------------+ / +--------------+ | |||
\ | / | |||
+--------------+ \ +--------------+ | +--------------+ --2----------/ +--------------+ | |||
| PROBE_DONE | -------------------> | PROBE_DONE | | | PROBE_BASE | --1------------------------------> | PROBE_BASE | | |||
+--------------+ +--------------+ | +--------------+ +--------------+ | |||
+--------------+ +--------------+ | ||||
| PROBE_SEARCH | --1------------------------------> | PROBE_SEARCH | | ||||
+--------------+ --2--------- +--------------+ | ||||
\ | ||||
+--------------+ \ +--------------+ | ||||
| PROBE_DONE | -------------------> | PROBE_DONE | | ||||
+--------------+ +--------------+ | ||||
Condition 1: The maximum number of probe packets has not been | Condition 1: The maximum number of probe packets has not been | |||
reached. Condition 2: The maximum number of probe packets has been | reached. Condition 2: The maximum number of probe packets has been | |||
reached. | reached. | |||
PMTU raise timer timeout: The path through the network can change | Figure 8: State changes at the expiration of the probe timer | |||
PMTU raise timer timeout: The path through the network can change | ||||
over time. It impossible to discover whether a path change has | over time. It impossible to discover whether a path change has | |||
increased the actual PMTU by exchanging packets less than or equal | increased the actual PMTU by exchanging packets less than or equal | |||
to the PLPMTU. This requires PLPMTUD to periodically send a probe | to the PLPMTU. This requires PLPMTUD to periodically send a probe | |||
packet to detect whether a larger PMTU is possible. This probe | packet to detect whether a larger PMTU is possible. This probe | |||
packet is generated by the PMTU_RAISE_TIMER. When the timer | packet is generated by the PMTU_RAISE_TIMER. When the timer | |||
expires, probing is restarted with the BASE_PMTU and the state is | expires, probing is restarted with the BASE_PMTU and the state is | |||
changed to PROBE_BASE. | changed to PROBE_BASE. | |||
Arrival of an ICMP message: The active probing of the path can be | Arrival of a PTB message: The active probing of the path can be | |||
supported by the arrival of PTB messages sent by routers or | supported by the arrival of a PTB message sent by a router or | |||
middleboxes with a link MTU that is smaller than the probe packet | middleboxes indicating the router's local link MTU. Two cases can | |||
size. If the PTB message includes the router link MTU, three | be distinguished: | |||
cases can be distinguished: | ||||
1. The indicated link MTU in the PTB message is between the | 1. The indicated link MTU in the PTB message is between the | |||
already probed and PLMTU and the probe that triggered the PTB | already probed and PLPMTU and the probe that triggered the PTB | |||
message. | message. | |||
2. The indicated link MTU in the PTB message is smaller than the | 2. The indicated link MTU in the PTB message is smaller than the | |||
PLPMTU. | PLPMTU. | |||
3. The indicated link MTU in the PTB message is equal to the | ||||
BASE_PMTU. | ||||
In first case, the PROBE_BASE state transitions to the PROBE_ERROR | In first case, the PROBE_BASE state transitions to the PROBE_ERROR | |||
state. In the PROBE_SEARCH state, a new probe packet is sent with | state. In the PROBE_SEARCH state, a new probe packet is sent with | |||
the sized reported by the PTB message. Its result is handled | the sized reported by the PTB message. Its result is handled | |||
according to the former events. | according to the former events. | |||
The second case could be a result of a network re-configuration. | The second case could be a result of a network re-configuration. | |||
If the reported link MTU in the PTB message is greater than the | If the reported link MTU in the PTB message is greater than the | |||
BASE_MTU, the probing starts again with a value of PROBE_BASE. | BASE_MTU, the probing starts again with a value of PROBE_BASE. | |||
Otherwise, the method enters the state PROBE_ERROR. | Otherwise, the method enters the state PROBE_ERROR. | |||
In the third case, the maximum possible PMTU has been reached. | ||||
This ought to be probed again, because there could be a link | ||||
further along the path with a still smaller MTU. | ||||
Note: Not all routers include the link MTU size when they send a | Note: Not all routers include the link MTU size when they send a | |||
PTB message. If the PTB message does not indicate the link MTU, | PTB message. If the PTB message does not indicate the link MTU, | |||
the probe is handled in the same way as condition 2 of Figure 6. | the probe is handled in the same way as condition 2 of Figure 8. | |||
Appendix B. Revision Notes | Appendix B. Revision Notes | |||
Note to RFC-Editor: please remove this entire section prior to | Note to RFC-Editor: please remove this entire section prior to | |||
publication. | publication. | |||
Individual draft -00: | Individual draft -00: | |||
o Comments and corrections are welcome directly to the authors or | o Comments and corrections are welcome directly to the authors or | |||
via the IETF TSVWG working group mailing list. | via the IETF TSVWG working group mailing list. | |||
skipping to change at page 31, line 48 ¶ | skipping to change at page 35, line 47 ¶ | |||
states and timers | states and timers | |||
o This update is proposed for WG comments. | o This update is proposed for WG comments. | |||
Individual draft -02: | Individual draft -02: | |||
o Contains updated representation of the algorithm, and textual | o Contains updated representation of the algorithm, and textual | |||
corrections. | corrections. | |||
o The text describing when to set the effective PMTU has not yet | o The text describing when to set the effective PMTU has not yet | |||
been verified by the authors | been validated by the authors | |||
o To determine security to off-path-attacks: We need to decide | o To determine security to off-path-attacks: We need to decide | |||
whether a received PTB message SHOULD/MUST be verified? The text | whether a received PTB message SHOULD/MUST be validated? The text | |||
on how to handle a PTB message indicating a link MTU larger than | on how to handle a PTB message indicating a link MTU larger than | |||
the probe has yet not been verified by the authors | the probe has yet not been validated by the authors | |||
o No text currently describes how to handle inconsistent results | o No text currently describes how to handle inconsistent results | |||
from arbitrary re-routing along different parallel paths | from arbitrary re-routing along different parallel paths | |||
o This update is proposed for WG comments. | o This update is proposed for WG comments. | |||
Working Group draft -00: | Working Group draft -00: | |||
o This draft follows a successful adoption call for TSVWG | o This draft follows a successful adoption call for TSVWG | |||
skipping to change at page 32, line 46 ¶ | skipping to change at page 36, line 49 ¶ | |||
o This draft includes reorganisation of the section on IETF | o This draft includes reorganisation of the section on IETF | |||
protocols. | protocols. | |||
o Added more discussion of implementation within an application. | o Added more discussion of implementation within an application. | |||
o Added text on flapping paths. | o Added text on flapping paths. | |||
o Replaced 'effective MTU' with new term PLPMTU. | o Replaced 'effective MTU' with new term PLPMTU. | |||
Working Group draft -03: | ||||
o Updated figures | ||||
o Added more discussion on blackhole detection | ||||
o Added figure describing just blackhole detection | ||||
o Added figure relating MPS sizes | ||||
o Updated full state machine artwork for clarity | ||||
o Changed all text to refer to /packet probes/ /validation/ (rather | ||||
than /verification/). | ||||
Authors' Addresses | Authors' Addresses | |||
Godred Fairhurst | Godred Fairhurst | |||
University of Aberdeen | University of Aberdeen | |||
School of Engineering | School of Engineering | |||
Fraser Noble Building | Fraser Noble Building | |||
Aberdeen, AB24 3U | Aberdeen AB24 3U | |||
UK | UK | |||
Email: gorry@erg.abdn.ac.uk | Email: gorry@erg.abdn.ac.uk | |||
Tom Jones | Tom Jones | |||
University of Aberdeen | University of Aberdeen | |||
School of Engineering | School of Engineering | |||
Fraser Noble Building | Fraser Noble Building | |||
Aberdeen, AB24 3U | Aberdeen AB24 3U | |||
UK | UK | |||
Email: tom@erg.abdn.ac.uk | Email: tom@erg.abdn.ac.uk | |||
Michael Tuexen | Michael Tuexen | |||
Muenster University of Applied Sciences | Muenster University of Applied Sciences | |||
Stegerwaldstrasse 39 | Stegerwaldstrasse 39 | |||
Stein fart, 48565 | Stein fart 48565 | |||
DE | DE | |||
Email: tuexen@fh-muenster.de | Email: tuexen@fh-muenster.de | |||
Irene Ruengeler | Irene Ruengeler | |||
Muenster University of Applied Sciences | Muenster University of Applied Sciences | |||
Stegerwaldstrasse 39 | Stegerwaldstrasse 39 | |||
Stein fart, 48565 | Stein fart 48565 | |||
DE | DE | |||
Email: i.ruengeler@fh-muenster.de | Email: i.ruengeler@fh-muenster.de | |||
End of changes. 173 change blocks. | ||||
468 lines changed or deleted | 592 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |