draft-ietf-intarea-gue-01.txt | draft-ietf-intarea-gue-02.txt | |||
---|---|---|---|---|
Internet Area WG T. Herbert | Internet Area WG T. Herbert | |||
Internet-Draft Quantonium | Internet-Draft Quantonium | |||
Intended status: Standard track L. Yong | Intended status: Standard track L. Yong | |||
Expires September 14, 2017 Huawei USA | Expires October 28, 2017 Huawei USA | |||
O. Zia | O. Zia | |||
Microsoft | Microsoft | |||
March 13, 2017 | April 26, 2017 | |||
Generic UDP Encapsulation | Generic UDP Encapsulation | |||
draft-ietf-intarea-gue-01 | draft-ietf-intarea-gue-02 | |||
Status of this Memo | Status of this Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
other groups may also distribute working documents as Internet- | other groups may also distribute working documents as Internet- | |||
Drafts. | Drafts. | |||
skipping to change at page 1, line 35 ¶ | skipping to change at page 1, line 35 ¶ | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html | http://www.ietf.org/shadow.html | |||
This Internet-Draft will expire on September 14, 2017. | This Internet-Draft will expire on October 28, 2017. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2017 IETF Trust and the persons identified as the | Copyright (c) 2017 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 3, line 40 ¶ | skipping to change at page 3, line 40 ¶ | |||
3.3. Flags and extension fields . . . . . . . . . . . . . . . . 10 | 3.3. Flags and extension fields . . . . . . . . . . . . . . . . 10 | |||
3.3.1. Requirements . . . . . . . . . . . . . . . . . . . . . 10 | 3.3.1. Requirements . . . . . . . . . . . . . . . . . . . . . 10 | |||
3.3.2. Example GUE header with extension fields . . . . . . . 11 | 3.3.2. Example GUE header with extension fields . . . . . . . 11 | |||
3.4. Private data . . . . . . . . . . . . . . . . . . . . . . . 12 | 3.4. Private data . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
3.5. Message types . . . . . . . . . . . . . . . . . . . . . . . 12 | 3.5. Message types . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
3.5.1. Control messages . . . . . . . . . . . . . . . . . . . 12 | 3.5.1. Control messages . . . . . . . . . . . . . . . . . . . 12 | |||
3.5.2. Data messages . . . . . . . . . . . . . . . . . . . . . 13 | 3.5.2. Data messages . . . . . . . . . . . . . . . . . . . . . 13 | |||
3.6. Hiding the transport layer protocol number . . . . . . . . 13 | 3.6. Hiding the transport layer protocol number . . . . . . . . 13 | |||
4. Version 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 | 4. Version 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
4.1. Direct encapsulation of IPv4 . . . . . . . . . . . . . . . 14 | 4.1. Direct encapsulation of IPv4 . . . . . . . . . . . . . . . 14 | |||
4.2. Direct encapsulation of IPv6 . . . . . . . . . . . . . . . 15 | 4.2. Direct encapsulation of IPv6 . . . . . . . . . . . . . . . 14 | |||
5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 | 5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 | |||
5.1. Network tunnel encapsulation . . . . . . . . . . . . . . . 16 | 5.1. Network tunnel encapsulation . . . . . . . . . . . . . . . 16 | |||
5.2. Transport layer encapsulation . . . . . . . . . . . . . . . 16 | 5.2. Transport layer encapsulation . . . . . . . . . . . . . . . 16 | |||
5.3. Encapsulator operation . . . . . . . . . . . . . . . . . . 16 | 5.3. Encapsulator operation . . . . . . . . . . . . . . . . . . 16 | |||
5.4. Decapsulator operation . . . . . . . . . . . . . . . . . . 17 | 5.4. Decapsulator operation . . . . . . . . . . . . . . . . . . 17 | |||
5.4.1. Processing a received data message . . . . . . . . . . 17 | 5.4.1. Processing a received data message . . . . . . . . . . 17 | |||
5.4.2. Processing a received control message . . . . . . . . . 18 | 5.4.2. Processing a received control message . . . . . . . . . 18 | |||
5.5. Router and switch operation . . . . . . . . . . . . . . . . 18 | 5.5. Router and switch operation . . . . . . . . . . . . . . . . 18 | |||
5.6. Middlebox interactions . . . . . . . . . . . . . . . . . . 18 | 5.6. Middlebox interactions . . . . . . . . . . . . . . . . . . 18 | |||
5.6.1. Inferring connection semantics . . . . . . . . . . . . 19 | 5.6.1. Inferring connection semantics . . . . . . . . . . . . 19 | |||
5.6.2. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | 5.6.2. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
5.7. Checksum Handling . . . . . . . . . . . . . . . . . . . . . 19 | 5.7. Checksum Handling . . . . . . . . . . . . . . . . . . . . . 19 | |||
5.7.1. Requirements . . . . . . . . . . . . . . . . . . . . . 19 | 5.7.1. Requirements . . . . . . . . . . . . . . . . . . . . . 19 | |||
5.7.2. UDP Checksum with IPv4 . . . . . . . . . . . . . . . . 20 | 5.7.2. UDP Checksum with IPv4 . . . . . . . . . . . . . . . . 20 | |||
5.7.3. UDP Checksum with IPv6 . . . . . . . . . . . . . . . . 20 | 5.7.3. UDP Checksum with IPv6 . . . . . . . . . . . . . . . . 20 | |||
5.8. MTU and fragmentation . . . . . . . . . . . . . . . . . . . 21 | 5.8. MTU and fragmentation . . . . . . . . . . . . . . . . . . . 21 | |||
5.9. Congestion control . . . . . . . . . . . . . . . . . . . . 21 | 5.9. Congestion control . . . . . . . . . . . . . . . . . . . . 21 | |||
5.10. Multicast . . . . . . . . . . . . . . . . . . . . . . . . 21 | 5.10. Multicast . . . . . . . . . . . . . . . . . . . . . . . . 22 | |||
5.11. Flow entropy for ECMP . . . . . . . . . . . . . . . . . . 22 | 5.11. Flow entropy for ECMP . . . . . . . . . . . . . . . . . . 22 | |||
5.11.1. Flow classification . . . . . . . . . . . . . . . . . 22 | 5.11.1. Flow classification . . . . . . . . . . . . . . . . . 22 | |||
5.11.2. Flow entropy properties . . . . . . . . . . . . . . . 23 | 5.11.2. Flow entropy properties . . . . . . . . . . . . . . . 23 | |||
5.12 Negotiation of acceptable flags and extension fields . . . 24 | 5.12 Negotiation of acceptable flags and extension fields . . . 24 | |||
6. Motivation for GUE . . . . . . . . . . . . . . . . . . . . . . 24 | 6. Motivation for GUE . . . . . . . . . . . . . . . . . . . . . . 24 | |||
6.1. Benefits of GUE . . . . . . . . . . . . . . . . . . . . . . 24 | 6.1. Benefits of GUE . . . . . . . . . . . . . . . . . . . . . . 24 | |||
6.2 Comparison of GUE to other encapsulations . . . . . . . . . 25 | 6.2 Comparison of GUE to other encapsulations . . . . . . . . . 25 | |||
7. Security Considerations . . . . . . . . . . . . . . . . . . . . 26 | 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 26 | |||
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 26 | 8. IANA Consideration . . . . . . . . . . . . . . . . . . . . . . 27 | |||
8.1. UDP source port . . . . . . . . . . . . . . . . . . . . . . 26 | 8.1. UDP source port . . . . . . . . . . . . . . . . . . . . . . 27 | |||
8.2. GUE version number . . . . . . . . . . . . . . . . . . . . 28 | 8.2. GUE version number . . . . . . . . . . . . . . . . . . . . 28 | |||
8.3. Control types . . . . . . . . . . . . . . . . . . . . . . . 28 | 8.3. Control types . . . . . . . . . . . . . . . . . . . . . . . 28 | |||
8.4. Flag-fields . . . . . . . . . . . . . . . . . . . . . . . . 28 | 8.4. Flag-fields . . . . . . . . . . . . . . . . . . . . . . . . 28 | |||
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 29 | 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 29 | |||
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 29 | 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 29 | |||
10.1. Normative References . . . . . . . . . . . . . . . . . . . 29 | 10.1. Normative References . . . . . . . . . . . . . . . . . . . 29 | |||
10.2. Informative References . . . . . . . . . . . . . . . . . . 30 | 10.2. Informative References . . . . . . . . . . . . . . . . . . 30 | |||
Appendix A: NIC processing for GUE . . . . . . . . . . . . . . . . 33 | Appendix A: NIC processing for GUE . . . . . . . . . . . . . . . . 33 | |||
A.1. Receive multi-queue . . . . . . . . . . . . . . . . . . . . 33 | A.1. Receive multi-queue . . . . . . . . . . . . . . . . . . . . 33 | |||
A.2. Checksum offload . . . . . . . . . . . . . . . . . . . . . 33 | A.2. Checksum offload . . . . . . . . . . . . . . . . . . . . . 33 | |||
skipping to change at page 8, line 33 ¶ | skipping to change at page 8, line 33 ¶ | |||
~ Private data (optional) ~ | ~ Private data (optional) ~ | |||
| | | | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
The contents of the UDP header are: | The contents of the UDP header are: | |||
o Source port: If connection semantics (section 5.6.1) are applied | o Source port: If connection semantics (section 5.6.1) are applied | |||
to an encapsulation, this is set to the local source port for | to an encapsulation, this is set to the local source port for | |||
the connection. When connection semantics are not applied, this | the connection. When connection semantics are not applied, this | |||
is set to a flow entropy value for use with ECMP (Equal-Cost | is set to a flow entropy value for use with ECMP (Equal-Cost | |||
Mulit-Path [RFC2992]). The properties of flow entropy are | Mulit-Path [RFC2992]); the properties of flow entropy are | |||
described in section 5.11. | described in section 5.11. | |||
o Destination port: If connection semantics (section 5.6.1) are | o Destination port: If connection semantics (section 5.6.1) are | |||
applied to an encapsulation, this is set to the destination port | applied to an encapsulation, this is set to the destination port | |||
for the tuple. If connection semantics are not applied this is | for the tuple. If connection semantics are not applied this is | |||
set to the GUE assigned port number, 6080. | set to the GUE assigned port number, 6080. | |||
o Length: Canonical length of the UDP packet (length of UDP header | o Length: Canonical length of the UDP packet (length of UDP header | |||
and payload). | and payload). | |||
skipping to change at page 9, line 45 ¶ | skipping to change at page 9, line 45 ¶ | |||
(when the C-bit is not set) or GUE control message type (when the C- | (when the C-bit is not set) or GUE control message type (when the C- | |||
bit is set). | bit is set). | |||
3.2.1 Proto field | 3.2.1 Proto field | |||
When the C-bit is not set, the proto/ctype field MUST contain an IANA | When the C-bit is not set, the proto/ctype field MUST contain an IANA | |||
Internet Protocol Number. The protocol number is interpreted relative | Internet Protocol Number. The protocol number is interpreted relative | |||
to the IP protocol that encapsulates the UDP packet (i.e. protocol of | to the IP protocol that encapsulates the UDP packet (i.e. protocol of | |||
the outer IP header). The protocol number serves as an indication of | the outer IP header). The protocol number serves as an indication of | |||
the type of the next protocol header which is contained in the GUE | the type of the next protocol header which is contained in the GUE | |||
payload at the offset indicated in Hlen. Intermediate devices may | payload at the offset indicated in Hlen. Intermediate devices MAY | |||
parse the GUE payload per the number in the proto/ctype field, and | parse the GUE payload per the number in the proto/ctype field, and | |||
header flags cannot affect the interpretation of the proto/ctype | header flags cannot affect the interpretation of the proto/ctype | |||
field. | field. | |||
When the outer IP protocol is IPv4, the proto field MUST be set to a | When the outer IP protocol is IPv4, the proto field MUST be set to a | |||
valid IP protocol number usable with IPv4; it MUST NOT be set to a | valid IP protocol number usable with IPv4; it MUST NOT be set to a | |||
number for IPv6 extension headers or ICMPv6 options (number 58). An | number for IPv6 extension headers or ICMPv6 options (number 58). An | |||
exception is that the destination options extension header using the | exception is that the destination options extension header using the | |||
PadN option MAY be used with IPv4 as described in section 3.6. The | PadN option MAY be used with IPv4 as described in section 3.6. The | |||
"no next header" protocol number (59) also MAY be used with IPv4 as | "no next header" protocol number (59) also MAY be used with IPv4 as | |||
skipping to change at page 10, line 34 ¶ | skipping to change at page 10, line 34 ¶ | |||
3.2.2 Ctype field | 3.2.2 Ctype field | |||
When the C-bit is set, the proto/ctype field MUST be set to a valid | When the C-bit is set, the proto/ctype field MUST be set to a valid | |||
control message type. A value of zero indicates that the GUE payload | control message type. A value of zero indicates that the GUE payload | |||
requires further interpretation to deduce the control type. This | requires further interpretation to deduce the control type. This | |||
might be the case when the payload is a fragment of a control | might be the case when the payload is a fragment of a control | |||
message, where only the reassembled packet can be interpreted as a | message, where only the reassembled packet can be interpreted as a | |||
control message. | control message. | |||
Control messages will be defined in an IANA registry. Control message | Control messages will be defined in an IANA registry. Control message | |||
types 1 through 127 may be defined in by RFCs. Types 128 through 255 | types 1 through 127 may be defined in standards. Types 128 through | |||
are reserved to be user defined for experimentation or private | 255 are reserved to be user defined for experimentation or private | |||
control messages. | control messages. | |||
This document does not specify any standard control message types | This document does not specify any standard control message types | |||
other than type 0. | other than type 0. | |||
3.3. Flags and extension fields | 3.3. Flags and extension fields | |||
Flags and associated extension fields are the primary mechanism of | Flags and associated extension fields are the primary mechanism of | |||
extensibility in GUE. As mentioned in section 3.1, GUE header flags | extensibility in GUE. As mentioned in section 3.1, GUE header flags | |||
indicate the presence of optional extension fields in the GUE header. | indicate the presence of optional extension fields in the GUE header. | |||
[GUEXTENS] defines a basic set of GUE extensions. | [GUEXTENS] defines a basic set of GUE extensions. | |||
3.3.1. Requirements | 3.3.1. Requirements | |||
There are sixteen flag bits in the GUE header. Some flags indicate | There are sixteen flag bits in the GUE header. Some flags indicate | |||
the presence of an extension fields. The size of an extension field | presence of an extension fields. The size of an extension field | |||
indicated by a flag MUST be fixed. | indicated by a flag MUST be fixed. | |||
Flags can be paired together to allow different lengths for an | Flags can be paired together to allow different lengths for an | |||
extension field. For example, if two flag bits are paired, a field | extension field. For example, if two flag bits are paired, a field | |||
can possibly be three different lengths-- that is bit value of 00 | can possibly be three different lengths-- that is bit value of 00 | |||
indicates no field present; 01, 10, and 11 indicate three possible | indicates no field present; 01, 10, and 11 indicate three possible | |||
lengths for the field. Regardless of how flag bits are paired, the | lengths for the field. Regardless of how flag bits are paired, the | |||
lengths and offsets of optional fields corresponding to a set of | lengths and offsets of optional fields corresponding to a set of | |||
flags MUST be well defined. | flags MUST be well defined. | |||
skipping to change at page 11, line 32 ¶ | skipping to change at page 11, line 32 ¶ | |||
packet. The processing order of extensions defined in [GUEEXTENS] | packet. The processing order of extensions defined in [GUEEXTENS] | |||
demonstrates this property. | demonstrates this property. | |||
Flags (or paired flags) are idempotent such that new flags MUST NOT | Flags (or paired flags) are idempotent such that new flags MUST NOT | |||
cause reinterpretation of old flags. Also, new flags MUST NOT alter | cause reinterpretation of old flags. Also, new flags MUST NOT alter | |||
interpretation of other elements in the GUE header nor how the | interpretation of other elements in the GUE header nor how the | |||
message is parsed (for instance, in a data message the proto/ctype | message is parsed (for instance, in a data message the proto/ctype | |||
field always holds an IP protocol number as an invariant). | field always holds an IP protocol number as an invariant). | |||
The set of available flags can be extended in the future by defining | The set of available flags can be extended in the future by defining | |||
a "flag extensions bit" that refers to a field containing an | a "flag extensions bit" that refers to a field containing a new set | |||
additional set of flags. | of flags. | |||
3.3.2. Example GUE header with extension fields | 3.3.2. Example GUE header with extension fields | |||
An example GUE header for a data message encapsulating an IPv4 packet | An example GUE header for a data message encapsulating an IPv4 packet | |||
and containing the VNID and Security extension fields (both defined | and containing the VNID and Security extension fields (both defined | |||
in [GUEXTENS]) is shown below: | in [GUEXTENS]) is shown below: | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| 0 |0| 3 | 94 |1|0 0 1| 0 | | | 0 |0| 3 | 94 |1|0 0 1| 0 | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| VNID | | | VNID | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | | | | | |||
+ Security + | + Security + | |||
| | | | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
In the above example, the first flag bit is set which indicates that | In the above example, the first flag bit is set which indicates that | |||
the VNID extension is present this is a 32 bit field. The second | the VNID extension is present which is a 32 bit field. The second | |||
through fourth bits of the flags are paired flags that indicate the | through fourth bits of the flags are paired flags that indicate the | |||
presence of a security field with eigth possible sizes. In this | presence of a security field with seven possible sizes. In this | |||
example 001 indicates a sixty-four bit security field. | example 001 indicates a sixty-four bit security field. | |||
3.4. Private data | 3.4. Private data | |||
An implementation MAY use private data for its own use. The private | An implementation MAY use private data for its own use. The private | |||
data immediately follows the last field in the GUE header and is not | data immediately follows the last field in the GUE header and is not | |||
a fixed length. This data is considered part of the GUE header and | a fixed length. This data is considered part of the GUE header and | |||
MUST be accounted for in header length (Hlen). The length of the | MUST be accounted for in header length (Hlen). The length of the | |||
private data MUST be a multiple of four and is determined by | private data MUST be a multiple of four and is determined by | |||
subtracting the offset of private data in the GUE header from the | subtracting the offset of private data in the GUE header from the | |||
skipping to change at page 13, line 29 ¶ | skipping to change at page 13, line 28 ¶ | |||
3.5.2. Data messages | 3.5.2. Data messages | |||
Data messages carry encapsulated packets that are addressed to the | Data messages carry encapsulated packets that are addressed to the | |||
protocol stack for the associated protocol. Data messages are a | protocol stack for the associated protocol. Data messages are a | |||
primary means of encapsulation and can be used to create tunnels for | primary means of encapsulation and can be used to create tunnels for | |||
overlay networks. | overlay networks. | |||
Data messages are indicated in GUE header when the C-bit is not set. | Data messages are indicated in GUE header when the C-bit is not set. | |||
The payload of a data message is interpreted as an encapsulated | The payload of a data message is interpreted as an encapsulated | |||
packet of an Internet protocol indicated in the proto/ctype field. | packet of an Internet protocol indicated in the proto/ctype field. | |||
The encapsulated packet immediately follows the GUE header. | The packet immediately follows the GUE header. | |||
3.6. Hiding the transport layer protocol number | 3.6. Hiding the transport layer protocol number | |||
The GUE header indicates the Internet protocol of an encapsulated | The GUE header indicates the Internet protocol of the encapsulated | |||
packet. A protocol number is either contained in the Proto/ctype | packet. A protocol number is either contained in the Proto/ctype | |||
field of the primary GUE header or in the Payload Type field of a GUE | field of the primary GUE header or in the Payload Type field of a GUE | |||
Transform extension field (used to encrypt the payload with DTLS, | Transform extension field (used to encrypt the payload with DTLS, | |||
[GUEEXTENS). If the transport protocol number needs to be hidden from | [GUEEXTENS]). If the transport protocol number needs to be hidden | |||
the network, then a trivial destination options can be used. | from the network, then a trivial destination options can be used. | |||
The PadN destination option [RFC2460] can be used to encode the | The PadN destination option [RFC2460] can be used to encode the | |||
transport protocol as a next header of an extension header (and | transport protocol as a next header of an extension header (and | |||
maintain alignment of encapsulated transport headers). The | maintain alignment of encapsulated transport headers). The | |||
Proto/ctype field or Payload Type field of the GUE Transform field is | Proto/ctype field or Payload Type field of the GUE Transform field is | |||
set to 60 to indicate that the first encapsulated header is a | set to 60 to indicate that the first encapsulated header is a | |||
destination options extension header. | destination options extension header. | |||
The format of the extension header is below: | The format of the extension header is below: | |||
skipping to change at page 14, line 25 ¶ | skipping to change at page 14, line 23 ¶ | |||
version and coincide with the first two bits of the version number in | version and coincide with the first two bits of the version number in | |||
the IP header. The first two version bits of IPv4 and IPv6 are 01, so | the IP header. The first two version bits of IPv4 and IPv6 are 01, so | |||
we use GUE version 1 for direct IP encapsulation which makes two bits | we use GUE version 1 for direct IP encapsulation which makes two bits | |||
of GUE version to also be 01. | of GUE version to also be 01. | |||
This technique is effectively a means to compress out the GUE header | This technique is effectively a means to compress out the GUE header | |||
when encapsulating IPv4 or IPv6 packets and there are no flags or | when encapsulating IPv4 or IPv6 packets and there are no flags or | |||
extension fields present. This method is compatible to use on the | extension fields present. This method is compatible to use on the | |||
same port number as packets with the GUE header (GUE version 0 | same port number as packets with the GUE header (GUE version 0 | |||
packets). This technique saves encapsulation overhead on costly links | packets). This technique saves encapsulation overhead on costly links | |||
for the common use case of IP encapsulation, and also obviates the | for the common use of IP encapsulation, and also obviates the need to | |||
need to allocate a separate port number for IP-over-UDP | allocate a separate port number for IP-over-UDP encapsulation. | |||
encapsulation. | ||||
4.1. Direct encapsulation of IPv4 | 4.1. Direct encapsulation of IPv4 | |||
The format for encapsulating IPv4 directly in UDP is: | The format for encapsulating IPv4 directly in UDP is: | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ | |||
| Source port | Destination port | | | | Source port | Destination port | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP | |||
skipping to change at page 16, line 34 ¶ | skipping to change at page 16, line 34 ¶ | |||
packets. In this case the encapsulator and decapsulator nodes are the | packets. In this case the encapsulator and decapsulator nodes are the | |||
tunnel endpoints. These could be routers that provide network tunnels | tunnel endpoints. These could be routers that provide network tunnels | |||
on behalf of communicating hosts. | on behalf of communicating hosts. | |||
5.2. Transport layer encapsulation | 5.2. Transport layer encapsulation | |||
When encapsulating layer 4 packets, the encapsulator and decapsulator | When encapsulating layer 4 packets, the encapsulator and decapsulator | |||
should be co-resident with the hosts. In this case, the encapsulation | should be co-resident with the hosts. In this case, the encapsulation | |||
headers are inserted between the IP header and the transport packet. | headers are inserted between the IP header and the transport packet. | |||
The addresses in the IP header refer to both the endpoints of the | The addresses in the IP header refer to both the endpoints of the | |||
encapsulation and the endpoints for terminating the transport | encapsulation and the endpoints for terminating the the transport | |||
protocol. Note that the transport layer ports in the encapsulated | protocol. Note that the transport layer ports in the encapsulated | |||
packet are independent of the UDP ports in the outer packet. | packet are independent of the UDP ports in the outer packet. | |||
Details about performing transport layer encapsulation are discussed | Details about performing transport layer encapsulation are discussed | |||
in [TOU]. | in [TOU]. | |||
5.3. Encapsulator operation | 5.3. Encapsulator operation | |||
Encapsulators create GUE data messages, set the fields of the UDP | Encapsulators create GUE data messages, set the fields of the UDP | |||
header, set flags and optional extension fields in the GUE header, | header, set flags and optional extension fields in the GUE header, | |||
and forward packets to a decapsulator. | and forward packets to a decapsulator. | |||
An encapsulator can be an end host originating the packets of a flow, | An encapsulator can be an end host originating the packets of a flow, | |||
or can be a network device performing encapsulation on behalf of | or can be a network device performing encapsulation on behalf of | |||
hosts (routers implementing tunnels for instance). In either case, | hosts (routers implementing tunnels for instance). In either case, | |||
the intended target (decapsulator) is indicated by the outer | the intended target (decapsulator) is indicated by the outer | |||
destination IP address and destination port in the UDP header. | destination IP address and destination port in the UDP header. | |||
If an encapsulator is tunneling packets -- that is encapsulating | If an encapsulator is tunneling packets -- that is encapsulating | |||
packets of layer 2 or layer 3 protocols (e.g. EtherIP, IPIP, or ESP | packets of layer 2 or layer 3 protocols (e.g. EtherIP, IPIP, ESP | |||
tunnel mode) -- it SHOULD follow standard conventions for tunneling | tunnel mode) -- it SHOULD follow standard conventions for tunneling | |||
of one protocol over another. For instance, if an IP packet is being | of one protocol over another. For instance, if an IP packet is being | |||
encapsualated in GUE then diffserv interaction [RFC2983] and ECN | encapsualated in GUE then diffserv interaction [RFC2983] and ECN | |||
propagation for tunnels [RFC6040] SHOULD be followed. | propagation for tunnels [RFC6040] SHOULD be followed. | |||
5.4. Decapsulator operation | 5.4. Decapsulator operation | |||
A decapsulator performs decapsulation of GUE packets. A decapsulator | A decapsulator performs decapsulation of GUE packets. A decapsulator | |||
is addressed by the outer destination IP address of a GUE packet. | is addressed by the outer destination IP address of a GUE packet. | |||
The decapsulator validates packets, including fields of the GUE | The decapsulator validates packets, including fields of the GUE | |||
skipping to change at page 17, line 32 ¶ | skipping to change at page 17, line 32 ¶ | |||
unsupported payload type, or an otherwise malformed header, it MUST | unsupported payload type, or an otherwise malformed header, it MUST | |||
drop the packet. Such events MAY be logged subject to configuration | drop the packet. Such events MAY be logged subject to configuration | |||
and rate limiting of logging messages. No error message is returned | and rate limiting of logging messages. No error message is returned | |||
back to the encapsulator. Note that set flags in a GUE header that | back to the encapsulator. Note that set flags in a GUE header that | |||
are unknown to a decapsulator MUST NOT be ignored. If a GUE packet is | are unknown to a decapsulator MUST NOT be ignored. If a GUE packet is | |||
received by a decapsulator with unknown flags, the packet MUST be | received by a decapsulator with unknown flags, the packet MUST be | |||
dropped. | dropped. | |||
5.4.1. Processing a received data message | 5.4.1. Processing a received data message | |||
If a valid data message is received, the UDP and GUE headers are | If a valid data message is received, the UDP headers are removed from | |||
(logically) removed from the packet. The outer IP header remains | the packet. The outer IP header remains intact and the next protocol | |||
intact and the next protocol in the IP header is set to the protocol | in the IP header is set to the protocol from the proto field in the | |||
from the proto field in the GUE header. The resulting packet is then | GUE header. The resulting packet is then resubmitted into the | |||
resubmitted into the protocol stack to process that packet as though | protocol stack to process that packet as though it was received with | |||
it was received with the protocol in the GUE header. | the protocol in the GUE header. | |||
As an example, consider that a data message is received where GUE | As an example, consider that a data message is received where GUE | |||
encapsulates an IP packet. In this case proto field in the GUE header | encapsulates an IP packet. In this case proto field in the GUE header | |||
is set 94 for IPIP: | is set 94 for IPIP: | |||
+-------------------------------------+ | +-------------------------------------+ | |||
| IP header (next proto = 17,UDP) | | | IP header (next proto = 17,UDP) | | |||
|-------------------------------------| | |-------------------------------------| | |||
| UDP | | | UDP | | |||
|-------------------------------------| | |-------------------------------------| | |||
skipping to change at page 18, line 28 ¶ | skipping to change at page 18, line 28 ¶ | |||
If a valid control message is received, the packet MUST be processed | If a valid control message is received, the packet MUST be processed | |||
as a control message. The specific processing to be performed depends | as a control message. The specific processing to be performed depends | |||
on the ctype in the GUE header. | on the ctype in the GUE header. | |||
5.5. Router and switch operation | 5.5. Router and switch operation | |||
Routers and switches SHOULD forward GUE packets as standard UDP/IP | Routers and switches SHOULD forward GUE packets as standard UDP/IP | |||
packets. The outer five-tuple should contain sufficient information | packets. The outer five-tuple should contain sufficient information | |||
to perform flow classification corresponding to the flow of the inner | to perform flow classification corresponding to the flow of the inner | |||
packet. A switch does not normally need to parse a GUE header, and | packet. A router does not normally need to parse a GUE header, and | |||
none of the flags or extension fields in the GUE header are expected | none of the flags or extension fields in the GUE header are expected | |||
to affect routing. | to affect routing. In cases where the outer five-tuple does not | |||
provide sufficient entropy for flow classification, for instance UDP | ||||
ports are fixed to provide connection semantics (section 5.6.1), then | ||||
the encapsulated packet MAY be parsed to determine flow entropy. | ||||
A router MUST NOT modify a GUE header when forwarding a packet. It | A router MUST NOT modify a GUE header when forwarding a packet. It | |||
MAY encapsulate a GUE packet in another GUE packet, for instance to | MAY encapsulate a GUE packet in another GUE packet, for instance to | |||
implement a network tunnel (i.e. by encapsulating an IP packet with a | implement a network tunnel (i.e. by encapsulating an IP packet with a | |||
GUE payload in another IP packet as a GUE payload). In this case, the | GUE payload in another IP packet as a GUE payload). In this case, the | |||
router takes the role of an encapsulator, and the corresponding | router takes the role of an encapsulator, and the corresponding | |||
decapsulator is the logical endpoint of the tunnel. When | decapsulator is the logical endpoint of the tunnel. When | |||
encapsulating a GUE packet within another GUE packet, there are no | encapsulating a GUE packet within another GUE packet, there are no | |||
specified provisions to automatically GUE copy flags or fields to the | provisions to automatically GUE copy flags or fields to the outer GUE | |||
outer GUE header. Each layer of encapsulation is considered | header. Each layer of encapsulation is considered independent. | |||
independent. | ||||
5.6. Middlebox interactions | 5.6. Middlebox interactions | |||
A middle box MAY interpret some flags and extension fields of the GUE | A middle box MAY interpret some flags and extension fields of the GUE | |||
header for classification purposes, but is not required to understand | header for classification purposes, but is not required to understand | |||
any of the flags or extension fields in GUE packets. A middle box | any of the flags or extension fields in GUE packets. A middle box | |||
MUST NOT drop a GUE packet merely because there are flags unknown to | MUST NOT drop a GUE packet merely because there are flags unknown to | |||
it. The header length in the GUE header allows a middlebox to inspect | it. The header length in the GUE header allows a middlebox to inspect | |||
the payload packet without needing to parse the flags or extension | the payload packet without needing to parse the flags or extension | |||
fields. | fields. | |||
5.6.1. Inferring connection semantics | 5.6.1. Inferring connection semantics | |||
A middlebox might infer bidirectional connection semantics for a UDP | A middlebox might infer bidirectional connection semantics for a UDP | |||
flow. For instance, a stateful firewall might create a five-tuple | flow. For instance, a stateful firewall might create a five-tuple | |||
rule to match flows on egress, and a corresponding five-tuple rule | rule to match flows on egress, and a corresponding five-tuple rule | |||
for matching ingress packets where the roles of source and | for matching ingress packets where the roles of source and | |||
destination are reversed for the IP addresses and UDP port numbers. | destination are reversed for the IP addresses and UDP port numbers. | |||
To operate in this environment, a GUE tunnel SHOULD be configured to | To operate in this environment, a GUE tunnel should be configured to | |||
assume connected semantics defined by the UDP five tuple and the use | assume connected semantics defined by the UDP five tuple and the use | |||
of GUE encapsulation needs to be symmetric between both endpoints. | of GUE encapsulation needs to be symmetric between both endpoints. | |||
The source port set in the UDP header MUST be the destination port | The source port set in the UDP header MUST be the destination port | |||
the peer would set for replies. In this case the UDP source port for | the peer would set for replies. In this case the UDP source port for | |||
a tunnel would be a fixed value and not set to be flow entropy as | a tunnel would be a fixed value and not set to be flow entropy as | |||
described in section 5.11. | described in section 5.11. | |||
The selection of whether to make the UDP source port fixed or set to | The selection of whether to make the UDP source port fixed or set to | |||
a flow entropy value for each packet sent SHOULD be configurable for | a flow entropy value for each packet sent SHOULD be configurable for | |||
a tunnel. | a tunnel. The default MUST be to set the flow entropy value in the | |||
UDP source port. | ||||
5.6.2. NAT | 5.6.2. NAT | |||
IP address and port translation can be performed on the UDP/IP | IP address and port translation can be performed on the UDP/IP | |||
headers adhering to the requirements for NAT with UDP [RFC4787]. In | headers adhering to the requirements for NAT with UDP [RFC4787]. In | |||
the case of stateful NAT, connection semantics MUST be applied to a | the case of stateful NAT, connection semantics MUST be applied to a | |||
GUE tunnel as described in section 5.6.1. GUE endpoints MAY also | GUE tunnel as described in section 5.6.1. GUE endpoints MAY also | |||
invoke STUN [RFC5389] or ICE [RFC5245] to manage NAT port mappings | invoke STUN [RFC5389] or ICE [RFC5245] to manage NAT port mappings | |||
for encapsulations. | for encapsulations. | |||
skipping to change at page 20, line 22 ¶ | skipping to change at page 20, line 26 ¶ | |||
implementation considerations. The IPv4 header includes a checksum | implementation considerations. The IPv4 header includes a checksum | |||
that protects against mis-delivery of the packet due to corruption | that protects against mis-delivery of the packet due to corruption | |||
of IP addresses. The UDP checksum potentially provides protection | of IP addresses. The UDP checksum potentially provides protection | |||
against corruption of the UDP header, GUE header, and GUE payload. | against corruption of the UDP header, GUE header, and GUE payload. | |||
Enabling or disabling the use of checksums is a deployment | Enabling or disabling the use of checksums is a deployment | |||
consideration that should take into account the risk and effects of | consideration that should take into account the risk and effects of | |||
packet corruption, and whether the packets in the network are | packet corruption, and whether the packets in the network are | |||
already adequately protected by other, possibly stronger mechanisms | already adequately protected by other, possibly stronger mechanisms | |||
such as the Ethernet CRC. If an encapsulator sets a zero UDP | such as the Ethernet CRC. If an encapsulator sets a zero UDP | |||
checksum for IPv4, it SHOULD use the GUE header checksum as | checksum for IPv4, it SHOULD use the GUE header checksum as | |||
described in [GUEEXTENS]. | described in [GUEEXTENS] assuming there are no other mechanisms used | |||
to protect the GUE packet. | ||||
When a decapsulator receives a packet, the UDP checksum field MUST | When a decapsulator receives a packet, the UDP checksum field MUST | |||
be processed. If the UDP checksum is non-zero, the decapsulator MUST | be processed. If the UDP checksum is non-zero, the decapsulator MUST | |||
verify the checksum before accepting the packet. By default, a | verify the checksum before accepting the packet. By default, a | |||
decapsulator SHOULD accept UDP packets with a zero checksum. A node | decapsulator SHOULD accept UDP packets with a zero checksum. A node | |||
MAY be configured to disallow zero checksums per [RFC1122]. | MAY be configured to disallow zero checksums per [RFC1122]. | |||
Configuration of zero checksums can be selective. For instance, zero | Configuration of zero checksums can be selective. For instance, zero | |||
checksums might be disallowed from certain hosts that are known to | checksums might be disallowed from certain hosts that are known to | |||
be sending over paths subject to packet corruption. If verification | be sending over paths subject to packet corruption. If verification | |||
of a non-zero checksum fails, a decapsulator lacks the capability to | of a non-zero checksum fails, a decapsulator lacks the capability to | |||
verify a non-zero checksum, or a packet with a zero-checksum was | verify a non-zero checksum, or a packet with a zero-checksum was | |||
received and the decapsulator is configured to disallow, the packet | received and the decapsulator is configured to disallow, the packet | |||
MUST be dropped. | MUST be dropped. | |||
5.7.3. UDP Checksum with IPv6 | 5.7.3. UDP Checksum with IPv6 | |||
In IPv6, there is no checksum in the IPv6 header that protects | In IPv6, there is no checksum in the IPv6 header that protects | |||
against mis-delivery due to address corruption. Therefore, when GUE | against mis-delivery due to address corruption. Therefore, when GUE | |||
is used over IPv6, either the UDP checksum or the GUE header | is used over IPv6, either the UDP checksum or the GUE header | |||
checksum SHOULD be used. The UDP checksum and GUE header checksum | checksum SHOULD be used unless there are alternative mechanisms in | |||
SHOULD not be used at the same time since that would be mostly | use that protect against misdelivery. The UDP checksum and GUE | |||
redundant. | header checksum SHOULD not be used at the same time since that would | |||
be mostly redundant. | ||||
If neither the UDP checksum or the GUE header checksum is used, then | If neither the UDP checksum or the GUE header checksum is used, then | |||
the requirements for using zero IPv6 UDP checksums in [RFC6935] and | the requirements for using zero IPv6 UDP checksums in [RFC6935] and | |||
[RFC6936] MUST be met. | [RFC6936] MUST be met. | |||
When a decapsulator receives a packet, the UDP checksum field MUST | When a decapsulator receives a packet, the UDP checksum field MUST | |||
be processed. If the UDP checksum is non-zero, the decapsulator MUST | be processed. If the UDP checksum is non-zero, the decapsulator MUST | |||
verify the checksum before accepting the packet. By default a | verify the checksum before accepting the packet. By default a | |||
decapsulator MUST only accept UDP packets with a zero checksum if | decapsulator MUST only accept UDP packets with a zero checksum if | |||
the GUE header checksum is used and is verified. If verification of | the GUE header checksum is used and is verified. If verification of | |||
skipping to change at page 21, line 24 ¶ | skipping to change at page 21, line 29 ¶ | |||
and fragmentation in conjunction with networking tunnels | and fragmentation in conjunction with networking tunnels | |||
(encapsulation of layer 2 or layer 3 packets) SHOULD be followed. | (encapsulation of layer 2 or layer 3 packets) SHOULD be followed. | |||
Details are described in MTU and Fragmentation Issues with In-the- | Details are described in MTU and Fragmentation Issues with In-the- | |||
Network Tunneling [RFC4459]. | Network Tunneling [RFC4459]. | |||
If a packet is fragmented before encapsulation in GUE, all the | If a packet is fragmented before encapsulation in GUE, all the | |||
related fragments MUST be encapsulated using the same UDP source | related fragments MUST be encapsulated using the same UDP source | |||
port. An operator SHOULD set MTU to account for encapsulation | port. An operator SHOULD set MTU to account for encapsulation | |||
overhead and reduce the likelihood of fragmentation. | overhead and reduce the likelihood of fragmentation. | |||
Alternatively to IP fragmentation, the GUE fragmentation extension | Alternative to IP fragmentation, the GUE fragmentation extension can | |||
can be used. GUE fragmentation is described in [GUEEXTENS]. | be used. GUE fragmentation is described in [GUEEXTENS]. | |||
5.9. Congestion control | 5.9. Congestion control | |||
Per requirements of [RFC5405], if the IP traffic encapsulated with | Per requirements of [RFC5405], if the IP traffic encapsulated with | |||
GUE implements proper congestion control no additional mechanisms | GUE implements proper congestion control no additional mechanisms | |||
should be required. | should be required. | |||
In the case that the encapsulated traffic does not implement any or | In the case that the encapsulated traffic does not implement any or | |||
sufficient control, or it is not known whether a transmitter will | sufficient control, or it is not known whether a transmitter will | |||
consistently implement proper congestion control, then congestion | consistently implement proper congestion control, then congestion | |||
skipping to change at page 24, line 4 ¶ | skipping to change at page 24, line 11 ¶ | |||
MAY use the value to match further receive packets for steering | MAY use the value to match further receive packets for steering | |||
decisions, but MUST NOT assume that the hash uniquely or | decisions, but MUST NOT assume that the hash uniquely or | |||
permanently identifies a flow. | permanently identifies a flow. | |||
o Input to the flow entropy calculation is not restricted to ports | o Input to the flow entropy calculation is not restricted to ports | |||
and addresses; input could include flow label from an IPv6 | and addresses; input could include flow label from an IPv6 | |||
packet, SPI from an ESP packet, or other flow related state in | packet, SPI from an ESP packet, or other flow related state in | |||
the encapsulator that is not necessarily conveyed in the packet. | the encapsulator that is not necessarily conveyed in the packet. | |||
o The assignment function for flow entropy SHOULD be randomly | o The assignment function for flow entropy SHOULD be randomly | |||
seeded to mitigate denial of service attacks. The seed may be | seeded to mitigate denial of service attacks. The seed SHOULD be | |||
changed periodically. | changed periodically. | |||
5.12 Negotiation of acceptable flags and extension fields | 5.12 Negotiation of acceptable flags and extension fields | |||
An encapsulator and decapsulator need to achieve agreement about GUE | An encapsulator and decapsulator need to achieve agreement about GUE | |||
parameters will be used in communications. Parameters include GUE | parameters will be used in communications. Parameters include GUE | |||
version, flags and extension fields that can be used, security | version, flags and extension fields that can be used, security | |||
algorithms and keys, supported protocols and control messages, etc. | algorithms and keys, supported protocols and control messages, etc. | |||
This document proposes different general methods to accomplish this, | This document proposes different general methods to accomplish this, | |||
however the details of implementing these are considered out of | however the details of implementing these are considered out of | |||
scope. | scope. | |||
Possible negotiation methods are: | General methods for this are: | |||
o Configuration. The parameters used for a tunnel are configured | o Configuration. The parameters used for a tunnel are configured | |||
at each endpoint. | at each endpoint. | |||
o Negotiation. A tunnel negotiation can be performed. This could | o Negotiation. A tunnel negotiation can be performed. This could | |||
be accomplished in-band of GUE using control messages or private | be accomplished in-band of GUE using control messages or private | |||
data. | data. | |||
o Via a control plane. Parameters for communicating with a tunnel | o Via a control plane. Parameters for communicating with a tunnel | |||
endpoint can be set in a control plane protocol (such as that | endpoint can be set in a control plane protocol (such as that | |||
skipping to change at page 24, line 45 ¶ | skipping to change at page 24, line 52 ¶ | |||
This section presents the motivation for GUE with respect to other | This section presents the motivation for GUE with respect to other | |||
encapsulation methods. | encapsulation methods. | |||
6.1. Benefits of GUE | 6.1. Benefits of GUE | |||
* GUE is a generic encapsulation protocol. GUE can encapsulate | * GUE is a generic encapsulation protocol. GUE can encapsulate | |||
protocols that are represented by an IP protocol number. This | protocols that are represented by an IP protocol number. This | |||
includes layer 2, layer 3, and layer 4 protocols. | includes layer 2, layer 3, and layer 4 protocols. | |||
* GUE is an extensible encapsulation protocol. Standard optional | * GUE is an extensible encapsulation protocol. Standardized | |||
data such as security, virtual networking identifiers, | optional data such as security, virtual networking identifiers, | |||
fragmentation are being defined. | fragmentation are being defined. | |||
* For extensilbity, GUE uses flag fields as opposed to TLVs as | * For extensilbity, GUE uses flag fields as opposed to TLVs as | |||
some other encapsulation protocols do. Flag fields are strictly | some other encapsulation protocols do. Flag fields are strictly | |||
ordered, allow random access, and an efficient use of header | ordered, allow random access, and are efficient in use of header | |||
space. | space. | |||
* GUE allows private data to be sent as part of the encapsulation. | * GUE allows private data to be sent as part of the encapsulation. | |||
This permits experimentation or customization in deployment. | This permits experimentation or customization in deployment. | |||
* GUE allows sending of control messages such as OAM using the | * GUE allows sending of control messages such as OAM using the | |||
same GUE header format (for routing purposes) as normal data | same GUE header format (for routing purposes) as normal data | |||
messages. | messages. | |||
* GUE maximizes deliverability of non-UDP and non-TCP protocols. | * GUE maximizes deliverability of non-UDP and non-TCP protocols. | |||
skipping to change at page 25, line 32 ¶ | skipping to change at page 25, line 38 ¶ | |||
provides layer 2 tunneling of Ethernet frames over IP. GRE [RFC2784], | provides layer 2 tunneling of Ethernet frames over IP. GRE [RFC2784], | |||
MPLS [RFC4023], and L2TP [RFC2661] provide methods for tunneling | MPLS [RFC4023], and L2TP [RFC2661] provide methods for tunneling | |||
layer 2 and layer 3 packets over IP. NVGRE [RFC7637] and VXLAN | layer 2 and layer 3 packets over IP. NVGRE [RFC7637] and VXLAN | |||
[RFC7348] are proposals for encapsulation of layer 2 packets for | [RFC7348] are proposals for encapsulation of layer 2 packets for | |||
network virtualization. IPIP [RFC2003] and Generic packet tunneling | network virtualization. IPIP [RFC2003] and Generic packet tunneling | |||
in IPv6 [RFC2473] provide methods for tunneling IP packets over IP. | in IPv6 [RFC2473] provide methods for tunneling IP packets over IP. | |||
Several proposals exist for encapsulating packets over UDP including | Several proposals exist for encapsulating packets over UDP including | |||
ESP over UDP [RFC3948], TCP directly over UDP [TCPUDP], VXLAN | ESP over UDP [RFC3948], TCP directly over UDP [TCPUDP], VXLAN | |||
[RFC7348], LISP [RFC6830] which encapsulates layer 3 packets, | [RFC7348], LISP [RFC6830] which encapsulates layer 3 packets, | |||
MPLS/UDP [RFC7510], and Generic UDP Encapsulation for IP Tunneling | MPLS/UDP [7510], and Generic UDP Encapsulation for IP Tunneling (GRE | |||
(GRE over UDP)[RFC8086]. Generic UDP tunneling [GUT] is a proposal | over UDP)[RFC8086]. Generic UDP tunneling [GUT] is a proposal similar | |||
similar to GUE in that it aims to tunnel packets of IP protocols over | to GUE in that it aims to tunnel packets of IP protocols over UDP. | |||
UDP. | ||||
GUE has the following discriminating features: | GUE has the following discriminating features: | |||
o UDP encapsulation leverages specialized network device | o UDP encapsulation leverages specialized network device | |||
processing for efficient transport. The semantics for using the | processing for efficient transport. The semantics for using the | |||
UDP source port for flow entropy as input to ECMP are defined in | UDP source port for flow entropy as input to ECMP are defined in | |||
section 5.11. | section 5.11. | |||
o GUE permits encapsulation of arbitrary IP protocols, which | o GUE permits encapsulation of arbitrary IP protocols, which | |||
includes layer 2 3, and 4 protocols. | includes layer 2 3, and 4 protocols. | |||
skipping to change at page 26, line 21 ¶ | skipping to change at page 26, line 27 ¶ | |||
to parse the full encapsulation header. | to parse the full encapsulation header. | |||
o Private data in the encapsulation header allows local | o Private data in the encapsulation header allows local | |||
customization and experimentation while being compatible with | customization and experimentation while being compatible with | |||
processing in network nodes (routers and middleboxes). | processing in network nodes (routers and middleboxes). | |||
o GUE includes both data messages (encapsulation of packets) and | o GUE includes both data messages (encapsulation of packets) and | |||
control messages (such as OAM). | control messages (such as OAM). | |||
o The flags-field model facilitates efficient implementation of | o The flags-field model facilitates efficient implementation of | |||
extensibility in hardware. For example, a TCAM can be use to | extensibility in hardware. | |||
parse a known set of N flags where the number of entries in the | ||||
TCAM is 2^N. By contrast, the number of TCAM entries needed to | For instance a TCAM can be use to parse a known set of N flags | |||
parse a set of N arbitrarily ordered TLVS is approximately e*N!. | where the number of entries in the TCAM is 2^N. | |||
For comparison, the number of TCAM entries needed to parse a set | ||||
of N arbitrarily ordered TLVS is approximately e*N!. | ||||
7. Security Considerations | 7. Security Considerations | |||
There are two important considerations of security with respect to | There are two important considerations of security with respect to | |||
GUE. | GUE. | |||
o Authentication and integrity of the GUE header. | o Authentication and integrity of the GUE header. | |||
o Authentication, integrity, and confidentiality of the GUE | o Authentication, integrity, and confidentiality of the GUE | |||
payload. | payload. | |||
GUE security is provided by extensions for security defined in | GUE security is provided by extensions for security defined in | |||
[GUEEXTENS]. These extensions include methods to authenticate the GUE | [GUEEXTENS]. These extensions include methods to authenticate the GUE | |||
header and encrypt the GUE payload. | header and encrypt the GUE payload. | |||
The GUE header can be authenticated using a security extension for an | The GUE header can be authenticated using a security extension for an | |||
HMAC. Securing the GUE payload can be accomplished use of the GUE | HMAC. Securing the GUE payload can be accomplished use of the GUE | |||
Payload Transform that can provide DTLS [RFC6347] in the payload of a | Payload Transform. This extension can be used to perform DTLS in the | |||
GUE packet to encrypt the payload. | payload of a GUE packet to encrypt the payload. | |||
A hash function for computing flow entropy (section 5.11) SHOULD be | A hash function for computing flow entropy (section 5.11) SHOULD be | |||
randomly seeded to mitigate some possible denial service attacks. | randomly seeded to mitigate some possible denial service attacks. | |||
8. IANA Considerations | 8. IANA Consideration | |||
8.1. UDP source port | 8.1. UDP source port | |||
A user UDP port number assignment for GUE has been assigned: | A user UDP port number assignment for GUE has been assigned: | |||
Service Name: gue | Service Name: gue | |||
Transport Protocol(s): UDP | Transport Protocol(s): UDP | |||
Assignee: Tom Herbert <therbert@google.com> | Assignee: Tom Herbert <tom@herbertland.com> | |||
Contact: Tom Herbert <therbert@google.com> | Contact: Tom Herbert <tom@herbertland.com> | |||
Description: Generic UDP Encapsulation | Description: Generic UDP Encapsulation | |||
Reference: draft-herbert-gue | Reference: draft-herbert-gue | |||
Port Number: 6080 | Port Number: 6080 | |||
Service Code: N/A | Service Code: N/A | |||
Known Unauthorized Uses: N/A | Known Unauthorized Uses: N/A | |||
Assignment Notes: N/A | Assignment Notes: N/A | |||
8.2. GUE version number | 8.2. GUE version number | |||
IANA is requested to set up a registry for the GUE version number. | IANA is requested to set up a registry for the GUE version number. | |||
skipping to change at page 29, line 46 ¶ | skipping to change at page 29, line 46 ¶ | |||
valuable input on this draft. | valuable input on this draft. | |||
10. References | 10. References | |||
10.1. Normative References | 10.1. Normative References | |||
[RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI | [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI | |||
10.17487/RFC0768, August 1980, <http://www.rfc- | 10.17487/RFC0768, August 1980, <http://www.rfc- | |||
editor.org/info/rfc768>. | editor.org/info/rfc768>. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - | |||
Requirement Levels", BCP 14, RFC 2119, DOI | Communication Layers", STD 3, RFC 1122, DOI | |||
10.17487/RFC2119, March 1997, <http://www.rfc- | 10.17487/RFC1122, October 1989, <http://www.rfc- | |||
editor.org/info/rfc2119>. | editor.org/info/rfc1122>. | |||
[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an | |||
(IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, | IANA Considerations Section in RFCs", RFC 2434, DOI | |||
December 1998, <http://www.rfc-editor.org/info/rfc2460>. | 10.17487/RFC2434, October 1998, <http://www.rfc- | |||
editor.org/info/rfc2434>. | ||||
[RFC2983] Black, D., "Differentiated Services and Tunnels", RFC | [RFC2983] Black, D., "Differentiated Services and Tunnels", RFC | |||
2983, DOI 10.17487/RFC2983, October 2000, <http://www.rfc- | 2983, DOI 10.17487/RFC2983, October 2000, <http://www.rfc- | |||
editor.org/info/rfc2983>. | editor.org/info/rfc2983>. | |||
[RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion | [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion | |||
Notification", RFC 6040, DOI 10.17487/RFC6040, November | Notification", RFC 6040, DOI 10.17487/RFC6040, November | |||
2010, <http://www.rfc-editor.org/info/rfc6040>. | 2010, <http://www.rfc-editor.org/info/rfc6040>. | |||
[RFC6935] Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and | [RFC6935] Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and | |||
skipping to change at page 30, line 31 ¶ | skipping to change at page 30, line 32 ¶ | |||
[RFC6936] Fairhurst, G. and M. Westerlund, "Applicability Statement | [RFC6936] Fairhurst, G. and M. Westerlund, "Applicability Statement | |||
for the Use of IPv6 UDP Datagrams with Zero Checksums", | for the Use of IPv6 UDP Datagrams with Zero Checksums", | |||
RFC 6936, DOI 10.17487/RFC6936, April 2013, | RFC 6936, DOI 10.17487/RFC6936, April 2013, | |||
<http://www.rfc-editor.org/info/rfc6936>. | <http://www.rfc-editor.org/info/rfc6936>. | |||
[RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- | [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- | |||
Network Tunneling", RFC 4459, DOI 10.17487/RFC4459, April | Network Tunneling", RFC 4459, DOI 10.17487/RFC4459, April | |||
2006, <http://www.rfc-editor.org/info/rfc4459>. | 2006, <http://www.rfc-editor.org/info/rfc4459>. | |||
[RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - | 10.2. Informative References | |||
Communication Layers", STD 3, RFC 1122, DOI | ||||
10.17487/RFC1122, October 1989, <http://www.rfc- | ||||
editor.org/info/rfc1122>. | ||||
[RFC6335] Cotton, M., Eggert, L., Touch, J., Westerlund, M., and S. | [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., | |||
Cheshire, "Internet Assigned Numbers Authority (IANA) | and G. Fairhurst, Ed., "The Lightweight User Datagram | |||
Procedures for the Management of the Service Name and | Protocol (UDP-Lite)", RFC 3828, July 2004, | |||
Transport Protocol Port Number Registry", BCP 165, RFC | <http://www.rfc-editor.org/info/rfc3828>. | |||
6335, DOI 10.17487/RFC6335, August 2011, <http://www.rfc- | ||||
editor.org/info/rfc6335>. | ||||
10.2. Informative References | [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, | |||
L., Sridhar, T., Bursell, M., and C. Wright, "Virtual | ||||
eXtensible Local Area Network (VXLAN): A Framework for | ||||
Overlaying Virtualized Layer 2 Networks over Layer 3 | ||||
Networks", RFC 7348, August 2014, <http://www.rfc- | ||||
editor.org/info/rfc7348>. | ||||
[RFC2992] Hopps, C., "Analysis of an Equal-Cost Multi-Path | [RFC7605] Touch, J., "Recommendations on Using Assigned Transport | |||
Algorithm", RFC 2992, DOI 10.17487/RFC2992, November 2000, | Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605, | |||
<http://www.rfc-editor.org/info/rfc2992>. | August 2015, <http://www.rfc-editor.org/info/rfc7605>. | |||
[RFC7637] Garg, P., Ed., and Y. Wang, Ed., "NVGRE: Network | ||||
Virtualization Using Generic Routing Encapsulation", RFC | ||||
7637, DOI 10.17487/RFC7637, September 2015, | ||||
<http://www.rfc-editor.org/info/rfc7637>. | ||||
[RFC8086] Yong, L., Ed., Crabbe, E., Xu, X., and T. Herbert, "GRE- | ||||
in-UDP Encapsulation", RFC 8086, DOI 10.17487/RFC8086, | ||||
March 2017, <http://www.rfc-editor.org/info/rfc8086>. | ||||
[RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, | ||||
"Encapsulating MPLS in UDP", RFC 7510, DOI | ||||
10.17487/RFC7510, April 2015, <http://www.rfc- | ||||
editor.org/info/rfc7510>. | ||||
[RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram | ||||
Congestion Control Protocol (DCCP)", RFC 4340, DOI | ||||
10.17487/RFC4340, March 2006, <http://www.rfc- | ||||
editor.org/info/rfc4340>. | ||||
[RFC4787] Audet, F., Ed., and C. Jennings, "Network Address | [RFC4787] Audet, F., Ed., and C. Jennings, "Network Address | |||
Translation (NAT) Behavioral Requirements for Unicast | Translation (NAT) Behavioral Requirements for Unicast | |||
UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January | UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January | |||
2007, <http://www.rfc-editor.org/info/rfc4787>. | 2007, <http://www.rfc-editor.org/info/rfc4787>. | |||
[RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, | [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, | |||
"Session Traversal Utilities for NAT (STUN)", RFC 5389, | "Session Traversal Utilities for NAT (STUN)", RFC 5389, | |||
DOI 10.17487/RFC5389, October 2008, <http://www.rfc- | DOI 10.17487/RFC5389, October 2008, <http://www.rfc- | |||
editor.org/info/rfc5389>. | editor.org/info/rfc5389>. | |||
[RFC5245] Rosenberg, J., "Interactive Connectivity Establishment | [RFC5285] Rosenberg, J., "Interactive Connectivity Establishment | |||
(ICE): A Protocol for Network Address Translator (NAT) | (ICE): A Protocol for Network Address Translator (NAT) | |||
Traversal for Offer/Answer Protocols", RFC 5245, DOI | Traversal for Offer/Answer Protocols", RFC 5245, DOI | |||
10.17487/RFC5245, April 2010, <http://www.rfc- | 10.17487/RFC5245, April 2010, <http://www.rfc- | |||
editor.org/info/rfc5245>. | editor.org/info/rfc5245>. | |||
[RFC8086] Yong, L., Ed., Crabbe, E., Xu, X., and T. Herbert, "GRE- | ||||
in-UDP Encapsulation", RFC 8086, DOI 10.17487/RFC8086, | ||||
March 2017, <http://www.rfc-editor.org/info/rfc8086>. | ||||
[RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines | [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines | |||
for Application Designers", BCP 145, RFC 5405, DOI | for Application Designers", BCP 145, RFC 5405, DOI | |||
10.17487/RFC5405, November 2008, <http://www.rfc- | 10.17487/RFC5405, November 2008, <http://www.rfc- | |||
editor.org/info/rfc5405>. | editor.org/info/rfc5405>. | |||
[RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label | [RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label | |||
for Equal Cost Multipath Routing and Link Aggregation in | for Equal Cost Multipath Routing and Link Aggregation in | |||
Tunnels", RFC 6438, DOI 10.17487/RFC6438, November 2011, | Tunnels", RFC 6438, DOI 10.17487/RFC6438, November 2011, | |||
<http://www.rfc-editor.org/info/rfc6438>. | <http://www.rfc-editor.org/info/rfc6438>. | |||
[RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, DOI | ||||
10.17487/RFC2003, October 1996, <http://www.rfc- | ||||
editor.org/info/rfc2003>. | ||||
[RFC3948] Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and M. | ||||
Stenberg, "UDP Encapsulation of IPsec ESP Packets", RFC | ||||
3948, DOI 10.17487/RFC3948, January 2005, <http://www.rfc- | ||||
editor.org/info/rfc3948>. | ||||
[RFC6830] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "The | ||||
Locator/ID Separation Protocol (LISP)", RFC 6830, DOI | ||||
10.17487/RFC6830, January 2013, <http://www.rfc- | ||||
editor.org/info/rfc6830>. | ||||
[RFC3378] Housley, R. and S. Hollenbeck, "EtherIP: Tunneling | [RFC3378] Housley, R. and S. Hollenbeck, "EtherIP: Tunneling | |||
Ethernet Frames in IP Datagrams", RFC 3378, DOI | Ethernet Frames in IP Datagrams", RFC 3378, DOI | |||
10.17487/RFC3378, September 2002, <http://www.rfc- | 10.17487/RFC3378, September 2002, <http://www.rfc- | |||
editor.org/info/rfc3378>. | editor.org/info/rfc3378>. | |||
[RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. | [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. | |||
Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, | Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, | |||
DOI 10.17487/RFC2784, March 2000, <http://www.rfc- | DOI 10.17487/RFC2784, March 2000, <http://www.rfc- | |||
editor.org/info/rfc2784>. | editor.org/info/rfc2784>. | |||
[RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed., | [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed., | |||
"Encapsulating MPLS in IP or Generic Routing Encapsulation | "Encapsulating MPLS in IP or Generic Routing Encapsulation | |||
(GRE)", RFC 4023, DOI 10.17487/RFC4023, March 2005, | (GRE)", RFC 4023, DOI 10.17487/RFC4023, March 2005, | |||
<http://www.rfc-editor.org/info/rfc4023>. | <http://www.rfc-editor.org/info/rfc4023>. | |||
[RFC2661] Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn, | [RFC2661] Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn, | |||
G., and B. Palter, "Layer Two Tunneling Protocol "L2TP"", | G., and B. Palter, "Layer Two Tunneling Protocol "L2TP"", | |||
RFC 2661, DOI 10.17487/RFC2661, August 1999, | RFC 2661, DOI 10.17487/RFC2661, August 1999, | |||
<http://www.rfc-editor.org/info/rfc2661>. | <http://www.rfc-editor.org/info/rfc2661>. | |||
[RFC7637] Garg, P., Ed., and Y. Wang, Ed., "NVGRE: Network | ||||
Virtualization Using Generic Routing Encapsulation", RFC | ||||
7637, DOI 10.17487/RFC7637, September 2015, | ||||
<http://www.rfc-editor.org/info/rfc7637>. | ||||
[RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, | ||||
L., Sridhar, T., Bursell, M., and C. Wright, "Virtual | ||||
eXtensible Local Area Network (VXLAN): A Framework for | ||||
Overlaying Virtualized Layer 2 Networks over Layer 3 | ||||
Networks", RFC 7348, August 2014, <http://www.rfc- | ||||
editor.org/info/rfc7348>. | ||||
[RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, DOI | ||||
10.17487/RFC2003, October 1996, <http://www.rfc- | ||||
editor.org/info/rfc2003>. | ||||
[RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in | ||||
IPv6 Specification", RFC 2473, DOI 10.17487/RFC2473, | ||||
December 1998, <http://www.rfc-editor.org/info/rfc2473>. | ||||
[RFC3948] Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and M. | ||||
Stenberg, "UDP Encapsulation of IPsec ESP Packets", RFC | ||||
3948, DOI 10.17487/RFC3948, January 2005, <http://www.rfc- | ||||
editor.org/info/rfc3948>. | ||||
[RFC6830] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "The | ||||
Locator/ID Separation Protocol (LISP)", RFC 6830, DOI | ||||
10.17487/RFC6830, January 2013, <http://www.rfc- | ||||
editor.org/info/rfc6830>. | ||||
[RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, | ||||
"Encapsulating MPLS in UDP", RFC 7510, DOI | ||||
10.17487/RFC7510, April 2015, <http://www.rfc- | ||||
editor.org/info/rfc7510>. | ||||
[RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram | ||||
Congestion Control Protocol (DCCP)", RFC 4340, DOI | ||||
10.17487/RFC4340, March 2006, <http://www.rfc- | ||||
editor.org/info/rfc4340>. | ||||
[GUEEXTENS] Herbert, T., Yong, L., and Templin, F., "Extensions for | [GUEEXTENS] Herbert, T., Yong, L., and Templin, F., "Extensions for | |||
Generic UDP Encapsulation" draft-herbert-gue-extensions-00 | Generic UDP Encapsulation" draft-herbert-gue-extensions-00 | |||
[GUE4NVO3] Yong, L., Herbert, T., Zia, O., "Generic UDP | [GUE4NVO3] Yong, L., Herbert, T., Zia, O., "Generic UDP | |||
Encapsulation (GUE) for Network Virtualization Overlay" | Encapsulation (GUE) for Network Virtualization Overlay" | |||
draft-hy-nvo3-gue-4-nvo-03 | draft-hy-nvo3-gue-4-nvo-03 | |||
[TOU] Herbert, T., "Transport layer protocols over UDP" draft- | [GUESEC] Yong, L., Herbert, T., "Generic UDP Encapsulation (GUE) for | |||
herbert-transports-over-udp-00 | Secure Transport" draft-hy-gue-4-secure-transport-03 | |||
[CIRCBRK] Fairhurst, G., "Network Transport Circuit Breakers", | ||||
[TCPUDP] Chesire, S., Graessley, J., and McGuire, R., | [TCPUDP] Chesire, S., Graessley, J., and McGuire, R., | |||
"Encapsulation of TCP and other Transport Protocols over | "Encapsulation of TCP and other Transport Protocols over | |||
UDP" draft-cheshire-tcp-over-udp-00 | UDP" draft-cheshire-tcp-over-udp-00 | |||
[TOU] Herbert, T., "Transport layer protocols over UDP" draft- | ||||
herbert-transports-over-udp-00 | ||||
[GUT] Manner, J., Varia, N., and Briscoe, B., "Generic UDP | [GUT] Manner, J., Varia, N., and Briscoe, B., "Generic UDP | |||
Tunnelling (GUT) draft-manner-tsvwg-gut-02.txt" | Tunnelling (GUT) draft-manner-tsvwg-gut-02.txt" | |||
[CIRCBRK] Fairhurst, G., "Network Transport Circuit Breakers", | ||||
draft-ietf-tsvwg-circuit-breaker-15 | ||||
[LCO] Cree, E., https://www.kernel.org/doc/Documentation/ | [LCO] Cree, E., https://www.kernel.org/doc/Documentation/ | |||
networking/checksum-offloads.txt | networking/checksum-offloads.txt | |||
Appendix A: NIC processing for GUE | Appendix A: NIC processing for GUE | |||
This appendix provides some guidelines for Network Interface Cards | This appendix provides some guidelines for Network Interface Cards | |||
(NICs) to implement common offloads and accelerations to support GUE. | (NICs) to implement common offloads and accelerations to support GUE. | |||
Note that most of this discussion is generally applicable to other | Note that most of this discussion is generally applicable to other | |||
methods of UDP based encapsulation. | methods of UDP based encapsulation. | |||
skipping to change at page 33, line 43 ¶ | skipping to change at page 33, line 39 ¶ | |||
GUE encapsulation is compatible with multi-queue NICs that support | GUE encapsulation is compatible with multi-queue NICs that support | |||
five-tuple hash calculation for UDP/IP packets as input to RSS. The | five-tuple hash calculation for UDP/IP packets as input to RSS. The | |||
flow entropy in the UDP source port ensures classification of the | flow entropy in the UDP source port ensures classification of the | |||
encapsulated flow even in the case that the outer source and | encapsulated flow even in the case that the outer source and | |||
destination addresses are the same for all flows (e.g. all flows are | destination addresses are the same for all flows (e.g. all flows are | |||
going over a single tunnel). | going over a single tunnel). | |||
By default, UDP RSS support is often disabled in NICs to avoid out- | By default, UDP RSS support is often disabled in NICs to avoid out- | |||
of-order reception that can occur when UDP packets are fragmented. As | of-order reception that can occur when UDP packets are fragmented. As | |||
discussed above, fragmentation of GUE packets is mostly avoided by | discussed above, fragmentation of GUE packets is be mostly avoided by | |||
fragmenting packets before entering a tunnel, GUE fragmentation, path | fragmenting packets before entering a tunnel, GUE fragmentation, path | |||
MTU discovery in higher layer protocols, or operator adjusting MTUs. | MTU discovery in higher layer protocols, or operator adjusting MTUs. | |||
Other UDP traffic might not implement such procedures to avoid | Other UDP traffic might not implement such procedures to avoid | |||
fragmentation, so enabling UDP RSS support in the NIC might be a | fragmentation, so enabling UDP RSS support in the NIC might be a | |||
considered tradeoff during configuration. | considered tradeoff during configuration. | |||
A.2. Checksum offload | A.2. Checksum offload | |||
Many NICs provide capabilities to calculate standard ones complement | Many NICs provide capabilities to calculate standard ones complement | |||
payload checksum for packets in transmit or receive. When using GUE | payload checksum for packets in transmit or receive. When using GUE | |||
encapsulation, there are at least two checksums that are of interest: | encapsulation, there are at least two checksums that are of interest: | |||
the encapsulated packet's transport checksum, and the UDP checksum in | the encapsulated packet's transport checksum, and the UDP checksum in | |||
the outer header. | the outer header. | |||
A.2.1. Transmit checksum offload | A.2.1. Transmit checksum offload | |||
NICs can provide a protocol agnostic method to offload transmit | NICs can provide a protocol agnostic method to offload transmit | |||
checksum (NETIF_F_HW_CSUM in Linux parlance) that can be used with | checksum (NETIF_F_HW_CSUM in Linux parlance) that can be used with | |||
skipping to change at page 35, line 26 ¶ | skipping to change at page 35, line 21 ¶ | |||
checksum-complete value for the UDP packet is the "not" of the pseudo | checksum-complete value for the UDP packet is the "not" of the pseudo | |||
header checksum. In this way, checksum-unnecessary can be converted | header checksum. In this way, checksum-unnecessary can be converted | |||
to checksum-complete. So, if the NIC provides checksum-unnecessary | to checksum-complete. So, if the NIC provides checksum-unnecessary | |||
for the outer UDP header in an encapsulation, checksum conversion can | for the outer UDP header in an encapsulation, checksum conversion can | |||
be done so that the checksum-complete value is derived and can be | be done so that the checksum-complete value is derived and can be | |||
used by the stack to validate checksums in the encapsulated packet. | used by the stack to validate checksums in the encapsulated packet. | |||
A.3. Transmit Segmentation Offload | A.3. Transmit Segmentation Offload | |||
Transmit Segmentation Offload (TSO) is a NIC feature where a host | Transmit Segmentation Offload (TSO) is a NIC feature where a host | |||
provides a large (greater than MTU size) TCP packet to the NIC, which | provides a large (>MTU size) TCP packet to the NIC, which in turn | |||
in turn splits the packet into separate segments and transmits each | splits the packet into separate segments and transmits each one. This | |||
one. This is useful to reduce CPU load on the host. | is useful to reduce CPU load on the host. | |||
The process of TSO can be generalized as: | The process of TSO can be generalized as: | |||
- Split the TCP payload into segments which allow packets with | - Split the TCP payload into segments which allow packets with | |||
size less than or equal to MTU. | size less than or equal to MTU. | |||
- For each created segment: | - For each created segment: | |||
1. Replicate the TCP header and all preceding headers of the | 1. Replicate the TCP header and all preceding headers of the | |||
original packet. | original packet. | |||
skipping to change at page 36, line 42 ¶ | skipping to change at page 36, line 37 ¶ | |||
fabricate a single meaningful header from all the coalesced packets. | fabricate a single meaningful header from all the coalesced packets. | |||
The conservative approach to supporting LRO for GUE would be to | The conservative approach to supporting LRO for GUE would be to | |||
assign packets to the same flow only if they have identical five- | assign packets to the same flow only if they have identical five- | |||
tuple and were encapsulated the same way. That is the outer IP | tuple and were encapsulated the same way. That is the outer IP | |||
addresses, the outer UDP ports, GUE protocol, GUE flags and fields, | addresses, the outer UDP ports, GUE protocol, GUE flags and fields, | |||
and inner five tuple are all identical. | and inner five tuple are all identical. | |||
Appendix B: Implementation considerations | Appendix B: Implementation considerations | |||
This appendix is informational and does not constitute a normative | ||||
part of this document. | ||||
B.1. Priveleged ports | B.1. Priveleged ports | |||
Using the source port to contain a flow entropy value disallows the | Using the source port to contain a flow entropy value disallows the | |||
security method of a receiver enforcing that the source port be a | security method of a receiver enforcing that the source port be a | |||
privileged port. Privileged ports are defined by some operating | privileged port. Privileged ports are defined by some operating | |||
systems to restrict source port binding. Unix, for instance, | systems to restrict source port binding. Unix, for instance, | |||
considered port number less than 1024 to be privileged. | considered port number less than 1024 to be privileged. | |||
Enforcing that packets are sent from a privileged port is widely | Enforcing that packets are sent from a privileged port is widely | |||
considered an inadequate security mechanism and has been mostly | considered an inadequate security mechanism and has been mostly | |||
skipping to change at page 37, line 31 ¶ | skipping to change at page 37, line 28 ¶ | |||
Low level data path protocol, such is GUE, are often supported in | Low level data path protocol, such is GUE, are often supported in | |||
high speed network device hardware. Variable length header (VLH) | high speed network device hardware. Variable length header (VLH) | |||
protocols like GUE are often considered difficult to efficiently | protocols like GUE are often considered difficult to efficiently | |||
implement in hardware. In order to retain the important | implement in hardware. In order to retain the important | |||
characteristics of an extensible and robust protocol, hardware | characteristics of an extensible and robust protocol, hardware | |||
vendors may practice "constrained flexibility". In this model, only | vendors may practice "constrained flexibility". In this model, only | |||
certain combinations or protocol header parameterizations are | certain combinations or protocol header parameterizations are | |||
implemented in hardware fast path. Each such parameterization is | implemented in hardware fast path. Each such parameterization is | |||
fixed length so that the particular instance can be optimized as a | fixed length so that the particular instance can be optimized as a | |||
fixed length protocol. In the case of GUE, this constitutes specific | fixed length protocol. In the case of GUE this constitutes specific | |||
combinations of GUE flags, fields, and next protocol. The selected | combinations of GUE flags, fields, and next protocol. The selected | |||
combinations would naturally be the most common cases which form the | combinations would naturally be the most common cases which form the | |||
"fast path", and other combinations are assumed to take the "slow | "fast path", and other combinations are assumed to take the "slow | |||
path". | path". | |||
In time, needs and requirements of the protocol may change which may | In time, needs and requirements of the protocol may change which may | |||
manifest themselves as new parameterizations to be supported in the | manifest themselves as new parameterizations to be supported in the | |||
fast path. To allow allow this extensibility, a device practicing | fast path. To allow allow this extensibility, a device practicing | |||
constrained flexibility should allow the fast path parameterizations | constrained flexibility should allow the fast path parameterizations | |||
to be programmable. | to be programmable. | |||
End of changes. 57 change blocks. | ||||
138 lines changed or deleted | 143 lines changed or added | |||
This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |