--- 1/draft-ietf-ngtrans-isatap-11.txt 2006-02-05 00:50:59.000000000 +0100 +++ 2/draft-ietf-ngtrans-isatap-12.txt 2006-02-05 00:50:59.000000000 +0100 @@ -1,22 +1,22 @@ Network Working Group F. Templin Internet-Draft Nokia -Expires: July 18, 2003 T. Gleeson +Expires: July 25, 2003 T. Gleeson Cisco Systems K.K. M. Talwar D. Thaler Microsoft Corporation - January 17, 2003 + January 24, 2003 Intra-Site Automatic Tunnel Addressing Protocol (ISATAP) - draft-ietf-ngtrans-isatap-11.txt + draft-ietf-ngtrans-isatap-12.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. @@ -25,21 +25,21 @@ and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http:// www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. - This Internet-Draft will expire on July 18, 2003. + This Internet-Draft will expire on July 25, 2003. Copyright Notice Copyright (C) The Internet Society (2003). All Rights Reserved. Abstract This document specifies an Intra-Site Automatic Tunnel Addressing Protocol (ISATAP) that connects IPv6 hosts and routers within IPv4 sites. ISATAP treats the site's IPv4 infrastructure as a link layer @@ -55,25 +55,25 @@ 4. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 5. Basic IPv6 Operation . . . . . . . . . . . . . . . . . . . . . 4 6. Automatic Tunneling . . . . . . . . . . . . . . . . . . . . . 5 7. Neighbor Discovery . . . . . . . . . . . . . . . . . . . . . . 7 8. Deployment Considerations . . . . . . . . . . . . . . . . . . 10 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 10. Security considerations . . . . . . . . . . . . . . . . . . . 11 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 Normative References . . . . . . . . . . . . . . . . . . . . . 12 Informative References . . . . . . . . . . . . . . . . . . . . 13 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 14 - A. Major Changes . . . . . . . . . . . . . . . . . . . . . . . . 15 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 15 + A. Major Changes . . . . . . . . . . . . . . . . . . . . . . . . 16 B. Rationale for Interface Identifier Construction . . . . . . . 17 - C. Dynamic MTU Discovery . . . . . . . . . . . . . . . . . . . . 18 - Intellectual Property and Copyright Statements . . . . . . . . 22 + C. ISATAP Interface MTU Considerations . . . . . . . . . . . . . 18 + Intellectual Property and Copyright Statements . . . . . . . . 23 1. Introduction This document presents a simple approach called the Intra-Site Automatic Tunnel Addressing Protocol (ISATAP) that enables incremental deployment of IPv6 [1] within IPv4 [2] sites. ISATAP allows dual-stack nodes that do not share a physical link with an IPv6 router to automatically tunnel packets to the IPv6 next-hop address through IPv4, i.e., the site's IPv4 infrastructure is treated as a link layer for IPv6. @@ -95,23 +95,23 @@ o enables incremental deployment of IPv6 hosts within IPv4 sites with no aggregation scaling issues at border gateways o requires no special IPv4 services within the site (e.g., multicast) o supports both stateless address autoconfiguration and manual configuration o supports networks that use non-globally unique IPv4 addresses - (e.g., when private address allocations [18] are used) + (e.g., when private address allocations [10] are used) - o compatible with other NGTRANS mechanisms (e.g., 6to4 [19]) + o compatible with other NGTRANS mechanisms (e.g., 6to4 [11]) 3. Requirements The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this document, are to be interpreted as described in [3]. This document also makes use of internal conceptual variables to describe protocol behavior and external variables that an implementation must allow system administrators to change. The @@ -139,21 +139,21 @@ addresses on the ISATAP link. ISATAP interface: a node's attachment to an ISATAP link. advertising ISATAP interface: same meaning as "advertising interface" in ([4], section 6.2.2). ISATAP address: an on-link address on an ISATAP interface and with an interface - identifier constructed as specified in Section 5.2 + identifier constructed as specified in Section 5.1 5. Basic IPv6 Operation ISATAP links transmit IPv6 packets via automatic tunnels using the site's IPv4 infrastructure as a link layer for IPv6, i.e., IPv6 treats the site's IPv4 infrastructure as a Non-Broadcast, Multiple Access (NBMA) link layer. The following considerations for IPv6 on ISATAP links are noted: 5.1 Interface Identifiers and Unicast Addresses @@ -167,340 +167,354 @@ ISATAP addresses are constructed as follows: | 64 bits | 32 bits | 32 bits | +------------------------------+---------------+----------------+ | global or local-use unicast | 0000:5EFE | IPv4 Address | | prefix | | of ISATAP link | +------------------------------+---------------+----------------+ 5.2 ISATAP Link/Interface Configuration - An ISATAP link consists of one or more underlying links that support + ISATAP links consist of one or more underlying links that support IPv4 for tunneling within a site. ISATAP interfaces are configured over ISATAP links; each IPv4 address assigned to an underlying link is seen as a link-layer address for ISATAP. - At least one link-layer address per advertising ISATAP interface - SHOULD be added to the Potential Routers List (see Section 7.3.1). + Neighbor discovery on ISATAP links (see: Section 7) provides the + functional equivalent of unicast virtual circuits (VCs) required for + other NBMA media types ([6], section 4.6). Neighbor state + information MAY be kept in the Conceptual Neighbor Cache ([4], + section 5.1). 5.3 Link Layer Address Options With reference to ([6], section 5.2), when the [NTL] and [STL] fields in an ISATAP link layer address option encode 0, the [NBMA Number] field encodes a 4-octet IPv4 address. 5.4 Multicast and Anycast - As for any IPv6 interface, an ISATAP interface is required to - recognize certain IPv6 multicast and anycast addresses ([5], section - 2.8). Mechanisms for sending multicast and anycast packets (e.g., - [20]) are left as future work. + ISATAP interfaces recognize a node's required addresses as specified + in ([5], section 2.8). + + Mechanisms for multicast/anycast emulation on ISATAP links (e.g., + adaptations of MLD [12], PIM-SM [13], MARS [14], etc.) are subject + for future companion document(s). 6. Automatic Tunneling The common tunneling mechanisms specified in ([7], sections 2 and 3) are used, with the following noted considerations for ISATAP: 6.1 Dual IP Layer Operation ISATAP uses the same specification found in ([7], section 2). That is, ISATAP nodes provide complete IPv4 and IPv6 implementations and are able to send and receive both IPv4 and IPv6 packets. Address configuration and DNS considerations are the same as ([7], sections 2.1 through 2.3). -6.2 Encapsulation +6.2 Encapsulation/Decapsulation - The specification in ([7], section 3.1) is used. Additionally, the - IPv6 next-hop address for packets sent on an ISATAP link MUST be an - ISATAP address; other packets are discarded and an ICMPv6 destination - unreachable indication with code 3 (Address Unreachable) [8] is - returned to the source. + The specifications in ([7], sections 3.1 and 3.6) are used. + Additionally, the IPv6 next-hop address for packets encapsulated on + an ISATAP link MUST be an ISATAP address; other packets are discarded + and an ICMPv6 destination unreachable indication with code 3 (Address + Unreachable) ([8], section 3.1) is returned to the source. 6.3 Tunnel MTU and Fragmentation ISATAP automatic tunnel interfaces may be configured over multiple underlying links with diverse maximum transmission units (MTUs). The minimum MTU for IPv6 interfaces is 1280 bytes ([1], Section 5), but the following considerations apply for ISATAP interfaces: o Nearly all IPv4 nodes connect to physical links with MTUs of 1500 bytes or larger (e.g., Ethernet) o Sub-IPv4 layer encapsulations (e.g., VPN) may occur on some paths o Commonly-deployed VPN interfaces use an MTU of 1400 bytes To maximize efficiency and minimize IPv4 fragmentation for the - predominant deployment case, ISATAP interfaces that do not use a - dynamic MTU discovery mechanism SHOULD set LinkMTU ([4], Section - 6.3.2 ) to no more than 1380 bytes (1400 minus 20 bytes for IPv4 - encapsulation). LinkMTU MAY be set to larger values on ISATAP - interfaces that use a dynamic MTU discovery mechanism. Appendix C - provides non-normative considerations for dynamic MTU discovery. + predominant deployment case, the ISATAP interface MTU, or "LinkMTU" + (see: [4], Section 6.3.2 ), SHOULD be set to no more than 1380 bytes + (1400 minus 20 bytes for IPv4 encapsulation). LinkMTU MAY be set to + larger values when a dynamic link layer MTU discovery mechanism is + used or when a static MTU assignment is used and additional + fragmentation in the site's IPv4 network is deemed acceptable. See + Appendix C for non-normative ISATAP interface MTU considerations. - The ISATAP link layer encapsulates packets of size 1380 or smaller - with the Don't Fragment (DF) bit not set in the encapsualting IPv4 - header. + When a dynamic MTU discovery mechanism is not used, the ISATAP link + layer encapsulates IPv6 packets with the Don't Fragment (DF) bit not + set in the encapsualting IPv4 header. 6.4 Handling IPv4 ICMP Errors IPv4 ICMP errors and ARP failures are processed as link error notifications. 6.5 Local-Use IPv6 Unicast Addresses - The specification in ([7], section 3.7) is not used. Instead, local use IPv6 unicast addresses are formed as specified in Section 5.1. 6.6 Ingress Filtering The specification in ([7], section 3.9) is used. In particular, - ISATAP nodes that forward decapsulated packets MUST be configured - with a list of source IPv4 address prefixes that are acceptable. + ISATAP nodes that forward decapsulated packets MUST verify the tunnel + source address is acceptable. 7. Neighbor Discovery - RFC 2461 [4] provides the following guidelines for non-broadcast - multiple access (NBMA) link support: + The specification in ([7], section 3.8) applies only to configured + tunnels. RFC 2461 [4] provides the following guidelines for + non-broadcast multiple access (NBMA) link support: "Redirect, Neighbor Unreachability Detection and next-hop determination should be implemented as described in this document. Address resolution and the mechanism for delivering Router Solicitations and Advertisements on NBMA links is not specified in this document." ISATAP links SHOULD implement Redirect, Neighbor Unreachability Detection, and next-hop determination exactly as specified in [4]. Address resolution and the mechanisms for delivering Router Solicitations and Advertisements for ISATAP links are not specified - by [4]; instead, they are specified in this document. + by [4]; instead, they are specified in the following sections of this + document. 7.1 Address Resolution and Neighbor Unreachability Detection ISATAP addresses are resolved to link-layer addresses (IPv4) by a static computation, i.e., the last four octets are treated as an IPv4 address. Following static address resolution, hosts SHOULD perform an initial - reachability confirmation by sending unicast Neighbor Solicitations - (NSs) and receiving a Neighbor Advertisement using the mechanisms - specified in ([4], sections 7.2.2-7.2.8). + reachability confirmation by sending Neighbor Solicitation (NS) + message(s) and receiving a Neighbor Advertisement (NA) message using + the mechanisms specified in ([4], section 7.2.). When the ISATAP + interface provides a multicast emulation mechanism (see: Section 5.4) + solicitations are sent to the solicited-node multicast address + corresponding to the target address. Otherwise, the solicitation is + sent to the target's unicast address. Hosts SHOULD additionally perform Neighbor Unreachability Detection (NUD) as specified in ([4], section 7.3). Routers MAY perform the above-specified reachability detection and NUD procedures, but this might not scale in all environments. All ISATAP nodes MUST send solicited neighbor advertisements ([4], section 7.2.4). 7.2 Duplicate Address Detection Duplicate Address Detection ([9], section 5.4) is not required for ISATAP addresses, since duplicate address detection is assumed already performed for the IPv4 addresses from which they derive. 7.3 Router and Prefix Discovery - Since ISATAP nodes will typically not receive unsolicited multicast - Router Advertisements, unicast mechanisms are required as specified - below: + The following sections describe mechanisms to support the router and + prefix discovery process ([4], section 6) on ISATAP links: 7.3.1 Conceptual Data Structures ISATAP nodes use the conceptual data structures Prefix List and Default Router List exactly as in ([4], section 5.1). ISATAP links - add a new conceptual data structure "Potential Router List" and the - following new configuration variable: + add a new conceptual data structure "Potential Router List" (PRL) and + the following new configuration variable: - ResolveInterval - Time between name service resolutions. Default and suggested - minimum: 1hr + PrlRefreshInterval + Time in seconds between successive refreshments of the PRL after + initialization. SHOULD be no less than 3,600 seconds. - A Potential Router List (PRL) is associated with every ISATAP link. - Each entry in the PRL has an IPv4 address and an associated timer. - The IPv4 address represents an advertising ISATAP interface, and is - used to construct the link-local ISATAP address for that interface. - The following sections specify the process for initializing the PRL: + Default: 3,600 seconds - When a node enables an ISATAP link, it discovers IPv4 addresses for - the PRL. The addresses MAY be established by a DHCPv4 [10] option + A PRL is associated with every ISATAP link. Each entry in the PRL + ("PRL(i)") has an IPv4 address ("V4ADDR(i)") that represents an + advertising ISATAP interface and an associated timer ("TIMER(i)"). + The process for initializing and refreshing the PRL is described + below: + + When a node enables an ISATAP link, it initializes the PRL with IPv4 + addresses. The addresses MAY be discovered via a DHCPv4 [15] option for ISATAP (option code TBD), manual configuration, or an unspecified alternate method (e.g., DHCPv4 vendor-specific option). When no other mechanisms are available, a DNS fully-qualified domain - name (FQDN) [21] established by an out-of-band method (e.g., DHCPv4, + name (FQDN) [16] established by an out-of-band method (e.g., DHCPv4, manual configuration, etc.) MAY be used. The FQDN is resolved into - IPv4 addresses through a static host file, a site-specific name - service, querying a DNS server within the site, or an unspecified - alternate method. The following notes apply: - - 1. Site administrators maintain a list of IPv4 addresses - representing advertising ISATAP interfaces and make them - available via one or more of the mechanisms described above. - - 2. There are no mandatory rules for the selection of a FQDN, but - manual configuration MUST be supported. + IPv4 addresses for the PRL through a static host file, a + site-specific name service, querying a DNS server within the site, or + an unspecified alternate method. There are no mandatory rules for + the selection of a FQDN, but manual configuration MUST be supported. + When DNS is used, client resolvers use the IPv4 transport. - 3. After initialization, nodes periodically re-initialize the PRL - (e.g., after ResolveInterval). When DNS is used, client - resolvers use the IPv4 transport. + After initialization, nodes periodically refresh the PRL (i.e., using + one or more of the methods described above) after PrlRefreshInterval. 7.3.2 Validation of Router Advertisements Messages The specification in ([4], section 6.1.2) is used. Additionally, received RA messages that contain Prefix Information options and/or encode non-zero values in the Cur Hop Limit, Router Lifetime, Reachable Time, or Retrans Timer fields (see: [4], section 4.2) MUST satisfy the following validity check for ISATAP: o the network-layer (IPv6) source address is an ISATAP address and - embeds an IPv4 address from the PRL + embeds V4ADDR(i) for some PRL(i) 7.3.3 Router Specification Routers with advertising ISATAP interfaces behave the same as - described in ([4], section 6.2). Advertising ISATAP interfaces send - RA messages to a node's unicast address, as permitted by ([4], - section 6.2.6). + described in ([4], section 6.2). As permitted by ([4], section + 6.2.6), advertising ISATAP interfaces SHOULD send unicast RA messages + to a soliciting host's address when the solicitation's source address + is not the unspecified address. 7.3.4 Host Specification + When no unsolicited RA messages containing prefix information options + and/or non-zero router lifetime values are received, hosts MAY send + Router Solicitation (RS) messages using the specification in Section + 7.3.4.1. RA messages (whether solicited or unsolicited) are + processed using the specification in Section 7.3.4.2. + 7.3.4.1 Sending Router Solicitations - All entries in the PRL are assumed to represent active advertising - ISATAP interfaces within the site, i.e., the PRL provides trust basis - only; not reachability detection. Hosts periodically solicit - information from one or more entries in the PRL ("PRL(i)") by sending - unicast Router Solicitation (RS) messages using PRL(i)'s IPv4 address - ("V4ADDR_PRL(i)") and associated timer ("TIMER(i)"). The manner of - selecting a PRL(i) for solicitation and/or deprecating a - previously-selected PRL(i) is outside the scope of this + All PRL(i)'s are assumed to represent active advertising ISATAP + interfaces within the site, i.e., the PRL provides trust basis only; + not reachability detection. Hosts periodically solicit information + from one or more PRL(i) by sending Router Solicitation (RS) messages. + The manner of selecting a PRL(i) for solicitation and/or deprecating + a previously-selected PRL(i) is outside the scope of this specification. Hosts add the following variable to support the solicitation process: MinRouterSolicitInterval - Minimum time between sending Router Solicitations. Default and - suggested minimum: 15min. + Minimum time in seconds between successive solicitations of the + same advertising ISATAP interface. SHOULD be no less than 900 + seconds. - When a PRL(i) is selected, the host sets TIMER(i) to - MinRouterSolicitInterval and initiates solicitation following a short - delay. Solicitation consists of sending RS messages to the ISATAP - link-local address constructed from V4ADDR_PRL(i), i.e., they are - sent to 'FE80::0:5EFE:V4ADDR_PRL(i)' instead of - 'All-Routers-multicast'. They are otherwise sent exactly as in ([4], - section 6.3.7). + Default: 900 seconds + Solicitation consists of sending RS messages using the interface's + link-local unicast addresses as the source address. When the ISATAP + interface provides a multicast emulation mechanism (see: Section + 5.4), RS messages are sent to the All-Routers multicast address. + Otherwise, they are sent to the link-local ISATAP address constructed + from V4ADDR(i) for some PRL(i) selected for solicitation. The RS + messages are otherwise sent exactly as in ([4], section 6.3.7). 7.3.4.2 Processing Router Advertisements Hosts process received RA messages exactly as in ([4], section 6.3.4) - and ([9], section 5.5.3) except that, when an RA message contains an - MTU option, hosts SHOULD NOT copy the option's value into the ISATAP - interface LinkMTU. Instead, when the ISATAP link layer implements a - per-neighbor path MTU cache, hosts SHOULD copy the MTU option's value - into the cache entry for the router that sent the RA message (see: - Appendix C). + and ([9], section 5.5.3). (But, see Appendix C for non-normative + considerations for RA messages containing MTU options.) - When the network-layer source address in an RA message is an ISATAP - address that embeds V4ADDR_PRL(i) for some PRL(i) selected for - solicitation, hosts additionally reset TIMER(i). Let "MIN_LIFETIME" - be the minimum value in the router lifetime or valid lifetime of any - prefixes advertised in the RA message. Then, TIMER(i) is reset to: + When the source address of the RA message is an ISATAP address that + embeds V4ADDR(i) for some PRL(i) selected for solicitation, hosts + additionally reset TIMER(i). Let "MIN_LIFETIME" be the minimum value + in the router lifetime or the lifetime(s) encoded in options included + in the RA message. Then, TIMER(i) is reset to: MAX((0.5 * MIN_LIFETIME), MinRouterSolicitInterval) 8. Deployment Considerations 8.1 Host And Router Deployment Considerations For hosts, if an underlying link supports both IPv4 (over which ISATAP is implemented) and also supports IPv6 natively, then ISATAP MAY be enabled if the native IPv6 layer does not receive Router Advertisements (i.e., does not have connection with an IPv6 router). After a non-link-local address has been configured and a default router acquired on the native link, the host SHOULD discontinue the - router solicitation process described in the host specification and - allow existing ISATAP address configurations to expire as specified - in ([4], section 5.3) and ([9], section 5.5.4). Any ISATAP addresses - added to the DNS for this host should also be removed. In this way, - ISATAP use will gradually diminish as IPv6 routers are widely - deployed throughout the site. + router solicitation process described in the Host Specification + (Section 7.3.4) and allow existing ISATAP address configurations to + expire as specified in ([4], section 5.3) and ([9], section 5.5.4). + Any ISATAP addresses added to the DNS for this host should also be + removed. In this way, ISATAP use will gradually diminish as IPv6 + routers are widely deployed throughout the site. Routers MAY configure both a native IPv6 and ISATAP interface over the same physical link. Routing will operate as usual between these two domains. Note that the prefixes used on the ISATAP and native IPv6 interfaces will be distinct. The IPv4 address(es) configured on a router's advertising ISATAP interface(s) SHOULD be added (either automatically or manually) to the site's address records for advertising ISATAP interfaces. 8.2 Site Administration Considerations The following considerations are noted for sites that deploy ISATAP: o ISATAP links are administratively defined by a set of advertising ISATAP interfaces and set of nodes which discover those interface addresses. Thus, ISATAP links are defined by administrative (not physical) boundaries. o Hosts and routers that use ISATAP can be deployed in an ad-hoc fashion. In particular, hosts can be deployed with little/no - advanced knowledge of existing routers, and routers can deployed - with no reconfiguration requirements for hosts. + advanced knowledge of existing routers, and routers can be + deployed with no reconfiguration requirements for hosts. - o ISATAP nodes periodically refresh the entries on the PRL. + o Site administrators maintain a list of IPv4 addresses representing + advertising ISATAP interfaces and make them available via one or + more of the mechanisms described in Section 7.3.1. ISATAP nodes + use this list to initialize and periodically refresh the PRL. Responsible site administration can reduce the control traffic. At a minimum, administrators SHOULD ensure that dynamically advertised information for the site's PRL is well maintained. 9. IANA Considerations - A DHCPv4 option code for ISATAP (TBD) [22] may be requested in the - event that this document (or a derivative thereof) is moved to + A DHCPv4 option code for ISATAP (TBD) [17] may be requested in the + event that this document or a derivative thereof is moved to standards track. + Modifications to the IANA "ethernet-numbers" registry (e.g., based on + text in Appendix B) may be requested in the event that this document + or a derivative thereof is moved to standards track. + 10. Security considerations ISATAP site border routers and firewalls MUST implement IPv6 ingress - filtering and MUST NOT allow packets with site-local source and/or - destination addresses (i.e., addresses with prefix FEC0::/10) to - enter or leave the site. + filtering and MUST NOT forward packets with site-local source and/or + destination addresses outside of the site [18]. In addition to possible attacks against IPv6, security attacks against IPv4 must also be considered. In particular, border routers and firewalls MUST implement IPv4 ingress filtering and ip-protocol-41 filtering. Even with IPv4 and IPv6 ingress filtering, reflection attacks can - originate from nodes within an ISATAP site that spoof IPv6 source - addresses. Security mechanisms for reflection attack mitigation - (e.g., [11], [12], etc.) SHOULD be used in routers with advertising - ISATAP interfaces. At a minimum, ISATAP site border gateways MUST - log potential source address spoofing cases. + originate from compromised nodes within an ISATAP site that spoof + IPv6 source addresses. Security mechanisms for reflection attack + mitigation (e.g., [19], [20], etc.) SHOULD be used in routers with + advertising ISATAP interfaces. At a minimum, ISATAP site border + gateways MUST log potential source address spoofing cases. - (RFC 2461 [4], section 6.1.2) implies that nodes trust received - Router Advertisement (RA) messages from on-link routers, as indicated - by a value of 255 in the IPv6 'hop-limit' field. ISATAP links - require an additional validation check for received RA messages (see: - Section 7.3.2). + IPv6 Neighbor Discovery trust models and threats [21] apply also to + ISATAP. However, ([21], section 4.4.) shows that most of these + threats are mitigated in corporate networks that implement site + security mechanisms, i.e., the applicability space for ISATAP. ISATAP addresses do not support privacy extensions for stateless - address autoconfiguration [23]. However, since the ISATAP interface + address autoconfiguration [22]. However, since the ISATAP interface identifier is derived from the node's IPv4 address, ISATAP addresses do not have the same level of privacy concerns as IPv6 addresses that use an interface identifier derived from the MAC address. (This is - especially true when private address allocations [18] are used.) + especially true when private address allocations [10] are used.) 11. Acknowledgements Some of the ideas presented in this draft were derived from work at SRI with internal funds and contractual support. Government sponsors who supported the work include Monica Farah-Stapleton and Russell Langan from U.S. Army CECOM ASEO, and Dr. Allen Moshfegh from U.S. Office of Naval Research. Within SRI, Dr. Mike Frankel, J. Peter Marcotullio, Lou Rodriguez, and Dr. Ambatipudi Sastry supported the work and helped foster early interest. @@ -508,117 +522,131 @@ The following peer reviewers are acknowledged for taking the time to review a pre-release of this document and provide input: Jim Bound, Rich Draves, Cyndi Jung, Ambatipudi Sastry, Aaron Schrader, Ole Troan, Vlad Yasevich. The authors acknowledge members of the NGTRANS community who have made significant contributions to this effort, including Rich Draves, Alain Durand, Nathan Lutchansky, Karen Nielsen, Art Shelest, Margaret Wasserman, and Brian Zill. - The authors also wish to acknowledge the work of Quang Nguyen [24] + The authors also wish to acknowledge the work of Quang Nguyen [23] under the guidance of Dr. Lixia Zhang that proposed very similar ideas to those that appear in this document. This work was first brought to the authors' attention on September 20, 2002. Normative References [1] Deering, S. and R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, December 1998. - [2] Postel, J., "Internet Protocol", STD 5, RFC 791, September - 1981. + [2] Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981. [3] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. - [4] Narten, T., Nordmark, E. and W. Simpson, "Neighbor Discovery - for IP Version 6 (IPv6)", RFC 2461, December 1998. + [4] Narten, T., Nordmark, E. and W. Simpson, "Neighbor Discovery for + IP Version 6 (IPv6)", RFC 2461, December 1998. [5] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", draft-ietf-ipngwg-addr-arch-v3-11 (work in progress), October 2002. [6] Armitage, G., Schulter, P., Jork, M. and G. Harter, "IPv6 over Non-Broadcast Multiple Access (NBMA) networks", RFC 2491, January 1999. [7] Gilligan, R. and E. Nordmark, "Basic Transition Mechanisms for IPv6 Hosts and Routers", draft-ietf-ngtrans-mech-v2-01 (work in progress), November 2002. [8] Conta, A. and S. Deering, "Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification", RFC 2463, December 1998. [9] Thomson, S. and T. Narten, "IPv6 Stateless Address Autoconfiguration", RFC 2462, December 1998. - [10] Droms, R., "Dynamic Host Configuration Protocol", RFC 2131, - March 1997. - - [11] Savola, P., "Security Considerations for 6to4", - draft-savola-ngtrans-6to4-security-01 (work in progress), March - 2002. +Informative References - [12] Bellovin, S., Leech, M. and T. Taylor, "ICMP Traceback - Messages", draft-ietf-itrace-03 (work in progress), January - 2003. + [10] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G. and E. + Lear, "Address Allocation for Private Internets", BCP 5, RFC + 1918, February 1996. - [13] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, - November 1990. + [11] Carpenter, B. and K. Moore, "Connection of IPv6 Domains via + IPv4 Clouds", RFC 3056, February 2001. - [14] Postel, J., "Internet Control Message Protocol", STD 5, RFC - 792, September 1981. + [12] Deering, S., Fenner, W. and B. Haberman, "Multicast Listener + Discovery (MLD) for IPv6", RFC 2710, October 1999. - [15] Baker, F., "Requirements for IP Version 4 Routers", RFC 1812, - June 1995. + [13] Estrin, D., Farinacci, D., Helmy, A., Thaler, D., Deering, S., + Handley, M. and V. Jacobson, "Protocol Independent + Multicast-Sparse Mode (PIM-SM): Protocol Specification", RFC + 2362, June 1998. - [16] McCann, J., Deering, S. and J. Mogul, "Path MTU Discovery for - IP version 6", RFC 1981, August 1996. + [14] Armitage, G., "Support for Multicast over UNI 3.0/3.1 based ATM + Networks", RFC 2022, November 1996. - [17] Braden, R., "Requirements for Internet Hosts - Communication - Layers", STD 3, RFC 1122, October 1989. + [15] Droms, R., "Dynamic Host Configuration Protocol", RFC 2131, + March 1997. -Informative References + [16] Mockapetris, P., "Domain names - implementation and + specification", STD 13, RFC 1035, November 1987. - [18] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G. and E. - Lear, "Address Allocation for Private Internets", BCP 5, RFC - 1918, February 1996. + [17] Droms, R., "Procedures and IANA Guidelines for Definition of + New DHCP Options and Message Types", BCP 43, RFC 2939, + September 2000. - [19] Carpenter, B. and K. Moore, "Connection of IPv6 Domains via - IPv4 Clouds", RFC 3056, February 2001. + [18] Hinden, R., "IPv6 Globally Unique Site-Local Addresses", + draft-hinden-ipv6-global-site-local-00 (work in progress), + December 2002. - [20] Thaler, D., "Support for Multicast over 6to4 Networks", - draft-ietf-ngtrans-6to4-multicast-01 (work in progress), July + [19] Savola, P., "Security Considerations for 6to4", + draft-savola-ngtrans-6to4-security-01 (work in progress), March 2002. - [21] Mockapetris, P., "Domain names - implementation and - specification", STD 13, RFC 1035, November 1987. + [20] Bellovin, S., Leech, M. and T. Taylor, "ICMP Traceback + Messages", draft-ietf-itrace-03 (work in progress), January + 2003. - [22] Droms, R., "Procedures and IANA Guidelines for Definition of - New DHCP Options and Message Types", BCP 43, RFC 2939, - September 2000. + [21] Nikander, P., "IPv6 Neighbor Discovery trust models and + threats", draft-ietf-send-psreq-01 (work in progress), January + 2003. - [23] Narten, T. and R. Draves, "Privacy Extensions for Stateless + [22] Narten, T. and R. Draves, "Privacy Extensions for Stateless Address Autoconfiguration in IPv6", RFC 3041, January 2001. - [24] Nguyen, Q., "http://irl.cs.ucla.edu/vet/report.ps", spring + [23] Nguyen, Q., "http://irl.cs.ucla.edu/vet/report.ps", spring 1998. - [25] Lahey, K., "TCP Problems with Path MTU Discovery", RFC 2923, + [24] Braden, R., "Requirements for Internet Hosts - Communication + Layers", STD 3, RFC 1122, October 1989. + + [25] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, + November 1990. + + [26] Postel, J., "Internet Control Message Protocol", STD 5, RFC + 792, September 1981. + + [27] Baker, F., "Requirements for IP Version 4 Routers", RFC 1812, + June 1995. + + [28] Lahey, K., "TCP Problems with Path MTU Discovery", RFC 2923, September 2000. - [26] Jacobson, V., Braden, B. and D. Borman, "TCP Extensions for + [29] McCann, J., Deering, S. and J. Mogul, "Path MTU Discovery for + IP version 6", RFC 1981, August 1996. + + [30] Jacobson, V., Braden, B. and D. Borman, "TCP Extensions for High Performance", RFC 1323, May 1992. - [27] Templin, F., "Neighbor Affiliation Protocol for + [31] Templin, F., "Neighbor Affiliation Protocol for IPv6-over-(foo)-over-IPv4", draft-templin-v6v4-ndisc-01 (work in progress), November 2002. Authors' Addresses Fred L. Templin Nokia 313 Fairchild Drive Mountain View, CA 94110 US @@ -647,20 +676,28 @@ Microsoft Corporation One Microsoft Way Redmond, WA 98052-6399 US Phone: +1 425 703 8835 EMail: dthaler@microsoft.com Appendix A. Major Changes + changes from version 11 to version 12: + + o Added comments from co-authors + + o Revised PRL initialization + + o Updated MTU section + changes from version 10 to version 11: o Added multicast/anycast subsection o Revised PRL initialization o Updated neighbor discovery, security consideration sections o Updated MTU section @@ -748,178 +784,228 @@ including support for encapsulating legacy EUI-48 interface identifiers (e.g., an IANA EUI-48 format multicast address such as: '01-00-5E-01-02-03' is encapsulated as: '01-00-5E-FF-FE-01-02-03'). But, the specification also provides a special TYPE (0xFE) to indicate an IPv4 address is embedded. Thus, when the first four octets of an IPv6 interface identifier are: '00-00-5E-FE' (note: the 'u/l' bit MUST be 0) the interface identifier is said to be in "ISATAP format" and the next four octets embed an IPv4 address encoded in network byte order. -Appendix C. Dynamic MTU Discovery +Appendix C. ISATAP Interface MTU Considerations ISATAP encapsulators and decapsulators are IPv6 neighbors that may be - separated by multiple link layer (IPv4) forwarding hops. When an - encapsulator's interface configures a LinkMTU ([4], Section 6.3.2) - value larger than 1380 bytes, a dynamic link layer (IPv4) mechanism - is required to discover per-neighbor path MTUs. The following text - gives non-normative considerations for dynamic MTU discovery. + separated by multiple link layer (IPv4) forwarding hops. Thus, the + path MTU of the underlying IPv4 network may determine the uni- + directional IPv6 per-neighbor MTU from the encapsulator to the + decapsulator. (Note that this constitutes the MTU of only one hop in + what may be a multiple-hop IPv6 path.) When the encapsulator's ISATAP + interface configures a large LinkMTU value (see: Section 6.3), + special considerations apply as described in the following + non-normative sections: - IPv4 path MTU discovery [13] uses ICMPv4 "fragmentation needed" - messages, but these generally do not provide enough information for - stateless translation to ICMPv6 "packet too big" messages (see: RFC - 792 [14] and RFC 1812 [15], section 4.3.2.3). Additionally, ICMPv4 - "fragmentation needed" messages can be spoofed, filtered, or not sent - at all by some forwarding nodes. Thus, IPv4 Path MTU discovery used - alone may be inadequate and can result in black holes that are - difficult to diagnose [25]. +C.1 Stateless (Static) MTU Assignment + + Nodes that connect to the Internet should be able to reassemble and/ + or discard IPv4 packets up to 64KB in length when the DF bit is not + set in the encapsulating IPv4 header. Nodes that cannot reassemble/ + discard maximum-length IPv4 packets are vulnerable to buffer overrun + attacks. This issue may be obviated for nodes that are accessed only + within a site (i.e., do not connect directly to the Internet) since + site border gateways, etc. can filter and discard fragments of large + packets before they reach constrained node(s). + + When the ISATAP encapsulator does not implement a dynamic link layer + mechanism to determine per-neighbor MTUs, all IPv6 packets are + encapsulated with the DF bit not set in the IPv4 header. + Additionally, LinkMTU may be set to a value that is no more than the + smallest Effective MTU to Receive (EMTU_R) (see: RFC 1122 [24], + section 3.3.2) for all potential decapsulators in the site. The + value chosen for LinkMTU must be at least 1280 bytes (the minimum + IPv6 MTU) and such that the potential worst-case level of + fragmentation in the underlying IPv4 network is deemed "acceptable" + by the site's standards. + + For example, when all decapsulators in the site are known to have an + EMTU_R of 10KB and the site's IPv4 routers are optimized for IPv4 + fragmentation, encapsulators may be able to use LinkMTU values as + large as 10KB (minus 20 bytes for IPv4 encapsulation). Conversely, + when IPv4 fragmentation causes performance degradation along some + paths, LinkMTU should be set to a smaller value. + + Nodes that use a static MTU assignment SHOULD copy the value in an + MTU option received in any Router Advertisement message into LinkMTU + for the ISATAP interface as specified in ([4], section 6.3.4). + +C.2 Stateful (Dynamic) MTU Determination + + When the encapsulator implements a dynamic MTU determination + mechanism it keeps a link layer cache of per-neighbor MTU values + (e.g., as ancillary data in the IPv6 neighbor cache, in the IPv4 path + MTU discovery cache, etc.). IPv4 path MTU discovery [25] uses ICMPv4 + "fragmentation needed" messages, but these generally do not provide + enough information for stateless translation to ICMPv6 "packet too + big" messages (see: RFC 792 [26] and RFC 1812 [27], section 4.3.2.3). + Additionally, ICMPv4 "fragmentation needed" messages can be spoofed, + filtered, or not sent at all by some forwarding nodes. Thus, IPv4 + Path MTU discovery used alone may be inadequate and can result in + black holes that are difficult to diagnose [28]. Alternate methods for determining per-neighbor MTUs should be used - when RFC 1191 path MTU discovery is deemed inadequate. In any - method, the encapsulator uses periodic and/or on-demand probing of - the IPv4 path to the decapsulator. The following three methods are + when RFC 1191 path MTU discovery is deemed inadequate. In these + methods, the encapsulator uses periodic and/or on-demand probing of + the IPv4 path to the decapsulator to initialize and update cache + entries. The following three probing methods (among others) are possible: 1. Encapsulator-driven - the encapsulator periodically sends probe packets with the DF bit set in the IPv4 header and waits for a positive acknowledgement from the decapsulator that the probe was received 2. Decapsulator-driven - the encapsulator sends all packets with the DF bit NOT set in the IPv4 header unless and until the decapsulator sends a "Fragmentation Experienced" indication(s) 3. Hybrid - the encapsulator and decapsulator engage in a dialogue and use "intelligent" probing to monitor the path MTU These methods are discussed in detail in the following subsections: -C.1 Encapsulator-driven Method - +C.2.1 Encapsulator-driven Method In this method, the encapsulator sets the DF bit in the IPv4 header of probe packets. Probe packets may be sent either when the encapsulator's link layer forwards a large data packet to the decapsulator (i.e., on-demand) or when the path MTU for the decapsulator has not been verified for some time (i.e., periodic). IPv6 Neighbor Solicitation (NS) or ICMPv6 ECHO_REQUEST packets with padding bytes added could be used for this purpose, since successful delivery results in a positive acknowledgement that the probe succeeded vis-a-vis a response from the decapsulator. - While the decapsulator is being probed, the encapsulator maintains a - queue of packets that have the decapsulator as the IPv6 next-hop - address. The queue should be large enough to buffer the - (delay*bandwidth) product for the round-trip time (RTT) to the - decapsulator. If the probe succeeds, packets in the queue that are - no larger than the probe size are sent to the decapsulator. If the - probe fails, packets larger than the last known successful probe are - dropped and an ICMPv6 "packet too big" message returned to the sender - [16]. + While probing, the encapsulator maintains a queue of packets that + have the decapsulator as the IPv6 next-hop address. If the probe + succeeds, packets in the queue that are no larger than the probe size + are sent to the decapsulator. If the probe fails, packets that are + larger than the last known successful probe are dropped and an ICMPv6 + "packet too big" message returned to the sender [29]. The queue + should be large enough to buffer the (delay*bandwidth) product for + the round-trip time to the decapsulator. When smaller queues are + used, loss of packets that are too big for the yet-to-be-determined + path MTU may occur with no ICMPv6 "packet too big" message returned. + Such loss may occur only in rare instances, but may result in + unpredictable behavior in senders that base their adaptation solely + on ICMPv6 "packet too big" messages. This method has the advantage that the decapsulator need not implement any special mechanisms, since standard IPv6 request/ response mechanisms are used. Additionally, the encapsulator is assured that any packets that are too large for the decapsulator to receive will be dropped by the network. Disadvantages for this method include the fact that probe packets do not carry data and thus - consume network resources. Additionally, packet queues may become - large on Long, Fat Networks (LFNs) (see: RFC 1323 [26]). + consume network resources. Additionally, queues may become large on + Long, Fat Networks (LFNs) (see: RFC 1323 [30]). -C.2 Decapsulator-driven Method +C.2.2 Decapsulator-driven Method In this method, the encapsulator sends all packets with the DF bit NOT set in the IPv4 header with the expectation that the decapsulator will send a "Fragmentation Experienced" indication if the IPv4 network fragments packets. In other words, the decapsulator simply sends all packets that are no larger than LinkMTU unless and until it receives "Fragmentation Experienced" messages from the decapsulator. The decapsulator can use IPv6 Router Advertisement (RA) messages with an MTU option as the means for both reporting fragmentation and informing the encapsulator of a new MTU value to use. - This method has the distinct advantages that the data packets - themselves are used as probes and no queueing on the encapsulator is - necessary. Additionally, fewer packets will be lost since the - decapsulator will quite often be able to reassemble packets - fragmented by the network. The primary disadvantage is that, using - the current specifications, the encapsulator has no way of knowing - whether a particular decapsulator implements the "fragmentation - experienced" signalling capability. However, the "fragmentation - experienced" indication can be trivially implemented in an - application on the decapsulator that uses the Berkeley Packet Filter - (aka, libpcap) to listen for fragmented packets from encapsulators. + This method has the advantage that the data packets themselves are + used as probes and no queuing on the encapsulator is necessary. + (When large data packets for probing are not available, smaller data + packets can be null-padded to the desired probe size by artificially + inflating the length field in the IPv4 header; leaving the IPv6 + length unchanged.) An additional advantage is that fewer packets will + be lost since the decapsulator will quite often be able to reassemble + packets fragmented by the network. The primary disadvantage for this + method is that, using the current specifications, the encapsulator + has no way of knowing whether a particular decapsulator implements + the "fragmentation experienced" signaling capability. However, the + "fragmentation experienced" indication can be trivially implemented + in an application on the decapsulator that uses the Berkeley Packet + Filter (aka, libpcap) to listen for fragmented packets from + encapsulators. - When fragmented packets arrive, the application sends IPv6 RA + When fragmented packets arrive, the decapsulator sends IPv6 RA messages with an MTU option to inform the encapsulator that fragmentation has been experienced and a new value for the neighbor's - MTU should be used. The application additionally sends ICMPv6 + MTU should be used. The decapsulator additionally sends ICMPv6 "packet too big" messages to the original source when a fragmented packet is not correctly reassembled. This function need not be built into the decapsulator's operating system and can be added as an - after-market feature. Finally, simply adding an extra bit in the RA - message header ([4], section 4.2) would provide a means for the + after-market feature. Finally, simply adding an extra bit in a + neighbor discovery message header would provide a means for the decapsulator to inform the encapsulator that dynamic MTU discovery is supported. -C.3 Hybrid Method +C.2.3 Hybrid Method In this method, the encapsulator and decapsulator engage in a "neighbor affiliation" protocol to negotiate link-layer parameters - such as MTU. (See: [27] for an example of such an approach.) This + such as MTU. (See: [31] for an example of such an approach.) This approach has the advantage that bi-directional links are used and both ends of the link have unambiguous knowledge that the other end - implements the protocol. However, the signalling protocol between - the endpoints is complicated and additional state is required in both - the encapsulator and decapsultor. + implements the protocol. However, the signaling protocol between the + endpoints is complicated and additional state is required in both the + encapsulator and decapsultor. The hybrid method seems best suited to + implementation in a reliable transport-layer protocol rather than at + the network/link layer. -C.4 Summary +C.2.4 Additional Notes - In summary, the decapsulator-based approach in Appendix C.2 has - distinct efficiency advantages over methods that engage the - encapsulator. Additionally, probing methods which use IPv4 - encapsulation with the DF bit NOT set may use LinkMTU values for the - ISATAP link that exceed the underlying link MTU size. Experimental - verification is called for which may eventually result in a - recommendation for proposed standard. + o In all dynamic methods, some packet loss due to link/buffer + restrictions may occur with no ICMPv6 "packet too big" message + returned to the sender. Unenlightened senders will interpret such + loss as loss due to congestion, which may result in longer + convergence to the actual path MTU. Enlightened senders will + interpret the loss as due to link/buffer restrictions and + immediately reduce their MTU estimate. -C.5 Additional Notes + o In all dynamic methods, when a Router Advertisement (RA) message + includes an MTU option hosts SHOULD NOT copy the option's value + into LinkMTU for the ISATAP interface. Instead, when the ISATAP + interface uses a per-neighbor path MTU cache, hosts SHOULD copy + the MTU option's value into the cache entry for the neighbor that + sent the RA message. This leaves an ambiguous interpretation for + processing received RA messages which could be eliminated if [4] + were modified to allow Neighbor Advertisement (NA) messages to + carry MTU options. - o In all methods, some packet loss due to link/buffer restrictions - may occur with no ICMPv6 "packet too big" message returned to the - sender. Unenlightened senders will interpret such loss as loss - due to congestion, which may result in longer convergence to the - actual path MTU. Enlightened senders will interpret the loss as - loss due to link/buffer restrictions and immediately reduce their - MTU estimate. + o In all methods, a "minimum MTU" must be supported by all nodes for + multicast (i.e., even when multicast is emulated on the NBMA IPv4 + network.) The mechanisms described above speak only to the unicast + case for MTU determination. o To avoid denial-of-service attacks that would cause superfluous probing based on counting down/up by small increments, plateau - tables (e.g., [13], section 7) should be used when the actual MTU + tables (e.g., [25], section 7) should be used when the actual MTU value is indeterminant. o ICMPv4 "fragmentation needed" messages may result when a link restriction is encountered but may also come from denial of service attacks. Implementations should treat ICMPv4 "fragmentation needed" messages as "tentative" negative acknowledgments and apply heuristics to determine when to suspect an actual link restriction and when to ignore the messages. IPv6 packets lost due actual link restrictions are perceived as lost due to congestion by the original source, but robust implementations minimize instances of such packet loss without ICMPv6 "packet too big" messages returned to the sender. - o Nodes that connect to the Internet are expected to be able to - reassemble or discard IPv4 packets up to 64KB in length when the - DF bit is not set in the encapsulating IPv4 header. Nodes that - cannot reassemble or discard maximum-length IPv4 packets are - vulnerable to attacks such as the "ping-of-death". - Intellectual Property Statement The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of