Internet Engineering Task Force Robert E. Gilligan INTERNET-DRAFT Erik Nordmark Sun Microsystems, Inc.
March 17,May 15, 1995 Transition Mechanisms for IPv6 Hosts and Routers <draft-ietf-ngtrans-trans-mech-00.txt><draft-ietf-ngtrans-trans-mech-01.txt> Abstract This document specifies IPv4 compatibility mechanisms that can be implemented by IPv6 hosts and routers. These mechanisms include providing complete implementations of both versions of the Internet Protocol (IPv4 and IPv6), and tunneling IPv6 packets over IPv4 routing infrastructures. They are designed to allow IPv6 nodes to maintain complete compatibility with IPv4, which should greatly simplify the deployment of IPv6 in the Internet, and facilitate the eventual transition of the entire Internet to IPv6. Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as ``work in progress.'' To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet- Drafts Shadow Directories on ds.internic.net (US East Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific Rim). This Internet Draft expires on September 17,November 15, 1995. 1. Introduction This specification defines mechanisms that IPv6 hosts and routers may implementThe key to be compatiblea successful IPv6 transition is compatibility with the large installed base of IPv4 hosts and routers. Maintaining compatibility with IPv4 while deploying IPv6 will streamline the task of transitioning the Internet to IPv6. This specification defines a set of mechanisms that IPv6 hosts and routers may implement in order to be compatible with IPv4 hosts and routers. The mechanisms in this document are designed to be employed by IPv6 hosts and routers that need to interoperate with IPv4 hosts and utilize IPv4 routing infrastructures. We expect that complete compatibility with IPv4 will be necessarymost nodes in the Internet will need such compatibility for a long time to come, and perhaps even indefinitely. However, IPv6 may be used in some environments where interoperability with IPv4 is not required. IPv6 nodes that are designed to be used in such environments need not use or even implement these mechanisms. The mechanisms specified here include: - Dual IP layer. Providing complete support for both IPv4 and IPv6 in hosts and routers. - IPv6 over IPv4 tunneling. Encapsulating IPv6 packets within IPv4 headers to carry them over IPv4 routing infrastructures. Two types of tunneling are employed: configured and automatic. Additional transition and compatibility mechanisms may be developed in the future. These will be specified in other documents. 1.2. Terminology The following terms are used in this document: Types of Nodes IPv4-only node: A host or router that implements only IPv4. An IPv4-only node does not understand IPv6. The installed base of IPv4 hosts and routers existing before the transition begins are IPv4-only nodes. IPv6/IPv4 node: A host or router that implements both IPv4 and IPv6. IPv6-only node: A host or router that implements IPv6, and does not implement IPv4. The operation of IPv6-only nodes is not addressed here. IPv6 node: Any host or router that implements IPv6. IPv6/IPv4 and IPv6-only nodes are both IPv6 nodes. IPv4 node: Any host or router that implements IPv4. IPv6/IPv4 and IPv4-only nodes are both IPv4 nodes. Types of IPv6 Addresses IPv4-compatible IPv6 address: An IPv6 address, assigned to an IPv6/IPv4 node, which bears the high-order 96-bit prefix 0:0:0:0:0:0, and an IPv4 address in the low-order 32-bits. IPv4-compatible addresses are used by the automatic tunneling mechanism. IPv6-only address: The remainder of the IPv6 address space. An IPv6 address that bears a prefix other than 0:0:0:0:0:0. Techniques Used in the Transition IPv6-over-IPv4 tunneling: The technique of encapsulating IPv6 packets within IPv4 so that they can be carried across IPv4 routing infrastructures. IPv6-in-IPv4 encapsulation: IPv6-over-IPv4 tunneling. Configured tunneling: IPv6-over-IPv4 tunneling where the IPv4 tunnel endpoint address is determined by configuration information on the encapsulating node. Automatic tunneling: IPv6-over-IPv4 tunneling where the IPv4 tunnel endpoint address is determined from the IPv4 address embedded in the IPv4-compatible destination address of the IPv6 packet. 1.3. Structure of this Document The remainder of this document is organized into three sections: - Section 2 discusses the IPv4-compatible address format. - Section 3 discusses the operation of nodes with a dual IP layer, IPv6/IPv4 nodes. - Section 4 discusses IPv6-over-IPv4 tunneling. 2. Addressing The automatic tunneling mechanism uses a special type of IPv6 address, termed an "IPv4-compatible" address. An IPv4-compatible address is identified by an all-zeros 96-bit prefix, and holds an IPv4 address in the low-order 32-bits. IPv4-compatible addresses are structured as follows: | 96-bits | 32-bits | +--------------------------------------+--------------+ | 0:0:0:0:0:0 | IPv4 Address | +--------------------------------------+--------------+ IPv4-Compatible IPv6 Address Format IPv4-compatible addresses are assigned to IPv6/IPv4 nodes that support automatic tunneling. Nodes that are configured with IPv4-compatible addresses may use the complete address as their IPv6 address, and use the embedded IPv4 address as their IPv4 address. The remainder of the IPv6 address space (that is, all addresses with 96-bit prefixes other than 0:0:0:0:0:0) are termed "IPv6-only Addresses." 3. Dual IP Layer The most straightforward way for IPv6 nodes to remain compatible with IPv4-only nodes is by providing a complete IPv4 implementation. IPv6 nodes that provide a complete IPv4 implementation in addition to their IPv6 implementation are called "IPv6/IPv4 nodes." IPv6/IPv4 nodes have the ability to send and receive both IPv4 and IPv6 packets. They can directly interoperate with IPv4 nodes using IPv4 packets, and also directly interoperate with IPv6 nodes using IPv6 packets. The dual IP layer technique may or may not be used in conjunction with the IPv6-over-IPv4 tunneling techniques, which are described in section 4. An IPv6/IPv4 node that supports tunneling may support only configured tunneling, or both configured and automatic tunneling. Thus three configurations are possible: - IPv6/IPv4 node that does not perform tunneling. - IPv6/IPv4 node that performs configured tunneling only. - IPv6/IPv4 node that performs configured tunneling and automatic tunneling. 3.1. Address Configuration Because they support both protocols, IPv6/IPv4 nodes may be configured with both IPv4 and IPv6 addresses. Although the two addresses may be related to each other, this is not required. IPv6/IPv4 nodes may be configured with IPv6 and IPv4 addresses that are unrelated to each other. Nodes that perform automatic tunneling are configured with IPv4-compatible IPv6 addresses. These may be viewed as single addresses that can serve both as IPv6 and IPv4 addresses. The entire 128-bit IPv4-compatible IPv6 address is used as the node's IPv6 address, while the IPv4 address embedded in low-order 32-bits serves as the node's IPv4 address. IPv6/IPv4 nodes may use the stateless IPv6 address configuration mechanism  or DHCP for IPv6  to acquire their IPv6 address. These mechanisms may provide either IPv4-compatible or IPv6-only IPv6 addresses. IPv6/IPv4 nodes may use IPv4 mechanisms to acquire their IPv4 addresses. IPv6/IPv4 nodes that perform automatic tunneling may also acquire their IPv4-compatible IPv6 addresses from another source: IPv4 address configuration protocols. A node may use any IPv4 address configuration mechanism to acquire its IPv4 address, then "map" that address into an IPv4-compatible IPv6 address by pre-pending it with the 96-bit prefix 0:0:0:0:0:0. This mode of configuration allows IPv6/IPv4 nodes to "leverage" the installed base of IPv4 address configuration servers. It can be particularly useful in environments where IPv6 routers and address configuration servers have not yet been deployed. The specific algorithm for acquiring an IPv4-compatible address using IPv4-based address configuration protocols is as follows: 1) The IPv6/IPv4 node uses standard IPv4 mechanisms or protocols to acquire its own IPv4 address. These include: - The Dynamic Host Configuration Protocol (DHCP)  - The Bootstrap Protocol (BOOTP)  - The Reverse Address Resolution Protocol (RARP)  - Manual configuration - Any other mechanism which accurately yields the node's own IPv4 address 2) The node uses this address as its IPv4 address. 3) The node prepends the 96-bit prefix 0:0:0:0:0:0 to the 32-bit IPv4 address that it acquired in step (1). The result is an IPv4-compatible IPv6 address with the node's own IPv4-address embedded in the low-order 32-bits. The node uses this address as its own IPv6 address. 3.1.1. IPv4 Loopback Address Many IPv4 implementations treat the address 127.0.0.1 as a "loopback address" -- an address to reach services located on the local machine. Per the host requirements specification , section 220.127.116.11, IPv4 packets addressed from or to the loopback address are not to be sent onto the network; they must remain entirely within the node. IPv6/IPv4 implementations may treat the IPv4-compatible IPv6 address ::127.0.0.1 as an IPv6 loopback address. Packets with this address should also remain entirely within the node, and not be transmitted onto the network. 3.2. DNS The Domain Naming System (DNS) is used in both IPv4 and IPv6 to map hostnames into addresses. A new resource record type named "AAAA" has been defined for IPv6 addresses . Since IPv6/IPv4 nodes must be able to interoperate directly with both IPv4 and IPv6 nodes, they must must provide resolver libraries capable of dealing with IPv4 "A" records as well as IPv6 "AAAA" records. Some sites use local host tables instead of, or in addition to, the DNS. Use of host tables may be particularly useful in the very early stages of transition before the DNS infrastructure has been converted to support AAAA records. Therefore, implementations may provide a host table mechanism in addition to their DNS resolver. Note that the local host table mechanism does not scale very well, so its use is not recommended for large sites. Further discussion of the host table issue can be found in section 6.1.1 of "Requirements for Internet Hosts -- Application and Support" . 3.2.1. Handling Records for IPv4-Compatible Addresses When an IPv4-compatible IPv6 addresses is assigned to an IPv6/IPv4 host that supports automatic tunneling, both A and AAAA records are listed in the DNS. The AAAA record holds the full IPv4-compatible IPv6 address, while the A record holds the low-order 32-bits of that address. The AAAA record is needed so that queries by IPv6 hosts can be satisfied. The A record is needed so that queries by IPv4-only hosts, whose resolver libraries only support the A record type, will locate the host. DNS resolver libraries on IPv6/IPv4 nodes must be capable of handling both AAAA and A records. However, when a query locates an AAAA record holding an IPv4-compatible IPv6 address, and an A record holding the corresponding IPv4 address, the resolver library need not necessarily return both addresses. It has three options: - Return only the IPv6 address to the application. - Return only the IPv4 address to the application. - Return both addresses to the application. The selection of which address type to return in this case, or, if both addresses are returned, in which order they are listed, can affect what type of IP traffic is generated. If the IPv6 address is returned, the node will communicate with that destination using IPv6 packets (in most cases encapsulated in IPv4); If the IPv4 address is returned, the communication will use IPv4 packets. The way that DNS resolver implementations handle redundant records for IPv4-compatible addresses may depend on whether that implementation supports automatic tunneling, or whether it is enabled. For example, an implementation that does not support automatic tunneling would not return IPv4-compatible IPv6 addresses to applications because those destinations are generally only reachable via tunneling. On the other hand, those implementations in which automatic tunneling is supported and enabled may elect to return only the IPv4-compatible IPv6 address and not the IPv4 address. 4. IPv6-over-IPv4 Tunneling In most deployment scenarios, the IPv6 routing infrastructure will be built up over time. While the IPv6 infrastructure is being deployed, the existing IPv4 routing infrastructure can remain functional, and can be used to carry IPv6 traffic. Tunneling provides a way to utilize an existing IPv4 routing infrastructure to carry IPv6 traffic. IPv6/IPv4 hosts and routers can tunnel IPv6 datagrams over regions of IPv4 routing topology by encapsulating them within IPv4 packets. Tunneling can be used in a variety of ways: - Router-to-Router. IPv6/IPv4 routers interconnected by an IPv4 infrastructure can tunnel IPv6 packets between themselves. In this case, the tunnel spans one segment of the end-to-end path that the IPv6 packet takes. - Host-to-Router. IPv6/IPv4 hosts can tunnel IPv6 packets to an intermediary IPv6/IPv4 router that is reachable via an IPv4 infrastructure. This type of tunnel spans the first segment of the packet's end-to-end path. - Host-to-Host. IPv6/IPv4 hosts that are interconnected by an IPv4 infrastructure can tunnel IPv6 packets between themselves. In this case, the tunnel spans the entire end-to-end path that the packet takes. - Router-to-Host. IPv6/IPv4 routers can tunnel IPv6 packets to their final destination IPv6/IPv4 host. This tunnel spans only the last segment of the end-to-end path. Tunneling techniques are usually classified according to the mechanism by which the encapsulating node determines the address of the node at the end of the tunnel. In the first two tunneling methods listed above -- router-to-router and host-to-router -- the IPv6 packet is being tunneled to a router. The endpoint of this type of tunnel is an intermediary router which must decapsulate the IPv6 packet and forward it on to its final destination. When tunneling to a router, the endpoint of the tunnel is different from the destination of the packet being tunneled. So the addresses in the IPv6 packet being tunneled do not provide the IPv4 address of the tunnel endpoint. Instead, the tunnel endpoint address must be determined from configuration information on the node performing the tunneling. We use the term "configured tunneling" to describe the type of tunneling where the endpoint is explicitly configured. In the last two tunneling methods -- host-to-host and router-to-host -- the IPv6 packet is tunneled all the way to its final destination. The tunnel endpoint is the node to which the IPv6 packet is addressed. Since the endpoint of the tunnel is the destination of the IPv6 packet, the tunnel endpoint can be determined from the destination IPv6 address of that packet: If that address is an IPv4-compatible address, then the low-order 32-bits hold the IPv4 address of the destination node, and that can be used as the tunnel endpoint address. This technique avoids the need to explicitly configure the tunnel endpoint address. Deriving the tunnel endpoint address from the embedded IPv4 address of the packet's IPv6 address is termed "automatic tunneling". The two tunneling techniques -- automatic and configured -- differ primarily in how they determine the tunnel endpoint address. Most of the underlying mechanisms are the same: - The entry node of the tunnel (the encapsulating node) creates an encapsulating IPv4 header and transmits the encapsulated packet. - The exit node of the tunnel (the decapsulating node) receives the encapsulated packet, removes the IPv4 header, updates the IPv6 header, and processes the received IPv6 packet. - The encapsulating node may need to maintain soft state information for each tunnel recording such parameters as the MTU of the tunnel its path lengthin order to correctly generateprocess IPv6 ICMP error messages.packets forwarded into the tunnel. Since the number of tunnels that any one host or router may be using may grow to be quite large, this state information can be cached and discarded when not in use. The next section discusses the common mechanisms that apply to both types of tunneling. Subsequent sections discuss how the tunnel endpoint address is determined for automatic and configured tunneling. 4.1. Common Tunneling Mechanisms The encapsulation of an IPv6 datagram in IPv4 is shown below: +-------------+ | IPv4 | | Header | +-------------+ +-------------+ | IPv6 | | IPv6 | | Header | | Header | +-------------+ +-------------+ | Transport | | Transport | | Layer | ===> | Layer | | Header | | Header | +-------------+ +-------------+ | | | | ~ Data ~ ~ Data ~ | | | | +-------------+ +-------------+ Encapsulating IPv6 in IPv4 In addition to adding an IPv4 headerheader, the encapsulating node also has to handle some more complex issues: - Determine when to fragment and when to report an ICMP "packet too big" error back to the source. - How to account for the tunnel in the IPv6 Hop Limit field. - How toreflect IPv4 ICMP errors from routers along the tunnel path back to the source as IPv6 ICMP errors. Those issues are discussed in the following sections. 4.1.1. Tunnel MTU and fragmentationFragmentation The encapsulating node could view encapsulation as IPv6 using IPv4 as a link layer with a very large MTU (65535-20 bytes to be exact; 20 bytes "extra" are needed for the encapsulating IPv4 header). The encapsulating node would need only to report IPv6 ICMP "packet too big" errors back to the source for packets that exceed this MTU. However, such a scheme would be inefficient for two reasons: 1) It would result in more fragmentation than needed. IPv4 layer fragmentation should be avoided due to the performance problems caused by the loss unit being smaller than the retransmission unit.unit . 2) Any IPv4 fragmentation occurring inside the tunnel would have to be reassembled at the tunnel endpoint. For tunnels that terminate at a router, this would require additional memory to reassemble the IPv4 fragments into a complete IPv6 packet before that packet could be forwarded onward. The fragmentation inside the tunnel can be reduced to a minimum by having the encapsulating node track the IPv4 Path MTU across the tunnel (usingtunnel, using the IPv4 Path MTU Discovery Protocol  and recording the resulting path MTU in the internet layer).MTU. The IPv6 layer in the encapsulating node can then view a tunnel as a link layer with an MTU equal to the IPv4 path MTU, minus the size of the encapsulating IPv4 header. Note that this does not completely eliminate IPv4 fragmentation in the case when the IPv4 path MTU would result in an IPv6 MTU less than 576 bytes. (Any link layer used by IPv6 has to have an MTU of at least 576 bytes .) In this case the IPv6 layer has to "see" a link layer with an MTU of 576 bytes and the encapsulating node has to use IPv4 fragmentation in order to forward the 576 byte IPv6 packets. The encapsulating node can employ the following algorithm to determine when to forward an IPv6 packet that is larger than the tunnel's path MTU using IPv4 fragmentation, and when to return an IPv6 ICMP "packet too big" message: if (IPv4 path MTU - 20) is less than or equal to 576 if packet is larger than 576 bytes Send IPv6 ICMP "packet too big" with MTU = 576. Drop packet. else Encapsulate but do not set the Don't Fragment flag in the IPv4 header. The resulting IPv4 packet might be fragmented by the IPv4 layer on the encapsulating node.node or by some router along the IPv4 path. endif else if packet is larger than (IPv4 path MTU - 20) Send IPv6 ICMP "packet too big" with MTU = (IPv4 path MTU - 20). Drop packet. else Encapsulate and set the Don't Fragment flag in the IPv4 header. endif endif Encapsulating nodes that have a large number of tunnels might not be able to store the IPv4 Path MTU for all tunnels. Such nodes can, at the expense of additional fragmentation in the network, avoid using the IPv4 Path MTU algorithm across the tunnel and instead use the MTU of the link layer (under IPv4) in the above algorithm instead of the IPv4 path MTU. In this case the Don't Fragment bit must not be set in the encapsulating IPv4 header. 4.1.2. Hop Limit The IPv4 hops of anIPv6-over-IPv4 tunnel can be accounted for in one of two ways: 1) Each of the "hops" that an encapsulated IPv6 datagram takes through IPv4 routers can be reflected intunnels are modeled as "single-hop". That is, the IPv6 hop limit field. For example, if the IPv4 path length of a tunnel is 5 hops, the IPv6 "hop limit" fieldis decremented by 51 when an IPv6 packet travels throughtraverses the tunnel. We use the term "multi-hop" to describe tunnels that use this model. 2) The tunnel can be modeled as consuming only one IPv6 hop independent of its IPv4 path length. That is, the IPv6 hop limit is decremented only by 1 when an IPv6 packet traverses the tunnel. We use the term "single-hop" to describe tunnels that use this model. These two models can be used to achieve different objectives. The multi-hop model can be useful to enforce the scope limitations imposed by the sender of the IPv6 datagram. It also makes the tunnel "traceroute detectable": by sending IPv6 packets with hop limit values that will cause them to "expire" within the tunnel, network management programs like "traceroute" can locate tunnels and determine their path length. Such programs can not determine the addresses of the IPv4 routers within the tunnel, however.The single-hop model is useful if the administrator wishesserves to hide the existence of a tunnel. Since a single-hopThe tunnel only "consumes" one IPv6 hop, itis opaque to users of the network, and is not detectable by programs likenetwork diagnostic tools such as traceroute. The multi-hopsingle-hop model can beis implemented by having the encapsulating node copyand decapsulating nodes process the IPv6 hop limit into the IPv4 TTL field when it composes the encapsulating packet, and having the decapsulating node copy the IPv4 TTLfield back intoas they would if they were forwarding a packet on to any other datalink. That is, they decrement the IPv6hop limit field. The single-hop model is implementedby having the encapsulating node select the IPv4 TTL independently of the1 when forwarding an IPv6 hop limit, and the decapsulatingpacket. (The originating node and final destination do not copyingdecrement the IPv4hop limit.) The TTL intoof the encapsulating IPv4 header is selected in an implementation dependent manner. The current suggested value is published in the hop limit field."Assigned Numbers RFC. Implementations may provide either model or both. Implementations that provide both models may wisha mechanism to give administratorsallow the abilityadministrator to configure which model is used for each tunnel. If implementations provide configurability, it is important that both ends of the tunnel --the encapsulating and decapsulating nodes -- are configured to use the same model. If the tunnel endpoints are configured differently, packets could end up with an incorrect IPv6 hop limit. No serious problems would result if the encapsulating node were configured to use the multi-hop model, but the decapsulating node was configured to use the single-hop model. The results would the same as if both ends were configured to use the single-hop model. However, two failure modes can occur if the encapsulating node is configured to use the single-hop model and the decapsulating node is configured to use the multi-hop model: - The IPv6 packet exits the tunnel with a larger hop limit than it had when entering the tunnel. This would occur if the amount of IPv4 TTL remaining when the packet reached the decapsulating node was larger than the IPv6 hop count. This failure can be thought of as the IPv6 packet "gaining hop limit" when passing through the tunnel. - The number of IPv6 hops "consumed" in passing through the tunnel is more than IPv4 path length of the tunnel. This would occur if the difference between the IPv6 hop limit in the packet and the remaining IPv4 TTL was greater than the IPv4 path length of the tunnel. This failure can be thought of as the IPv6 packet "loosing too much hop limit" when passing through the tunnel. Note that in both of these cases, the original IPv6 hop limit is lost. Its value after transiting the tunnel is related only to the IPv4 TTL selected by the encapsulating node, which is not related to the hop limit in the IPv6 packet. Of the two potential failure modes above, the first is more serious since it could cause a packet to "live forever". A routing loop which sent IPv6 packets through such a tunnel could cause an infinite cycle of packets, for example. The second failure mode would cause packets to expire prematurely. The decapsulating node can implement a simple algorithm to prevent the "gaining hop count" problem. This algorithm does not prevent the second problem. This algorithm is implemented as part of the process of decapsulating the IPv6 packet: - If the tunnel is configured to use the "single-hop" model, do not modify the IPv6 hop limit field. - If the tunnel is configured to use the "multi-hop" model, then: - If the IPv4 TTL field is greater than or equal to the IPv6 hop limit field, do not modify the IPv6 hop limit field. - Else, copy the IPv4 TTL field into the IPv6 hop limit field. It is an open issue whether the "loosing too much hop count" problem is serious enough to require that a solution be developed. Note that the decision about whether to copy the IPv4 TTL field into the hop limit field does not affect the requirement to decrement the hop limit field; If the encapsulating or decapsulating node is an IPv6 router that forwards the packet, it must decrement the IPv6 hop count. Note also that the hop limit problem affects only configured tunnels. Automatic tunnels terminate at the end node, where the packet is consumed, not forwarded, so the remaining hop limit is irrelevant. 4.1.3. Handling IPv4 ICMP errors The encapsulating node has to be able to handle IPv4 ICMP errors that are generated by routers interior to the tunnel. All such errors are returned to the encapsulating node since the encapsulating node is the IPv4 source of the packets. Ideally the encapsulating node would want to convert these errors to IPv6 ICMP errors and send them back to the source of the original IPv6 datagram. However, this in infeasible since the IPv4 ICMP errors may not return enough of the "offending packet". Many IPv4 implementations only return the IPv4 header plus 8 bytes of the IPv4 payload, which will not even contain the complete IPv6 header, let alone enough higher level headers for the originating node to determine which application originated the packet that experienced the error. For the purpose of this discussion there are two categories of errors: 1) ICMP errors that are needed to maintain connectivity. Only ICMP "packet too big" falls in this category; a persistent loss of ICMP "packet too big" message would result in a black hole for large packets. 2) ICMP errors that are needed by network management tools like traceroute. These errors include ICMP unreachable and ICMP TTL expired. The ICMP "packet too big" errors are handled according to IPv4 Path MTU Discovery  and the resulting path MTU is recorded in the IPv4 layer. The recorded path MTU is used by IPv6 to determine if an IPv6 ICMP "packet too big" error has to be generated as described in section 4.1.1. The other errors can be handled as described in the remainder of this section to make multi-hop tunnels be "traceroute detectable." Making a tunnel traceroute detectable is implemented by having the encapsulating node maintain "soft state" information about the tunnel. This state is created based on the IPv4 ICMP errors that are received in response to encapsulated packets. When the encapsulating node prepares to send an IPv6 packet into a tunnel, it consults the tunnel state to determine if the packet is likely to generate an ICMP error inside the tunnel. If so, it generates an appropriate IPv6 ICMP error, which it sends back to the source of the IPv6 packet. It also encapsulates the packet and sends it into the tunnel. The latter is needed to quickly recover from transient error conditions. Note that, since the IPv6 ICMP error message originates at the encapsulating node, not at the IPv4 router within the tunnel, the node that sent the original IPv6 packet does not receive the address of the IPv4 router. Thus a traceroute program may not determine the addresses of the IPv4 routers within a tunnel, but it may detect their presence by noting that a packets with a consecutive range of hop limits expire at the same router (the encapsulating router). Tunnel state information is associated with the IPv4 address of the endpoint of the tunnel and can include: - The MTU of the Tunnel. Its use is described in section 4.1.1. - Reachability of the endpoint of the tunnel. - If the endpoint of the tunnel is unreachable, the IPv4 address of the router reporting unreachability. - Path length of the tunnel (number of IPv4 hops to the endpoint). - For each TTL 't' between 1 and the path length of the tunnel, the IPv4 address of the router that was last known to be 't' hops into the tunnel. Maintaining the IPv4 addresses of the routers internal to the tunnel is not strictly necessary for correct operation, but is useful for network management. The tunnel state does not have to be allocated until anIPv4 TTL. 4.1.3. Handling IPv4 ICMP error is received.errors In the absence of tunnel state, the tunnel MTU can be assumedresponse to be the MTU of the outgoing interface, the path length one hop andencapsulated packets it has sent into the endpoint being reachable. Whentunnel, the encapsulating node receives anmay receive IPv4 ICMP error where the "offending packet" is an IPv6-in-IPv4 packet (i.e. anmessages from IPv4 packet with an IP protocol field of 41),routers inside the tunnel. These packets are addressed to the encapsulating node updates the tunnel state associated withbecause it is the IPv4 destination in the "offending packet". The update depends on the type of ICMP error: - Host or network unreachable: Mark the tunnel endpoint as unreachable and record thesource of the ICMP error as the source of unreachability. - Time exceeded in transit:encapsulated packet. The TTL "consumed" before reaching the router that sent the time exceeded message is extracted from the IPv6 hop limit field in the "offending packet" (the IPv6 hop limit field is in the first 8 bytes of the IPv6 header thus it will be returned in the ICMP packet). Compute the updated tunnel path length as the maximum of the currently recorded path length and the extracted IPv6 hop limit. Record the source of theICMP error as the router at 'IPv6 hop limit' hops into the tunnel. - "Packet"packet too big": Use thebig" error messages are handled according to IPv4 Path MTU Discovery  algorithm to update the tunnel MTU. - For all other ICMP errors log a network management event. When the encapsulating node prepares to forward an IPv6 packet into the tunnel it performs the following checks against the tunnel state: - If the tunnel endpoint is unreachable, it generates an IPv6 ICMP "destination unreachable" message. - Ifand the hop limitresulting path MTU is less than therecorded tunnel TTL, it generates an IPv6 ICMP "time exceeded" message. - If the packet would violatein the tunnel MTU, generateIPv4 layer. The recorded path MTU is used by IPv6 to determine if an IPv6 ICMP "packet too big" message,error has to be generated as specifieddescribed in section 4.1.1. The IPv6handling of other types of ICMP error messagemessages depends on how much information is sent back to the source ofincluded in the IPv6 packet, and includes as much of"packet in error" field, which holds the original IPv6encapsulated packet as will fit. The source IPv6 addressthat caused the error. Many older IPv4 routers return only 8 bytes of data beyond the ICMP message is thatIPv4 header of the encapsulating node. That original IPv6packet in error, which is also forwarded intonot enough to include the tunnel. The algorithm as described above quickly returns IPv6 ICMP errors as a resultaddress fields of the IPv6 header. More modern IPv4 ICMP errors from insiderouters may return enough data beyond the tunnel. In orderIPv4 header to determine wheninclude the error condition is lifted, it relies on: - A timeout. All tunnel state, exceptentire IPv6 header and possibly even the tunnel MTU, should be discarded after at most 30 seconds after it was created.data beyond that. If the error condition still existsoffending packet includes enough data, the encapsulating node may extract the encapsulated IPv6 packet and packets continueuse it to flow through that tunnel, IPv4generating an IPv6 ICMP errors will continuemessage directed back to arrive and they will cause a refresh ofthe tunnel state. The tunnel MTU is timed outoriginating IPv6 node, as described inshown below: +--------------+ | IPv4 Path MTU Discovery .Header | | dst = encaps | | node | +--------------+ | ICMP | | Header | - Data packets are always sent into the tunnel, even when the encapsulating- +--------------+ | IPv4 Header | | src = encaps | IPv4 | node generates| +--------------+ - - Packet | IPv6 | | Header | Original IPv6 in +--------------+ Packet - | Transport | Can be used to Error | Header | generate an +--------------+ IPv6 ICMP error message. This means that packets will get through as soon as theICMP | | error condition withinmessage ~ Data ~ back to the tunnel is relieved, although error reports may continue for a short period thereafter.source. | | - - +--------------+ - - IPv4 ICMP Error Message Returned to Encapsulating Node 4.1.4. IPv4 Header Construction When encapsulating an IPv6 packet in an IPv4 datagram, the IPv4 header fields are set as follows: Version: 4 IP Header Length in 32-bit words: 5 (There are no IPv4 options in the encapsulating header.) Type of Service: 0 Total Length: Payload length from IPv6 header plus length of IPv6 and IPv4 headers (i.e. a constant 60 bytes). Identification: Generated uniquely as for any IPv4 packet transmitted by the system. Flags: Set the Don't Fragment (DF) flag as specified in section 4.1.1. Set the More Fragments (MF) bit as necessary if fragmenting. Fragment offset: Set as necessary if fragmenting. Time to Live: If tunnel is configured as multi-hop: Copied from the IPv6 hop limit field. If tunnel is configured as single-hop:Set to pre-configured value.in implementation-specific manner. Protocol: 41 (Assigned payload type number for IPv6) Header Checksum: Calculate the checksum of the IPv4 header. Source Address: IPv4 address of outgoing interface of the encapsulating node. Destination Address: IPv4 address of of tunnel endpoint. Any IPv6 options are preserved in the packet (after the IPv6 header). 4.1.5. Decapsulating IPv6-in-IPv4 Packets When an IPv6/IPv4 host or a router receives an IPv4 datagram that is addressed to one of its own IPv4 address, and the value of the protocol field is 41, it removes the IPv4 header and submits the IPv6 datagram to its IPv6 layer code. The decapsulation is shown below: +-------------+ | IPv4 | | Header | +-------------+ +-------------+ | IPv6 | | IPv6 | | Header | | Header | +-------------+ +-------------+ | Transport | | Transport | | Layer | ===> | Layer | | Header | | Header | +-------------+ +-------------+ | | | | ~ Data ~ ~ Data ~ | | | | +-------------+ +-------------+ Decapsulating IPv6 from IPv4 When decapsulating the IPv6-in-IPv4 packet, only the hop limit field ofthe IPv6 header is modified: If tunnel is configured as single-hop: Donot modify the IPv6 hop limit field. If tunnel is configured as multi-hop:modified. If the IPv4 TTL fieldpacket is greater than or equal to the IPv6 hop limit field, do not modify the IPv6 hop limit field. Else, copy the IPv4 TTL field into the IPv6subsequently forwarded, its hop limit field. Then theis decremented by one. The encapsulating IPv4 header is discarded. Note that theThe decapsulating node performs IPv4 reassembly before decapsulating the IPv6 packet. All IPv6 options are preserved even if the encapsulatedencapsulating IPv4 packet is fragmented. After the IPv6 packet is decapsulated, it is treatedprocessed the same as any received IPv6 packet. 4.2. Configured Tunneling In configured tunneling, the tunnel endpoint address is determined from configuration information in the encapsulating node. For each tunnel, the encapsulating node must store the tunnel endpoint address. When an IPv6 packet is transmitted over a tunnel, the tunnel endpoint address configured for that tunnel is used as the destination address for the encapsulating IPv4 header. The determination of which packets to tunnel is usually made by routing information on the encapsulating node. This is usually done via a routing table, which directs packets based on their destination address using the prefix mask and match technique. 4.2.1. Default Configured Tunnel Nodes that are connected to IPv4 routing infrastructures may use a configured tunnel to reach an IPv6 "backbone". If the IPv4 address of an IPv6/IPv4 router bordering the backbone is known, a tunnel can be configured to that router. This tunnel can be configured into the routing table as a "default route". That is, all destinationsIPv6 destination addresses will match the route and could potentially traverse the tunnel. Since the "mask length" of such default route is zero, it will be used only if there are no other routes with a longer mask that match the destination. The tunnel endpoint address of such a default tunnel could be the IPv4 address of one IPv6/IPv4 router at the border of the IPv6 backbone. Alternatively, the tunnel endpoint could be an IPv4 "logical"anycast address". With this approach, multiple IPv6/IPv4 routers at the border advertise IPv4 reachability to the same IPv4 logicaladdress. All of these routers accept packets to this address as their own, and will decapsulate IPv6 packets tunneled to this address. This logical address operates something like an "anycast address":When an IPv6/IPv4 node sendsends an encapsulated packet to this address, it will be delivered to only one of the border routers, but the sending node will not know which one. The IPv4 routing system will generally carry the traffic to the closest router. Using a default tunnel to a logicalan IPv4 address"anycast address" provides a high degree of robustness since multiple border router can be provided, andand, using the normal fallback mechanisms of IPv4 routing, traffic will automatically switch to another router when one goes down. 4.3. Automatic Tunneling In automatic tunneling, the tunnel endpoint address is determined from the packet being tunneled. The destination IPv6 address in the packet must be an IPv4-compatible address. If it is, the IPv4 address component of that address -- the low-order 32-bits -- are extracted and used as the tunnel endpoint address. IPv6 packets that are not addressed to an IPv4-compatible address can not be tunneled using automatic tunneling. The determination ofIPv6/IPv4 nodes need to determine which IPv6 packets to automatically tunnelcan be made bysent via automatic tunneling. One technique is to use the IPv6 routing table information. Thisto direct automatic tunneling. An implementation can be configured in thehave a special static routing table as route toentry for the prefix 0:0:0:0:0:0:0:0/96. That0:0:0:0:0:0/96. (That is, a route to the all-zeros prefix with a 96-bit mask.mask.) Packets to all destinations bearing the all-zeros 96-bitthat match this prefix can beare sent viato a pseudo-interface driver which performs automatic tunneling. Since all IPv4-compatible IPv6 addresses will match this prefix, all packets to those destinations will be auto-tunneled. 4.4. Default Sending Algorithm This section presents a combined IPv4 and IPv6 sending algorithm that IPv6/IPv4 nodes can use. The algorithm can be used to determine when to send IPv4 packets, when to send IPv6 packets, and when to perform automatic and configured tunneling. It illustrates how the techniques of dual IP layer, configured tunneling, and automatic tunneling can be used together. The algorithm has the following properties: - Sends IPv4 packets to all IPv4 destinations. - Sends IPv6 packets to all IPv6 destinations on the same link. - Using automatic tunneling, sends IPv6 packets encapsulated in IPv4 to IPv6 destinations with IPv4-compatible addresses that are located off-link. - Sends IPv6 packets to IPv6 destinations located off-link when IPv6 routers are present. - Using the default IPv6 tunnel, sends IPv6 packets encapsulated in IPv4 to IPv6 destinations with IPv6-only addresses when no IPv6 routers are present. The algorithm is as follows: 1) If the address of the end node is an IPv4 address then: 1.1) If the destination is located on the attached link, then send an IPv4 packet addressed to the end node. 1.2) If the destination is located off-link, then; 1.2.1) If there is an IPv4 router on link, then send an IPv4 format packet. The IPv4 destination address is the IPv4 address of the end node. The datalink address is the datalink address of the IPv4 router. 1.2.2) Else, the destination is treated as "unreachable" because it is located off link and there are no on-link routers. 2) If the address of the end node is an IPv4-compatible IPv6 address (i.e. bears the prefix 0:0:0:0:0:0), then: 2.1) If the destination is located on the attached link, then send an IPv6 format packet (not encapsulated). The IPv6 destination address is the IPv6 address of the end node. The datalink address is the datalink address of the end node. 2.2) If the destination is located off-link, then: 2.2.1) If there is an IPv4 router on the attached link, then send an IPv6 packet encapsulated in IPv4. The IPv6 destination address is the address of the end node. The IPv4 destination address is the low-order 32-bits of the end node's address. The datalink address is the datalink address of the IPv4 router. 2.2.2) Else, if there is an IPv6 router on the attached link, then send an IPv6 format packet. The IPv6 destination address is the IPv6 address of the end node. The datalink address is the datalink address of the IPv6 router. 2.2.3) Else, the destination is treated as "unreachable" because it is located off-link and there are no on-link routers. 3) If the address of the end node is an IPv6-only address, then: 3.1) If the destination is located on the attached link, then send an IPv6 format packet. The IPv6 destination address is the IPv6 address of the end node. The datalink address is the datalink address of the end node. 3.2) If the destination is located off-link, then: 2.2.1) If there is an IPv6 router on the attached link, then send an IPv6 format packet. The IPv6 destination address is the IPv6 address of the end node. The datalink address is the datalink address of the IPv6 router. 2.2.2) Else, if the destination is reachable via a configured tunnel, and there is an IPv4 router on the attached link link, then send an IPv6 packet encapsulated in IPv4. The IPv6 destination address is the address of the end node. The IPv4 destination address is the configured IPv4 address of the tunnel endpoint. The datalink address is the datalink address of the IPv4 router. 2.2.3) Else, the destination is treated as "unreachable" because it is located off-link and there are no on-link IPv6 routers. A summary of these sending rules are given in the table below: End | End | IPv4 | IPv6 | Packet | | | Node | Node | Router | Router | Format | IPv6 | IPv4 | DLink Address | On | On | On | To | Dest | Dest | Dest Type | Link? | Link? | Link? | Send | Addr | Addr | Addr ------------+---------+---------+---------+--------+------+------+------ IPv4 | Yes | N/A | N/A | IPv4 | N/A | E4 | EL ------------+---------+---------+---------+--------+------+------+------ IPv4 | No | Yes | N/A | IPv4 | N/A | E4 | RL ------------+---------+---------+---------+--------+------+------+------ IPv4 | No | No | N/A | UNRCH | N/A | N/A | N/A ------------+---------+---------+---------+--------+------+------+------ IPv4-compat | Yes | N/A | N/A | IPv6 | E6 | N/A | EL ------------+---------+---------+---------+--------+------+------+------ IPv4-compat | No | Yes | N/A | IPv6/4 | E6 | E4 | RL ------------+---------+---------+---------+--------+------+------+------ IPv4-compat | No | No | Yes | IPv6 | E6 | N/A | RL ------------+---------+---------+---------+--------+------+------+------ IPv4-compat | No | No | No | UNRCH | N/A | N/A | N/A ------------+---------+---------+---------+--------+------+------+------ IPv6-only | Yes | N/A | N/A | IPv6 | E6 | N/A | EL ------------+---------+---------+---------+--------+------+------+------ IPv6-only | No | N/A | Yes | IPv6 | E6 | N/A | RL ------------+---------+---------+---------+--------+------+------+------ IPv6-only | No | Yes | No | IPv6/4 | E6 | T4 | RL ------------+---------+---------+---------+--------+------+------+------ IPv6-only | No | No | No | UNRCH | N/A | N/A | N/A ------------+---------+---------+---------+--------+------+------+------ Key to Abbreviations -------------------- N/A: Not applicable or does not matter. E6: IPv6 address of end node. E4: IPv4 address of end node (low-order 32-bits of IPv4-compatible address). EL: Datalink address of end node. T4: IPv4 address of the tunnel endpoint. R6: IPv6 address of router. R4: IPv4 address of router. RL: Datalink address of router. IPv4: IPv4 packet format. IPv6: IPv6 packet format. IPv6/4: IPv6 encapsulated in IPv4 packet format. UNRCH: Destination is unreachable. Don't send a packet. 4.4.1 On/Off Link Determination Part of the process of determining what packet format to use includes determining whether a destination is located on an attached link or not. IPv4 and IPv6 employ different mechanisms. IPv4 uses an algorithm in which the destination address and the interface address are both logically ANDed with the netmask of the interface and then compared. If the resulting two values match, then the destination is located on-link. This algorithm is discussed in more detail in Section 18.104.22.168 of the document "Requirements for Internet Hosts -- Communications Layers"host requirements specification . IPv6 uses the neighbor discovery algorithm described in "IPv6 Neighbor Discovery -- Processing" . IPv6/IPv4 nodes need to use both methods: - If a destination is an IPv4 address, then the on/off link determination is made by comparison with the netmask, as described in RFC 1122 section 22.214.171.124. - If a destination is represented by an IPv4-compatible IPv6 address (prefix 0:0:0:0:0:0), the decision is made using the IPv4 netmask comparison algorithm using the low-order 32-bits (IPv4 address part) of the destination address. - If the destination is represented by an IPv6-only address (prefix other than 0:0:0:0:0:0), the on/off link determination is made using the IPv6 neighbor discovery mechanism. 5. Acknowledgements We would like to thank the members of the IPng working group and the IPng transition working group for their many contributions and extensive review of this document. TheSpecial thanks to Jim Bound, Ross Callon, and Bob Hinden for many helpful suggestions and to John Moy for suggesting the IPv4 "logical"anycast address" default tunnel technique was originally suggested by John Moy.technique. 6. Authors' Address Robert E. Gilligan Sun Microsystems, Inc. 2550 Garcia Ave. Mailstop UMTV 05-44 Mountain View, California 94043 415-336-1012 (voice) 415-336-6015 (fax) Bob.Gilligan@Eng.Sun.COM Erik Nordmark Sun Microsystems, Inc. 2550 Garcia Ave. Mailstop UMTV 05-44 Mountain View, California 94043 415-336-2788 (voice) 415-336-6015 (fax) Erik.Nordmark@Eng.Sun.COM 7. References  W. Croft, J. Gilmore. "Bootstrap Protocol". RFC 951. September 1985.  R. Droms. "Dynamic Host Configuration Protocol". RFC 1541. October 1993.  J. Bound, Y. Rekhter, Sue Thompson. "Dynamic Host Configuration Protocol for IPv6". Internet Draft <draft-ietf-dhc-dhcpv6-00.txt>. February 1995.  S. Deering, R. Hinden. "Internet Protocol, Version 6 (IPv6) Specification". Internet Draft <draft-hinden-ipng-ipv6-spec-00.txt>. October 1994.<draft-ietf-ipngwg-ipv6-spec-01.txt>, March 1995.  S. Thompson, IPv6 Stateless Address Configuration.Autoconfiguration, Internet Draft to be written.<draft-ietf-addrconf-ipv6-auto-01.txt>, March 1995.  S. Thompson, C. Huitema. "DNS Extensions to support IP version 6". Internet Draft <draft-thomson-ipng-dns-00.txt>. October 1994.<draft-ietf-ipngwg-dns-00.txt>, March 1995.  W. A. Simpson. "IPv6 Neighbor Discovery -- Processing". Internet Draft <draft-simpson-ipv6-discov-process-00.txt>. October 1994.<draft-simpson-ipv6-discov-process-02.txt>. February 1995.  J. Mogul, S. Deering. "Path MTU Discovery". RFC 1191. November 1990.  R. Finlayson, T. Mann, J. Mogul, M. Theimer. "Reverse Address Resolution Protocol". RFC 903. June 1984.  R. Braden. "Requirements for Internet Hosts - Application And Support". RFC 1123. October 1989.  R. Braden. "Requirements for Internet Hosts - Communication Layers". RFC 1122. October 1989.  A. Conta, S. Deering. "ICMP for the Internet Protocol Version 6 (IPv6)". Internet Draft <draft-ietf-ipngwg-icmp-01.txt>. February 1995.  C. Kent and J. Mogul. "Fragmentation Considered Harmful". In Proc. SIGCOMM '87 Workshop on Frontiers in Computer Communications Technology. August, 1987.