draft-ietf-ipngwg-esd-analysis-00.txt   draft-ietf-ipngwg-esd-analysis-01.txt 
INTERNET-DRAFT Matt Crawford INTERNET-DRAFT Matt Crawford
Fermilab Fermilab
<draft-ietf-ipngwg-esd-analysis-00.txt> Allison Mankin <draft-ietf-ipngwg-esd-analysis-01.txt> Allison Mankin
ISI ISI
Thomas Narten Thomas Narten
IBM IBM
John W. Stewart, III John W. Stewart, III
ISI ISI
Lixia Zhang Lixia Zhang
UCLA UCLA
IPng Analysis of the GSE Proposal July 30, 1997
<draft-ietf-ipngwg-esd-analysis-00.txt> Separating Identifiers and Locators in Addresses:
An Analysis of the GSE Proposal for IPv6
<draft-ietf-ipngwg-esd-analysis-01.txt>
Status of this Memo Status of this Memo
This document is an Internet-Draft. Internet-Drafts are working This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts. working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as ``work in progress.'' material or to cite them other than as "work in progress."
To learn the current status of any Internet-Draft, please check the To learn the current status of any Internet-Draft, please check the
``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
Directories on ds.internic.net (US East Coast), nic.nordu.net Directories on ds.internic.net (US East Coast), nic.nordu.net
(Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific (Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific
Rim). Rim).
Distribution of this memo is unlimited. Distribution of this memo is unlimited.
This Internet Draft expires September, 1997. This Internet Draft expires January 30, 1997.
Abstract Abstract
On February 27-28 1997, the IPng Working Group held an interim On February 27-28, 1997, the IPng Working Group held an interim
meeting in Palo Alto, California to consider adopting Mike O'Dell's meeting in Palo Alto, California to consider adopting Mike O'Dell's
``GSE - An Alternate Addressing Architecture for IPv6'' proposal 'GSE - An Alternate Addressing Architecture for IPv6' proposal [GSE].
[GSE]. In GSE, 16-byte IPv6 addresses are split into three portions: In GSE, 16-byte IPv6 addresses are split into three portions: a
a globally unique End System Designator (ESD), a Site Topology globally unique End System Designator (ESD), a Site Topology
Partition (STP) and a Routing Goop (RG) portion. The STP corresponds Partition (STP) and a Routing Goop (RG) portion. The STP corresponds
(roughly) to a site's subnet portion of an IPv4 address, whereas the (roughly) to a site's subnet portion of an IPv4 address, whereas the
RG identifies the attachment point to the public Internet. Routers RG identifies the attachment point to the public Internet. Routers
use the RG+STP portions of addresses to route packets to the link to use the RG+STP portions of addresses (called 'Routing Stuff' in this
which the destination is directly attached; the ESD is used to document) to route packets to the link to which the destination is
deliver the packet across the last hop link. An important idea in GSE directly attached; the ESD is used to deliver the packet across the
is that nodes within a Site would not need to know the RG portion of last hop link. An important idea in GSE is that nodes within a site
their addresses. Border routers residing between a Site and its do not know the RG portion of their addresses. A border router at the
Internet connect point would dynamically replace the RG part of site's Internet connect point would dynamically replace the RG part
source addresses of all outgoing IP datagrams, and the RG part of of source addresses of all outgoing IP datagrams and the RG part of
destination addresses on incoming traffic. destination addresses on incoming traffic.
This document provides a detailed analysis of the GSE plan. Much of This document provides a detailed analysis of the GSE plan. Much of
the analysis presented here is an expansion of official meeting the analysis presented here is an expansion of official meeting
minutes, though it also includes issues uncovered by the authors in minutes, though it also includes issues uncovered by the authors in
the process of fully fleshing out the analysis. In summary, the the process of fully fleshing out the analysis. In summary, the
consensus of the attendees of the PAL1 meeting was that having working group eventually decided that the full addresses of nodes
routers rewrite the Routing Goop portion of addresses should not be within a site should not be hidden from those nodes, so as a result
adopted, though other parts of the GSE plan should (e.g., having it is not necessary for routers to rewrite the Routing Goop portion
globally unique ESDs). After completing the first draft of this of addresses. However, other parts of the GSE plan were adopted
document, the authors still strongly concur with this outcome. (e.g., having 64-bit interface identifiers with an option for
specifying them as globally unique and easing the renumbering of the
high-order portion of addresses within DNS).
As a first draft, this document should not be considered to represent In addition to analyzing the GSE proposal in particular, the document
the views of the IPng Working Group. Instead, it should be viewed as also studies the general issue of separating network layer addresses
the rough consensus of the PAL1 attendees and the strong consensus of into two separate values satisfying location and identification
the five authors. It is hoped that this first draft of the document purposes, respectively.
will be the catalyst for discussions to refine the written analysis,
and especially the conclusions, so that after some number of
iterations it will represent the consensus of the working group.
Contents Contents
Status of this Memo.......................................... 1 Status of this Memo.......................................... 1
1. Introduction............................................. 4 1. Introduction............................................. 4
2. Addressing and Routing in IPv4........................... 5 2. Addressing and Routing in IPv4........................... 5
2.1. The Need for Aggregation............................ 6 2.1. The Need for Aggregation............................ 7
2.2. The Pre-CIDR Internet............................... 7 2.2. The Pre-CIDR Internet............................... 7
2.3. CIDR and Provider-Based Addressing.................. 8 2.3. CIDR and Provider-Based Addressing.................. 8
2.4. Multihoming and Aggregation......................... 11 2.4. Multi-Homing and Aggregation........................ 11
3. GSE Background........................................... 12 3. GSE Background........................................... 14
3.1. Motivation For GSE.................................. 12 3.1. Motivation For GSE.................................. 14
3.2. GSE Address Format.................................. 13 3.2. GSE Address Format.................................. 15
3.3. Routing Stuff (RG and STP).......................... 14 3.3. Routing Stuff (RG and STP).......................... 15
3.4. End-System Designator............................... 15 3.4. End-System Designator............................... 17
3.5. Address Rewriting by Border Routers................. 16 3.5. Address Rewriting by Border Routers................. 18
3.6. Renumbering and Rehoming Mid-Level ISPs............. 17 3.6. Renumbering and Rehoming Mid-Level ISPs............. 19
3.7. Support for Multihomed Sites........................ 18 3.7. Support for Multi-Homed Sites....................... 20
3.8. Explicit Non-Goals for GSE.......................... 19 3.8. Explicit Non-Goals for GSE.......................... 21
4. Analysis of GSE's Advantages and Disadvantages........... 19 4. Analysis of GSE's Advantages and Disadvantages........... 21
4.1. End System Designator............................... 19 4.1. End System Designator............................... 21
4.1.1. IP Addresses in the IPv4 Internet.............. 19 4.1.1. Uniqueness Enforcement in the IPv4 Internet.... 21
4.1.2. Overloading Addresses: Network Layer Issues.... 20 4.1.2. Overloading Addresses: Network Layer Issues.... 23
4.1.3. Overloading Addresses: Transport Layer Issues.. 22 4.1.3. Overloading Addresses: Transport Layer Issues.. 24
4.1.4. Benefits of Globally Unique ESDs............... 23 4.1.4. Potential Benefits of Globally Unique ESDs..... 25
4.1.5. ESD: Network Layer Issues...................... 24 4.1.5. ESD: Network Layer Issues...................... 26
4.1.6. ESD: Transport Layer Issues.................... 25 4.1.6. ESD: Transport Layer Issues.................... 28
4.1.7. ESD: Application Layer Issues.................. 32 4.1.7. On The Uniqueness Of ESDs...................... 34
4.1.8. When ESDs are Not Unique....................... 34 4.1.8. DNS PTR Queries................................ 35
4.1.9. DNS PTR Queries................................ 36 4.1.9. Reverse Mapping of ESDs........................ 37
4.1.10. Reverse Mapping of ESDs....................... 38 4.1.10. Reverse Mapping of Complete GSE Addresses..... 38
4.1.11. Reverse Mapping of Complete GSE Addresses..... 39 4.1.11. The ICMP "Who Are You" Message................ 39
4.1.12. The ICMP ``Who Are You'' Message.............. 40 4.2. Renumbering and Domain Name System (DNS) Issues..... 40
4.2. Renumbering and Domain Name System (DNS) Issues..... 41 4.2.1. How Frequently Can We Renumber?................ 40
4.2.1. How Frequently Can We Renumber?................ 41 4.2.2. Efficient DNS support for Site Renumbering..... 41
4.2.2. Efficient DNS support for Site Renumbering..... 42 4.2.3. Two-Faced DNS.................................. 42
4.2.3. Synthesizing AAAA Records...................... 43 4.2.4. Bootstrapping Issues........................... 43
4.2.4. Two-Faced DNS.................................. 43 4.2.5. Renumbering and Reverse DNS Lookups............ 44
4.2.5. Bootstrapping Issues........................... 44 4.3. Address Rewriting Routers........................... 44
4.2.6. DNS PTR RRs Not Needed......................... 45 4.3.1. Load Balancing................................. 45
4.2.7. Renumbering and Reverse DNS Lookups............ 45 4.3.2. End-To-End Argument: Don't Hide RG from Hosts.. 45
4.3. Address Rewriting Routers........................... 46 4.4. Multi-Homing........................................ 46
4.3.1. Load Balancing................................. 46
4.3.2. End-To-End Argument: Don't Hide RG from Hosts.. 47
4.4. Multi-homing........................................ 47
5. Recommendations.......................................... 49
6. Security Considerations.................................. 50 5. Results.................................................. 48
6. Security Considerations.................................. 49
7. Acknowledgments.......................................... 50 7. Acknowledgments.......................................... 49
8. References............................................... 51 8. References............................................... 49
9. Authors' Addresses....................................... 52 9. Authors' Addresses....................................... 51
1. Introduction 1. Introduction
In October of 1996, Mike O'Dell published an Internet-Draft (dubbed In October of 1996, Mike O'Dell published an Internet-Draft (dubbed
``8+8'' that proposed significant changes to the IPv6 addressing "8+8") that proposed significant changes to the IPv6 addressing
architecture. The 8+8 proposal was the topic of considerable architecture. The 8+8 proposal was the topic of considerable
discussion at the December, 1996 IETF meeting in San Jose. Because discussion at the December 1996 IETF meeting in San Jose. Because the
the proposal offered both potential benefits (e.g., enhanced routing proposal offered both potential benefits (e.g., enhanced routing
scalability) and risks (e.g., changes to the basic IPv6 scalability) and risks (e.g., changes to the basic IPv6
architecture), the IPng Working Group held an interim meeting on architecture), the IPng Working Group held an interim meeting on
February 27-28, 1997 to consider adopting the 8+8 proposal. The February 27-28, 1997 to consider adopting the 8+8 proposal. The
meeting, at which over 45 persons attended, was held at Sun meeting, at which over 45 persons attended, was held at Sun
Microsystems' PAL1 facility in Palo Alto, CA. Microsystems' PAL1 facility in Palo Alto, CA.
Shortly before the interim meeting, an updated version of the Shortly before the interim meeting, an updated version of the
Internet-Draft was produced, in which the name of the proposal was Internet-Draft was produced, in which the name of the proposal was
changed from ``8+8'' to ``GSE,'' for the three separate components of changed from "8+8" to "GSE," to identify the three separate
the address: Global, Site and End-System Designator. This last components of the address: Global, site and End-System Designator.
version of the GSE proposal was published as an Informational RFC This last version of the GSE proposal was published as an
[GSE] for historical purposes. Informational RFC [GSE] for historical purposes.
The stated purpose of the meeting was to evaluate the GSE proposal
and make a firm decision to either:
1) Definitely adopt GSE for IPv6,
2) Adopt GSE contingent upon certain other documents being
successfully completed by the April, 1997 IETF, or
3) Definitely don't adopt it. The purpose of the meeting was to evaluate the GSE proposal and
decide whether to adopt it in whole or in part or to reject it.
The well-attended meeting generated high caliber, focused technical The well-attended meeting generated high caliber, focused technical
discussions on the issues involved, with participation by almost all discussions on the issues involved, with participation by almost all
of the attendees. By the middle of the second day there was unanimous of the attendees. By the middle of the second day there was unanimous
agreement by the attendees that the GSE proposal as written presented agreement by the attendees that the GSE proposal as written presented
too many risks and should not be adopted as the basis for IPv6. too many risks and should not be adopted as the basis for IPv6.
However, the attendees also concluded that some of the issues However, the attendees also concluded that some of the issues
discussed in the GSE proposal were equally applicable to the current discussed in the GSE proposal were equally applicable to the current
IPv6 provider-based addressing plan and had enough benefit to warrant IPv6 provider-based addressing plan and had enough benefit to warrant
making changes to some existing IPv6 documents. These changes further consideration apart from the GSE address format. These
include: changes include:
1) Making changes to the IPv6 provider-based addressing document, 1) Making changes to the IPv6 provider-based addressing document to
to facilitate increased aggregation. facilitate increased aggregation.
2) Creating hard boundaries in IPv6 addresses to clearly 2) Creating hard boundaries in IPv6 addresses to clearly
distinguish between the portions used to identify hosts, for distinguish between the portions used for identifying hosts and
routing within a Site, and for routing within the Public for routing.
Internet.
3) Designating the low-order 8 bytes of IPv6 addresses to be a 3) Having an option to indicate that the low-order 8 bytes of an
globally unique End System Designator (ESD). This change has IPv6 address is a globally unique End System Designator (ESD).
potential benefits to future transport protocols (e.g., TCPng). This change has potential benefits to future transport protocols
(e.g., TCPng).
4) Make a clear distinction between the ``locator'' part of an 4) Making a clear distinction between the "locator" part of an
address and the ``identifier'' part of the address. The former address and the "identifier" part of the address. The former is
is used to route a packet to its end point, the latter is used used to route a packet to its end-point, the latter is used to
to identify an end point, independent of the path used to identify an end-point, independent of the path used to deliver
deliver the packet. Although this is a potentially revolutionary the packet.
change to IPv6 addressing model, existing transport protocols
such as TCP and UDP will not take advantage of the split. Future
transport protocols (e.g., TCPng), however, may.
5) Making changes to the way AAAA records are stored within the 5) Making changes to the way AAAA records are stored within the
DNS, so that renumbering a Site (e.g., when a Site changes ISPs) DNS, so that renumbering a site (e.g., when a site changes ISPs)
requires few changes to the DNS database in order to effectively requires few changes to the DNS database in order to effectively
change all of a Site's address AAAA RRs. change all of a site's address AAAA RRs.
The remainder of this document attempts to capture the debate and While this document does contain an analysis of the specific
discussion that led to the above changes. mechanisms of the GSE proposal, much of document's analysis applies
to any proposal in which the identifying and locating properties of
an address (which are combined in IPv4) are split apart into
separable pieces.
2. Addressing and Routing in IPv4 2. Addressing and Routing in IPv4
Before dealing with details of GSE, we present some background about Before dealing with details of GSE, we present some background about
how routing and addressing works in ``classical IP'' (i.e., IPv4). We how routing and addressing works in "classical IP" (i.e., IPv4). We
present this background because the GSE proposal proposes a fairly present this background because the GSE proposal proposes a fairly
major change to the base model. In order to properly evaluate the major change to the base model. In order to properly evaluate the
benefits of GSE, one must understand what problems in IPv4 it alleges benefits of GSE, one must understand what problems in IPv4 it alleges
to improve or fix. to improve or fix.
The structure and semantics of a network layer protocol's addresses The structure and semantics of a network layer protocol's addresses
are absolutely core to that protocol. Addressing substantially are absolutely core to that protocol. Addressing substantially
impacts the way packets are routed, the ability of a protocol to impacts the way packets are routed, the ability of a protocol to
scale and the kinds of functionality higher layer protocols can scale and the kinds of functionality higher layer protocols can
provide. Indeed, addressing is intertwined with both routing and provide. Indeed, addressing is intertwined with both routing and
transport layer issues; a change in any one of these can impact transport layer issues; a change in any one of these can impact
another. Issues of administration and operation (e.g., address another. Issues of administration and operation (e.g., address
allocation and required renumbering), while not part of the pure allocation and required renumbering), while not part of the pure
exercise of engineering a network layer protocol, turn out to be exercise of engineering a network layer protocol, turn out to be
critical to the viability of that protocol in a global and commercial critical to the scalability of that protocol in a global and
network. The interaction between addressing, routing and especially commercial network. The interaction between addressing, routing and
aggregation, is particularly relevant to this document, so some time especially aggregation is particularly relevant to this document, so
will be spent describing it. some time will be spent describing it.
Addresses in IPv4 serve two purposes: Addresses in IPv4 serve two purposes:
1) Unique identification of an interface. That is, the IP address 1) Unique identification of an interface. An IP address by itself
by itself identifies which interface a packet should be identifies which interface a packet should be delivered to.
delivered to.
2) Location information of that interface. Routers extract location 2) Location information of that interface. Routers extract location
information from packets in order to route them towards their information from a packet's destination address in order to
ultimate destination. That is, addresses identify ``where'' the route it towards its ultimate destination. That is, addresses
intended recipient is located within the Internet topology. identify "where" the intended recipient is located within the
Internet topology.
For scalability, the location information contained in addresses
must be aggregatable. In practice, this means nodes
topologically close to each other (e.g., connected to the same
link, residing at the same site, or customers of the same ISP)
must use addresses that share a common prefix.
What is important to note is that these identification and location What is important to note is that these identification and location
requirements have been met through the use of the same value, namely requirements have been met through the use of the same value, namely
the IP address. As will be noted repeatedly in this document, the the IP address. As will be noted repeatedly in this document, the
``over-loading'' of IPv4 addresses with multiple semantics has some "over-loading" of IPv4 addresses with multiple semantics has some
undesirable implications. For example, the embedding of IPv4 undesirable implications. For example, the embedding of IPv4
addresses within transport protocol addresses that identify the end addresses within transport protocol addresses that identify the end-
point of a connection involves those transport protocols with point of a connection couples those transport protocols with routing.
routing. This entanglement is inconsistent with a strictly layered This entanglement is inconsistent with a strictly layered model in
model in which routing would be a completely independent function of which routing would be a completely independent function of the
the network layer and not directly impact the transport layer. network layer and not directly impact the transport layer.
In addition to architectural uncleanliness, combining the locator and Combining locator and identifier functions also has the practical
identifier has the practical impact of complicating the support for impact of complicating the support for mobility. In a mobile
mobility. In a mobile environment, the location of an end-station may environment, the location of an end-station may change even though
change even though its identity stays the same; transport connections its identity stays the same; ideally, transport connections should be
should be able to survive such changes. In IPv4, however, one cannot able to survive such changes. In IPv4, however, one cannot change the
change the locator without also changing the identifier. locator without also changing the identifier. Consequently,
Consequently, conventional wisdom for some time has been that having conventional wisdom for some time has been that having separate
separate values for location and identification could be of values for location and identification could be of significant
significant benefit. The GSE proposal attempts to make such a benefit. The GSE proposal attempts to make such a separation.
separation.
This document frequently uses mobility as an example to demonstrate
the pros and cons of separating the identifier from the locator.
However, the reader should note the fundamental equivalence between
the problems faced by mobile hosts and the problem faced by sites
that change providers yet don't want to be required to renumber their
network. When a site changes providers, it moves (topologically) in
much the same way a mobile node does when it moves from one place to
another. Consequently, techniques that help (or hinder) mobility are
often relevant to the issue of site renumbering.
2.1. The Need for Aggregation 2.1. The Need for Aggregation
IPv4 has seen a number of different addressing schemes. Since the IPv4 has seen a number of different addressing schemes. Since the
original specification, the two major additions have been subnetting original specification, the two major additions have been subnetting
and classless routing. The purpose of adding subnetting was to allow and classless routing. The motivation for adding subnetting was to
a collection of tens or hundreds of networks located at one site to allow a collection of networks located at one site to be viewed from
be viewed from afar as being just one IP network (i.e., to aggregate afar as being just one IP network (i.e., to aggregate all of the
all of the individual networks into one bigger network). A practical individual networks into one bigger network). The practical benefit
benefit of subnetting was that all of a site's hosts, even if of subnetting was that all of a site's hosts, even if scattered among
scattered among tens or hundreds of LANs, could be reached via a tens or hundreds of LANs, could be represented via a single routing
single routing table entry in routers located far from the site. In table entry in routers located far from the site. In contrast, prior
contrast, prior to subnetting, a site with ten LANs might advertise to subnetting, a site with ten LANs would advertise ten separate
ten separate routing table entries to the routing subsystem of the network entries, and all routers would have to maintain ten separate
Internet. entries, even though they contained redundant information..
The benefits of aggregation should be clear. The amount of work The benefits of aggregation should be clear. The amount of work
involved in computing forwarding tables from routing tables is involved in computing forwarding tables from routing tables is
dependent in part on the number of network routes (i.e., dependent in part on the number of network routes (i.e.,
destinations) to which best paths are computed. If each site has 10 destinations) to which best paths are computed. If each site has 10
internal networks, and each of those individual networks is internal networks, and each of those networks is individually
advertised to the global routing subsystem as individual routing advertised to the global routing subsystem, the complexity of
entries, the complexity of computing forwarding tables can easily be computing forwarding tables can easily be an order of magnitude
an order of magnitude greater than if each site advertised just a greater than if each site advertised just a single entry that covered
single route that covered all of the addresses used within the site. all of the addresses used within the site.
2.2. The Pre-CIDR Internet 2.2. The Pre-CIDR Internet
In the early days of the Internet, the Internet's topology and its In the early days of the Internet, the Internet's topology and its
addressing were treated as orthogonal. Specifically, when a site addressing were treated as orthogonal. Specifically, when a site
wanted to connect to the Internet, it approached a centralized wanted to connect to the Internet, it approached a centralized
address allocation authority to obtain an address and then approached address allocation authority to obtain an address and then approached
a provider about procuring connectivity. This procedure for address a provider about procuring connectivity. This procedure for address
allocation resulted in a system where the addresses used by customers allocation resulted in a system where the addresses used by customers
of a certain provider bore little relation to the addresses used by of the same provider bore little relation to the addresses used by
other customers of that provider. In other words, though the topology other customers of that provider. In other words, though the topology
of the Internet was mostly hierarchical, the addressing was not (in a of the Internet was mostly hierarchical (i.e., customers connected to
global sense), and little aggregation of routes took place. An only one provider and the same path was used to reach all customers
example of such a topology and addressing scheme shown in Figure 1. of the same provider), the addressing was not, and little aggregation
of routes took place. An example of such a topology and addressing
scheme shown in Figure 1.
+----------------+ +----------------+
| |------- Customer1 (192.2.2.0) | |------- Customer1 (192.2.2.0)
| |------- Customer2 (128.128.0.0) | |------- Customer2 (128.128.0.0)
| Provider A |------- Customer3 (18.0.0.0) | Provider A |------- Customer3 (18.0.0.0)
| |------- Customer4 (193.3.3.0) | |------- Customer4 (193.3.3.0)
| |------- Customer5 (194.4.4.0) | |------- Customer5 (194.4.4.0)
+----------------+ +----------------+
| |
| |
skipping to change at page 8, line 32 skipping to change at page 8, line 32
Figure 1 shows Provider A having 5 customers, each with their own Figure 1 shows Provider A having 5 customers, each with their own
independently obtained network addresses. Providers A and B connect independently obtained network addresses. Providers A and B connect
to each other. In order for Provider B to be able to send traffic to to each other. In order for Provider B to be able to send traffic to
Customers1-5, Provider A must announce each of the 5 networks to Customers1-5, Provider A must announce each of the 5 networks to
Provider B. That is, the routers within Provider B must have explicit Provider B. That is, the routers within Provider B must have explicit
routing entries for each of Provider A's customers, 5 separate routes routing entries for each of Provider A's customers, 5 separate routes
in Figure 1. in Figure 1.
Experience has shown that this approach scales very poorly. In the Experience has shown that this approach scales very poorly. In the
Default-Free Zone (DFZ) of the Public Internet, where routers must Default-Free Zone (DFZ) of the Public Internet, where routers must
maintain routing tables for all reachable destinations, the cost of maintain routing entries for all reachable destinations, the cost of
computing forwarding tables quickly becomes unacceptable large. A computing forwarding tables quickly becomes unacceptably large. A
large part of the cost is related to the seemingly redundant large part of the cost is related to the seemingly redundant
computations that must be made for each individual network, even computations that must be made for each individual network, even
though the reality is that many reside at the same end site. though the reality is that many reside in the same topological
location (e.g., the same provider). Looking at Figure 1, the problem
is that provider B performs 5 separate calculations to construct the
routing tables needed to reach each of A's customers.
2.3. CIDR and Provider-Based Addressing 2.3. CIDR and Provider-Based Addressing
Classless Inter-Domain Routing (CIDR) and its associated provider- One of the reasons Classless Inter-Domain Routing (CIDR) and its
assigned address allocation policy were introduced (in part) to help associated provider-assigned address allocation policy were
reduce the size of and cost of computing forwarding tables. In CIDR, introduced was to help reduce the size of and cost of computing
sites that want to connect to the Internet approach a provider to forwarding tables. CIDR reduces the cost of computing forwarding
procure both connectivity and a network address; providers have large tables by aggressively aggregating addresses. Aggregating addresses
blocks of address space and assign pieces of them out to customers means structuring them in such a way that the location of the nodes
such that customers of the same provider have addresses with some having those addresses can be represented by a single routing entry.
number of leading bits in common. Note that CIDR started the use of In CIDR, this means that addresses share a common prefix. The common
the term ``prefix'' to refer to a Classless network. The combination prefix provides location information for all addresses sharing that
of CIDR and provider-based addressing results in the ability for a same prefix.
provider to address many hundreds of sites while introducing just
*one* network address into the DFZ global routing system, i.e., In CIDR, sites that want to connect to the Internet approach a
aggregating all of its customers addresses under one prefix. An provider to procure both connectivity and a network address;
example of such a topology and addressing scheme is shown in Figure individual providers have a large block of address space covered by
2. one prefix and assign pieces of their space to customers.
Consequently, customers of the same provider have addresses that
share the same prefix. Note that CIDR started the use of the term
"prefix" to refer to a Classless network. The combination of CIDR and
provider-based addressing results in the ability for a provider to
address many hundreds of sites while introducing just *one* network
address into the global routing system, i.e., aggregating all of its
customers addresses under one prefix. An example of such a topology
and addressing scheme is shown in Figure 2.
+----------------+ +----------------+
| |------- Customer1 (204.1.0.0/19) | |------- Customer1 (204.1.0.0/19)
| |------- Customer2 (204.1.32.0/23) | |------- Customer2 (204.1.32.0/23)
| Provider A |------- Customer3 (204.1.34.0/24) | Provider A |------- Customer3 (204.1.34.0/24)
| |------- Customer4 (204.1.35.0/24) | |------- Customer4 (204.1.35.0/24)
| |------- Customer5 (204.1.36.0/23) | |------- Customer5 (204.1.36.0/23)
+----------------+ +----------------+
| |
| | A announces
| | 204.1/16 to B
| |
+----------------+ +----------------+
| Provider B | | Provider B |
+----------------+ +----------------+
Figure 2 Figure 2
In Figure 2, Provider A has been assigned the classless block, or In Figure 2, Provider A has been assigned the classless block, or
``aggregate,'' 204.1.0.0/16 (i.e., a network prefix with 16 bits for "aggregate," 204.1.0.0/16 (i.e., a network prefix with 16 bits for
the network part and 16 bits for local use). Provider A has 5 the network part and 16 bits for local use). Provider A has 5
customers, each of which has been assigned a (longer) prefix customers, each of which has been assigned a prefix subordinate to
subordinate to the aggregate. Note that unlike the pre-CIDR days of the aggregate. In order for Provider B to be able to reach
``classful addressing'' the amount of address space assigned to a Customers1-5, Provider A need only announce a single prefix,
customer no longer needs to limited to the hard byte-boundary of the 204.1.0.0/16, because that prefix covers all of its customers. The
``classful days'' In order for Provider B to be able to forward benefit for Provider B is that its routers need only a single routing
traffic to Customers1-5, Provider A need only announce a single table entry to reach all of Provider A's customers. Note the
prefix, 204.1.0.0/16, because that prefix covers all of its difference between the cases described in Figures 1 and 2. The
customers. The benefit for Provider B is that its routers need only a important difference in the two Figures is that the latter example
single routing table entry to reach all of Provider A's customers. uses fewer slots in the routing table to reach the same number of
Note the difference between the cases described in Figures 1 and 2. destinations.
The important difference in the two Figures is that the latter
example uses fewer routes to reach the same number of destinations.
CIDR was a critical step for the Internet: in the early 1990s the CIDR was a critical step for the Internet: in the early 1990s the
overhead of computing and constructing forwarding tables in the DFZ size of default-free routing tables required to support the Classful
required to support the Classful Internet was almost more than the Internet was almost more than the commercially-available hardware and
commercially-available hardware and software of the day could handle. software of the day could handle. The introduction of BGP4's
The introduction of BGP4's classless routing and provider-based classless routing and provider-based address allocation policies
address allocation policies resulted in an immediate relief. Having resulted in an immediate relief. Having said that, however, there are
said that, however, there are some weaknesses of the system. First, some weaknesses of the system. First, the Internet addressing model
the internet addressing model shifted from one of ``address owning'' shifted from one of "address owning" to "address lending." In pre-
to ``address lending.'' In pre-CIDR days sites acquired addresses CIDR days sites acquired addresses from a central authority
from a central authority independent of who their network provider independent of who their network provider was, and a site could
was, and a site could assume it ``owned'' the address it was given. assume it "owned" the address it was given. Owning addresses meant
Owning addresses meant that once one had been given a set of network that once one had been given a set of network addresses, one could
addresses, one could always use them and assume that no matter where always use them and assume that no matter where a site connected to
a site connected to the Internet, the prefix for that network could the Internet, the prefix for that network could be injected into the
be injected into the public routing system, and in particular, into public routing system. Today, however, it is simply no longer
the DFZ. With CIDR, however, it is simply no longer possible for each possible for each individual site to have its own private prefix
individual site to have its own private network prefix be injected injected into the DFZ; there would simply be too many of them.
into the DFZ; there would simply be too many of them. Consequently, Consequently, if a site decides to change providers, then it needs to
if a site decides to change providers, then it needs to number itself number itself out of space given to it by the new provider and give
out of space given to it by the new provider and give its old address its old address back to the old provider. To understand this,
back to the old provider. To understand this, consider if, from consider if, from Figure 2, Customer3 changes its provider from
Figure 2, Customer3 changes its provider from Provider A to Provider Provider A to Provider C, but does not renumber. The picture would be
C, but does not renumber. The picture would be as follows: as follows:
+----------------+ +----------------+
| |------- Customer1 (204.1.0.0/19) | |---- Customer1 (204.1.0.0/19)
| |------- Customer2 (204.1.32.0/23) | |---- Customer2 (204.1.32.0/23)
| Provider A | | Provider A |
+------| |------- Customer4 (204.1.35.0/24) +---------------| |---- Customer4 (204.1.35.0/24)
| | |------- Customer5 (204.1.36.0/23) | A announces | |---- Customer5 (204.1.36.0/23)
| +----------------+ | 204.1/16 to B +----------------+
| | | |
+----------------+ | +----------------+ |
| Provider B | | | Provider B | |
+----------------+ | +----------------+ |
| | | |
| +----------------+ | C announces |
+------| Provider C |------- Customer3 (204.1.34.0/24) | 204.1.34/24 |
| to B +----------------+
+---------------| Provider C |---- Customer3 (204.1.34.0/24)
+----------------+ +----------------+
Figure 3 Figure 3
In Figure 3, each of Provider A, B and C are directly connected to In Figure 3, each of Provider A, B and C are directly connected to
the other providers. In order for Provider B to reach Customers 1, 2, each other provider. In order for Provider B to reach Customers 1, 2,
4 and 5, Provider A still only announces the 204.1.0.0/16 aggregate. 4 and 5, Provider A still only announces the 204.1.0.0/16 aggregate.
However, in order for Provider B to reach Customer 3, Provider C must However, in order for Provider B to reach Customer 3, Provider C must
also announce the prefix 204.1.34.0/24. Prefix 204.1.34.0/24 is announce the prefix 204.1.34.0/24. Prefix 204.1.34.0/24 is called a
called a ``more-specific'' of 204.1.0.0/16; another term used is that "more-specific" of 204.1.0.0/16; another term used is that Customer3
Customer3 and Provider C have ``punched a hole in'' Provider A's and Provider C have "punched a hole in" Provider A's block. The
block. The result of this is that from Provider B's view, the result of this is that from Provider B's view, the address space
address space underneath 204.1.0.0/16 is no longer cleanly aggregated underneath 204.1.0.0/16 is no longer cleanly aggregated into a single
into a single prefix and instead the aggregation has been broken prefix and instead the aggregation has been broken because the
because the addressing is inconsistent with the topology; in order to addressing is inconsistent with the topology; in order to maintain
maintain reachability to Customer3, Provider B must carry two reachability to Customer3, Provider B must carry two prefixes where
prefixes where it used to have to carry only one prefix. it used to have to carry only one.
The example in Figure 3 explains why sites must renumber if existing The example in Figure 3 explains why sites must renumber if existing
levels of aggregation are to be maintained. While it is certainly levels of aggregation are to be maintained. While it is certainly
clear that one or two ``exceptions'' to the ideal case can be clear that one or two "exceptions" to the ideal case can be
tolerated, the reality in today's Internet is that there are tolerated, the reality in today's Internet is that there are
thousands of providers, many with thousands of individual customers. thousands of providers, many with thousands of individual customers.
Some renumbering of sites would seem essential to maintain sufficient It is generally accepted that some renumbering of sites is essential
aggregation. for maintaining sufficient aggregation.
The empirical cost of renumbering a site in order to maintain The empirical cost of renumbering a site in order to maintain
aggregation has been the subject of much discussion. The practical aggregation has been the subject of much discussion. The practical
reality, however, is that forcing all sites to renumber is difficult reality, however, is that forcing all sites to renumber is difficult
given the size and wealth of companies that now depend on the given the size and wealth of companies that now depend on the
Internet for running their business. Thus, although the technical Internet for running their business. Thus, although the technical
community came to consensus that address lending was necessary in community came to consensus that address lending was necessary in
order for the Internet to continue to operate and grow, the reality order for the Internet to continue to operate and grow, the reality
has been that some of CIDR's benefits have been lost because sites has been that some of CIDR's benefits have been lost because sites
refuse to renumber. refuse to renumber.
One unfortunate characteristic of CIDR at an architectural level is One unfortunate characteristic of CIDR at an architectural level is
that the pieces of the infrastructure which benefit from the that the pieces of the infrastructure which benefit from the
aggregation (i.e., the providers which constitute the default-free aggregation (i.e., the providers whose major headache is managing
part of the routing infrastructure) are not the pieces that incur the routing table growth in the DFZ) are not the pieces that incur the
cost to achieve the aggregation. The logical corollary of this cost (i.e., the end site). The logical corollary of this statement is
statement is that the pieces of the infrastructure which do incur that the pieces of the infrastructure which do incur cost to achieve
cost to achieve aggregation (e.g., sites which renumber when they aggregation (e.g., sites which renumber when they change providers)
change providers) don't directly see the benefit. The word don't directly see the benefit. (The word "directly" is used here
``directly'' is used here because one could claim that the continued because one could claim that the continued operation of the Internet
operation of the Internet is a benefit, though it is an indirect is a benefit, though it is an indirect benefit and requires
benefit and requires selflessness on the part of the site in order to selflessness on the part of the site in order to recognize it.)
recognize it.
2.4. Multihoming and Aggregation 2.4. Multi-Homing and Aggregation
As sites become more dependent on the Internet, they have begun to As sites become more dependent on the Internet, they have begun to
install additional connections to the Internet to improve robustness install additional connections to the Internet to improve robustness
and performance. Such sites are called ``multi-homed.'' and performance. Such sites are called "multi-homed." Unfortunately,
Unfortunately, when a site connects to the Internet at multiple when a site connects to the Internet at multiple places, the impact
places, the impact can be much like a site that switches providers on routing can be much like a site that switches providers but
but refuses to renumber. refuses to renumber.
In the pre-CIDR days, multi-homed sites were typically known by only In the pre-CIDR days, multi-homed sites were typically known by only
one network address. When that site's providers would announce the one network prefix. When that site's providers announced the site's
site's network into the global routing system, a ``shortest path'' network into the global routing system, a "shortest path" type of
type of routing would occur so that pieces of the Internet closest to routing would occur so that pieces of the Internet closest to the
the first provider would use the first provider and other pieces first provider would use the first provider while other pieces of the
would use the second provider. This allowed sites to deal with the Internet might use the second provider. This allowed sites to use the
load on their multiple connections with the routing system itself. routing system itself to load balance traffic across their multiple
connections. This type of multi-homing assumes that a site's prefix
can be propagated throughout the DFZ, an assumption that is no longer
universally true.
With CIDR, if a multi-homed site is known by a single prefix taken With CIDR, issues of addressing and aggregation complicate matters
from one of its providers, then that prefix is aggregable by the significantly. At the highest levels, there are three possible ways
provider which assigned the address but *not* aggregable by the other to deal with multi-homed sites. The first approach is for multi-
providers. homed sites to receive address space directly from a registry,
independent of its providers. The problem with this approach is
that, because the address space is obtained independent of either
provider, it is not aggregatable and therefore has a negative impact
on the scaling of global routing.
One way to prevent entropy from taking over under CIDR is to have The second approach is for a multi-homed site to receive an
multi-homed sites use address space from all of its providers. Though allocation from one of its providers and just use that single prefix.
this in itself is not so difficult, it changes the way load-sharing The site would advertise its prefix to all of the providers to which
is handled, complicating the engineering by requiring much more it connects. Their are two problems with this is approach. First,
foresight and by introducing complexities like DNS and its caching although the prefix is aggregatable by the provider which made the
system. So, like those sites which refuse to renumber, many multi- allocation, it is not aggregatable by the other providers. To the
homed sites today are known by a single prefix, thus reducing the other providers, the site's prefix poses the same problem as a
efficiency of the global routing system. provider-independent address would. This has a negative impact on
the scaling of global routing. Second, due to CIDR's longest-match
routing rules, it turns out that the site's prefix is not always
aggregable in practice by the provider that made the allocation.
Consider Figure 4. Provider C has two paths for reaching customer 1.
Provider A advertises 204.1/16, which includes customer 1. But
Provider C will also receive an advertisement for prefix 204.1.0/19
from Provider B, and because the prefix match through B is longer, C
will choose that path. In order for Provider C to be able to choose
between the two paths, Provider A would also have to advertise the
longer prefix for 204.1.0/19 in addition to the shorter 204.1/16. At
this point, from the routing perspective, the situation is very
similar to the general problem posed by the use of provider-
independent addresses.
To be clear, there certainly are ways with CIDR for sites to be It should be noted that the above example simplifies a very complex
multi-homed without having a negative impact on the routing issue. For example, consider the example in Figure 4 again. Provider
infrastructure, and there are some sites that do this today. However, A could choose *not* to propagate a route entry for the longer
operational experience to date has shown an unwillingness on the part 2.4.1.0/19 prefix, advertising only the shorter 204.1/16. In such
of most sites to do the work necessary to multi-home in a way that is cases, provider C would always select Provider B. Internally,
CIDR-friendly. Sites have more experience doing load-sharing under Provider A would continue to router traffic from its other customers
the pre-CIDR type of multi-homing than post-CIDR, so this is one to customer 1 directly. If Provider A had a large enough customer
reason for the reluctance. Another reason, however, is that in its base, effective load sharing would achieved.
documentation, CIDR presents several options related to multi-homing,
but it does not choose one option and fully flesh out related details +------------+ +------------+
like load-sharing. While the analysis of GSE will end up showing that _____| Provider A |---| Provider C |
it actually does little to improve the support for multi-homing, it / +------------+ +------------+
should be given great credit for giving the topic significant / 204.1/16 /
attention as a distinguished service from the beginning, rather than / /
as an after-thought. Customer 1 --- / B advertises 204.1.0/19 to C
204.1.0.0/19 | /
| +------------+
----- | Provider B |
+------------+
Figure 4
The third approach is for a multi-homed site to receive an allocation
from each of its providers. This approach has advantages from the
perspective of route scaling because both allocations are
aggregatable. Unfortunately, the approach doesn't necessarily meet
the demands of the multi-homed site. A site that has a prefix from
each of its providers has a number of choices about how to use that
address space. Possibilities include:
1) The site can number a distinct set of hosts out of each of the
prefixes. Consider a configuration where a site is connected to
ISP-A and ISP-B. If the link to ISP-A goes down, then unless the
ISP-A prefix is announced to ISP-B (which breaks aggregation),
the hosts numbered out of the ISP-A prefix would be unreachable.
2) The site could assign each host multiple addresses (i.e., one
address for each ISP connection). There are two problems with
this. First, it accelerates the consumption of the address
space. Second, when the connection to ISP-A goes down,
addresses numbered out of ISP-A's space become unreachable.
Remote peers would have to have sufficient intelligence to use
the second address. For example, when initiating a connection to
a host, the DNS would return multiple candidate addresses.
Clients would need to try them all before concluding that a
destination is unreachable (something not all hosts currently
do). In addition, a site's hosts would need a significant amount
of intelligence for choosing the source addresses they use. A
host shouldn't choose a source address corresponding to a
addresses that are not reachable from the Public Internet. At
present, hosts do not have such sophistication.
In summary, how best to achieve multi-homing with IPv4 in the face of
CIDR is an unsolved problem. There is a delicate balance between the
scalability of routing versus the site's requirements of robustness
and load-sharing. At this point in time, no solution has been
discovered that satisfies the competing requirements of route scaling
and robustness/performance. It is worth noting, however, that some
people are beginning to study the issue more closely and propose
novel ideas [BATES].
3. GSE Background 3. GSE Background
This section provides background information about GSE with the This section provides background information about GSE with the
intent of making this document stand-alone with respect to the GSE intent of making this document stand-alone with respect to the GSE
``specification.'' Additional details on GSE can be found in [GSE]. "specification." Additional details on GSE can be found in [GSE].
First the motivations behind GSE will be discussed, then the We begin by reviewing the motivation for GSE. Next we review the
important technical details will be described and finally some salient technical details, and we conclude by listing the explicit
explicit non-goals will be listed. non-goals of the GSE proposal.
3.1. Motivation For GSE 3.1. Motivation For GSE
The primary motivation for GSE is the fact that the chief IPv6 global The primary motivation for GSE is the fact that the chief IPv6 global
unicast address structure, provider-based, is fundamentally the same unicast address structure, provider-based [RFC 2073], is
as IPv4 with CIDR and provider-based aggregation. Many people are not fundamentally the same as IPv4 with CIDR and provider-based
satisfied with the scaling factors achieved with CIDR and provider- aggregation. Provider-based addressing requires that sites renumber
based aggregation and think that better solutions can, and in fact when they switch providers, so that sites are always aggregated
must, be found. The GSE draft asserts that IPv4 with CIDR has not within their provider's prefix. In practice, the cost of renumbering
achieved the aggressive aggregation required for the route (which can only grow as a site grows in size and becomes more
computation functions of the default-free zone of the Internet to dependent on the Internet for day-to-day business) is high enough
scale for IPv4, and that the larger addresses of IPv6 simply that an increasing number of sites refuse to renumber. This cost is
exacerbate the problem. particularly relevant in cases where end-users are asked to renumber
because an upstream provider has changed its transit provider (i.e.,
the end site is asked to renumber for reasons outside of its control
and for which it sees no direct benefit). Consequently, The GSE
draft asserts that IPv4 with CIDR has not achieved the aggressive
aggregation required for the route computation functions of the
default-free zone of the Internet to scale for IPv4, and that the
larger addresses of IPv6 simply exacerbate the problem.
More importantly, a key aspect of provider-based aggregation is its The GSE proposal does not propose to eliminate the need for
requirement that end sites be renumbered in response to topological renumbering. Indeed, it asserts that end sites will have to be
changes (e.g., when an end site switches ISPs). The GSE proposal renumbered more frequently in order to continue scaling the Internet.
asserts that acceptable aggregation can continue only if renumbering However, GSE proposes to make the cost of such a renumbering so
is forced, but the future viability of forced renumbering is unclear small, that sites could be renumbered at essentially any time with
given the increasing dependence on the Internet by litigious only minor disruption to the site.
commercial organizations. This fact is particularly relevant in cases
where end-users are asked to renumber because an upstream provider
has changed its transit provider (i.e., the end site is forced to
renumber by forces outside of its control).
Finally, GSE deals significantly with Sites that have multiple Finally, GSE deals significantly with sites that have multiple
Internet connections. In some addressing schemes (e.g., CIDR), this Internet connections. In some addressing schemes (e.g., CIDR), this
``multi-homing'' can create exceptions to the aggregation and result "multi-homing" can create exceptions to the aggregation and result in
in poor scaling. That is, the public routing infrastructure needs to poor scaling. That is, the public routing infrastructure needs to
carry multiple distinct routes for the multi-homed site, one for each carry multiple distinct routes for the multi-homed site, one for each
independent path. GSE recognizes the ``special work done by the independent path. GSE recognizes the "special work done by the global
global Internet infrastructure on behalf of multi-homed sites,'' Internet infrastructure on behalf of multi-homed sites," [GSE] and
[GSE] and proposes a way for multi-homed Sites to gain some benefit proposes a way for multi-homed sites to gain some benefit without
without impacting global scaling. This includes a specific mechanism impacting global scaling. This includes a specific mechanism that
that providers could use to support multi-homed Sites, presumably at providers could use to support multi-homed sites, presumably at a
a cost that the Site would consider when deciding whether or not to cost that the Site would consider when deciding whether or not to
become multi-homed. become multi-homed.
3.2. GSE Address Format 3.2. GSE Address Format
The key departure of GSE from classical IP addressing (both v4 and The key departure of GSE from classical IP addressing (both v4 and
v6) is that rather than over-loading addresses with both locator and v6) is that rather than over-loading addresses with both locator and
identifier purposes, it splits the address into two elements: the identifier purposes, it splits the address into two elements: the
high-order 8 bytes for routing (called ``Routing Stuff'' throughout high-order 8 bytes for routing (called "Routing Stuff" throughout the
the rest of this document) and the low-order eight bytes for unique rest of this document) and the low-order 8 bytes for unique
identification of an end point. The structure of GSE addresses is: identification of an end-point. The structure of GSE addresses is:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| Routing Goop | STP| End System Designator | | Routing Goop | STP| End System Designator |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
6+ bytes ~2 bytes 8 bytes 6+ bytes ~2 bytes 8 bytes
Figure 4 Figure 5
3.3. Routing Stuff (RG and STP) 3.3. Routing Stuff (RG and STP)
The Routing Goop (RG) describes the place in the Internet topology The Routing Goop (RG) identifies the place in the Public Internet
where a Site connects, so it is used to route datagrams to the Site. topology where a Site connects and is used to route datagrams to the
RG is structured as follows: Site. RG is structured as follows:
1 2 3 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| xxx | 13 Bits of LSID | Upper 16 bits of Goop | | xxx | 13 Bits of LSID | Upper 16 bits of Goop |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
3 4 3 4
2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Bottom 18 bits of Routing Goop | | Bottom 18 bits of Routing Goop |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5 Figure 6
The RG describes the location of a Site's connection by identifying The RG describes the location of a Site's connection by identifying
smaller and smaller regions of topology until finally the single link smaller and smaller regions of topology until finally it identifies a
is identified. Before interpreting the bits in the RG, it is single link to which the site. Before interpreting the bits in the
important to understand that routing with GSE depends on decomposing RG, it is important to understand that routing with GSE depends on
the Internet's topology into a specific graph. At the highest level, decomposing the Internet's topology into a specific graph. At the
the topology is broken into Large Structures (LSs). An LS is highest level, the topology is broken into Large Structures (LSs). An
basically a region that can aggregate significant amounts of LS is basically a region that can aggregate significant amounts of
topology. Examples of potential LSs are large providers and exchange topology. Examples of potential LSs are large providers and exchange
points. Within an LS the topology is further divided into another points. Within an LS the topology is further divided into another
graph of structures, with each LS dividing itself however it sees graph of structures, with each LS dividing itself however it sees
fit. This division of the topology into smaller and smaller fit. This division of the topology into smaller and smaller
structures can recurse for a number of levels, where the trade-off is structures can recurse for a number of levels, where the trade-off is
``between the flat-routing complexity within a region and minimizing "between the flat-routing complexity within a region and minimizing
total depth of the substructure.'' [ESD] total depth of the substructure." [ESD]
Having described the decomposition process, we can now examine the Having described the decomposition process, we can now examine the
bits in the RG. After the 3-bit prefix identifying the address as bits in the RG. After the 3-bit prefix identifying the address as
GSE, the next 13 bits identify the LS. By limiting the field to 13 GSE, the next 13 bits identify the LS. By limiting the field to 13
bits, a ceiling is defined on the complexity of the top-most routing bits, a ceiling is defined on the complexity of the top-most routing
level. In the next 34 bits, a series of subordinate structure(s) are level. In the next 34 bits, a series of subordinate structure(s) are
identified until finally the leaf subordinate structure is identified until finally the leaf subordinate structure is
identified, at which point the remaining bits identify the individual identified, at which point the remaining bits identify the individual
link within that leaf structure. The remaining 14 bits are used for link within that leaf structure. The remaining 14 bits of the Routing
routing structure within a Site, similar to subnetting with IPv4, Stuff comprise the STP and are used for routing structure within a
though these bits are *not* part of the Routing Goop. The distinction Site, similar to subnetting with IPv4, though these bits are *not*
between Routing Stuff and Routing Goop is that RG controls routing in part of the Routing Goop. The distinction between Routing Stuff and
transit networks, while Routing Stuff includes the RG plus the Site Routing Goop is that RG controls routing in the Public Internet,
Topology Partition (STP). The STP is used for routing structure while Routing Stuff includes the RG plus the Site Topology Partition
within a Site. (STP). The STP is used for routing structure within a Site.
The GSE proposal formalizes the ideas of Sites and of public versus The GSE proposal formalizes the ideas of sites and of public versus
private topology. In the first case, a Site is a set of hosts, private topology. In the first case, a Site is a set of hosts,
routers and media which have one or more connections to the Internet. routers and media which have one or more connections to the Internet.
A Site can have an arbitrarily complicated topology, but all of that A Site can have an arbitrarily complicated topology, but all of that
complexity is hidden from everyone outside of the Site. A Site only complexity is hidden from everyone outside of the Site. A Site only
carries packets which originated from, or are destined to, that Site; carries packets which originated from, or are destined to, that Site;
in other words, a Site cannot be a transit network. A Site is private in other words, a Site cannot be a transit network. A Site is private
topology, while the transit networks form the public topology. topology, while the transit networks form the public topology.
[Editorial note: An attempt was made to capitalize ``Site'' when
assuming the GSE model and lower-case ``site'' when referring to the
less formal idea of a site in IPv4.]
A datagram is routed through public topology using just the RG, but A datagram is routed through public topology using just the RG, but
within the destination Site routing is based on the Site Topology within the destination Site routing is based on the Site Topology
Partition (STP) field. Partition (STP) field.
3.4. End-System Designator 3.4. End-System Designator
The End-System Designator (ESD) is an unstructured 8-byte field that The End-System Designator (ESD) is an unstructured 8-byte field that
uniquely identifies that end-system from all others. The leading uniquely identifies that interface from all others. The most
contender for the role of a 64-bit globally unique ESD is the important feature of the ESD is that it alone identifies an end
recently defined ``EUI-64'' identifier [EUI64]. These identifiers point; the Routing Stuff portion of an address, although used to help
consist of a 24-bit ``company_id'' concatenated with a 40-bit deliver a packet to its destination, is not used to actually identify
``extension.'' (Company_id is just a new name for the an end point. End-points of communication care about the ESD; as
Organizationally Unique Identifier (OUI) that forms the first half of examples, TCP peers could be identified by the source and destination
an 802 MAC address.) Manufacturers are expected to assign locally ESDs alone (together with port numbers), checksums would exclude the
unique values to the extension field, guaranteeing global uniqueness RG (the sender doesn't know its RG, so can't include it in the
for the complete 64-bit identifier. checksum), and on receipt of a datagram only the ESD would be used in
testing whether a packet is intended for local delivery.
The leading contender for the role of a 64-bit globally unique ESD is
the recently defined "EUI-64" identifier [EUI64]. These identifiers
consist of a 24-bit "company_id" concatenated with a 40-bit
"extension." (Company_id is just a new name for the Organizationally
Unique Identifier (OUI) that forms the first half of an 802 MAC
address.) Manufacturers are expected to assign locally unique values
to the extension field, guaranteeing global uniqueness for the
complete 64-bit identifier.
A range of the EUI-64 space is reserved to cover pre-existing 48-bit A range of the EUI-64 space is reserved to cover pre-existing 48-bit
MAC addresses, and a defined mapping insures that an ESD derived from MAC addresses, and a defined mapping insures that an ESD derived from
a MAC address will not duplicate the ESD of a device that has a a MAC address will not duplicate the ESD of a device that has a
built-in EUI-64. The mapping of MAC addresses into EUI-64 built-in EUI-64.
identifiers is as follows: a 48-bit MAC address xx-xx-xx-yy-yy-yy is
mapped into the 64-bit EUI-64 identifier xx-xx-xx-FF-FE-yy-yy-yy.
The existence of the reserved range and defined mapping of 48-bit MAC
addresses makes the EUI-64 a seemingly ideal ESD candidate. ESDs
derived from existing IEEE MAC addresses should be compatible with
future network media that use EUI-64 as station identifiers (e.g.,
FireWire, Futurebus+, SCI).
In some cases, interfaces may not have access to an appropriate MAC In some cases, interfaces may not have access to an appropriate MAC
address or EUI-64 identifier. A globally unique ESD must then be address or EUI-64 identifier. A globally unique ESD must then be
obtained through some alternate mechanism. Any organization obtained through some alternate mechanism. Several possible
possessing a valid company_id could sell identifiers out of its mechanisms can be imagined (e.g., the IANA could hand out addresses
allocation, though there may be a requirement that EUI-64's be sold from the company id assigned it has been allocated), but we do not
only in the form of an electronically-readable part. The IANA, to explore them in detail here.
choose one example, could generate identifiers using its company_id
(00005E hex).
Another scheme for an IETF-specific ESD space would be to use one or 3.5. Address Rewriting by Border Routers
both of the two lowest-order bits of the first octet as a flag. In a
company_id, those two bits are reserved for use as the Global/Local
and Individual/Group bits, so if either is set, lack of conflict with
an EUI-64 would seem to be assured.
The most important feature of the ESD is its global uniqueness. End- GSE Site border routers rewrite addresses of the packets they forward
points of communication would only care about the ESD; as examples, across the Site/Public Topology boundary. Within a Site, nodes need
TCP peers could be identified by the ESD alone, checksums would not know the RG associated with their addresses. They simply use a
exclude the RG (the sender doesn't know its RG, so can't include it designated "Site-Local RG" value for internal addresses. When a
in the checksum), and on receipt of a datagram only the ESD would be packet is forwarded to the Public Topology, the border router
used in testing whether a packet is intended for local delivery. replaces the Site-Local RG portion of packet's source address with an
appropriate value. Likewise, when a packet from the Public Topology
is forwarded into a Site, the border router replaces the RG part of
the destination address with the designated Site-Local RG.
3.5. Address Rewriting by Border Routers To simplify discussion, the following discussion uses the singular
term RG as if a site could have only one RG value (i.e., one
connection to the Public Internet). Of course, a site could have
multiple Internet connections and consequently multiple RGs.
Another fundamental aspect of GSE is that Site border routers rewrite Having border routers rewrite addresses obviates the need to renumber
addresses of the packets they forward across the Site/Internet devices within sites because of changing providers --- GSE's approach
boundary. Within a Site, nodes need not know the RG associated with isn't so much to ease renumbering as to make it transparent to end
their addresses. They simply use a designated ``Site-Local RG'' value sites. To achieve transparency, the RG by which a Site is known is
for internal addresses. When a packet is forwarded to the Public hidden (i.e., kept secret) from hosts or routers within that Site.
Internet, the border router inserts the appropriate RG into the Instead, the RG for the Site would be known only by the exit router,
packet's source address. Likewise, when a packet from the Public either through static configuration or through a dynamic protocol
Internet is forwarded into a Site, the border router replaces the RG with an upstream provider.
part of the destination address with the designated Site-Local RG.
Having border routers rewrite addresses obviates the need for end Because end-hosts don't know their RG, they don't know their entire
Sites to renumber --- GSE's approach isn't so much to ease 16-byte public address, so they can't specify the full address in the
renumbering as to simply make it completely transparent to end Sites. source fields of packets they originate. Consequently, when a
To achieve transparency, the RG(s) by which a Site is known would datagram leaves a Site, the egress border router fills in the high-
*not* be known to hosts or routers within the core of that Site. order portion of the source address with the appropriate RG.
(Note: RG can be plural in the previous sentence because multi-homed
Sites are known by multiple RGs.) Instead, the RG(s) for the Site
would be known only by the exit router(s), either through static
configuration or through a dynamic protocol with the upstream
provider. Because end-hosts don't know their RG(s), they don't know
their entire 16-byte address(es), so they can't specify the full
address in the source fields of packets they originate. Consequently,
when datagram leaves a Site, the egress border router fills in the
high-order portion of the source address with the appropriate RG.
The point of keeping the RG hidden from nodes within the core of a The point of keeping the RG hidden from nodes within the core of a
Site is to ensure the changeability of this value without impacting Site is to insure the changeability of this value without impacting
the Site itself. It is expected that the RG will need to change the Site itself. It is expected that the RG will need to change
relatively frequently (e.g., several times a year) in order to relatively frequently (e.g., several times a year) in order to
support scalable aggregation as the topology of the Public Internet support scalable aggregation as the topology of the Public Internet
changes. A change to a Site's RG would only require a change at the changes. A change to a Site's RG would only require a change at the
Site's egress point (or points, in the case of a multi-homed Site); Site's egress point (or points, in the case of a multi-homed Site);
and it's well possible that this change would be accomplished through and it's well possible that this change would be accomplished through
a dynamic protocol with the upstream provider. a dynamic protocol with the upstream provider.
Hiding a Site's RG from its internal nodes does not, however, mean Hiding a Site's RG from its internal nodes does not, however, mean
that changes to RG have no impact on end Sites. Since the full 16- that changes to RG have no impact on end sites. Since the full 16-
byte address of a node isn't a stable value (the RG portion can byte address of a node isn't a stable value (the RG portion can
change), a stored address may contain invalid RG and be unusable if change), a stored address may contain invalid RG and be unusable if
it isn't ``refreshed'' through some other means. For intra-Site it isn't "refreshed" through some other means. For example, opening a
communication, however, it is expected that only the Site-Local RG TCP connection, writing the address of the peer to a file and then
would be used (and stored) which would always work for intra-Site later trying to reestablish a connection to that peer is likely to
communication regardless of changes to the Site's external RG. fail. For intra-Site communication, however, it is expected that
only the Site-Local RG would be used (and stored) which would
continue to work for intra-Site communication regardless of changes
to the Site's external RG. This has the benefit of shielding a site's
internal traffic from the affects of renumbering changes outside of
the site.
In addition to rewriting source addresses upon leaving a Site, In addition to rewriting source addresses upon leaving a Site,
destination addresses are rewritten upon entering a Site. To destination addresses are rewritten upon entering a Site. To
understand the motivation behind this, consider a Site with three understand the motivation behind this, consider a Site with
connections. Because each of those connections has its own RG, each connection to three Internet providers. Because each of those
destination within the Site would be known by three different 16-byte connections has its own RG, each destination within the Site would be
addresses. As a result, intra-Site routers would have to carry a known by three different 16-byte addresses. As a result, intra-Site
routing table three times larger than expected. Instead, GSE proposes routers would have to carry a routing table three times larger than
replacing the RG in inbound packets with the special ``Site-local expected. Instead, GSE proposes replacing the RG in inbound packets
RG'' value to reduce intra-Site routing tables to the minimum with the special "Site-local RG" value to reduce intra-Site routing
necessary. tables to the minimum necessary.
To be clear, when a node initiates a flow to a node in another Site, In summary, when a node initiates a flow to a node in another Site,
the initiating node knows the full 16-byte address for the the initiating node knows the full 16-byte address for the
destination through some mechanism like a DNS query. So this destination through some mechanism like a DNS query. The initiating
initiating node places the full 16-byte address in the destination node places the full 16-byte address in the destination address field
address field of the datagram, and that field stays in tact through of the datagram, and that field stays intact through the first Site
the first Site and through all of the Public Topology; it is only and through all of the Public Topology. When the datagram reaches
when the datagram arrives at the destination Site that the RG portion the exit border router, the router replaces the RG of the packet's
of the destination address is rewritten with the distinguished source address. When the datagram arrives at entry router at the
``Site-Local RG'' value. When the destination host needs to send destination Site, the router replaces the RG portion of the
return traffic, that host will also know the full 16-byte address for destination address with the distinguished "Site-Local RG" value.
the destination because it appeared in the source address field of When the destination host needs to send return traffic, that host
the arriving packet. knows the full 16-byte address for the destination because it
appeared in the source address field of the arriving packet.
3.6. Renumbering and Rehoming Mid-Level ISPs 3.6. Renumbering and Rehoming Mid-Level ISPs
One of the most difficult-to-solve components of the renumbering One of the most difficult-to-solve components of the renumbering
problem is that of renumbering mid-level service providers. problem is that of renumbering mid-level service providers.
Specifically, if SmallISP1 changes its transit provider from BigISP1 Specifically, if SmallISP1 changes its transit provider from BigISP1
to BigISP2, then all of SmallISP1's customers would have to renumber to BigISP2 (in the CIDR model), then all of SmallISP1's customers
into address space covered by an aggregate of BigISP2 (if the overall would have to renumber into address space covered by an aggregate of
size of routing tables is to stay the same). GSE deals with this BigISP2 (if the overall size of routing tables is to stay the same).
problem by handling the RG in DNS with indirection. Specifically, a GSE deals with this problem by handling the RG in DNS with
Site's DNS server specifies the RG portion of its addresses by indirection. Specifically, a Site's DNS server specifies the RG
referencing the *name* of its immediate provider, which is a portion of its addresses by referencing the *name* of its immediate
resolvable DNS name (this obviously implies a new Resource Record provider, which is a resolvable DNS name (this obviously implies a
type). That provider may define some of the low-order bits of the RG new Resource Record type). That provider may define some of the low-
and then reference its immediate provider. This chain of reference order bits of the RG and then reference its immediate provider. This
allows mid-level service providers to change transit providers, and chain of reference allows mid-level service providers to change
the customers of that mid-level will simply ``inherit'' the change in transit providers, and the customers of that mid-level will simply
RG. "inherit" the change in RG.
3.7. Support for Multihomed Sites 3.7. Support for Multi-Homed Sites
GSE defines a specific mechanism for providers to use to support GSE defines a specific mechanism for providers to use to support
multi-homed customers that gives those customers more reliability multi-homed customers that gives those customers more reliability
than singly-homed Sites, but without a negative impact on the scaling than singly-homed sites, but without a negative impact on the scaling
of global routing. This mechanism is not specific to GSE and could be of global routing. This mechanism is not specific to GSE and could be
applied to any multi-homing scenario where a site is known by applied to any multi-homing scenario where a site is known by
multiple prefixes (including provider-based addressing). Assume the multiple prefixes (including provider-based addressing). Assume the
following topology: following topology:
Provider1 Provider2 Provider1 Provider2
+------+ +------+ +------+ +------+
| | | | | | | |
| PBR1 | | PBR2 | | PBR1 | | PBR2 |
+----x-+ +-x----+ +----x-+ +-x----+
| | | |
RG1 | | RG2 RG1 | | RG2
| | | |
+--x-----------x--+ +--x-----------x--+
| SBR1 SBR2 | | SBR1 SBR2 |
| | | |
+-----------------+ +-----------------+
Site Site
Figure 6 Figure 7
PBR1 is Provider1's border router while PBR2 is Provider2's border PBR1 is Provider1's border router while PBR2 is Provider2's border
router. SBR1 is the Site's border router that connects to Provider1 router. SBR1 is the Site's border router that connects to Provider1
while SBR2 is the Site's border router that connects to Provider2. while SBR2 is the Site's border router that connects to Provider2.
Imagine, for example, that the line between Provider1 and the Site Imagine, for example, that the line between Provider1 and the Site
goes down. Any already existing flows that use a destination address goes down. Any already existing flows that use a destination address
including RG1 would stop working. In addition, any DNS queries that including RG1 would stop working. In addition, any DNS queries that
return addresses including RG1 would not be viable addresses. If PBR1 return addresses including RG1 would not be viable addresses. If PBR1
and PBR2 knew about each other, however, then in this case PBR1 could and PBR2 knew about each other, however, then in this case PBR1 could
tunnel packets destined for RG1-prefixed addresses to PBR2, thus tunnel packets destined for RG1-prefixed addresses to PBR2, thus
keeping the communication working. keeping the communication working. (Note that true tunneling, i.e.,
re-encapsulation, is necessary since routers between PBR1 and PBR2
would forward RG1 addresses towards PBR1.)
3.8. Explicit Non-Goals for GSE 3.8. Explicit Non-Goals for GSE
It is worth noting explicitly that GSE does not attempt to address It is worth noting explicitly that GSE does not attempt to address
the following issues: the following issues:
1) Survival of TCP connections through renumbering events. If a 1) Survival of TCP connections through renumbering events. If a
Site is renumbered, TCP connections using a previous address Site is renumbered, TCP connections using a previous address
will continue to work only as long as the previous address still will continue to work only as long as the previous address still
works (i.e., while it is still "valid" using RFC 1971 works (i.e., while it is still "valid" using RFC 1971
terminology). No attempt is made to have existing connections terminology). No attempt is made to have existing connections
switch to the new address. switch to the new address.
2) It is not known how mobility can be made to work under GSE. 2) It is not known how mobility can be made to work under GSE.
3) It is not known how multicast can be made to work under GSE. 3) It is not known how multicast can be made to work under GSE.
4) It is not known whether the performance cost of having routers 4) The performance impact of having routers rewrite portions of the
rewrite portions of the source and destination address in packet source and destination address in packet headers requires
headers is acceptable. further study.
That GSE doesn't address the above does not mean they cannot be That GSE doesn't address the above does not mean they cannot be
solved. Rather the issues haven't been studied in sufficient depth. solved. Rather the issues haven't been studied in sufficient depth.
4. Analysis of GSE's Advantages and Disadvantages 4. Analysis of GSE's Advantages and Disadvantages
This section contains the bulk of the GSE analysis and the analysis
of the general locator/identifier split.
4.1. End System Designator 4.1. End System Designator
4.1.1. IP Addresses in the IPv4 Internet 4.1.1. Uniqueness Enforcement in the IPv4 Internet
As described earlier, in the IPv4 Public Internet, IP addresses As described earlier, in the IPv4 Public Internet, IP addresses
contain two pieces of information: a unique identifier and a locator. contain two pieces of information: a unique identifier and a locator.
A key aspect of the embedded location information is that it must be Embedding location information within an address has the side-effect
aggregable, so that a single routing table entry can cover many of helping insure that all addresses are globally unique. If
destination addresses. In practice, this means that sites that are interfaces on two different nodes are assigned the same unicast
topologically close to each other must share a common prefix, as
exemplified in provider-based addressing [RFC 2073] and CIDR
[RFC1817]. Without sufficient aggregation, routing in the Public
Internet can not scale [RFC2008].
Note that embedding location information within an address has the
side-effect of helping ensure that all addresses are globally unique.
If interfaces on two different nodes are assigned the same unicast
address, the routing subsystem will (generally) deliver packets to address, the routing subsystem will (generally) deliver packets to
only one of those nodes. The other node will quickly realize that only one of those nodes. The other node will quickly realize that
something is wrong (since communication using the duplicate address something is wrong (since communication using the duplicate address
fails) and take corrective action (e.g., obtain a proper address). fails) and take corrective action (e.g., obtain a proper address).
This is important for two reasons. It helps detect misconfigurations
(use of the wrong address prevents communication from taking place),
and helps thwart intruders.
In IPv4, communication usually fails quickly when addresses are not
unique. There are two cases to consider, depending on whether the two
interfaces assigned duplicate addresses are attached to the same or
to different links.
When two interfaces on the same link use the same address, a node
(host or router) sending traffic to the duplicate address will in
practice send all packets to one of the nodes. On Ethernets, for
example, the sender will use ARP (or Neighbor Discovery in IPv6) to
determine the link layer address corresponding to the destination
address. When multiple ARP replies for the target IP address are
received, the most recently received response replaces whatever is
already in the cache. Consequently, the destinations a node using a
duplicate IP address can communicate with depends on what its
neighboring nodes have in their ARP caches. In most cases, such
communication failures become apparent relatively quickly, since it
is unlikely that communication can proceed correctly on both nodes.
It is also the case that a number of ARP implementations (e.g., BSD-
derived implementations) log warning messages when an ARP request is
received from a node using the same address as the machine receiving
the ARP request.
When two interfaces on different links use the same address, the
routing subsystem will generally deliver packets to only one of the
nodes because only one of the links has the right "prefix" or "subnet
part" corresponding to the IP address. Consequently, the node using
the address on the "wrong" link will generally never receive any
packets sent to it and will be unable to communicate with anyone. For
obvious reasons, this condition is usually detected quickly.
An important observation is that, with classical IP, when different
nodes mistakenly assign the same IP address to different interfaces,
problems become apparent relatively quickly because communication
with several (if not all) destinations fails. In contrast, failure
scenarios differ when globally unique ESDs are assumed, but two nodes
mistakenly select the same one.
Embedding location information within an address also provides some, Embedding location information within an address also provides some,
though not much, protection from forged addresses. Although it is though not much, protection from forged addresses. Although it is
trivial to forge a source address in today's Internet, the routing trivial to forge a source address in today's Internet, the routing
subsystem will in most cases forward any return traffic sent to that subsystem will in most cases forward any return traffic sent to that
address to its proper destination --- not to an arbitrary node address to its proper destination --- not to an arbitrary node
masquerading as someone else. To masquerade as someone else requires masquerading as someone else. To masquerade as someone else requires
subverting the routing subsystem, placing the intruder somewhere on subverting the routing subsystem, placing the intruder somewhere on
the normal routing path between the masqueraded host and its peer, the normal routing path between the masqueraded host and its peer,
etc. Allowing the Routing Stuff and ESD portions of an address to be etc.
changed independent of each other potentially increases the ease with
which packets intended for a particular ESD can be misrouted or
hijacked elsewhere. As discussed in later sections, additional checks
must be made to reduce the threat of hijacking.
4.1.2. Overloading Addresses: Network Layer Issues 4.1.2. Overloading Addresses: Network Layer Issues
Embedding location information within an address has some important At the network layer, a node compares the destination address of
consequences. At the network layer, a node compares the destination received packets against the addresses of its attached interfaces.
address of received packets against the addresses of its attached Only if the addresses of received packets match are packets handed up
interfaces. Only if the addresses of received packets match are to higher layer protocols. In IPv4, the entire address must match.
packets handed up to higher layer protocols. The entire address Otherwise, the packet is assumed to be intended for some other node
(including the Routing Stuff part) must match. Otherwise, the packet and forwarded on (if received by a router) or silently discarded (if
is assumed to be intended for some other node and forwarded on (if received by a host). This has subtle but significant implications:
received by a router) or silently discarded (if received by a host).
This has subtle but significant implications:
1) If a receiving host has multiple interfaces, it has multiple IP 1) If a receiving host has multiple interfaces, it has multiple IP
addresses. When a packet addressed to a multi-homed host is addresses. When a packet addressed to a multi-homed host is
received on an interface other than the one to which a packet is received on an interface other than the one to which a packet is
addressed, the host may reject (i.e., silently discard) the addressed, the host may reject (i.e., silently discard) the
packet, if it implements the ``Strong ES Model'' defined in packet, if it implements the "Strong ES Model" defined in
[RFC1122]. [RFC1122].
2) In IPv6 (and recent IPv4 stacks), an interface may have more 2) In recent IPv4 stacks, an interface may have more than one
than one unicast IP address assigned to it. Indeed, one way to unicast IP address assigned to it. Indeed, one way to renumber
renumber an end site is to phase out an address (i.e., an end site is to phase out an address (i.e., "deprecate" it
"deprecate" it using RFC 1971 termininology) while using RFC 1971 terminology) while simultaneously phasing in a
simultaneously phasing in a new one. Once the deprecated address new one. Once the deprecated address becomes invalid, packets
becomes invalid, packets sent to the invalid address will no sent to the invalid address will no longer be accepted by the
longer be accepted by the node, even though the packet may have node, even though the packet may have intuitively reached its
intuitively reached its intended recipient. Thus, even if a intended recipient. Thus, even if a packet sent to an invalid
packet sent to an invalid address is somehow delivered to the address is somehow delivered to the intended recipient (e.g.,
intended recipient (e.g., via tunneling), the receiver would via tunneling), the receiver would reject the packet because the
reject the packet because the address it was sent to no longer address it was sent to no longer belongs to any of the node's
belongs to any of the node's interfaces. Consequently, any interfaces. Consequently, any communication using the invalid
communication using the invalid address will fail (e.g., new and address will fail (e.g., new and existing TCP connections).
existing TCP connections). Anyone wishing to communicate with Anyone wishing to communicate with the node must learn and
the node must learn and switch to the new address. switch to the new address.
3) Because an address also indicates ``where'' the destination 3) Because an address also indicates "where" the destination
resides within the Internet, a mobile node that moves from one resides within the Internet, a mobile node that moves from one
part of the Internet to another must obtain a new address that part of the Internet to another must obtain a new address that
reflects its new location. Moreover, the routing subsystem will reflects its new location. Moreover, the routing subsystem will
continue to forward packets sent to the mobile node's previous continue to forward packets sent to the mobile node's previous
address to the node's previous point of attachment where they address to the node's previous point of attachment where they
are likely be discarded. That is, even if a mobile node is are likely be discarded. That is, even if a mobile node is
willing to continue accepting packets addressed to one its willing to continue accepting packets addressed to one its
previous addresses, it is unlikely that they will be received previous addresses, it is unlikely that they will be received
(in the absence of something like Mobile IP [RFC2002]). (in the absence of something like Mobile IP [RFC2002]).
4) A multi-homed host has multiple interfaces, each with its own 4) A multi-homed host has multiple interfaces, each with its own
address(es). If one of its interfaces fails, packets could, in address(es). If one of its interfaces fails, packets could, in
theory, be delivered to one of the host's other interfaces. In theory, be delivered to one of the host's other interfaces. In
practice, however, the routing subsystem has no way of knowing practice, however, the routing subsystem has no way of knowing
that the interface to which a packet is addressed has failed and that the interface to which a packet is addressed has failed and
what alternate interface addresses the packet could be delivered what alternate interface addresses the packet could be delivered
to. Consequently, packets sent to a multi-homed host won't be to. Consequently, packets sent to a failed interface of a
delivered to the intended recipient, even though the node is multi-homed host won't be delivered, even though the node is
reachable (through an alternate address). reachable through alternate interfaces.
Note that the above problems fall into two general categories: Note that the above problems fall into two general categories:
1) Today's routing subsystem is unable to automatically deliver a 1) Today's routing subsystem is unable to automatically deliver a
packet to a host's ``alternate'' addresses (if the host is packet to a host's "alternate" addresses (if the host is multi-
multi-homed) or a new address (if the host moves), should there homed) or a new address (if the host moves), should there be a
be a problem delivering a packet to the destination address problem delivering a packet to the destination address listed in
listed in the packet. It is possible to imagine, however, future the packet. It is possible to imagine, however, future routing
routing advances addressing this problem (e.g., Mobile IP). advances addressing this problem (e.g., Mobile IP).
2) Even if a packet is delivered to its intended destination, the 2) Even if a packet is delivered to its intended destination, the
packet may still be rejected because the packet's destination packet may still be rejected because the packet's destination
address does not match any of the addresses assigned to address does not match any of the addresses assigned to
destination's interfaces. This problem does not appear to be destination's interfaces. This problem does not appear to be
insurmountable and could be rectified (for example) by having a insurmountable and could be rectified (for example) by having a
host remember its previous addresses. host remember its previous addresses.
4.1.3. Overloading Addresses: Transport Layer Issues 4.1.3. Overloading Addresses: Transport Layer Issues
Although the problems discussed previously appear to have viable The problems discussed previously create particular complications at
solutions, additional complications occur at the transport level. the transport level. Transport protocols such as TCP and UDP use
Transport protocols such as TCP and UDP use embedded IP addresses to embedded IP addresses to identify the end-points of a transport
identify the end points of a transport connection. Specifically, the connection. Specifically, the communicating end-points of a transport
communicating end points of a transport connection are uniquely connection are uniquely identified by the sender's source IP address
identified by the sender's source IP address and source port number and source port number together with the recipient's destination IP
together with the recipient's destination IP address and port number. address and port number. Once a connection has been established, the
Once a connection has been established, the IP addresses can not IP addresses can not change. In particular, if a mobile host moves to
change. In particular, if a mobile host moves to a new location and a new location and obtains a new address, packets intended for a TCP
obtains a new address, packets intended for a TCP connection created connection created prior to the move cannot use the new address. TCP
prior to the move cannot use the new address. TCP will treat any will treat any packets sent to the new address as belonging to a
packets sent to the new address as belonging to a different TCP different TCP connection.
connection.
It is possible to imagine changes to TCP that might allow connections It is possible to imagine changes to TCP that might allow connections
to change the addresses they are using mid-connection without to change the addresses they are using mid-connection without
breaking the connection. However, some subtle issues arise: breaking the connection. However, some subtle issues arise:
1) Packets intended for a pre-existing connection must be 1) Packets intended for a pre-existing connection must be
demultiplexed to that connection as part of any negotiation to demultiplexed to that connection as part of any negotiation to
change the addresses that identify that transport end point. change the addresses that identify that transport end-point.
However, because the demultiplexing operation uses the transport However, because the demultiplexing operation uses the transport
addresses of the pre-existing TCP connection (which is based on addresses of the pre-existing TCP connection (which is based on
the previous address), TCP packets sent to a new address won't the previous address), TCP packets sent to a new address won't
be delivered to the desired transport end point (which still be delivered to the desired transport end-point (which still
uses the previous address). Consequently, packets would need to uses the previous address). Consequently, packets would need to
be sent to the previous address. However, by the time a mobile be sent to the previous address. However, by the time a mobile
node has moved and knows its new address, packets sent to the node has moved and knows its new address, packets sent to the
previous address may no longer be delivered (i.e., they may not previous address may no longer be delivered (i.e., they may not
be forwarded to the mobile host's new location). be forwarded to the mobile host's new location).
2) When a mobile host moves, it could inform its TCP peers that it 2) When a mobile host moves, it could inform its TCP peers that it
has a new address. However, such a message could not be has a new address. However, such a message could not be
delivered to the remote TCP connection if it was sent using its delivered to the remote TCP connection if it was sent using its
new address for its source address. Just as above, such packets new address for its source address. Just as above, such packets
would not be demultiplexed to the correct TCP connection. On the would not be demultiplexed to the correct TCP connection. On the
other hand, there are difficulties if it attempts to send other hand, it is infeasible to send packets using its previous
packets using its previous address from its new location. address from its new location. Because of the danger of spoofing
Because of the danger of spoofing attacks, routers are now attacks, routers are now encouraged to actively look for, and
encouraged to actively look for, and discard traffic from, a discard traffic from, a source address that does not match known
source address that does not match known addresses for that addresses for that region of the Internet [CERT]. Consequently,
region of the Internet [CERT]. Consequently, such packets cannot such packets cannot be expected to be delivered.
be expected to be delivered.
Although the previous discussion used mobile nodes as an example, the Although the previous discussion used mobile nodes as an example, the
same problem arises in other contexts. For example, if a site is same problem arises in other contexts. For example, if a site is
being renumbered in IPv6, it may have two addresses, a previous being renumbered in IPv6, it may have two addresses, a previous
(i.e., deprecated) one being phased out and a new (i.e., preferred) (i.e., deprecated) one being phased out and a new (i.e., preferred)
one being phased in. At the transport level, the problem of switching one being phased in. At the transport level, the problem of switching
addresses is similar in many respects to the mobility problem. addresses is similar in many respects to the mobility problem.
4.1.4. Benefits of Globally Unique ESDs 4.1.4. Potential Benefits of Globally Unique ESDs
An alternate approach is to break an address into two distinct
portions:
1) An End System Designator (ESD) that uniquely identifies an end
point of communication (independent of the interface through
which that was reached). Such an identifier should be globally
unique so that a node that receives a packet can definitively
determine whether the packet is intended for it by comparing
only the ESD portion of the address.
2) A ``locator'' or Routing Stuff that is used by the routing
subsystem to deliver a packet to the appropriate end system
identified by the ESD.
Having a clear separation between the Routing Stuff and the ESD Having a clear separation between the Routing Stuff and the ESD
portion of an address gives protocols some additional flexibility. At portion of an address gives protocols some additional flexibility. At
the network layer, for example, recipients can examine just the ESD the network layer, for example, recipients can examine just the ESD
portion of the destination addresses when determining whether a portion of the destination addresses when determining whether a
packet is intended for them. This means that if a packet is delivered packet is intended for them. This means that if a packet is delivered
to the correct destination node, the node will accept the packet, to the correct destination node, the node will accept the packet,
regardless of how the packet got there, i.e., without regard to the regardless of how the packet got there, i.e., without regard to the
Routing Stuff of the address, which interface it arrived on, etc. Routing Stuff of the address, which interface it arrived on, etc.
Excluding the Routing Stuff of an address when making address Such packets would then be delivered and accepted by the target host.
comparisons also makes it possible to change the Routing Stuff of an
address to reflect a mobile node's new location, or an alternate
interface on a multi-homed host. Such packets would then be delivered
and accepted by the target host.
The idea of using addresses that cleanly separate the Routing Stuff The idea of using addresses that cleanly separate the Routing Stuff
from an ESD is not new [references XXX]. However, there are several from an ESD is not new [references XXX]. However, there are several
different flavors. In its pure form, a sender would only need to know different flavors. In its pure form, a sender would only need to know
the ESD of an end point in order to send packets to it. When the ESD of an end-point in order to send packets to it. When
presented with a datagram to send, network software would be presented with a datagram to send, network software would be
responsible for finding the Routing Stuff associated with the ESD so responsible for finding the Routing Stuff associated with the ESD so
that the packet can be delivered. A key question, then, is who is that the packet can be delivered. A key question is who is
responsible for finding the Routing Stuff associated with a given responsible for finding the Routing Stuff associated with a given
ESD? There are a number of possibilities: ESD? There are a number of possibilities:
1) The network layer could be responsible for doing the mapping. 1) The network layer could be responsible for doing the mapping.
The advantage of such a system is that an ESD could be stored The advantage of such a system is that an ESD could be stored
essentially forever (e.g., in configuration files), but whenever essentially forever (e.g., in configuration files), but whenever
it is actually used, network layer software could automatically it is actually used, network layer software would automatically
perform the mapping to determine the appropriate Routing Stuff perform the mapping to determine the appropriate Routing Stuff
for the destination. Likewise, should an existing mapping become for the destination. Likewise, should an existing mapping become
invalid, network layer software could dynamically determine the invalid, network layer software could dynamically determine the
updated quantity. Unfortunately, building such a mapping updated quantity. Unfortunately, building such a mapping
mechanism that is scalable is a hard problem. mechanism that is scalable is a hard problem.
2) The transport layer could be responsible for doing the mapping. 2) The transport layer could be responsible for doing the mapping.
It could perform the mapping when a connection is first opened, It could perform the mapping when a connection is first opened,
periodically refreshing the binding for long-running periodically refreshing the binding for long-running
connections. Implementing such a scheme would change the connections. Implementing such a scheme would change the
existing transport layer protocols TCP and UDP significantly. existing transport layer protocols TCP and UDP significantly.
3) Higher-layer software (e.g., the application itself) could be 3) Higher-layer software (e.g., the application itself) could be
responsible for performing the mapping. This potentially responsible for performing the mapping. This potentially
increases the burden on application programmers significantly, increases the burden on application programmers significantly,
especially if long-running connections are required to survive especially if long-running connections are required to survive
renumbering and/or deal with mobile nodes. renumbering and/or deal with mobile nodes.
It should be noted that the GSE proposal [GSE] does not embrace the It should be noted that the GSE proposal does not embrace the general
general model. Indeed, it proposes the last. The network layer (and model. Indeed, it proposes the last. The network layer (and indeed
indeed the transport layer) is always presented both the Routing the transport layer) is always presented both the Routing Stuff (RG +
Stuff (RG + STP) and the ESD together in one IPv6 address. It is not STP) and the ESD together in one IPv6 address. It is not the network
the network (or transport) layer's job to determine the Routing Stuff (or transport) layer's job to determine the Routing Stuff given only
given only the ESD. When an application has data to send, it queries the ESD or to validate that the Routing Stuff is correct. When an
the DNS to obtain the IPv6 AAAA record for a destination. The application has data to send, it queries the DNS to obtain the IPv6
returned AAAA record contains both the Routing Stuff and the ESD of AAAA record for a destination. The returned AAAA record contains both
the specified destination. While such an approach eliminates the need the Routing Stuff and the ESD of the specified destination. While
for the lower layers to be able to map ESDs into corresponding such an approach eliminates the need for the lower layers to be able
Routing Stuff, it also means that when presented with an address to map ESDs into corresponding Routing Stuff, it also means that when
containing an incorrect (i.e., no longer valid) Routing Stuff, the presented with an address containing an incorrect (i.e., no longer
network will be unable to deliver the packet to its correct valid) Routing Stuff, the network is unable to deliver the packet to
destination. It is up to applications themselves to deal with such its correct destination. It is up to applications themselves to deal
failures. Note that addresses containing invalid Routing Stuff will with such failures. Note that addresses containing invalid Routing
result any time cached addresses are used after the Routing Stuff of Stuff will result any time cached addresses are used after the
the address becomes invalid. This may happen if addresses are stored Routing Stuff of the address becomes invalid. This may happen if
in configuration files, or with long-running communication. addresses are stored in configuration files, or with long-running
communication.
4.1.5. ESD: Network Layer Issues 4.1.5. ESD: Network Layer Issues
Along with the flexibility offered by separating the ESD from the Along with the flexibility offered by separating the ESD from the
Routing Stuff come additional considerations that must be considered Routing Stuff come additional considerations that must be considered
at the network layer: at the network layer:
1) If a receiver observes that recent packets are arriving with a 1) Addresses must have a locator embedded within them. It is not
feasible to route packets solely on an ESD; doing so would make
it impossible to aggregate routing information in a scalable
way. The GSE proposal assumes that the locator part of an
address is filled with an appropriate value by higher layers
(i.e., the transport or application layer).
2) If a receiver observes that recent packets are arriving with a
different Routing Stuff in the source address than before, it different Routing Stuff in the source address than before, it
may want to send return traffic using the new Routing Stuff. may want to send return traffic using the new Routing Stuff.
However, such information should not be accepted without However, such information should not be accepted without
appropriate authentication of the new Routing Stuff, otherwise appropriate authentication of the new Routing Stuff, otherwise
it would be trivial to hijack existing transport connections. it would be trivial to hijack existing transport connections.
Always using the most recently received Routing Stuff of an Always using the most recently received Routing Stuff of an
address to send return traffic without appropriate address to send return traffic without appropriate
authentication leads to a vulnerability that is equivalent in authentication leads to a vulnerability that is equivalent in
potential danger to ``reversing and using an unauthenticated potential danger to "reversing and using an unauthenticated
received source route.'' received source route."
Note also that in the GSE proposal, since a sender does not know Note also that in the GSE proposal, since a sender does not know
their own RG, it is not possible for the sender to compute an its own RG, it is not possible for the sender to compute an
Authentication Header via IPSEC that covers the RG portion of an Authentication Header via IPSec that covers the RG portion of an
address. Thus, a recipient of new RG would need to authenticate address. Thus, a recipient of new RG would need to authenticate
the received information via some alternate (undefined) the received information via some alternate (undefined)
mechanism. mechanism.
Finally, receipt of packets from different Routing Stuff than Finally, receipt of packets from different Routing Stuff than
before does not necessarily indicate a permanent change. In the before does not necessarily indicate a permanent change. In the
GSE proposal, for example, when a Site is multi-homed, some of GSE proposal, for example, when a Site is multi-homed, some of
its packets may exit via one egress router and other packets via its packets may exit via one egress router while other packets
a different egress router. Even packets originated from the same exit via a different egress router. Even packets originated from
source may exit through multiple egress routers. Consequently, a the same source may exit through multiple egress routers.
node may receive traffic from the same sender in which the Consequently, a node may receive traffic from the same sender in
Routing Stuff part changes on every packet. which the Routing Stuff part changes on every packet.
2) In general, whenever an address is embedded within a packet 3) In general, whenever an address is embedded within a packet
(including within data), one must consider whether all the bits (including within data), one must consider whether all the bits
in the address should be used in computations, or whether just in the address should be used in computations, or whether just
the ESD portion should be used. Examples where such decisions the ESD portion should be used. Examples where such decisions
would need to be made include, but are not limited to, Neighbor would need to be made include, but are not limited to, Neighbor
Discovery packets containing Neighbor Solicitations and Discovery packets containing Neighbor Solicitations and
Responses [RFC 1970], IPSEC packets being demultiplexed to their Responses [RFC 1970], IPSec packets being demultiplexed to their
appropriate Security Association, IP deciding whether to accept appropriate Security Association, IP deciding whether to accept
an IP datagram (before reaching the transport level), the an IP datagram (before reaching the transport level), the
reassembly of fragments, transport layer demultiplexing of reassembly of fragments, transport layer demultiplexing of
received packets to end points, etc. received packets to end-points, etc.
4.1.6. ESD: Transport Layer Issues 4.1.6. ESD: Transport Layer Issues
Previous sections have made clear that the embedding of full IPv6 Previous sections have made clear that the embedding of full IPv6
addresses (i.e., Routing Stuff) within transport connection endpoint addresses (i.e., Routing Stuff) within transport connection end-point
identifiers poses problems for mobility and site renumbering. This identifiers poses problems for mobility and site renumbering. This
section discusses an alternate approach, in which transport endpoint section discusses an alternate approach, in which transport end-point
identifiers use ESDs rather than full addresses (with embedded identifiers use ESDs rather than full addresses (with embedded
Routing Stuff). Routing Stuff).
In the following discussion, it should be kept in mind that the IPng In the following discussion, it should be kept in mind that the IPng
Recommendation [RFC 1752] states that a transition to IPv6 cannot Recommendation [RFC 1752] states that a transition to IPv6 cannot
also require deployment of a ``TCPng'', that is, make changes to TCP also require deployment of a "TCPng." In addition, although we focus
that decrease its overall robustness. In addition, although we focus
on TCP, UDP-based protocols also depend on the Routing Stuff in on TCP, UDP-based protocols also depend on the Routing Stuff in
similar ways, e.g., starting with the UDP checksum of the peers' similar ways, e.g., starting with the UDP checksum of the peers'
addresses. Indeed, we believe that TCP is the ``easy'' case to deal addresses. Indeed, we believe that TCP is the "easy" case to deal
with, for two reasons. First, TCP is a stateful protocol in which with, for two reasons. First, TCP is a stateful protocol in which
both ends of the connection can negotiate with each other. Some UDP- both ends of the connection can negotiate with each other. Some UDP-
based protocols are stateless, and remember nothing from one packet based protocols are stateless, and remember nothing from one packet
to the next. Consequently, changing UDP-based protocols may require to the next. Consequently, changing UDP-based protocols may require
the introduction of ``session'' features, perhaps as part of a common the introduction of "session" features, perhaps as part of a common
``library'', for use by applications whose transport protocol is "library", for use by applications whose transport protocol is
relatively stateless. Second, changes to UDP-based protocols in relatively stateless. Second, changes to UDP-based protocols in
practice mean changing individual applications themselves. practice mean changing individual applications themselves, raising
deployability questions.
4.1.6.1. Dumultiplexing Packets to Transport Endpoints 4.1.6.1. Demultiplexing Packets to Transport Endpoints
Connections in GSE are identified by the ESDs rather than full IPv6 Connections in GSE are identified by the ESDs rather than full IPv6
addresses (with embedded Routing Stuff). That is: addresses (with embedded Routing Stuff). That is:
unique TCP connection: srcaddr dstaddr srcport destport
unique IPv4 TCP connection: srcaddr dstaddr srcport destport
unique GSE TCP connection: srcESD dstESD srcport dstport unique GSE TCP connection: srcESD dstESD srcport dstport
Consequently, when demultiplexing incoming packets, TCP would ignore
the Routing Stuff portions of addresses when delivering packets to Consequently, with GSE, when demultiplexing incoming packets, TCP
their proper end point. would ignore the Routing Stuff portions of addresses when delivering
packets to their proper end-point.
Although there are potential benefits to this approach (discussed Although there are potential benefits to this approach (discussed
below), demultiplexing of ESDs is in fact a requirement with GSE. If below), demultiplexing on ESDs alone without the RS is, in fact,
a site is multihomed, the packets it sends may exit different egress required with GSE. If a site is multi-homed, the packets it sends may
border routers during the lifetime of a connection. Because each exit different egress border routers during the lifetime of a
border router will place its own RG into the source addresses of such connection. Because each border router will place its own RG into the
packets, the receiving TCP must ignore (at least) the RG portion of source addresses of outgoing packets, the receiving TCP must ignore
addresses when demultiplexing received packets. The alternative would (at least) the RG portion of addresses when demultiplexing received
be to make TCP less robust with respect to changes in routing, i.e., packets. The alternative would be to make TCP less robust with
if the path changed, packets delivered correctly would be discarded respect to changes in routing, i.e., if the path changed, packets
by the receiving TCP rather than processed. delivered correctly would be discarded by the receiving TCP rather
than processed.
4.1.6.2. Pseudo-Header Checksum Calculations 4.1.6.2. Pseudo-Header Checksum Calculations
Having routers rewrite the RG portion of addresses means that TCP Having routers rewrite the RG portion of addresses means that TCP
cannot include the RG in its checksum calculation; the sender does cannot include the RG in its checksum calculation; the sender does
not know its own RG. Consequently, upon receipt of a TCP segment, the not know its own RG. Consequently, upon receipt of a TCP segment, the
receiver has no way of determining whether the RG portion of an receiver has no way of determining whether the RG portion of an
address has been corrupted (or modified) in transit (the implications address has been corrupted (or modified) in transit (the implications
of this are discussed below). of this are discussed below).
4.1.6.3. RG Selection When Sending Packets 4.1.6.3. RG Selection When Sending Packets
When a host has a packet to send, what RG should it use? There are When a host has a packet to send, there are three cases for deciding
three cases. If the host is performing an ``active open'', it queries what RG to use in the destination. If the host is performing an
the DNS to obtain the destination address, which contains appropriate "active open", it queries the DNS to obtain the destination address,
RG. If the host is responding to an active open from a remote peer, which contains appropriate RG. If the host is responding to an active
the source address of packets from that peer contain usable RG. The open from a remote peer, the source address of packets from that peer
interesting case is when RG changes mid-connection. contains usable RG. Note that assuming that the RG on an incoming TCP
connection is "correct" needs qualification. It is "correct" in the
sense that it corresponds to the site originating the connection.
Whether the ESD paired with the RG is actually located at that site
cannot be assumed. The issue of spoofing is discussed in more detail
later. The last (and most interesting) case is when RG changes mid-
connection. Although, the GSE proposal calls for always using the
first RG learned (and then never switching), we explored the
possibility of doing so in order to better understand the issues.
4.1.6.4. Mid-Connection RG Changes 4.1.6.4. Mid-Connection RG Changes
During a connection, the RG appearing on subsequent packets is During a connection, the RG appearing on subsequent packets is
susceptible to change through renumbering events, and indeed more susceptible to change through renumbering events, and indeed more
frequently, to change through Site-internal routing changes that frequently, to change through Site-internal routing changes that
cause the egress point for off-Site traffic to change. It is even cause the egress point for off-Site traffic to change. It is even
possible that traffic balancing schemes could result in the use of possible (in the worst case) that traffic-balancing schemes could
two egress routers, with roughly every other packet exiting through a result in the use of two egress routers, with roughly every other
different egress router. Consequently it may be desirable to switch packet exiting through a different egress router. Consequently it may
to the just-received RG, as the old RG may no longer be valid (e.g., be desirable to switch to the just-received RG, as the old RG may no
a border router has failed). However, simply using the most- longer be valid (e.g., a border router has failed), but care must be
recently-received RG makes it trivial to hijack connections. taken not to thrash. Moreover, simply using the most-recently-
received RG makes it trivial for an intruder to hijack connections.
The way TCP packets are demultiplexed under GSE, they will be Because TCP under GSE demultiplexes packets using only ESDs, packets
delivered to the correct endpoint even though TCP may send to its will be delivered to the correct end-point regardless of what source
peer at a deprecated RG or one that is less optimal because the RG is used. However, return traffic will continue to be sent via the
peer's Border Router has changed. It would seem highly desireable "old" RG, even though it may have been deprecated or become less
for TCP connections to be able to survive such events. However, the optimal because the peer's border router has changed. It would seem
completion of renumbering events (so that an earlier RG is now highly desirable for TCP connections to be able to survive such
invalid) and certain topology changes would require TCP to switch events. However, the completion of renumbering events (so that an
sending to a new RG mid-connection. To explore the whole space, we earlier RG is now invalid) and certain topology changes would require
considered ways of allowing this mid-connection RG change. TCP to switch sending to a new RG mid-connection. To explore the
whole space, we considered ways of allowing this mid-connection RG
change to happen.
If TCP connection identifiers are based on ESDs rather than full If TCP connection identifiers are based on ESDs rather than full
addresses, traffic from the same ESD would be viewed as coming from addresses, traffic from the same ESD would be viewed as coming from
the same peer, regardless of its source RG. This makes it trivial for the same peer, regardless of its source RG. This makes it trivial for
any Internet host to impersonate another, and have such traffic be any Internet host to impersonate another, and have such traffic be
accepted by TCP. Because this vulnerability is already present in accepted by TCP. Because this vulnerability is already present in
today's Internet (forging full source addresses is trivial), the mere today's Internet (forging full source addresses is trivial), the mere
delivery of incoming datagrams with the same ESD but a different RG delivery of incoming datagrams with the same ESD but a different RG
does not introduce new vulnerability to TCP. In today's Internet, does not introduce new vulnerability to TCP. In today's Internet,
any node can already originate FINs/RSTs from an arbitrary source any node can already originate FINs/RSTs from an arbitrary source
address and potentially or definitely disrupt the connection. address and potentially or definitely disrupt the connection.
Therefore, changing RG for acceptance, or acceptance of traffic Therefore, changing RG for acceptance, or acceptance of traffic
independent of its source RG, does not significantly worsen existing independent of its source RG, does not appear to significantly worsen
robustness, as far as our analysis has gone. existing robustness.
We also considered allowing TCP to reply to each segment using the RG We also considered allowing TCP to reply to each segment using the RG
of the most recently-received segment. Although, this allows TCP to of the most recently-received segment. Although this allows TCP to
survive some important events (e.g., renumbering), it also makes it survive some important events (e.g., renumbering), it also makes it
trivial to hijack connections, unacceptably weakening robustness trivial to hijack connections, unacceptably weakening robustness
compared with today's Internet. A sender simply needs to guess the compared with today's Internet. A sender simply needs to guess the
sequence numbers in use by a given TCP connection [Bellovin 89] and sequence numbers in use by a given TCP connection [Bellovin 89] and
send traffic with a source RG that redirects responses to the send traffic with a bogus RG to hijack a connection to an intruder
intruder's current location. at an arbitrary location.
Providing protection from hijacking implies that the RG used to send Providing protection from hijacking implies that the RG used to send
packets must be bound to a connection end point (e.g., it is part of packets must be bound to a connection end-point (e.g., it is part of
the connection state). Although it may be reasonable to accept the connection state). Although it may be reasonable to accept
incoming traffic independent of the source RG, the choice of sending incoming traffic independent of the source RG, the choice of sending
RG requires more careful consideration. Indeed, any subsequent change RG requires more careful consideration. Indeed, any subsequent change
in what RG is used for sending traffic must be properly authenticated in what RG is used for sending traffic must be properly authenticated
using cryptographic means. In the GSE proposal, it is not clear how using cryptographic means. In the GSE proposal, it is not clear how
to authenticate such a change, since the remote peer doesn't even to authenticate such a change, since the remote peer doesn't even
know what RG it is using! Consequently, the only reasonable approach know what RG it is using! Consequently, the only reasonable approach
in GSE is to send to the peer at the first RG used by the peer for in GSE is to send to the peer at the first RG used by the peer for
the entire life of a connection. That is, always continue to use the the entire life of a connection. That is, always continue to use the
first RG used. first RG seen.
In summary, changing the RG dynamically in a safe way for a
connection requires that an originator of traffic be able to
authenticate a proposed change in the RG before sending to a
particular ESD via that RG. Such a mechanism would need to be
invented, as the TCP/IP suite has no obvious candidate that operates
at or below the transport layer (using the DNS, an application
protocol that resides above IP, would be problematic due to layering
circularity considerations).
4.1.6.5. Passive Opens
One question that arises is what impact corrupted RG would have on One question that arises is what impact corrupted RG would have on
robustness. Because the RG is not covered by any checksums, it would robustness. Because the RG is not covered by any checksums, it would
be difficult to detect such corruption. Moreover, once a specific RG be difficult to detect such corruption. Moreover, once a specific RG
is in use, it does not change for the duration of a connection. The is in use, it does not change for the duration of a connection. The
interesting case occurs on the passive side of a TCP connection, interesting case occurs on the passive side of a TCP connection,
where a server accepts incoming connections from remote clients. If where a server accepts incoming connections from remote clients. If
the initial SYN from the client includes corrupted RG, the server TCP the initial SYN from the client includes corrupted RG, the server TCP
will create a TCP connection (in the SYN-RECEIVED state) and cache will create a TCP connection (in the SYN-RECEIVED state) and cache
the (corrupted) RG with the connection. The second packet of the 3- the corrupted RG with the connection. The second packet of the 3-way
way handshake, the SYN-ACK packet, would be sent to the wrong RG and handshake, the SYN-ACK packet, would be sent to the wrong RG and
consequently not reach the correct destination. Later, when the consequently not reach the correct destination. Later, when the
client retransmits the (unacknowledged) SYN, the server will continue client retransmits the unacknowledged SYN, the server will continue
to send the SYN-ACK using the bad RG. Eventually the client times to send the SYN-ACK using the bad RG. Eventually the client times
out, and the attempt to open a TCP connection fails. Figure 7 shows out, and the attempt to open a TCP connection fails. Figure 8 shows
the details. the details.
TCP A TCP B TCP A TCP B
1. CLOSED LISTEN 1. CLOSED LISTEN
2. SYN-SENT --> <SRC RG=BITERR><SEQ=100><CTL=SYN> --> SYN-RECEIVED 2. SYN-SENT --> <SRC RG=BITERR><SEQ=100><CTL=SYN> --> SYN-RECEIVED
3. <-- <DST RG=BITERR><SEQ=300><ACK=101><CTL=SYN,ACK> <-- SYN-RECEIVED 3. <-- <DST RG=BITERR><SEQ=300><ACK=101><CTL=SYN,ACK> <-- SYN-RECEIVED
4. SYN-SENT --> <SRC RG><SEQ=100><CTL=SYN> --> SYN-RECEIVED 4. SYN-SENT --> <SRC RG><SEQ=100><CTL=SYN> --> SYN-RECEIVED
5. <-- <DST RG=BITERR><SEQ=300><ACK=101><CTL=SYN,ACK> <-- SYN-RECEIVED 5. <-- <DST RG=BITERR><SEQ=300><ACK=101><CTL=SYN,ACK> <-- SYN-RECEIVED
... TCP A times out ... TCP A times out
Figure 7 Figure 8
We next considered relaxing the restriction on switching RGs in an We next consider relaxing the restriction on switching RGs in an
attempt to avoid the previous failure scenario. The situation is attempt to avoid the previous failure scenario. The situation is
complicated by the fact that the RG on received packets may change complicated by the fact that the RG on received packets may change
for legitimate reasons (e.g., a multi-homed site load-shares traffic for legitimate reasons (e.g., a multi-homed site load-shares traffic
across multiple Border Routers). The key question is how can one across multiple border routers). The key question is how can one
determine which RG is valid and which is not? That is, for each of determine which RG is valid and which is not. That is, for each of
the RGs a sender attempts to use, how can it determine which RG the RGs a sender attempts to use, how can it determine which RG
worked and which did not? Solving this problem is more difficult than worked and which did not? Solving this problem is more difficult than
first appears, since one must cover the cases of delayed segments, first appears, since one must cover the cases of delayed segments,
lost segments, simultaneous opens, etc. If a SYN-ACK is retransmitted lost segments, simultaneous opens, etc. If a SYN-ACK is retransmitted
using different RGs, it is not possible to determine which of those using different RGs, it is not possible to determine which of those
RGs worked correctly. We concluded that the only way TCP could RGs worked correctly. We conclude that the only way TCP could
determine that a particular RG was used to deliver segments was if it determine that a particular RG was used to deliver segments was if it
received an ACK for a specific sequence number in which all received an ACK for a specific sequence number in which all
transmissions of that sequence number used the same RG. transmissions of that sequence number used the same RG (a non-trivial
addition to TCP).
We analyzed multiple cases of RG changing within the time of the We analyze multiple cases of RG changing within the time of the
opening handshake. One example is diagrammed in Figure 8, and it and opening handshake. One example is diagrammed in Figure 9, and it and
two others are summarized in Table 1. We observed that RG flap and two others are summarized in Table 1. We observe that RG flap and
large numbers of passive opens may coincide, for instance, when a large numbers of passive opens may coincide, for instance, when a
power failure at a server farm affects both internal routers and power failure at a server farm affects both internal routers and
servers. servers.
time TCP A time TCP B time TCP A time TCP B
t0 --> <SRC RG=M><SEQ=100><SYN> t1 t0 --> <SRC RG=M><SEQ=100><SYN> t1
t3 <-- <DST RG=M><SEQ=300><ACK=101><SYN,ACK> t1 t3 <-- <DST RG=M><SEQ=300><ACK=101><SYN,ACK> t1
TCP B's SYN,ACK is delayed and crosses with retransmit of TCP A's TCP B's SYN,ACK is delayed and crosses with retransmit of TCP A's
SYN on which RG has changed from M to N SYN on which RG has changed from M to N
t2 --> <SRC RG=N><SEQ=100><SYN> t3 t2 --> <SRC RG=N><SEQ=100><SYN> t3
t4 --> <SRC RG=N><SEQ=101><ACK=301> t3 ESTABLISHED t4 --> <SRC RG=N><SEQ=101><ACK=301> t3 ESTABLISHED
TCP B decides to use DST RG=M for TCP A, because it heard from TCP B decides to use DST RG=M for TCP A, because it heard from
RG=M and was ACK'd on a send to RG=M RG=M and was ACK'd on a send to RG=M
Figure 8 Figure 9
SYNFROM SYNACKTO ACKFROM SELECT SYNFROM SYNACKTO ACKFROM SELECT
W W X W W W X W
------------------------------------ ------------------------------------
W W
X W X W X W X W
------------------------------------ ------------------------------------
W W W W
X X Y ?? X X Y ??
Table 1 Table 1
At best, an RG selection algorithm for TCP would be relatively At best, an RG selection algorithm for TCP would be relatively
straightforward but would require new logic in implementations of straightforward but would require new logic in implementations of
TCP's opening handshake --- a significant transition issue. We are TCP's opening handshake --- a significant transition issue. We are
not certain that a valid algorithm is attainable, however. RG changes not certain that a valid algorithm is attainable, however. RG changes
would have to be handled in all cases handled by the opening would have to be handled in all cases handled by the opening
handshake: delayed segments, lost segments, undetected bit errors in handshake: delayed segments, lost segments, undetected bit errors in
RG, simultaneous opens, old segments, and so on. RG, simultaneous opens, old segments and so on.
In the end, we concluded that although the corrupted SYN case of In the end, we conclude that although the corrupted SYN case of
Figure 7 was a potential problem, the changes that would need to be Figure 8 was a potential problem, the changes that would need to be
made to TCP to robustly deal with such corruption would be made to TCP to robustly deal with such corruption would be
significant, if tractable at all. This would result in transition to significant, if tractable at all. This would result in transition to
GSE needing a significant TCPng transition. GSE needing a significant TCPng transition.
Our final conclusion is that transport protocol end points must make Our final conclusion is that transport protocol end-points must make
an early, single choice of the RG to use when sending to a peer and an early, single choice of the RG to use when sending to a peer and
stick with that choice for the duration of the connection. stick with that choice for the duration of the connection.
Specifically: Specifically:
1) The demultiplexing of arriving packets to their transport end 1) The demultiplexing of arriving packets to their transport end
points should use only the ESD, and not the Routing Stuff. points should use only the ESD, and not the Routing Stuff.
2) If the application chooses an RG for the remote peer (i.e., an 2) If the application chooses an RG for the remote peer (i.e., an
active open), use the provided RG for all traffic sent to that active open), use the provided RG for all traffic sent to that
peer, even if alternative RGs are received on subsequent peer, even if alternative RGs are received on subsequent
incoming datagrams from the same ESD. incoming datagrams from the same ESD.
3) For all other cases, use the first RG received with a given ESD 3) For all other cases, use the first RG received with a given ESD
for all sending. We recommend that a means be found for RGs to for all sending. We recommend that a means be found for RGs to
be checksummed if the GSE address structure is used. be checksummed if the GSE address structure is used.
4.1.6.5. Duplicate ESDs with Differing RGs Consequently, there does not appear to be a straightforward way to
use ESDs in conjunction with mobility or site renumbering (in which
Another interesting case occurs if two different (client) nodes are existing connections survive the renumbering).
using the same ESD, and they attempt to communicate with a common
server. In such cases, the RG (or STP) portions of the address may be
different. However, since only the ESD is used in demultiplexing
packets to their transport end points, traffic from two different
hosts may be delivered and processed by one transport endpoint. Given
the above rules that bind RG to existing connections, only one RG
will be used, and all traffic from the server will be sent to the
same client. It would appear that in most cases, only one connection
would reach ESTABLISHED state, and the others would time out.
One implication of binding RG information to TCP connection state is
that we may be opening the door to additional security threats. One
denial of service attack, for instance, would be for an intruder to
masquerade as another host and ``wedge'' connections in a SYN-
RECEIVED state by sending SYN segments containing on invalid RG in
the source IP address. Subsequent connection attempts to the wedged
host from the legitimate party (if they used the same TCP port
numbers) would then not complete, since return traffic would be sent
to the wrong place.
4.1.6.6. Summary: ESD and RG Not Strictly Independent 4.1.6.6. Summary: ESD and RG Not Strictly Independent
We cannot emphasize enough that the use of an ESD independent of an We cannot emphasize enough that the use of an ESD independent of an
associated RG can be very dangerous. That is, communicating with a associated RG can be very dangerous. That is, communicating with a
peer implies that one is always talking to the same peer for the peer implies that one is always talking to the same peer for the
duration of the communication. But as has been described in previous duration of the communication. But as has been described in previous
sections, such assurance can only take place if there are assurances sections, such assurance can only take place if there are assurances
that only properly authenticated RG is used --- RG authenticated by that only properly authenticated RG is used.
the peer.
Consequently, we conclude that the rules for transport processing We conclude that the rules for transport processing when ESDs are
when ESDs are present differ from classical IP. Specifically: present differ from classical IP. Specifically:
1) The demultiplexing of packets to transport connection end points 1) The demultiplexing of packets to transport connection end-points
should use ESDs, but should not use the Routing Stuff part of should use ESDs, but should not use the Routing Stuff part of
addresses. addresses. This insures that packets are delivered to their
intended destination independent of RG.
2) Once a packet has been delivered to its transport endpoint, that
packet should not be processed without first examining the
source RG used. Whether (and how) the information in the
received packet is used is dependent on the transport protocol
itself. A protocol could chose to completely ignore the packet,
it could selectively use parts of the packet (e.g., to attempt
out-of-band authentication of the RG), or it could process the
packet in its entirety. It must not, however, use the received
RG to send subsequent return traffic without first
authenticating the RG.
4.1.7. ESD: Application Layer Issues
In this section we define applications as user processes that must
exchange data reliably with remote processes. Such distributed
processes need system support to reliably identify remote processes.
It is desirable (necessary?) for such end identifications to meet the
following requirements:
1) The identifier assigned to each end point should be globally
unique.
2) This uniqueness should be easily enforceable because it is
difficult, or probably impossible, to provide an absolute
guarantee on the uniqueness of these identifiers. Applications
relying on this uniqueness must be prepared for duplicate
detection; at the same time, one must be prepared to detect
maliciously forged identifiers.
The last point is becoming increasingly important as the Internet
continues to grow exponentially, both in the scale of user population
and in the scope of diverse applications that cover all walks of
life.
In the original design of the IPv4 architecture, globally unique IP
addresses are used as the globally unique identifiers for all
interfaces reachable on the Internet. Thus, a user process easily
obtains a globally unique identifier by attaching a local port number
to the address. One fundamental architectural change suggested by GSE
is to split the address into completely separate interface
identifiers and locators (routing stuff). In the rest of this section
we discuss the pros and cons of each of the two approaches.
4.1.7.1. The Impacts of Address and Identifier Overloading In IPv4
Because global IP addresses are unique, using addresses as
identifiers automatically provides the needed uniqueness property of
an identifier. Moreover, duplicates can be easily detected. If two
more interfaces claim the same unicast address, then due to shortest
path routing, packets destined to that IP address are, generally
speaking, delivered only to the interface closest to the source.
Nodes with duplicate addresses can detect the error by observing the
lack of connectivity to certain locations.
Furthermore, using addresses as identifiers makes forging difficult.
If a malicious node inserts a false source address in its outgoing
packets, although the packets are likely to be delivered to the
destination host, it is almost impossible for the malicious node to
receive any reply data. We can further prevent the packets with
faulty source address from being delivered by making all routers
perform reverse source address checking, that is, checking each
incoming packet to see if it comes from the correct interface as
indicated by the routing table, a practice enforced by all IP
multicast routing protocols. Such source address checking provides a
simple, universal and effective enforcement to correct interface
identifications. Even if some routers are compromised and allow
packets with false source addresses to pass through, delivering
packets with forged source addresses to the destination would require
that all routers along the path be compromised.
The fundamental disadvantage of overloading addresses with
identification information is that changes in addresses lead to
changes in identification, which implies that all hosts will be aware
of all address changes, an issue GSE is designed to resolve. It is
worth noting that keeping TCP connections running across renumbering
is a non-goal of GSE design.
4.1.7.2. The Impact of Separating Locators and Identifiers
GSE uses the upper 8 bytes of IPv6 addresses as locators and the
lower 8 bytes as globally unique identifiers (ESD). The chief
advantage of this complete separation of Routing Stuff and ESD is a
stable identifier per interface, independent from all renumbering
changes in a network with a provider-based addressing architecture.
One intention of the GSE design is to use the ESD alone in
establishing and maintaining TCP connection state. The discussions at
the interim meeting, however, revealed significant resistance to
doing this. The consensus from the meeting was that it is not
adequate to use an ESD alone for end point identification due to the
ease of hijacking a TCP connection. Incoming packets with a wrong ESD
can easily be detected as coming from an incorrect source, however,
incoming packets with a correct ESD cannot be easily trusted as being
from the correct source.
Various approaches to using RG as part of TCP source verification
were discussed at the meeting. Using ESDs as end point
identifications seems to require two steps of processing. In the
first step, the ESD can be used for PCB lookups. In the second step,
the entire address (including RG) must be considered because one
cannot safely take packets with an arbitrary RG. So the purpose of
the first step is to locate the intended end point of an incoming
packet. The second step then can make a separate decision as to how
to act on the received data (accept, reject or perform out-of-band
authentication on the RG). Due to the conflict between applications'
desire to use RG information for remote end checking and GSE's desire
to hide RG from hosts, however, none of the approaches can satisfy
both desires at the same time.
Thus the disadvantages of the Routing Stuff and ESD separation comes
directly from this separation. We believe that neither the global
uniqueness of ESDs nor their correct use is enforceable, thus easy
detection of wrong ESDs becomes the key. Unfortunately, short of
using IPSEC for every IP packet delivery, using ESD alone loses the
advantage of easy forge detection that comes from the address
overloading in IPv4 design.
4.1.8. When ESDs are Not Unique
In IPv4, communication usually fails quickly when addresses are not
unique. There are two cases to consider, depending on whether the two
interfaces assigned duplicate addresses are attached to the same or
to different links.
When two interfaces on the same link use the same address, a node
(host or router) sending traffic to the duplicate address will in
practice send all packets to one of the nodes. On Ethernets, for
example, the sender will use ARP (or Neighbor Discovery in IPv6) to
determine the link layer address corresponding to the destination
address. When multiple ARP replies for the target IP address are
received, the most recently received response replaces whatever is
already in the cache. Consequently, the destinations a node using a
duplicate IP address can communicate with depends on what its
neighboring nodes have in their ARP caches. In most cases, such
communication failures become apparent relatively quickly, since it
is unlikely that communication can proceed correctly on both nodes.
It is also the case that a number of ARP implementations (e.g., BSD- 2) Once a packet has been delivered to its transport end-point, a
derived implementations) log warning messages when an ARP request is separate (i.e., distinct) decision should be made concerning
received from a node using the same address as the machine receiving whether and how to act upon the received packet. Such a decision
the ARP request. would be transport-protocol specific. A protocol could chose to
completely ignore the packet, it could selectively use parts of
the packet (e.g., to attempt out-of-band authentication of the
RG), or it could process the packet in its entirety. It must
not, however, use the received RG to send subsequent return
traffic without first authenticating the RG.
When two interfaces on different links use the same address, the 4.1.7. On The Uniqueness Of ESDs
routing subsystem will generally deliver packets to only one of the
nodes because only one of the links has the right ``prefix'' or
``subnet part'' corresponding to the IP address. Consequently, the
node using the address on the ``wrong'' link will generally never
receive any packets sent to it and will be unable to communicate with
anyone. For obvious reasons, this condition is usually detected
quickly.
An important observation is that, with classical IP, when different The uniqueness requirements for ESDs depends on what purpose they
nodes mistakenly assign the same IP address to different interfaces, serve. In GSE, ESDs identify end systems, requiring that they be
problems become apparent relatively quickly because communication globally unique. It does not make sense for two different end systems
with several (if not all) destinations fails. In contrast, failure to use the same ESD; every end system must have its own ESD to
scenarios differ when globally unique ESDs are assumed, but two nodes distinguish from other end systems.
mistakenly select the same one.
At first glance it might appear that two nodes using the same ESD If ESDs are only used to identify session endpoints, the situation
cannot communicate. However, this is not necessarily the case. In the becomes more complex. At first glance it might appear that two nodes
GSE proposal, for example, a node queries the DNS to obtain an IPv6 using the same ESD cannot communicate. However, this is not
address. The returned address includes the Routing Stuff of an necessarily the case. In the GSE proposal, for example, a node
address (the RG+STP portions). Since the sending host transmits queries the DNS to obtain an IPv6 address. The returned address
packets based on the entire destination IPv6 address, the sender may includes the Routing Stuff of an address (the RG+STP portions). Since
well forward the packet to a router that delivers the packet to its the sending host transmits packets based on the entire destination
correct destination (using the information in the Routing Stuff). It IPv6 address, the sender may well forward the packet to a router that
is only on receipt of a packet that a node would extract the ESD delivers the packet to its correct destination (using the information
portion of a datagram's destination address and ask ``is this for in the Routing Stuff). It is only on receipt of a packet that a node
me?'' would extract the ESD portion of a datagram's destination address and
ask "is this for me?"
A more problematic case occurs if two nodes using the same ESD A more problematic case occurs if two nodes using the same ESD
communicate with a third party. To the third party, packets received communicate with a third party. To the third party, packets received
from either machine might appear to be coming from the same machine from either machine might appear to be coming from the same machine
since they are both using the same ESD. Consequently, at the since they are both using the same ESD. Consequently, at the
transport level, if both machines choose the same source and transport level, if both machines choose the same source and
destination port numbers (one of the ports --- a server's well-known destination port numbers (one of the ports --- a server's well-known
port number will likely be the same), packets belonging to two port number will likely be the same), packets belonging to two
distinct transport connections will be demultiplexed to a single distinct transport connections will be demultiplexed to a single
transport end point. transport end-point.
When packets from different sources using the same source ESD are When packets from different sources using the same source ESD are
delivered to the same transport end point, a number of possibilities delivered to the same transport end-point, a number of possibilities
come to mind: come to mind:
1) The transport end point could accept the packet, without regard 1) The transport end-point could accept the packet, without regard
to the Routing Stuff of the source address. This may lead to a to the Routing Stuff of the source address. This may lead to a
number of robustness problems, if data from two different number of robustness problems, if data from two different
sources mistakenly using the same ESD are delivered to the same sources mistakenly using the same ESD are delivered to the same
transport or application end point (which at best will confuse transport or application end-point (which at best will confuse
the application). the application).
2) The transport end point could verify that the Routing Stuff of 2) The transport end-point could verify that the Routing Stuff of
the source address matches one of a set of expected values the source address matches one of a set of expected values
before processing the packet further. If the Routing Stuff before processing the packet further. If the Routing Stuff
doesn't match any expected value, the packet could be dropped. doesn't match any expected value, the packet could be dropped.
This would result in a connection from one host operating This would result in a connection from one host operating
correctly, while a connection from another host (using the same correctly, while a connection from another host (using the same
ESD) would fail. ESD) would fail.
3) When a packet is received with an unexpected Routing Stuff the 3) When a packet is received with an unexpected Routing Stuff the
receiver could invoke special-purpose code to deal with this receiver could invoke special-purpose code to deal with this
case. Possible actions include attempting to verify whether the case. Possible actions include attempting to verify whether the
Routing Stuff is indeed correct (the saved values may have Routing Stuff is indeed correct (the saved values may have
expired) or attempting to verify whether duplicate ESDs are in expired) or attempting to verify whether duplicate ESDs are in
use (e.g., by inventing a protocol that sends packets using both use (e.g., by inventing a protocol that sends packets using both
Routing Stuff and verifies that they go are delivered to the Routing Stuff and verifies that they are delivered to the same
same end point). end-point).
4.1.9. DNS PTR Queries 4.1.8. DNS PTR Queries
IPv4 uses the top-level domain ``IN-ADDR.ARPA'' to hold PTR Resource IPv4 uses the domain "IN-ADDR.ARPA" to hold PTR Resource Records. PTR
Records. PTR RRs allow a client to map IP addresses back into the RRs allow a client to map IP addresses back into the domain name
domain name corresponding to that address. IPv4 addresses can be put corresponding to that address. IPv4 addresses can be put into the DNS
into the DNS because they have hierarchical structure -- the same because they have hierarchical structure -- the same hierarchy used
hierarchy used to aggregate routes. to aggregate routes.
The ability to map an IP address into its corresponding DNS name is The ability to map an IP address into its corresponding DNS name is
used in several contexts: used in several contexts:
1) Network packet tracing utilities (e.g., tcpdump) display the 1) Network packet tracing utilities (e.g., tcpdump) display the
contents of packets. Printing out the DNS names appearing in contents of packets. Printing out the DNS names appearing in
those packets (rather than dotted IP addresses) requires access those packets (rather than dotted IP addresses) requires access
to an address-to-name mapping mechanism. to an address-to-name mapping mechanism.
2) Some applications perform ``cheap'' authentication by using the 2) Some applications perform "cheap" authentication by using the
DNS to map a source address of a peer into a DNS name. Then, the DNS to map a source address of a peer into a DNS name. Then, the
client queries the DNS a second time, this time asking for the client queries the DNS a second time, this time asking for the
address(es) corresponding to the peer's DNS name. Only if one of address(es) corresponding to the peer's DNS name. Only if one of
the addresses returned by the DNS matches the peer address of the addresses returned by the DNS matches the peer address of
the TCP connection is the source of the TCP connection accepted the TCP connection is the source of the TCP connection accepted
as being from the indicated DNS name. as being from the indicated DNS name.
It is important to note that although two DNS queries are made It is important to note that although two DNS queries are made
during the above operation, it is the second one --- mapping the during the above operation, it is the second one --- mapping the
peer's DNS name back into an IP address --- that provides the peer's DNS name back into an IP address --- that provides the
authentication property. The first transaction simply obtains authentication property. The first transaction simply obtains
the peer's DNS name, but no assumption is made that the returned the peer's DNS name, but no assumption is made that the returned
DNS name is correct. Thus, the first DNS query could be DNS name is correct. Thus, the first DNS query could be
replaced by an alternate mechanism without weakening the replaced by an alternate mechanism without weakening the already
(already weak) authentication check described above. One weak authentication check described above. One possible
possible alternate mechanism, an ICMP ``Who Are You'' message, alternate mechanism, an ICMP "Who Are You" message, is described
is described in Section 4.1.12. in Section 4.1.11.
3) Applications that log all incoming network connections (e.g., 3) Applications that log all incoming network connections (e.g.,
anonymous FTP servers) may prefer logging recognizable DNS names anonymous FTP servers) may prefer logging recognizable DNS names
to addresses. to addresses.
4) Network administrators examining logs or other trace data 4) Network administrators examining logs or other trace data
containing addresses may wish to determine the DNS name of some containing addresses may wish to determine the DNS name of some
addresses. Note that this may occur sometime after those addresses. Note that this may occur sometime after those
addresses were actually used. addresses were actually used.
Although DNS PTR records have proven useful in several contexts, Although DNS PTR records have proven useful in several contexts,
there is also widespread agreement that, in practice, most of the IP there is also widespread agreement that, in practice, many IP
addresses in use today are not properly registered in the PTR addresses in use today are not properly registered in the IN-
hierarchy. Consequently, more often than not, PTR queries fail to ADDR.ARPA namespace. Consequently, PTR queries frequently fail to
return usable information. Thus, the overall utility of PTR records return usable information. Thus, the overall utility of PTR records
is questionable. is questionable.
It is also worth noting that the primary reason that so few addresses It is also worth noting that the primary reason that so few addresses
are properly registered in the PTR space is the lack of incentive for are properly registered in the PTR space is the absence of incentive
doing so. With no key piece of the Internet infrastructure depending for doing so. With no key piece of the Internet infrastructure
on such mappings being in place (or correct), there is little harm in depending on such mappings being in place or correct, there is little
failing to keep it up-to-date. practical harm in failing to keep it up-to-date.
Finally, it might appear at first glance that secure DNS [RFC2065] Finally, it might appear at first glance that secure DNS [RFC2065]
provides a means for cryptographically signing a PTR record and provides a means for cryptographically signing a PTR record and
thereby providing authentication. Things are not so simple, however. thereby providing authentication. Things are not so simple, however.
The signature on a PTR record indicates that the entity owning an The signature on a PTR record indicates that the entity owning an
address has given it a DNS name. It does not mean that the owner of address has given it a DNS name. It does not mean that the owner of
the address is authorized to use that specific name. For example, the address is authorized to use that specific name. For example,
anyone owning an address can set up a PTR record indicating that the anyone owning an address can set up a PTR record indicating that the
address corresponds to the name ``www.ietf.org''. However, the name address corresponds to the name "www.ietf.org". However, the name
``www.ietf.org'' belongs to only one entity, regardless of how many "www.ietf.org" belongs to only one entity, regardless of how many PTR
PTR records indicate otherwise. records indicate otherwise.
4.1.10. Reverse Mapping of ESDs 4.1.9. Reverse Mapping of ESDs
If an end point is identified via an ESD rather than by its full It is reasonable to ask if it is necessary or desirable to be able to
address, do we need the ability map the ESD alone into some other map an ESD (alone) into some other meaningful quantity, such as a
meaningful quantity, such as a fully qualified domain name? The fully qualified domain name. The benefits of being able to perform
benefits of being able to perform such a mapping are analogous to such a mapping are analogous to those described in the preceding
those described in the preceding section. section.
The primary difficulty with constructing such a mapping is that it The primary difficulty with constructing such a mapping is that it
requires that ESDs have sufficient structure to support the requires that ESDs have sufficient structure to support the
delegating mechanism of a distributed database such as DNS. The sorts delegating mechanism of a distributed database such as DNS. The sorts
of built-in identifiers now found in computing hardware, such as of built-in identifiers now found in computing hardware, such as
``EUI-48'' and ``EUI-64'' addresses [IEEE802, IEEE1212], do not have "EUI-48" and "EUI-64" addresses [IEEE802, IEEE1212], do not have the
the structure required for this delegation. Hence, stateless structure required for this delegation. Hence, stateless
autoconfiguration [RFC1971] cannot create addresses with the autoconfiguration [RFC1971] cannot create addresses with the
necessary hierarchical property. necessary hierarchical property.
Another possibility would be to define ESDs with sufficient structure Another possibility would be to define ESDs with sufficient structure
to permit the construction of a mapping mechanism. However, analysis to permit the construction of a mapping mechanism. However, analysis
performed during the IPng deliberations concluded that close to 48- performed during the IPng deliberations concluded that close to 48-
bits of hierarchy were needed to identify all the possible sites bits of hierarchy were needed to identify all the possible sites
30-40 years from now. That would leave only 2 bytes for host 30-40 years from now. That would leave only 2 bytes for host
numbering at a site, a number clearly incompatible with stateless numbering at a site, a number clearly incompatible with stateless
autoconfiguration [RFC1971]. autoconfiguration [RFC1971].
There are several arguments against requiring a global ESD-lookup There are several arguments against having a global ESD-lookup
capability. Adding sufficient structure to an 8-byte ESD requires capability. Adding sufficient structure to an 8-byte ESD would be
more bits than are compatible with stateless autoconfiguration. In incompatible with stateless autoconfiguration, which already uses 6
addition, experience with the IN-ADDR.ARPA domain suggests that the bytes for its token; two additional bytes for hierarchy are clearly
required databases will be poorly maintained. Finally, imposing a insufficient. In addition, experience with the IN-ADDR.ARPA domain
required hierarchical structure on ESDs would also introduce a new suggests that the required databases will be poorly maintained.
administrative burden and a new or expanded registry system to manage Finally, imposing a required hierarchical structure on ESDs would
ESD space. While the procedures for assigning ESDs, which need only also introduce a new administrative burden and a new or expanded
organizational and not topological significance, would be simpler registry system to manage ESD space. While the procedures for
than the procedures for managing IPv4 addresses (or DNS names), it is assigning ESDs, which need only organizational and not topological
hard to imagine such a process being universally well-received or significance, would be simpler than the procedures for managing IPv4
without controversy; it seems a laudable goal to avoid the problem addresses (or DNS names), it is hard to imagine such a process being
altogether if possible. universally well-received or without controversy; it seems a laudable
goal to avoid the problem altogether if possible.
Finally, there is an argument based on allocation efficiency. One of
the design criteria for IPng was support for 10^9 networks and 10^12
hosts ``and preferably much more'' [RFC1726], and estimates as high
as 10^15 hosts have been published. Since GSE uses 64 bits to
designate the Site and subnet and the same number to designate the
end system, the allocation efficiency for the latter assignment
process must be much greater. In terms of the H ratio [RFC1715], the
RG+STP portion of the address needs to achieve an H ratio of only
0.14, which is as low a value as any of the example numbering schemes
examined in RFC 1715, while the ESD assignment must achieve 0.19 to
0.23, depending on the total host estimate one accepts. These values
are in the middle to high range of the schemes examined, indicating
that they represent a feasible efficiency -- achievable with careful
management.
4.1.11. Reverse Mapping of Complete GSE Addresses 4.1.10. Reverse Mapping of Complete GSE Addresses
Within a Site, one could imagine maintaining a database keyed on Although it seems infeasible to have a global scale, reverse mapping
unstructured 8-byte ESDs. However, it is a matter of debate whether of ESDs, within a Site, one could imagine maintaining a database
such a database could be kept up-to-date at reasonable cost, without keyed on unstructured 8-byte ESDs. However, it is a matter of debate
making unreasonable assumptions as to how large Sites are going to whether such a database can be kept up-to-date at reasonable cost,
grow, and how frequently ESD registrations will be made or updated. without making unreasonable assumptions as to how large sites are
Note that the issue isn't just the physical database itself, but the going to grow, and how frequently ESD registrations will be made or
operational issues involved in keeping it up-to-date. For the rest of updated. Note that the issue isn't just the physical database itself,
this section, however, let us assume such a database can be built. but the operational issues involved in keeping it up-to-date. For the
rest of this section, however, let us assume such a database can be
built.
A mechanism supporting a lookup keyed on a flat-space ESD from an A mechanism supporting a lookup keyed on a flat-space ESD from an
arbitrary Site requires having sufficient structure to identify the arbitrary Site requires having sufficient structure to identify the
Site that needs to be queried. In practice, an ESD will almost always Site that needs to be queried. In practice, an ESD will almost always
be used in conjunction with Routing Stuff (i.e., a full 16-byte be used in conjunction with Routing Stuff (i.e., a full 16-byte
address). Since the Routing Stuff is organized hierarchically, it address). Since the Routing Stuff is organized hierarchically, it
becomes feasible to maintain a DNS tree that maps full GSE addresses becomes feasible to maintain a DNS tree that maps full GSE addresses
into DNS names, in a fashion analogous to what is done with IPv4 PTR into DNS names, in a fashion analogous to what is done with IPv4 PTR
records today. records today.
skipping to change at page 39, line 41 skipping to change at page 38, line 41
Routing Stuff portion of the address is correctly entered in the DNS Routing Stuff portion of the address is correctly entered in the DNS
tree. Because the RG portion of an address is expected to change over tree. Because the RG portion of an address is expected to change over
time, this assumption will not be valid indefinitely. As a time, this assumption will not be valid indefinitely. As a
consequence, a packet trace recorded in the past might not contain consequence, a packet trace recorded in the past might not contain
enough information to identify the off-Site sources of the packets in enough information to identify the off-Site sources of the packets in
the present. This problem can be addressed by requiring that the the present. This problem can be addressed by requiring that the
database of RG delegations be maintained for some period of time database of RG delegations be maintained for some period of time
after the RG is no longer usable for routing packets. after the RG is no longer usable for routing packets.
Finally, it should be noted that the problem where an address's RG Finally, it should be noted that the problem where an address's RG
``expires'' with the implication that the mapping of ``expired'' "expires" with the implication that the mapping of "expired"
addresses into DNS names may no longer hold is not a problem specific addresses into DNS names may no longer hold is not a problem specific
to the GSE proposal. With provider-based addressing, the same issue to the GSE proposal. With provider-based addressing, the same issue
arises when a site renumbers into a new provider prefix and releases arises when a site renumbers into a new provider prefix and releases
the allocation from a previous block. the allocation from a previous block. The authors are aware of one
such renumbering in IPv4 where a block of returned addresses was
reassigned and reused within 24 hours of the renumbering.
4.1.12. The ICMP ``Who Are You'' Message 4.1.11. The ICMP "Who Are You" Message
Although there is widespread agreement of the utility of being able Although there is widespread agreement on the utility of being able
to determine the DNS name one is communicating with, there is also to determine the DNS name one is communicating with, there is also
widespread concern that repeating the experience of the ``in- widespread concern that repeating the experience of the "IN-
addr.arpa'' domain is undesirable. Consequently, an old proposal to ADDR.ARPA" domain is undesirable. Consequently, an old proposal to
define an ICMP ``Who Are You?'' message was resurrected [RFC1788]. A define an ICMP "Who Are You?" message was resurrected [RFC1788]. A
client would send such a message to a peer, and that peer would client would send such a message to a peer, and that peer would
return an ICMP message containing its DNS name. return an ICMP message containing its DNS name.
Asking a remote host to supply its own name in no way implies that Asking a remote host to supply its own name in no way implies that
the returned information is accurate. However, having a remote peer the returned information is accurate. However, having a remote peer
provide a piece of information that a client can use as input to a provide a piece of information that a client can use as input to a
separate authentication procedure provides a starting point for separate authentication procedure provides a starting point for
performing strong authentication. The actual strength of the performing strong authentication. The actual strength of the
authentication depends on the authentication procedure invoked, authentication depends on the authentication procedure invoked,
rather than the (untrustable) piece of information provided by a rather than the untrustable piece of information provided by a remote
remote peer. peer.
Reconsidering the ``cheap'' authentication procedure described in Reconsidering the "cheap" authentication procedure described in
Section 4.1.9, the ICMP ``Who Are You'' replaces the DNS PTR query Section 4.1.9, the ICMP "Who Are You" replaces the DNS PTR query used
used to obtain the DNS name of a remote peer. The second DNS query, to obtain the DNS name of a remote peer. The second DNS query, to map
to map the DNS name back into a set of addresses, would be performed the DNS name back into a set of addresses, would be performed as
as before. Because the latter DNS query provides the strength of the before. Because the latter DNS query provides the strength of the
authentication, the use of an ICMP ``Who Are You'' message does not authentication, the use of an ICMP "Who Are You" message does not in
in any way weaken the strength of the authentication method. Indeed, any way weaken the strength of the authentication method. Indeed, it
it can only make it more useful in practice, because virtually all can only make it more useful in practice, because virtually all hosts
hosts can be expected to implement the ``Who Are You'' message. can be expected to implement the "Who Are You" message.
The ``Who Are You'' message would contain an identifier for matching The "Who Are You" message could contain an identifier for matching
replies to requests, as well as a nonce value to provide resistance replies to requests, and perhaps a nonce value to provide resistance
to spoofing. In order to minimize the number of WRU packets on the to spoofing. In order to minimize the number of WRU packets on the
Internet, the WRU messages should be sent by DNS servers who would Internet, the WRU messages should be sent by DNS servers who would
then cache the answers. This has the pleasant side-effect of reducing then cache the answers. This has the pleasant side-effect of reducing
the impact on existing applications (i.e., they would continue to the impact on existing applications (i.e., they would continue to
look up addresses using the same API as before). In many cases there look up addresses using the same API as before). In many cases there
is a natural TTL that the target node can provide in its reply: is a natural TTL that the target node can provide in its reply:
either the remaining lifetime of a DHCP lease or the remaining valid either the remaining lifetime of a DHCP lease or the remaining valid
time of a prefix from which the address was derived through stateless time of a prefix from which the address was derived through stateless
autoconfiguration. autoconfiguration.
The ``Who Are You?'' (WRU) message described in [section ref: The "Who Are You?" (WRU) message described in Section 4.1.10 is
``Reverse Mapping of Complete IPv6 Addresses'' under robust against renumbering, since it follows the paths of valid
Analysis/ESD/WRU] is robust against renumbering, since it follows the routable prefixes. Essentially, it uses the Internet routing system
paths of valid routable prefixes. Essentially, it uses the Internet in place of the DNS delegation scheme. It is attractive in the
routing system in place of the DNS delegation scheme. It is context of GSE-style renumbering, since no host or DNS server needs
attractive in the context of GSE-style renumbering, since no host or to be updated after a renumbering event for WRU-based lookups to
DNS server needs to be updated after a renumbering event for WRU- work. It has advantages outside the context of GSE as well, including
based lookups to work. It has advantages outside the context of GSE a more decentralized, and hence more scalable, administration and
as well, including a more decentralized, and hence more scalable, easier upkeep than a DNS reverse-lookup zone. It also has drawbacks:
administration and easier upkeep than a DNS reverse-lookup zone. It it requires the target node to be up and reachable at the time of the
also has drawbacks: it requires the target node to be up and query and to know its fully qualified domain name. It is also not
reachable at the time of the query and to know its fully qualified possible to resolve addresses once those addresses become unroutable.
domain name. It is also not possible to resolve addresses once those In contrast, the DNS PTR mirrors, but is independent of, the routing
addresses become unroutable. In contrast, the DNS PTR mirrors, but is hierarchy. The DNS can maintain mappings long after the routing
independent of, the routing hierarchy. The DNS can maintain mappings subsystem stops delivering packets to certain addresses.
long after the routing subsystem stops delivering packets to certain
addresses.
The requirement that the target node be up and reachable at the time The requirement that the target node be up and reachable at the time
of the query makes it very uncertain that one would be able to take of the query makes it very uncertain that one would be able to take
addresses from a packet log and translate them to correct domain addresses from a packet log and translate them to correct domain
names at a later date. This is a design flaw in the logging system, names at a later date. This is a design flaw in the logging system,
as it violates the architectural principle, ``Avoid any design that as it violates the architectural principle, "Avoid any design that
requires addresses to be ... stored on non-volatile storage.'' requires addresses to be ... stored on non-volatile storage."
[RFC1958] A better-designed system would look up domain names [RFC1958] A better-designed system would look up domain names
promptly from logged addresses. Indeed, one of the authors is pleased promptly from logged addresses. Indeed, one of the authors is pleased
to be able to state that his site has been doing that for some years. to be able to state that his site has been doing that for some years.
(Speculative note: Proxy servers to answer WRU queries are possible. (Speculative note: Proxy servers to answer WRU queries are possible.
If the boundary between the global and site portions of addresses are If the boundary between the global and site portions of addresses are
fixed and/or the boundary between the routing and the end-node fixed and/or the boundary between the routing and the end-node
portions are fixed, then one could define a well-known anycast portions are fixed, then one could define a well-known anycast
address for proxy WRU service per site and/or per subnet. The low- address for proxy WRU service per site and/or per subnet. The low-
order portion of this address would presumably be created from the order portion of this address would presumably be created from the
skipping to change at page 42, line 10 skipping to change at page 41, line 8
(i.e., without coordinating the change with the Site). This would (i.e., without coordinating the change with the Site). This would
make it possible for backbone providers to aggressively renumber the make it possible for backbone providers to aggressively renumber the
Routing Goop part of addresses and achieve a high degree of route Routing Goop part of addresses and achieve a high degree of route
aggregation. On closer examination, frequent (e.g., daily) aggregation. On closer examination, frequent (e.g., daily)
renumbering turns out to be difficult in practice because of a renumbering turns out to be difficult in practice because of a
circular dependency between the DNS and routing. Specifically, if a circular dependency between the DNS and routing. Specifically, if a
Site's Routing Stuff changes, nodes communicating with the Site need Site's Routing Stuff changes, nodes communicating with the Site need
to obtain the new Routing Stuff. In the GSE proposal, one queries the to obtain the new Routing Stuff. In the GSE proposal, one queries the
DNS to obtain this information. However, in order to reach a Site's DNS to obtain this information. However, in order to reach a Site's
DNS servers, the pointers controlling the downward delegation of DNS servers, the pointers controlling the downward delegation of
authoritative DNS servers (i.e., DNS ``glue records'') must use authoritative DNS servers (i.e., DNS "glue records") must use
addresses (with Routing Stuff) that are reachable. That is, in order addresses (with Routing Stuff) that are reachable. That is, in order
to find the address for the web server ``www.foo.bar.com'', DNS to find the address for the web server "www.foo.bar.com", DNS queries
queries might need to be sent to a root DNS servers, as well as DNS might need to be sent to a root DNS servers, as well as DNS servers
servers for ``bar.com'' and ``foo.bar.com''. Each of these servers for "bar.com" and "foo.bar.com". Each of these servers must be
must be reachable from the querying client. Consequently, there must reachable from the querying client. Consequently, there must be an
be an overlap period during which both the old Routing Stuff and the overlap period during which both the old Routing Stuff and the new
new Routing Stuff can be used simultaneously. During the overlap Routing Stuff can be used simultaneously. During the overlap period,
period, DNS ``glue records'' would need to be updated to use the new DNS glue records would need to be updated to use the new addresses
addresses (including Routing Stuff). Only after all relevant DNS (including Routing Stuff). Only after all relevant DNS servers have
servers have been updated and older cached RRs containing the old been updated and older cached RRs containing the old addresses have
addresses have timed out can the old address be deleted. timed out can the old address be deleted.
An important observation is that the above issue is not specific to An important observation is that the above issue is not specific to
GSE: the same requirement exists with today's provider-based GSE: the same requirement exists with today's provider-based
addressing architecture. When a site is renumbered (e.g., it switches addressing architecture. When a site is renumbered (e.g., it switches
ISPs and obtains a new set of addresses from its new provider), the ISPs and obtains a new set of addresses from its new provider), the
DNS must be updated in a similar fashion. DNS must be updated in a similar fashion.
4.2.2. Efficient DNS support for Site Renumbering 4.2.2. Efficient DNS support for Site Renumbering
When a site renumbers to satisfy its ISP, only the site's routing When a site renumbers to satisfy its ISP, only the site's routing
prefix needs to change. That is, the prefix reflects where within the prefix needs to change. That is, the prefix reflects where within the
Internet the site resides. Although some sites may also change the Internet the site resides. Although some sites may also change the
numbering of their internal topology when switching providers, this numbering of their internal topology when switching providers, this
is not a requirement. Rather, it may be a convenient time to also is not a requirement. Rather, it may be a convenient time to also
perform any desired internal renumbering since one practical view is perform any desired internal renumbering since in practice that any
that any address renumbering tends to cause disruptions. address renumbering tends to cause disruptions.
In the current Internet, when a site is renumbered, the addresses of In the current Internet, when a site is renumbered, the addresses of
all the site's internal nodes change. This requires a potentially all the site's internal nodes change. This requires a potentially
large update to the RR database for that site. Although Dynamic DNS large update to the RR database for that site. Although Dynamic DNS
[DDNS] could potentially be used, the cost is likely to be large due [DDNS] could potentially be used, the cost is likely to be large due
to the large number of individual records that would need to be to the large number of individual records that would need to be
updated. In addition, when DHCP and DDNS are used together [DHCP/DDNS updated. In addition, when DHCP and DDNS are used together [DHCP-
interaction XXX], it may be the case that individual hosts ``own'' DDNS], it may be the case that individual hosts "own" their own A or
their own A or AAAA records, further complicating the question of who AAAA records, further complicating the question of who is able to
is able to update the contents of DNS RRs. update the contents of DNS RRs.
One change that could reduce the cost of updating the DNS when a site One change that could reduce the cost of updating the DNS when a site
is renumbered is to split addresses into two distinct portions: a is renumbered is to split addresses into two distinct portions: a
Routing Stuff that reflects where a node attaches to the Internet and Routing Goop that reflects where a node attaches to the Internet and
a ``site internal part'' that is the site-specific part of an a "site internal part" that is the site-specific part of an address.
address. During a renumbering, only the Routing Stuff would change;
the ``site internal part'' would remain fixed. Furthermore, the two During a renumbering, only the Routing Goop would change; the "site
parts of the address could be stored in the DNS as separate RRs. That internal part" would remain fixed. Furthermore, the two parts of the
way, renumbering a site would only require that the Routing Stuff RR address could be stored in the DNS as separate RRs. That way,
of a site be updated; the ``site-internal part'' of individual renumbering a site would only require that the Routing Goop RR of a
addresses would not change. site be updated; the "site-internal part" of individual addresses
would not change.
To obtain the address of a node from the DNS, a DNS query for the To obtain the address of a node from the DNS, a DNS query for the
name would return two quantities: the ``site internal part'' and the name would return two quantities: the "site internal part" and the
DNS name of the Routing Stuff for the site. An additional DNS query DNS name of the Routing Stuff for the site. An additional DNS query
would then obtain the specific RR of the site, and the complete would then obtain the specific RR of the site, and the complete
address would be synthesized by concatenating the two pieces of address would be synthesized by concatenating the two pieces of
information. information.
Implementing these DNS changes increases the practicality of using Implementing these DNS changes increases the practicality of using
Dynamic DNS to update a site's DNS records as it is renumbered. Dynamic DNS to update a site's DNS records as it is renumbered. Only
the site's Routing Goop RRs would need updating.
Finally, it may be useful to divide a node's AAAA RR into the three Finally, it may be useful to divide a node's AAAA RR into the three
logical parts proposed in [GSE], namely RG, STP and ESD. Whether or logical parts of the GSE proposal, namely RG, STP and ESD. Whether or
not it is useful to have separate RRs for the RG and STP portions of not it is useful to have separate RRs for the STP and ESD portions of
an address is an issue that requires further study. an address or a single RR combining both is an issue that requires
further study.
4.2.3. Synthesizing AAAA Records
If AAAA records are comprised of multiple distinct RRs, then one If AAAA records are comprised of multiple distinct RRs, then one
question is who should be responsible for synthesizing the AAAA from question is who should be responsible for synthesizing the AAAA from
its components: the resolver running on the querying client's machine its components: the resolver running on the querying client's machine
or the queried name server? To minimize the impact on client hosts or the queried name server? To minimize the impact on client hosts
and make it easier to deploy future changes, it is recommended that and make it easier to deploy future changes, it is recommended that
the synthesis of AAAA records from its constituent parts be done on the synthesis of AAAA records from its constituent parts be done on
name servers rather than in client resolvers. [this section is really name servers rather than in client resolvers.
weak; what more is needed?]
4.2.4. Two-Faced DNS 4.2.3. Two-Faced DNS
The GSE proposal [GSE] attempts to hide the RG part of addresses from The GSE proposal attempts to hide the RG part of addresses from nodes
nodes within a Site. If the nodes do not know their own RG, then they within a Site. If the nodes do not know their own RG, then they can't
can't store or use them in ways that cause problems should the Site store or use them in ways that cause problems should the Site be
be renumbered and its RG change (i.e., the cached RG become invalid). renumbered and its RG change (i.e., the cached RG become invalid). A
A Site's DNS servers, however, will need to have more information Site's DNS servers, however, will need to have more information about
about the RG its Site uses. Moreover, the responses it returns will the RG its Site uses. Moreover, the responses it returns will depend
depend on who queries the server. A query from a node within the Site on who queries the server. A query from a node within the Site should
should return an address with an RG portion equal to ``Site local,'' return an address with an RG portion equal to "Site local," whereas a
whereas a query for the same name from a client located at a Site query for the same name from a client located at a different Site
elsewhere in the Internet would return the appropriate RG portion. would return the appropriate RG portion. This facilitates intra-site
Such context-dependent DNS servers are commonly referred as ``two- communication to be more resilient to failures outside of the site.
faced DNS servers.'' Such context-dependent DNS servers are commonly referred as "two-
faced" DNS servers.
Some issues that must be considered in this context: Some issues that must be considered in this context:
1) A DNS server may recursively attempt to resolve a query on 1) A DNS server may recursively attempt to resolve a query on
behalf of a requesting client. Consequently, a DNS query might behalf of a requesting client. Consequently, a DNS query might
be received from a proxy rather from the client that actually be received from a proxy rather than from the client that
seeks the information. Because the proxy may not be located at actually seeks the information. Because the proxy may not be
the same Site as the originating client, a DNS server cannot located at the same Site as the originating client, a DNS server
reliably determine whether a DNS request is coming from the same cannot reliably determine whether a DNS request is coming from
Site or a remote Site. One solution would be to disallow the same Site or a remote Site. One solution would be to
recursive queries for off-Site requesters, though this raises disallow recursive queries for off-Site requesters, though this
additional questions. raises additional questions.
2) Since cached response are, in general, context sensitive, a name
server may be unable to correctly answer a query from its cache,
since the information it has is incomplete. That is, it may have
loaded the information via a query from a local client, and the
information has a Site-local prefix. If a subsequent request
comes in from an off-Site requester, the DNS server cannot
return a correct response (i.e., one containing the correct RG).
[what else needs mentioning?] 2) Since cached responses are, in general, context sensitive, a
name server may be unable to correctly answer a query from its
cache, since the information it has is incomplete. That is, it
may have loaded the information via a query from a local client,
and the information has a Site-local prefix. If a subsequent
request comes in from an off-Site requester, the DNS server
cannot return a correct response (i.e., one containing the
correct RG).
4.2.5. Bootstrapping Issues 4.2.4. Bootstrapping Issues
If Routing Stuff information is distributed via the DNS, key DNS If Routing Stuff information is distributed via the DNS, key DNS
servers must always be reachable. In particular, the addresses servers must always be reachable. In particular, the addresses
(including Routing Stuff) of all root DNS servers are, for all (including Routing Stuff) of all root DNS servers are, for all
practical purposes, well-known and assumed to never change. It is not practical purposes, well-known and assumed to never change. It is not
uncommon for the addresses of root servers to be hard-coded into uncommon for the addresses of root servers to be hard-coded into
software distributions. Consequently, the Routing Stuff associated software distributions. Consequently, the Routing Stuff associated
with such addresses must always be usable for reaching root servers. with such addresses must always be usable for reaching root servers.
If it becomes necessary or desirable to change the Routing Stuff of If it becomes necessary or desirable to change the Routing Stuff of
an address at which a root DNS server resides, the routing subsystem an address at which a root DNS server resides, the routing subsystem
will likely need to continue carrying ``exceptions'' for those will likely need to continue carrying "exceptions" for those
addresses. Because the total number of root DNS servers is relatively addresses. Because the total number of root DNS servers is relatively
small, the routing subsystem is expected to be able to handle this small, the routing subsystem is expected to be able to handle this
requirement. requirement.
All other DNS server addresses can be changed, since their addresses All other DNS server addresses can be changed, since their addresses
are typically learned from an upper-level DNS server that has are typically learned from an upper-level DNS server that has
delegated a part of the name space to them. So long as the delegating delegated a part of the name space to them. So long as the delegating
server is configured with the new address, the addresses of other server is configured with the new address, the addresses of other
servers can change. servers can change.
4.2.6. DNS PTR RRs Not Needed 4.2.5. Renumbering and Reverse DNS Lookups
Both IPv4 and IPv6 include a mechanism for mapping addresses into DNS
names (i.e., the PTR RR and IN-ADDR.ARPA and IP6.INT domains
respectively). The need for such a mechanism can be decreased
significantly through the proposed ICMP ``Who Are You'' message (see
Section 4.1.12).
In any case, PTR records can be implemented essentially as today.
Assuming that the Routing Stuff of addresses remains hierarchical,
each ISP would be responsible for delegating its part of the Routing
Stuff to its downstream customers. Eventually, the Routing Stuff
would reach the ISP to which the end site connects. This ISP would
have a pointer to the site's DNS servers which be responsible for the
rest of the mapping.
4.2.7. Renumbering and Reverse DNS Lookups
It is certain that many sites will from time to time undergo a It is certain that many sites will, from time to time, undergo a
renumbering event, either through the mechanisms proposed for GSE or renumbering event, either through the mechanisms proposed for GSE or
using the facilities already specified for IPv6. It would be useful using the facilities already specified for IPv6. It would be useful
to an outside node corresponding with such a site to be able to to an outside node corresponding with such a site to be able to
distinguish a legitimate renumbering from an attempt to impersonate distinguish a legitimate renumbering from an attempt to impersonate
the site. We claim that the DNS IP6.INT zone, without security the site. We claim that the DNS IP6.INT zone, without security
extensions [RFC2065], is of no use in making this determination and extensions [RFC2065], is of no use in making this determination and
that even a completely secured IP6.INT zone is of little use compared that even a completely secured IP6.INT zone is of little use compared
with the ``forward'' DNS zone. with the "forward" DNS zone.
The first half of the claim is almost self-evident. An impersonator The first half of the claim is almost self-evident. An impersonator
can set up an insecure zone at some point in the IP6.INT hierarchy can set up an insecure zone at some point in the IP6.INT hierarchy
and load it with any desired data. This is the reason that current and load it with any desired data. This is the reason that current
applications doing minimal access control follow a reverse lookup applications doing minimal access control follow a reverse lookup
with a forward lookup. with a forward lookup.
With a secured reverse zone, the problem of verifying an apparent With a secured reverse zone, the problem of verifying an apparent
renumbering of a site can still be quite complex in the general case, renumbering of a site can still be quite complex in the general case,
and will certainly be outside the scope of a transport protocol, if and will certainly be outside the scope of a transport protocol, if
skipping to change at page 46, line 13 skipping to change at page 44, line 45
representing the site. It is then problematic to translate representing the site. It is then problematic to translate
established trust in the old reverse mapping zone into trust in the established trust in the old reverse mapping zone into trust in the
new zone. Certainly it's simpler to rely on the forward zone only. new zone. Certainly it's simpler to rely on the forward zone only.
The only function of the reverse zone, then, is to suggest an entry The only function of the reverse zone, then, is to suggest an entry
point to the forward zone's database. It is this function which we point to the forward zone's database. It is this function which we
propose to achieve by means of a new ICMP message exchange. propose to achieve by means of a new ICMP message exchange.
4.3. Address Rewriting Routers 4.3. Address Rewriting Routers
One of the most novel pieces of GSE is the rewriting of addresses as One of the most novel pieces of GSE is the rewriting of addresses as
datagrams enter and leave Sites. If only a small number of routers datagrams enter and leave sites. If only a small number of routers
know the RG portion of the addresses, then the operational impact of know the RG portion of the addresses, then the operational impact of
renumbering a Site would be small. In fact, assuming that the renumbering a Site would be small. In fact, assuming that the
critical security issues are dealt with, one could imagine a dynamic critical security issues are dealt with, one could imagine a dynamic
protocol that a Site uses with its upstream provider to be told what protocol that a Site uses with its upstream provider to be told what
RG to use, so it might even be possible to renumber a Site RG to use, so it might even be possible to renumber a Site
transparently. transparently.
GSE's ability to ensure that the RG portion of a Site's addresses GSE's ability to insure that the RG portion of a Site's addresses
reflect the actual location of that Site within the Public Internet reflect the actual location of that Site within the Public Internet
means that very aggressive aggregation (i.e., better route scaling) means that very aggressive aggregation (i.e., better route scaling)
can be achieved. Both GSE and other route-scaling approaches that use can be achieved. Both GSE and other route-scaling approaches that use
provider-based addressing depend on aggressive aggregation, but while provider-based addressing depend on aggressive aggregation, but while
other schemes rely largely on operational policies, GSE attempts to other schemes rely largely on operational policies, GSE attempts to
include mechanisms in its core to ensure that aggressive aggregation include mechanisms in its core to insure that aggressive aggregation
happens in practice. happens in practice.
GSE has an advantage over other provider-based addressing schemes GSE has an advantage over other provider-based addressing schemes
like IPv4's CIDR with respect to the ``fair distribution of work.'' like IPv4's CIDR with respect to the "fair distribution of work."
CIDR addresses the scaling of routing in DFZ portions of the CIDR addresses the scaling of routing in DFZ portions of the
Internet, but the cost of carrying out the renumbering to maintain Internet, but the cost of carrying out the renumbering to maintain
the aggregation falls on the shoulders of subscribers who are far the aggregation falls on the shoulders of subscribers who are far
away from the DFZ; in other words, subscribers must do the work of away from the DFZ; in other words, subscribers must do the work of
renumbering so that their provider (or possibly even their provider's renumbering so that their provider (or possibly even their provider's
provider) see better aggregation. With GSE, the majority of the cost provider) sees better aggregation. With GSE, the majority of the cost
required to make the routing scale would be incurred by the parties required to make the routing scale would be incurred by the parties
who reap the benefits. who reap the benefits.
4.3.1. Load Balancing 4.3.1. Load Balancing
While not considered a major advantage, with GSE, multi-homed Sites While not considered a major advantage, with GSE, multi-homed sites
can more easily achieve symmetry with respect to which of their links can more easily achieve symmetry with respect to which of their links
is used for a given flow. With GSE, if HostA in multi-homed Site1 is used for a given flow. With GSE, if HostA in multi-homed Site1
initiates a flow to HostB in Site2, then when the initial packet initiates a flow to HostB in Site2, then when the initial packet
leaves Site1 the source address will be rewritten with an RG that leaves Site1 the source address will be rewritten with an RG that
identifies the outgoing link used. As a result, when HostB needs to identifies the egress link used. As a result, when HostB needs to
send return traffic, it will use the full 16-byte address from the send return traffic, it will use the full 16-byte address from the
arriving packet and this necessarily means that traffic for this flow arriving packet and this necessarily means that traffic for this flow
coming into Site1 will use the same circuit that outgoing traffic for coming into Site1 will use the same circuit that outgoing traffic for
that flow took. that flow took. In contrast, if the source address (i.e., Routing
Stuff) is fixed by the sending host, the same return path is used for
return traffic coming back to a site, regardless of which egress
router packets traverse when leaving that site.
4.3.2. End-To-End Argument: Don't Hide RG from Hosts 4.3.2. End-To-End Argument: Don't Hide RG from Hosts
Despite these significant advantages, however, it was felt that Despite these significant advantages, however, the overwhelming
address rewriting by routers should not be pursued as part of the consensus was that address rewriting by routers should not be pursued
current standardization effort. Although hiding RG knowledge from as part of the current standardization effort. Although hiding RG
hosts has advantages in simple scenarios, that lack of knowledge also knowledge from hosts has advantages in some scenarios, that lack of
makes it difficult to solve important problems. knowledge also makes it difficult to solve important problems.
For example, a host in a multi-homed site is known by multiple For example, a host in a multi-homed site is known by multiple
addresses, but without knowing its address the host can play no role addresses, but without knowing its address the host can play no role
in the source address selection; instead, the host relies on the in the source address selection; instead, the host relies on the
routing infrastructure to magically select the right one, i.e., by routing infrastructure to magically select the right one, i.e., by
selecting the egress router the packet uses to leave the site. In selecting the egress router closest to the sender. For many sites,
this particular case, the historically difficult-to-solve problem of this is the desired behavior. For others, this is not the desired
source address selection is made more difficult by moving it from an behavior. In those cases, the historically difficult-to-solve problem
intra-host decision to a distributed one. Now a site's internal of source address selection is made more difficult by moving it from
an intra-host decision to a distributed one. Now a site's internal
routers would have to have sufficient knowledge to decide which routers would have to have sufficient knowledge to decide which
egress router to forward traffic to, perhaps on a source-by-source egress router to forward traffic to, perhaps on a source-by-source
(or worse) basis. Another end-to-end problem resulting from address (or worse) basis.
rewriting has to do with how transport connections should deal with
the RG portion of the address in incoming packets, particularly when Another end-to-end problem resulting from address rewriting has to do
the RG changes. The sections on transport issues deal with the with how transport connections should deal with the RG portion of the
subject in much more detail. address in incoming packets, particularly when authenticating the RG
changes. The sections on transport issues deal with the subject in
much more detail.
Interesting questions arise about address rewriting when dealing with Interesting questions arise about address rewriting when dealing with
tunnels. It was realized that any node that can terminate a tunnel tunnels. Any node that acts as a tunnel for which the other end
whose other end point is in a different Site must be able to behave resides in a different Site must be able to behave as a Site border
as a Site border router and do address rewriting. This means that the router and do address rewriting. This means that the RG may need to
RG may need to be configured in more than just a Site's egress be configured in more than just a Site's egress router, thus making
router, thus making renumbering more problematic. Another problem renumbering more problematic.
related to both performance and ``architectural cleanliness'' has to
do with IPv6's Routing Headers. It may be necessary for addresses
other than just the simple source and destination to be rewritten.
And again, this rewriting would need to be done by both egress
routers and nodes which terminate tunnels that go to other Sites.
4.4. Multi-homing Another problem related to both performance and "architectural
cleanliness" has to do with IPv6's Routing Headers. It may be
necessary for addresses other than just the simple source and
destination to be rewritten. And again, this rewriting would need to
be done by both egress routers and nodes which terminate tunnels that
go to other sites.
Multi-homing can mean many things. In the context of GSE, multi- 4.4. Multi-Homing
Multi-Homing can mean many things. In the context of GSE, multi-
homing refers to a Site having more than one connection to the homing refers to a Site having more than one connection to the
Internet and being known by multiple RGs. In many ways this is close Internet and therefore being known by multiple RGs. In many ways this
to multi-homing with IPv6 provider-based addressing. It is hard to is close to multi-homing with IPv6 provider-based addressing. It is
make comparisons to IPv4 because multi-homing has traditionally been hard to make comparisons to IPv4 because multi-homing has
done in an ad hoc fashion. traditionally been done in an ad hoc fashion.
With GSE, the ability of a Site to control the load-sharing over its With GSE, the ability of a Site to control the load-sharing over its
multiple links is not clear, partially because there is little multiple links is not clear, partially because there is little
operational experience with multi-homed sites known by multiple operational experience with multi-homed sites known by multiple
prefixes (with IPv4 the site is generally only known by a single prefixes (with IPv4 the site is generally only known by a single
prefix). The following analysis is relevant to any scheme where an prefix). The following analysis is relevant to any scheme where an
Internet-connected site is known by multiple prefixes. For flows that Internet-connected site is known by multiple prefixes. For flows that
the multi-homed site initiates, load-sharing is impacted by the the multi-homed site initiates, load-sharing is impacted by the
source address used because that is the address that the remote site source address used because that is the address that the remote site
will use for return traffic. If we assume the model of routers will use for return traffic. If we assume the model of routers
skipping to change at page 48, line 31 skipping to change at page 47, line 20
address selection, and the optimal choice may require knowledge of address selection, and the optimal choice may require knowledge of
the topology. For flows initiated by someone outside of the multi- the topology. For flows initiated by someone outside of the multi-
homed site, the load-sharing is dependent on the destination address homed site, the load-sharing is dependent on the destination address
specified, so the DNS has a large impact on load-sharing. There is specified, so the DNS has a large impact on load-sharing. There is
some amount of operational experience in using DNS to control load on some amount of operational experience in using DNS to control load on
servers (e.g., having a Web server resolve to multiple addresses), servers (e.g., having a Web server resolve to multiple addresses),
though that is load-sharing of a different resource and at a though that is load-sharing of a different resource and at a
different scope and scale. It is also worth noting that the selection different scope and scale. It is also worth noting that the selection
of the optimal outgoing link may well depend on the destination, of the optimal outgoing link may well depend on the destination,
which has particularly interesting results on the DNS understanding which has particularly interesting results on the DNS understanding
topology (and brings up the question of whether the servers or the topology (and brings up the question of whether the DNS servers or
resolvers are responsible for knowing the topology). the resolvers are responsible for knowing the topology).
One advantage that GSE has for multi-homed Sites is symmetry. Because One advantage that GSE has for multi-homed sites is symmetry. Because
the source address is selected based on the outgoing link, and that the source address is selected based on the outgoing link, and that
source address is what determines the return path, flows initiated by source address is what determines the return path, flows initiated by
the Site will be symmetric with respect to which of the Site's links the Site will be symmetric with respect to which of the Site's links
is used. is used.
The multi-homing mechanism described in Section 3.2.4 has some The multi-homing mechanism described in Section 3.7 has some
weaknesses and complexities. First, the mechanism only supports weaknesses and complexities. First, the mechanism only supports
healing a failed link and not a router; in other words, referencing healing a failed link and not a router; in other words, referencing
Figure 6, from Section 3.2.4, if PBR1 were not up at all, then it Figure 7, from Section 3.7, if PBR1 were not up at all, then it could
could not tunnel the packets anywhere. One could imagine ways of not tunnel the packets anywhere. One could imagine ways of
distributing PBR1's knowledge of PBR2 to other routers within distributing PBR1's knowledge of PBR2 to other routers within
Provider1 to add more reliability, though this makes the problem Provider1 to add more reliability, though this makes the problem
distributed rather than point-to-point and therefore more difficult. distributed rather than point-to-point and therefore more difficult.
Second, in the general case static identification of PBR2 to PBR1, Second, in the general case, static identification of PBR2 to PBR1,
and vice-versa, is not adequate. Imagine, for example, that the link and vice-versa, is not adequate. Imagine, for example, that the link
to PBR1 is much faster than the link to PBR2. In this case, it's to PBR1 is much faster than the link to PBR2. In this case, it's
possible that packets whose destination addresses contain RG1 might possible that packets whose destination addresses contain RG1 might
normally transit PBR2 without going directly to the Site. So there normally transit PBR2 without going directly to the Site. So there
seems to be a need for a dynamic protocol between PBR1 and PBR2 to seems to be a need for a dynamic protocol between PBR1 and PBR2 to
notify when PBR2, for example, should forward RG1-prefaced notify when PBR2, for example, should forward RG1-prefaced
destinations directly to the Site as opposed to forwarding it towards destinations directly to the Site as opposed to forwarding it towards
PBR1. PBR1.
Another note about multi-homing is the potential impact of internal Another note about multi-homing is the potential impact of internal
topology changes in the face of address rewriting. Using the topology changes in the face of address rewriting. Using the
previously referenced diagram, if a flow from a host within the Site previously referenced diagram, if a flow from a host within the Site
is leaving via SBR1, but then something happens such that SBR2 is leaving via SBR1, but then something happens such that SBR2
becomes the host's closest exit point, then the remote end point of becomes the host's closest exit point, then the remote end-point of
the flow will begin seeing different RG. Reasons such as this are why the flow will begin seeing different RG. Reasons such as this are why
the repercussions on the transport layer are so important (e.g., the repercussions on the transport layer are so important (e.g.,
whether or not transport peers pay attention to the RG). whether or not transport peers pay attention to the RG).
5. Recommendations 5. Results
This section should be viewed as ``proto recommendations'' and not
final recommendations. It is impossible to have final recommendations
until there exists an analysis on which there is consensus.
A straw-man set of recommendations, along with some related open This section summarizes the results of the GSE deliberations on the
questions, is presented below: IPv6 process.
1) Make changes to the IPv6 provider-based addressing document to 1) Make changes to the IPv6 provider-based addressing document to
facilitate aggressive aggregation that is also operationally facilitate aggressive aggregation that is also operationally
realistic. realistic.
2) Create hard boundaries in IPv6 addresses to clearly distinguish 2) Create hard boundaries in IPv6 addresses to clearly distinguish
between the portions used to identify hosts, for routing within between the portions used to identify hosts, for routing within
a site, and for routing within the Public Internet. a site, and for routing within the Public Internet.
3) Designate the low-order 8 bytes of IPv6 addresses to be a 3) Allow an option for the low-order 8 bytes of IPv6 addresses to
globally unique End System Designator (ESD). This change has be designated as a globally unique End System Designator (ESD).
potential benefits to future transport protocols (e.g., TCPng). This change has potential benefits to future transport protocols
A point of discussion on this topic is whether, in the short- (e.g., TCPng).
term, the ESD will be used alone; if it isn't to be used alone,
then how important is the global uniqueness?
4) Make a clear distinction between the ``locator'' part of an 4) Make a clear distinction between the "locator" part of an
address and the ``identifier'' part of the address. The former address and the "identifier" part of the address. The former is
is used to route a packet to its end point, the latter is used used to route a packet to its end-point, the latter is used to
to identify an end point, independent of the path used to identify an end-point, independent of the path used to deliver
deliver the packet. Although this is a potentially revolutionary the packet. Although this is a potentially revolutionary change
change to IPv6 addressing model, existing transport protocols to IPv6 addressing model, existing transport protocols such as
such as TCP and UDP will not take advantage of the split. Future TCP and UDP will not take advantage of the split. Future
transport protocols (e.g., TCPng), however, may. transport protocols (e.g., TCPng), however, may.
5) Make changes to the way AAAA records are stored within the DNS, 5) Make changes to the way AAAA records are stored within the DNS,
so that renumbering a site (e.g., when a site changes ISPs) so that renumbering a site (e.g., when a site changes ISPs)
requires few changes to the DNS database in order to effectively requires few changes to the DNS database in order to effectively
change all of a site's address AAAA RRs. change all of a site's address AAAA RRs.
6) Don't hide a node's full address from that node. In a scheme 6) Don't hide a node's full address from that node. In a scheme
where all nodes know their full address, address rewriting where all nodes know their full address, address rewriting
should not be necessary. should not be necessary.
7) Consider multi-homing and its effect on aggregation and route 7) Consider multi-homing and its effect on aggregation and route
scaling from the beginning. Have a goal of architecting a way to scaling from the beginning. Have a goal of architecting a way to
do multi-homing that is both scalable and operationally do multi-homing that is both scalable and operationally
practical, and consider related issues such as load-sharing. practical, and consider related issues such as load-sharing.
8) Consider the issue of subnetting. For example, how are point- 8) Consider the issue of subnetting. For example, how are point-
to-point links numbered? With IPv4, current practice is to to-point links numbered? With IPv4, current practice is to
number point-to-point links out of ``/30'' subnets. However, do number point-to-point links out of "/30" subnets. However, do
network masks longer than 64 bits make sense with the concept of network masks longer than 64 bits make sense with the concept of
the low-order 8 bytes being a globally unique ESD? If not, then the low-order 8 bytes being a globally unique ESD? If not, then
is it acceptable to either leave point-to-point links un- is it acceptable to either leave point-to-point links un-
numbered or to use an entire subnet for each point-to-point numbered or to use an entire subnet for each point-to-point
link? Will there need to be an exception for IPv6 host routes link? Will there need to be an exception for IPv6 host routes
(i.e., /128s) as a work-around for the bootstrapping issue of (i.e., /128s) as a work-around for the bootstrapping issue of
addressing root DNS servers? If /128s are allowed, but not masks addressing root DNS servers? If /128s are allowed, but not masks
between /65 and /127, inclusive, then a possible way to number between /65 and /127, inclusive, then a possible way to number
point-to-point links within a backbone is to dedicate a single point-to-point links within a backbone is to dedicate a single
subnet to them and route them as /128s. subnet to them and route them as /128s.
9) Search for ways to minimize the impact that renumbering can have 9) Search for ways to minimize the impact that renumbering has on
on intra-site communication. Renumbering operations that change intra-site communication. Renumbering operations that change
only the RG portion of addresses should not impact existing only the RG portion of addresses should not impact existing
intra-site communication. One possible approach is to encourage intra-site communication. One possible approach is to encourage
the use of site-local addresses for all intra-site the use of site-local addresses for all intra-site
communication. communication.
6. Security Considerations 6. Security Considerations
TBD The primary security consideration with GSE or, more generally, a
network layer with addresses split into locator and identifier parts,
is that of one node impersonating another by copying the
identification without the location.
7. Acknowledgments 7. Acknowledgments
Thanks go to Steve Deering and Bob Hinden (the Chairs of the IPng Thanks go to Steve Deering and Bob Hinden (the Chairs of the IPng
Working Group) as well as Sun Microsystems (the host for the meeting) Working Group) as well as Sun Microsystems (the host for the PAL1
for the planning and execution of the interim meeting. Thanks also meeting) for the planning and execution of the interim meeting.
goes to Mike O'Dell for writing the 8+8 and GSE drafts. By publishing Thanks also goes to Mike O'Dell for writing the 8+8 and GSE drafts.
these documents and speaking on their behalf, Mike was the catalyst By publishing these documents and speaking on their behalf, Mike was
for some very valuable discussions that are expected to result in the catalyst for some very valuable discussions that are expected to
improved IPv6 addressing. Special thanks to the attendees of the result in improved IPv6 addressing. Special thanks to the attendees
meeting who carried on the high caliber discussions which were the of the meeting who carried on the high caliber discussions which were
source for this document. the source for this document.
8. References 8. References
[DANVERS] Minutes of the IPNG working Group, April 1995. [BATES] Scalable support for multi-homed multi-provider
ftp://ftp.ietf.cnri.reston.va.us/ietf-online-proceedings/ connectivity, Internet Draft, Tony Bates & Yakov Rekhter,
95apr/area.and.wg.reports/ipng/ipngwg/ ipngwg-minutes- draft-bates-multihoming-01.txt.
95apr.txt.
[EUI64] 64-Bit Global Identifier Format Tutorial.
http://standards.ieee.org/db/oui/tutorials/EUI64.html.
Note: ``EUI-64'' is claimed as a trademark by an
organization which also forbids reference to itself in
association with that term in a standards document which is
not their own, unless they have approved that reference.
However, since this document is not standards-track, it
seems safe to name that organization: the IEEE.
[Bellovin 89] ``Security Problems in the TCP/IP Protocol Suite'', [Bellovin 89] "Security Problems in the TCP/IP Protocol Suite",
Bellovin, Steve, Computer Communications Review, Vol. 19, Bellovin, Steve, Computer Communications Review, Vol. 19,
No. 2, pp32-48, April 1989. No. 2, pp32-48, April 1989.
[CERT] CERT(sm) Advisory CA-96.21 [CERT] CERT(sm) Advisory CA-96.21
(ftp://info.cert.org/pub/cert_advisories) (ftp://info.cert.org/pub/cert_advisories)
[DDNS] ``Dynamic Updates in the Domain Name System (DNS UPDATE)'', [DANVERS] Minutes of the IPNG working Group, April 1995.
ftp://ftp.ietf.cnri.reston.va.us/ietf-online-proceedings/
95apr/area.and.wg.reports/ipng/ipngwg/ ipngwg-minutes-
95apr.txt.
[DHCP-DDNS] Interaction between DHCP and DNS, Internet Draft, Yakov
Rekhtor, draft-ietf-dhc-dhcp-dns-04.txt.
[DDNS] "Dynamic Updates in the Domain Name System (DNS UPDATE)",
Paul Vixie (Editor), draft-ietf-dnsind-dynDNS-11.txt, Paul Vixie (Editor), draft-ietf-dnsind-dynDNS-11.txt,
November, 1996. November, 1996.
[GSE] ``GSE - An Alternate Addressing Architecture for IPv6', Mike
[EUI64] 64-Bit Global Identifier Format Tutorial.
http://standards.ieee.org/db/oui/tutorials/EUI64.html.
Note: "EUI-64" is claimed as a trademark by an organization
which also forbids reference to itself in association with
that term in a standards document which is not their own,
unless they have approved that reference. However, since
this document is not standards-track, it seems safe to name
that organization: the IEEE.
[GSE] "GSE - An Alternate Addressing Architecture for IPv6", Mike
O'Dell, draft-ietf-ipngwg-gseaddr-00.txt. O'Dell, draft-ietf-ipngwg-gseaddr-00.txt.
[IEEE802] IEEE Std 802-1990, Local and Metropolitan Area Networks: [IEEE802] IEEE Std 802-1990, Local and Metropolitan Area Networks:
IEEE Standard Overview and Architecture. IEEE Standard Overview and Architecture.
[RFC1122] ``Requirements for Internet hosts - communication
layers'', R. Braden, 10/01/1989.
[IEEE1212] IEEE Std 1212-1994, Information technology-- [IEEE1212] IEEE Std 1212-1994, Information technology--
Microprocessor systems: Control and Status Registers (CSR) Microprocessor systems: Control and Status Registers (CSR)
Architecture for microcomputer buses. Architecture for microcomputer buses.
[RFC1122] "Requirements for Internet hosts - communication layers",
R. Braden, 10/01/1989.
[RFC1715] The H Ratio for Address Assignment Efficiency. C. [RFC1715] The H Ratio for Address Assignment Efficiency. C.
Huitema. Huitema.
[RFC1726] Technical Criteria for Choosing IP:The Next Generation [RFC1726] Technical Criteria for Choosing IP:The Next Generation
(IPng). F. Kastenholz, C. Partridge. (IPng). F. Kastenholz, C. Partridge.
[RFC1752] ``The Recommendation for the IP Next Generation [RFC1752] "The Recommendation for the IP Next Generation Protocol,"
Protocol,'' S. Bradner, A. Mankin, 01/18/1995. S. Bradner, A. Mankin, 01/18/1995.
[RFC1788] ``ICMP Domain Name Messages'', W. Simpson, 04/14/1995 [RFC1788] "ICMP Domain Name Messages", W. Simpson, 04/14/1995
[RFC1958] Architectural Principles of the Internet. B. Carpenter. [RFC1958] Architectural Principles of the Internet. B. Carpenter.
[RFC1971] IPv6 Stateless Address Autoconfiguration. S. Thomson, T. [RFC1971] IPv6 Stateless Address Autoconfiguration. S. Thomson, T.
Narten. Narten.
[RFC2002] ``IP Mobility Support'', 10/22/1996, C. Perkins. [RFC2002] "IP Mobility Support", 10/22/1996, C. Perkins.
[RFC2008] ``Implications of Various Address Allocation Policies for [RFC2008] "Implications of Various Address Allocation Policies for
Internet Routing'', Y. Rekhter, T. Li. Internet Routing", Y. Rekhter, T. Li.
[RFC2065] Domain Name System Security Extensions. D. Eastlake, C. [RFC2065] Domain Name System Security Extensions. D. Eastlake, C.
Kaufman. Kaufman.
[RFC2073] An IPv6 Provider-Based Unicast Address Format. Y. [RFC2073] An IPv6 Provider-Based Unicast Address Format. Y.
Rekhter, P. Lothberg, R. Hinden, S. Deering, J. Postel Rekhter, P. Lothberg, R. Hinden, S. Deering, J. Postel
9. Authors' Addresses 9. Authors' Addresses
Matt Crawford John Stewart Matt Crawford John Stewart
 End of changes. 253 change blocks. 
1130 lines changed or deleted 1084 lines changed or added

This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/