draft-ietf-ltru-matching-02.txt   draft-ietf-ltru-matching-03.txt 
Network Working Group A. Phillips, Ed. Network Working Group A. Phillips, Ed.
Internet-Draft Quest Software Internet-Draft Quest Software
Expires: December 12, 2005 M. Davis, Ed. Expires: December 30, 2005 M. Davis, Ed.
IBM IBM
June 10, 2005 June 28, 2005
Matching Language Identifiers Matching Tags for the Identification of Languages
draft-ietf-ltru-matching-02 draft-ietf-ltru-matching-03
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 35 skipping to change at page 1, line 35
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on December 12, 2005. This Internet-Draft will expire on December 30, 2005.
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (2005). Copyright (C) The Internet Society (2005).
Abstract Abstract
This document describes different mechanisms for comparing and This document describes different mechanisms for comparing, matching,
matching the tags for the identification of languages defined by [RFC and evaluating language tags. Possible algorithms for language
3066bis] [1]. Possible algorithms for language negotiation and negotiation and content selection are described.
content selection are described. This document obsoletes portions of
[RFC 3066] [19].
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. The Language Range . . . . . . . . . . . . . . . . . . . . . . 4 2. The Language Range . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Basic Language Range . . . . . . . . . . . . . . . . . . . 4 2.1 Basic Language Range . . . . . . . . . . . . . . . . . . . 4
2.1.1 Matching . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 Matching . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.2 Lookup . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.2 Lookup . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Extended Language Range . . . . . . . . . . . . . . . . . 6 2.2 Extended Language Range . . . . . . . . . . . . . . . . . 6
2.2.1 Extended Range Matching . . . . . . . . . . . . . . . 7 2.2.1 Extended Range Matching . . . . . . . . . . . . . . . 7
skipping to change at page 2, line 28 skipping to change at page 2, line 28
2.5 Considerations for Private Use Subtags . . . . . . . . . . 12 2.5 Considerations for Private Use Subtags . . . . . . . . . . 12
2.6 Length Considerations in Matching . . . . . . . . . . . . 12 2.6 Length Considerations in Matching . . . . . . . . . . . . 12
3. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 3. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14
4. Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4. Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5. Security Considerations . . . . . . . . . . . . . . . . . . . 16 5. Security Considerations . . . . . . . . . . . . . . . . . . . 16
6. Character Set Considerations . . . . . . . . . . . . . . . . . 17 6. Character Set Considerations . . . . . . . . . . . . . . . . . 17
7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18
7.1 Normative References . . . . . . . . . . . . . . . . . . . 18 7.1 Normative References . . . . . . . . . . . . . . . . . . . 18
7.2 Informative References . . . . . . . . . . . . . . . . . . 19 7.2 Informative References . . . . . . . . . . . . . . . . . . 19
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 19 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 19
A. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 20 A. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 21
Intellectual Property and Copyright Statements . . . . . . . . 21 Intellectual Property and Copyright Statements . . . . . . . . 22
1. Introduction 1. Introduction
Human beings on our planet have, past and present, used a number of Human beings on our planet have, past and present, used a number of
languages. There are many reasons why one would want to identify the languages. There are many reasons why one would want to identify the
language used when presenting or requesting information. language used when presenting or requesting information.
Information about a user's language preferences commonly needs to be Information about a user's language preferences commonly needs to be
identified so that appropriate processing can be applied. For identified so that appropriate processing can be applied. For
example, the user's language preferences in a browser can be used to example, the user's language preferences in a browser can be used to
select web pages appropriately. A choice of language preference can select web pages appropriately. A choice of language preference can
also be used to select among tools (such as dictionaries) to assist also be used to select among tools (such as dictionaries) to assist
in the processing or understanding of content in different languages. in the processing or understanding of content in different languages.
Given a set of language identifiers, such as those defined in Given a set of language identifiers, such as those defined in
RFC3066bis [1], various mechanisms can be envisioned for performing [ID.ietf-ltru-registry], various mechanisms can be envisioned for
language negotiation and tag matching. The suitability of a performing language negotiation and tag matching. The suitability of
particular mechanism to a particular application depends on the needs a particular mechanism to a particular application depends on the
of that application. needs of that application.
This document defines language ranges and syntax for specifying user This document defines language ranges and syntax for specifying user
preferences in a request for language content. It also specifies preferences in a request for language content. It also specifies
various schemes and mechanisms that can be used with language ranges various schemes and mechanisms that can be used with language ranges
when matching or filtering content based on language tags. when matching or filtering content based on language tags.
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [5]. document are to be interpreted as described in [RFC2119].
2. The Language Range 2. The Language Range
Language Tags are used to identify the language of some information Language Tags are used to identify the language of some information
item or content. Applications that use language tags are often faced item or content. Applications that use language tags are often faced
with the problem of identifying sets of content that share certain with the problem of identifying sets of content that share certain
language attributes. For example, HTTP 1.1 [10] describes language language attributes. For example, HTTP 1.1 [RFC2616] describes
ranges in its discussion of the Accept-Language header (Section language ranges in its discussion of the Accept-Language header
14.4), which is used for selecting content from servers based on the (Section 14.4), which is used for selecting content from servers
language of that content. based on the language of that content.
When selecting content according to its language, it is useful to When selecting content according to its language, it is useful to
have a mechanism for identifying sets of language tags that share have a mechanism for identifying sets of language tags that share
specific attributes. This allows users to select or filter content specific attributes. This allows users to select or filter content
based on specific requirements. Such an identifier is called a based on specific requirements. Such an identifier is called a
"Language Range". "Language Range".
2.1 Basic Language Range 2.1 Basic Language Range
A basic language range (such as described in RFC 3066 [19] and HTTP A basic language range (such as described in [RFC3066] and HTTP 1.1
1.1 [10]) is a set of languages whose tags all begin with the same [RFC2616]) is a set of languages whose tags all begin with the same
sequence of subtags. A basic language range can be represented by a sequence of subtags. A basic language range can be represented by a
'language-range' tag, by using the definition from HTTP/1.1 [10] : 'language-range' tag, by using the definition from HTTP/1.1 [RFC2616]
:
language-range = language-tag / "*" language-range = language-tag / "*"
That is, a language-range has the same syntax as a language-tag or is That is, a language-range has the same syntax as a language-tag or is
the single character "*". This definition of language-range implies the single character "*". This definition of language-range implies
that there is a semantic relationship between tags that share the that there is a semantic relationship between tags that share the
same prefix. same prefix.
In particular, the set of language tags that match a specific In particular, the set of language tags that match a specific
language-range might not all be mutually intelligible. The use of a language-range might not all be mutually intelligible. The use of a
prefix when matching tags to language ranges does not imply that prefix when matching tags to language ranges does not imply that
skipping to change at page 5, line 23 skipping to change at page 5, line 23
on the other end of the protocol would make use of that on the other end of the protocol would make use of that
information. information.
3. Some applications of language tags might want or need to consider 3. Some applications of language tags might want or need to consider
extensions and private-use subtags when matching tags. If extensions and private-use subtags when matching tags. If
extensions and private-use subtags are included in a matching or extensions and private-use subtags are included in a matching or
filtering process that utilizes the one of the schemes described filtering process that utilizes the one of the schemes described
in this document, then the implementation SHOULD canonicalize the in this document, then the implementation SHOULD canonicalize the
language tags and/or ranges before performing the matching. Note language tags and/or ranges before performing the matching. Note
that language tag processors that claim to be "well-formed" that language tag processors that claim to be "well-formed"
processors as defined in [1] generally fall into this category. processors as defined in [ID.ietf-ltru-registry] generally fall
into this category.
There are two matching schemes that are commonly associated with There are two matching schemes that are commonly associated with
basic language ranges: matching and lookup. basic language ranges: matching and lookup.
2.1.1 Matching 2.1.1 Matching
Language tag matching is used to select all content that matches a Language tag matching is used to select all content that matches a
given prefix. In matching, the language range represents the least given prefix. In matching, the language range represents the least
specific tag which is an acceptable match and every piece of content specific tag which is an acceptable match and every piece of content
that matches is returned. that matches is returned.
For example, if an application is applying a style to all content in For example, if an application is applying a style to all content in
a web page in a particular language, it might use language tag a web page in a particular language, it might use language tag
matching to select the content to which the style is applied. matching to select the content to which the style is applied.
A language-range matches a language-tag if it exactly equals the tag, A language-range matches a language-tag if it exactly equals the tag,
or if it exactly equals a prefix of the tag such that the first or if it exactly equals a prefix of the tag such that the first
character following the prefix is "-". (That is, the language-range character following the prefix is "-". (That is, the language-range
"en-de" matches the language tag "en-DE-boont", but not the language "de-de" matches the language tag "de-DE-1996", but not the language
tag "en-Deva".) tag "de-Deva".)
The special range "*" matches any tag. A protocol which uses The special range "*" matches any tag. A protocol which uses
language ranges MAY specify additional rules about the semantics of language ranges MAY specify additional rules about the semantics of
"*"; for instance, HTTP/1.1 specifies that the range "*" matches only "*"; for instance, HTTP/1.1 specifies that the range "*" matches only
languages not matched by any other range within an "Accept-Language:" languages not matched by any other range within an "Accept-Language:"
header. header.
2.1.2 Lookup 2.1.2 Lookup
Content lookup is used to select the single information item that Content lookup is used to select the single information item that
skipping to change at page 7, line 8 skipping to change at page 7, line 8
not always the most appropriate way to access the information not always the most appropriate way to access the information
contained in language tags when selecting or filtering content. Some contained in language tags when selecting or filtering content. Some
applications might wish to define a more granular matching scheme and applications might wish to define a more granular matching scheme and
such a matching scheme requires the ability to specify the various such a matching scheme requires the ability to specify the various
attributes of a language tag in the language range. An extended attributes of a language tag in the language range. An extended
language range can be represented by the following ABNF: language range can be represented by the following ABNF:
extended-language-range = grandfathered / privateuse / range extended-language-range = grandfathered / privateuse / range
range = ( lang [ "-" script ] [ "-" region ] *( "-" variant ) range = ( lang [ "-" script ] [ "-" region ] *( "-" variant )
[ "-" privateuse ] ) [ "-" privateuse ] )
lang = ( 2*8ALPHA *[ "-" extlang ] ) / "*" lang = 2*8ALPHA / extlang / "*"
extlang = 3ALPHA / "*" extlang = 2*3ALPHA *2("-" 3ALPHA) ( "-" ( 3ALPHA / "*" ) )
script = 4ALPHA / "*" script = 4ALPHA / "*"
region = 2ALPHA / 3DIGIT / "*" region = 2ALPHA / 3DIGIT / "*"
variant = 5*8alphanum / ( DIGIT 3alphanum ) / "*" variant = 5*8alphanum / ( DIGIT 3alphanum ) / "*"
privateuse = ( "x" / "X" ) 1*( "-" ( 1*8alphanum ) ) privateuse = ( "x" / "X" ) 1*( "-" ( 1*8alphanum ) )
grandfathered = 1*3ALPHA 1*2( "-" ( 2*8alphanum ) ) grandfathered = 1*3ALPHA 1*2( "-" ( 2*8alphanum ) )
alphanum = ( ALPHA / DIGIT ) alphanum = ( ALPHA / DIGIT )
In an extended language range, the identifier takes the form of a In an extended language range, the identifier takes the form of a
series of subtags which must consist of well-formed subtags or the series of subtags which must consist of well-formed subtags or the
special subtag "*". For example, the language range "en-*-US" special subtag "*". For example, the language range "en-*-US"
skipping to change at page 12, line 25 skipping to change at page 12, line 25
intended for general use. Private-use subtags are simply useless for intended for general use. Private-use subtags are simply useless for
information exchange without prior arrangement. information exchange without prior arrangement.
The value and semantic meaning of private-use tags and of the subtags The value and semantic meaning of private-use tags and of the subtags
used within such a language tag are not defined. Matching private used within such a language tag are not defined. Matching private
use tags using language ranges or extended language ranges can result use tags using language ranges or extended language ranges can result
in unpredictable content being returned. in unpredictable content being returned.
2.6 Length Considerations in Matching 2.6 Length Considerations in Matching
RFC 3066 [19] did not provide an upper limit on the size of language [RFC3066] did not provide an upper limit on the size of language tags
tags or ranges. RFC 3066 did define the semantics of particular or ranges. RFC 3066 did define the semantics of particular subtags
subtags in such a way that most language tags or ranges consisted of in such a way that most language tags or ranges consisted of language
language and region subtags with a combined total length of up to six and region subtags with a combined total length of up to six
characters. Larger tags and ranges (in terms of both subtags and characters. Larger tags and ranges (in terms of both subtags and
characters) did exist, however. characters) did exist, however.
[1] also does not impose a fixed upper limit on the number of subtags [ID.ietf-ltru-registry] also does not impose a fixed upper limit on
in a language tag or range (and thus an upper bound on the size of the number of subtags in a language tag or range (and thus an upper
either). The syntax in that document suggests that, depending on the bound on the size of either). The syntax in that document suggests
specific language or range of languages, more subtags (and thus that, depending on the specific language or range of languages, more
characters) are sometimes necessary as a result. Length subtags (and thus characters) are sometimes necessary as a result.
considerations and their impact on the selection and processing of Length considerations and their impact on the selection and
tags are described in Section 2.1.1 of that document. processing of tags are described in Section 2.1.1 of that document.
A matching implementation MAY choose to limit the length of the A matching implementation MAY choose to limit the length of the
language tags or ranges used in matching. Any such limitation SHOULD language tags or ranges used in matching. Any such limitation SHOULD
be clearly documented, and such documentation SHOULD include the be clearly documented, and such documentation SHOULD include the
disposition of any longer tags or ranges (for example, whether an disposition of any longer tags or ranges (for example, whether an
error value is generated or the language tag or range is truncated). error value is generated or the language tag or range is truncated).
If truncation is permitted it MUST NOT permit a subtag to be divided, If truncation is permitted it MUST NOT permit a subtag to be divided,
since this changes the semantics of the subtag being matched and can since this changes the semantics of the subtag being matched and can
result in false positives or negatives. result in false positives or negatives.
skipping to change at page 13, line 19 skipping to change at page 13, line 19
In practice, most tags do not require additional subtags or In practice, most tags do not require additional subtags or
substantially more characters. Additional subtags sometimes add substantially more characters. Additional subtags sometimes add
useful distinguishing information, but extraneous subtags interfere useful distinguishing information, but extraneous subtags interfere
with the meaning, understanding, and especially matching of language with the meaning, understanding, and especially matching of language
tags. Since language tags or ranges MAY be truncated by an tags. Since language tags or ranges MAY be truncated by an
application or protocol that limits storage, when choosing language application or protocol that limits storage, when choosing language
tags or ranges users and applications SHOULD avoid adding subtags tags or ranges users and applications SHOULD avoid adding subtags
that add no distinguishing value. In particular, users and that add no distinguishing value. In particular, users and
implementations SHOULD follow the 'Prefix' and 'Suppress-Script' implementations SHOULD follow the 'Prefix' and 'Suppress-Script'
fields in the registry (defined in Section 3.6 of [1]): these fields fields in the registry (defined in Section 3.6 of [ID.ietf-ltru-
provide guidance on when specific additional subtags SHOULD (and registry]): these fields provide guidance on when specific additional
SHOULD NOT) be used. subtags SHOULD (and SHOULD NOT) be used.
Implementations MUST support a limit of at least 33 characters. This Implementations MUST support a limit of at least 33 characters. This
limit includes at least one subtag of each non-extension, non-private limit includes at least one subtag of each non-extension, non-private
use type. When choosing a buffer limit, a length of at least 42 use type. When choosing a buffer limit, a length of at least 42
characters is strongly RECOMMENDED. characters is strongly RECOMMENDED.
The practical limit on tags or ranges derived solely from registered The practical limit on tags or ranges derived solely from registered
values is 42 characters. Implementations MUST be able to handle tags values is 42 characters. Implementations MUST be able to handle tags
and ranges of this length. Support for tags and ranges of at least and ranges of this length. Support for tags and ranges of at least
62 characters in length is RECOMMENDED. Implementations MAY support 62 characters in length is RECOMMENDED. Implementations MAY support
skipping to change at page 15, line 9 skipping to change at page 15, line 9
Figure 4: Example of Tag Truncation Figure 4: Example of Tag Truncation
3. IANA Considerations 3. IANA Considerations
This document presents no new or existing considerations for IANA. This document presents no new or existing considerations for IANA.
4. Changes 4. Changes
This is the first version of this document. This is the first version of this document.
The following changes were put into this document since draft-00: The following changes were put into this document since draft-02:
Fixed text in the introduction that is no longer accurate.
Specifically, there no longer is a default matching algorithm.
(A.Phillips)
Fixed text in Section 2.1 which incorrectly discussed the default
fallback mechanism. (A.Phillips)
Minor changes to Section 2.3, in particular, the addition of the
'variant' paragraph and some tidying of the text. (A.Phillips)
Fixed a minor glitch in the ABNF caused by taking the output of
Bill Fenner's parser and not looking too closely at it (M. Patton)
Fixed some minor reference problems. (M.Patton)
Added Section 2.6 on length considerations in matching. Turned on symrefs and replaced all reference IDs to make them
(R.Presuhn) readable (F.Ellermann)
Copied various materials from the length considerations section of Removed all external references from the abstract (R.Presuhn)
the registry draft to keep the two documents in sync.
(A.Phillips)
5. Security Considerations 5. Security Considerations
The only security issue that has been raised with language tags since Language ranges used in content negotiation might be used to infer
the publication of RFC 1766, which stated that "Security issues are
believed to be irrelevant to this memo", is a concern with language
ranges used in content negotiation - that they might be used to infer
the nationality of the sender, and thus identify potential targets the nationality of the sender, and thus identify potential targets
for surveillance. for surveillance. In addition, unique or highly unusual language
ranges or combinations of language ranges might be used to track
specific individual's activities.
This is a special case of the general problem that anything you send This is a special case of the general problem that anything you send
is visible to the receiving party. It is useful to be aware that is visible to the receiving party. It is useful to be aware that
such concerns can exist in some cases. such concerns can exist in some cases.
The evaluation of the exact magnitude of the threat, and any possible The evaluation of the exact magnitude of the threat, and any possible
countermeasures, is left to each application protocol. countermeasures, is left to each application protocol.
Although the specification of valid subtags for an extension MUST be
available over the Internet, implementations SHOULD NOT mechanically
depend on it being always accessible, to prevent denial-of-service
attacks.
6. Character Set Considerations 6. Character Set Considerations
The syntax in this document requires that language ranges use only The syntax of language tags and language ranges permit only the
the characters A-Z, a-z, 0-9, and HYPHEN-MINUS legal in language characters A-Z, a-z, 0-9, and HYPHEN-MINUS (%x2D). These characters
tags. These characters are present in most character sets, so are present in most character sets, so presentation of language tags
presentation of language tags should not have any character set should not present any character set issues.
issues.
Rendering of characters based on the content of a language tag is not
addressed in this memo. Historically, some languages have relied on
the use of specific character sets or other information in order to
infer how a specific character should be rendered (notably this
applies to language and culture specific variations of Han ideographs
as used in Japanese, Chinese, and Korean). When language tags are
applied to spans of text, rendering engines sometimes use that
information in deciding which font to use in the absence of other
information, particularly where languages with distinct writing
traditions use the same characters.
7. References 7. References
7.1 Normative References 7.1 Normative References
[1] Phillips, A., Ed. and M. Davis, Ed., "Tags for the [ID.ietf-ltru-registry]
Identification of Languages (Internet-Draft)", June 2005, <http Phillips, A., Ed. and M. Davis, Ed., "Tags for the
://www.ietf.org/internet-drafts/ Identification of Languages (Internet-Draft)", June 2005,
draft-ietf-ltru-registry-03.txt>. <http://www.ietf.org/internet-drafts/
draft-ietf-ltru-registry-07.txt>.
[2] Hardcastle-Kille, S., "Mapping between X.400(1988) / ISO 10021 [RFC1327] Hardcastle-Kille, S., "Mapping between X.400(1988) / ISO
and RFC 822", RFC 1327, May 1992. 10021 and RFC 822", RFC 1327, May 1992.
[3] Borenstein, N. and N. Freed, "MIME (Multipurpose Internet Mail [RFC1521] Borenstein, N. and N. Freed, "MIME (Multipurpose Internet
Extensions) Part One: Mechanisms for Specifying and Describing Mail Extensions) Part One: Mechanisms for Specifying and
the Format of Internet Message Bodies", RFC 1521, Describing the Format of Internet Message Bodies",
September 1993. RFC 1521, September 1993.
[4] Hovey, R. and S. Bradner, "The Organizations Involved in the [RFC2028] Hovey, R. and S. Bradner, "The Organizations Involved in
IETF Standards Process", BCP 11, RFC 2028, October 1996. the IETF Standards Process", BCP 11, RFC 2028,
October 1996.
[5] Bradner, S., "Key words for use in RFCs to Indicate Requirement [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[6] Freed, N. and K. Moore, "MIME Parameter Value and Encoded Word [RFC2231] Freed, N. and K. Moore, "MIME Parameter Value and Encoded
Extensions: Character Sets, Languages, and Continuations", Word Extensions: Character Sets, Languages, and
RFC 2231, November 1997. Continuations", RFC 2231, November 1997.
[7] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax [RFC2234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
Specifications: ABNF", RFC 2234, November 1997. Specifications: ABNF", RFC 2234, November 1997.
[8] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform [RFC2396] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
Resource Identifiers (URI): Generic Syntax", RFC 2396, Resource Identifiers (URI): Generic Syntax", RFC 2396,
August 1998. August 1998.
[9] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an
Considerations Section in RFCs", BCP 26, RFC 2434, IANA Considerations Section in RFCs", BCP 26, RFC 2434,
October 1998. October 1998.
[10] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol -- Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
HTTP/1.1", RFC 2616, June 1999. Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
[11] Carpenter, B., Baker, F., and M. Roberts, "Memorandum of [RFC2860] Carpenter, B., Baker, F., and M. Roberts, "Memorandum of
Understanding Concerning the Technical Work of the Internet Understanding Concerning the Technical Work of the
Assigned Numbers Authority", RFC 2860, June 2000. Internet Assigned Numbers Authority", RFC 2860, June 2000.
[12] Yergeau, F., "UTF-8, a transformation format of ISO 10646", [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO
STD 63, RFC 3629, November 2003. 10646", STD 63, RFC 3629, November 2003.
7.2 Informative References 7.2 Informative References
[13] International Organization for Standardization, "ISO 639- [ISO639-1]
1:2002, Codes for the representation of names of languages -- International Organization for Standardization, "ISO 639-
Part 1: Alpha-2 code", ISO Standard 639, 2002. 1:2002, Codes for the representation of names of languages
-- Part 1: Alpha-2 code", ISO Standard 639, 2002.
[14] International Organization for Standardization, "ISO 639-2:1998 [ISO639-2]
- Codes for the representation of names of languages -- Part 2: International Organization for Standardization, "ISO 639-
Alpha-3 code - edition 1", August 1988. 2:1998 - Codes for the representation of names of
languages -- Part 2: Alpha-3 code - edition 1",
August 1988.
[15] ISO TC46/WG3, "ISO 15924:2003 (E/F) - Codes for the [ISO15924]
ISO TC46/WG3, "ISO 15924:2003 (E/F) - Codes for the
representation of names of scripts", January 2004. representation of names of scripts", January 2004.
[16] International Organization for Standardization, "Codes for the [ISO3166] International Organization for Standardization, "Codes for
representation of names of countries, 3rd edition", the representation of names of countries, 3rd edition",
ISO Standard 3166, August 1988. ISO Standard 3166, August 1988.
[17] Statistical Division, United Nations, "Standard Country or Area [UN_M49] Statistical Division, United Nations, "Standard Country or
Codes for Statistical Use", UN Standard Country or Area Codes Area Codes for Statistical Use", UN Standard Country or
for Statistical Use, Revision 4 (United Nations publication, Area Codes for Statistical Use, Revision 4 (United Nations
Sales No. 98.XVII.9, June 1999. publication, Sales No. 98.XVII.9, June 1999.
[18] Alvestrand, H., "Tags for the Identification of Languages", [RFC1766] Alvestrand, H., "Tags for the Identification of
RFC 1766, March 1995. Languages", RFC 1766, March 1995.
[19] Alvestrand, H., "Tags for the Identification of Languages", [RFC3066] Alvestrand, H., "Tags for the Identification of
BCP 47, RFC 3066, January 2001. Languages", BCP 47, RFC 3066, January 2001.
[20] Klyne, G. and C. Newman, "Date and Time on the Internet: [RFC3339] Klyne, G. and C. Newman, "Date and Time on the Internet:
Timestamps", RFC 3339, July 2002. Timestamps", RFC 3339, July 2002.
Authors' Addresses Authors' Addresses
Addison Phillips (editor) Addison Phillips (editor)
Quest Software Quest Software
Email: addison dot phillips at quest dot com Email: addison dot phillips at quest dot com
Mark Davis (editor) Mark Davis (editor)
IBM IBM
Email: mark dot davis at ibm dot com Email: mark dot davis at ibm dot com
Appendix A. Acknowledgements Appendix A. Acknowledgements
Any list of contributors is bound to be incomplete; please regard the Any list of contributors is bound to be incomplete; please regard the
following as only a selection from the group of people who have following as only a selection from the group of people who have
contributed to make this document what it is today. contributed to make this document what it is today.
skipping to change at page 20, line 11 skipping to change at page 21, line 11
IBM IBM
Email: mark dot davis at ibm dot com Email: mark dot davis at ibm dot com
Appendix A. Acknowledgements Appendix A. Acknowledgements
Any list of contributors is bound to be incomplete; please regard the Any list of contributors is bound to be incomplete; please regard the
following as only a selection from the group of people who have following as only a selection from the group of people who have
contributed to make this document what it is today. contributed to make this document what it is today.
The contributors to RFC 3066 and RFC 1766, the precursors of this The contributors to [ID.ietf-ltru-registry], [RFC3066] and [RFC1766],
document, made enormous contributions directly or indirectly to this each of which is a precursor to this document, made enormous
document and are generally responsible for the success of language contributions directly or indirectly to this document and are
tags. generally responsible for the success of language tags.
The following people (in alphabetical order) contributed to this The following people (in alphabetical order by family name)
document or to RFCs 1766 and 3066: contributed to this document:
Glenn Adams, Harald Tveit Alvestrand, Tim Berners-Lee, Marc Blanchet, Jeremy Carroll, John Cowan, Frank Ellermann, Doug Ewell, Ira
Nathaniel Borenstein, Eric Brunner, Sean M. Burke, Jeremy Carroll, McDonald, M. Patton, Randy Presuhn and many, many others.
John Clews, Jim Conklin, Peter Constable, John Cowan, Mark Crispin,
Dave Crocker, Martin Duerst, Michael Everson, Doug Ewell, Ned Freed,
Tim Goodwin, Dirk-Willem van Gulik, Marion Gunn, Joel Halpren,
Elliotte Rusty Harold, Paul Hoffman, Richard Ishida, Olle Jarnefors,
Kent Karlsson, John Klensin, Alain LaBonte, Eric Mader, Keith Moore,
Chris Newman, Masataka Ohta, Michael S. Patton, Randy Presuhn, George
Rhoten, Markus Scherer, Keld Jorn Simonsen, Thierry Sourbier, Otto
Stolz, Tex Texin, Andrea Vine, Rhys Weatherley, Misha Wolf, Francois
Yergeau and many, many others.
Very special thanks must go to Harald Tveit Alvestrand, who Very special thanks must go to Harald Tveit Alvestrand, who
originated RFCs 1766 and 3066, and without whom this document would originated RFCs 1766 and 3066, and without whom this document would
not have been possible. Special thanks must go to Michael Everson, not have been possible.
who has served as language tag reviewer for almost the complete
period since the publication of RFC 1766. Special thanks to Doug
Ewell, for his production of the first complete subtag registry, and
his work in producing a test parser for verifying language tags.
For this particular document, John Cowan originated the scheme For this particular document, John Cowan originated the scheme
described in Section 2.2.3. Mark Davis originated the scheme described in Section 2.2.3. Mark Davis originated the scheme
described in the Section 2.1.2. described in the Section 2.1.2.
Intellectual Property Statement Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in pertain to the implementation or use of the technology described in
 End of changes. 

This html diff was produced by rfcdiff 1.24, available from http://www.levkowetz.com/ietf/tools/rfcdiff/