draft-ietf-ltru-registry-05.txt   draft-ietf-ltru-registry-06.txt 
Network Working Group A. Phillips, Ed. Network Working Group A. Phillips, Ed.
Internet-Draft Quest Software Internet-Draft Quest Software
Expires: December 17, 2005 M. Davis, Ed. Expires: December 25, 2005 M. Davis, Ed.
IBM IBM
June 15, 2005 June 23, 2005
Tags for Identifying Languages Tags for Identifying Languages
draft-ietf-ltru-registry-05 draft-ietf-ltru-registry-06
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 35 skipping to change at page 1, line 35
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on December 17, 2005. This Internet-Draft will expire on December 25, 2005.
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (2005). Copyright (C) The Internet Society (2005).
Abstract Abstract
This document describes the structure, content, construction, and This document describes the structure, content, construction, and
semantics of language tags for use in cases where it is desirable to semantics of language tags for use in cases where it is desirable to
indicate the language used in an information object. It also indicate the language used in an information object. It also
describes how to register values for use in language tags and the describes how to register values for use in language tags and the
creation of user defined extensions for private interchange. This creation of user defined extensions for private interchange. This
document obsoletes RFC 3066 (which replaced RFC 1766). document obsoletes RFC 3066 (which replaced RFC 1766).
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. The Language Tag . . . . . . . . . . . . . . . . . . . . . . . 4 2. The Language Tag . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Length Considerations . . . . . . . . . . . . . . . . 6 2.2 Language Subtag Sources and Interpretation . . . . . . . . 6
2.2 Language Subtag Sources and Interpretation . . . . . . . . 8 2.2.1 Primary Language Subtag . . . . . . . . . . . . . . . 7
2.2.1 Primary Language Subtag . . . . . . . . . . . . . . . 9 2.2.2 Extended Language Subtags . . . . . . . . . . . . . . 9
2.2.2 Extended Language Subtags . . . . . . . . . . . . . . 11 2.2.3 Script Subtag . . . . . . . . . . . . . . . . . . . . 10
2.2.3 Script Subtag . . . . . . . . . . . . . . . . . . . . 12 2.2.4 Region Subtag . . . . . . . . . . . . . . . . . . . . 11
2.2.4 Region Subtag . . . . . . . . . . . . . . . . . . . . 13 2.2.5 Variant Subtags . . . . . . . . . . . . . . . . . . . 12
2.2.5 Variant Subtags . . . . . . . . . . . . . . . . . . . 14 2.2.6 Extension Subtags . . . . . . . . . . . . . . . . . . 13
2.2.6 Extension Subtags . . . . . . . . . . . . . . . . . . 15 2.2.7 Private Use Subtags . . . . . . . . . . . . . . . . . 15
2.2.7 Private Use Subtags . . . . . . . . . . . . . . . . . 17 2.2.8 Pre-Existing RFC 3066 Registrations . . . . . . . . . 15
2.2.8 Pre-Existing RFC 3066 Registrations . . . . . . . . . 17 2.2.9 Classes of Conformance . . . . . . . . . . . . . . . . 15
2.2.9 Classes of Conformance . . . . . . . . . . . . . . . . 17 3. Registry Format and Maintenance . . . . . . . . . . . . . . . 17
3. Registry Format and Maintenance . . . . . . . . . . . . . . . 19 3.1 Format of the IANA Language Subtag Registry . . . . . . . 17
3.1 Format of the IANA Language Subtag Registry . . . . . . . 19 3.2 Maintenance of the Registry . . . . . . . . . . . . . . . 22
3.2 Maintenance of the Registry . . . . . . . . . . . . . . . 24 3.3 Stability of IANA Registry Entries . . . . . . . . . . . . 23
3.3 Stability of IANA Registry Entries . . . . . . . . . . . . 25 3.4 Registration Procedure for Subtags . . . . . . . . . . . . 27
3.4 Registration Procedure for Subtags . . . . . . . . . . . . 29 3.5 Possibilities for Registration . . . . . . . . . . . . . . 30
3.5 Possibilities for Registration . . . . . . . . . . . . . . 32 3.6 Extensions and Extensions Namespace . . . . . . . . . . . 31
3.6 Extensions and Extensions Namespace . . . . . . . . . . . 33 3.7 Initialization of the Registry . . . . . . . . . . . . . . 34
3.7 Initialization of the Registry . . . . . . . . . . . . . . 36 4. Formation and Processing of Language Tags . . . . . . . . . . 38
4. Formation and Processing of Language Tags . . . . . . . . . . 39 4.1 Choice of Language Tag . . . . . . . . . . . . . . . . . . 38
4.1 Choice of Language Tag . . . . . . . . . . . . . . . . . . 39 4.2 Meaning of the Language Tag . . . . . . . . . . . . . . . 40
4.2 Meaning of the Language Tag . . . . . . . . . . . . . . . 41 4.3 Length Considerations . . . . . . . . . . . . . . . . . . 41
4.3 Canonicalization of Language Tags . . . . . . . . . . . . 42 4.3.1 Working with Limited Buffer Sizes . . . . . . . . . . 41
4.4 Considerations for Private Use Subtags . . . . . . . . . . 44 4.3.2 Truncation of Language Tags . . . . . . . . . . . . . 43
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 45 4.4 Canonicalization of Language Tags . . . . . . . . . . . . 43
6. Security Considerations . . . . . . . . . . . . . . . . . . . 46 4.5 Considerations for Private Use Subtags . . . . . . . . . . 45
7. Character Set Considerations . . . . . . . . . . . . . . . . . 47 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 47
8. Changes from RFC 3066 . . . . . . . . . . . . . . . . . . . . 48 6. Security Considerations . . . . . . . . . . . . . . . . . . . 48
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 51 7. Character Set Considerations . . . . . . . . . . . . . . . . . 49
9.1 Normative References . . . . . . . . . . . . . . . . . . . 51 8. Changes from RFC 3066 . . . . . . . . . . . . . . . . . . . . 50
9.2 Informative References . . . . . . . . . . . . . . . . . . 52 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 53 9.1 Normative References . . . . . . . . . . . . . . . . . . . 54
A. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 54 9.2 Informative References . . . . . . . . . . . . . . . . . . 55
B. Examples of Language Tags (Informative) . . . . . . . . . . . 55 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 56
C. Example Registry . . . . . . . . . . . . . . . . . . . . . . . 58 A. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 57
Intellectual Property and Copyright Statements . . . . . . . . 62 B. Examples of Language Tags (Informative) . . . . . . . . . . . 58
C. Example Registry . . . . . . . . . . . . . . . . . . . . . . . 61
Intellectual Property and Copyright Statements . . . . . . . . 64
1. Introduction 1. Introduction
Human beings on our planet have, past and present, used a number of Human beings on our planet have, past and present, used a number of
languages. There are many reasons why one would want to identify the languages. There are many reasons why one would want to identify the
language used when presenting or requesting information. language used when presenting or requesting information.
Information about a user's language preferences commonly needs to be User's language preferences often need to be identified so that
identified so that appropriate processing can be applied. For appropriate processing can be applied. For example, the user's
example, the user's language preferences in a browser can be used to language preferences in a Web browser can be used to select Web pages
select web pages appropriately. A choice of language preference can appropriately. Language preferences can also be used to select among
also be used to select among tools (such as dictionaries) to assist tools (such as dictionaries) to assist in the processing or
in the processing or understanding of content in different languages. understanding of content in different languages.
In addition, knowledge about the particular language used by some In addition, knowledge about the particular language used by some
piece of information content might be useful or even required by some piece of information content might be useful or even required by some
types of information processing; for example spell-checking, types of processing; for example spell-checking, computer-synthesized
computer-synthesized speech, Braille transcription, or high-quality speech, Braille transcription, or high-quality print renderings.
print renderings.
One means of indicating the language used is by labeling the One means of indicating the language used is by labeling the
information content with a language identifier. These identifiers information content with an identifier or "tag". These tags can be
can also be used to specify user preferences when selecting used to specify user preferences when selecting information content,
information content, or for labeling additional attributes of content or for labeling additional attributes of content and associated
and associated resources. resources.
These identifiers can also be used to indicate additional attributes Tags can also be used to indicate additional language attributes of
of content that are closely related to the language. In particular, content. For example, indicating specific information about the
it is often necessary to indicate specific information about the
dialect, writing system, or orthography used in a document or dialect, writing system, or orthography used in a document or
resource, as these attributes may be important for the user to obtain resource may enable the user to obtain information in a form that
information in a form that they can understand, or important in they can understand, or important in processing or rendering the
selecting appropriate processing resources for the given content. given content into an appropriate form or style.
This document specifies an identifier mechanism and a registration This document specifies a particular identifier mechanism (the
function for values to be used with that identifier mechanism. It language tag) and a registration function for values to be used to
also defines a mechanism for private use values and future extension. form tags. It also defines a mechanism for private use values and
future extension.
This document replaces RFC 3066, which replaced RFC 1766. For a list This document replaces RFC 3066, which replaced RFC 1766. For a list
of changes in this document, see Section 8. of changes in this document, see Section 8.
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [11]. document are to be interpreted as described in [RFC2119].
2. The Language Tag 2. The Language Tag
The language tag always defines a language as used (which includes
being spoken, written, signed, or otherwise signaled) by human
beings for communication of information to other human beings.
Computer languages such as programming languages are explicitly
excluded.
2.1 Syntax 2.1 Syntax
The language tag is composed of one or more parts: A primary language The language tag is composed of one or more parts or "subtags". Each
subtag and a (possibly empty) series of subsequent subtags. Subtags subtag consists of a sequence of alpha-numeric characters. Subtags
are distinguished by their length, position in the subtag sequence, are distinguished and separated from one another by a hyphen ("-").
and content, so that each type of subtag can be recognized solely by A language tag consists of a "primary language" subtag and a
these features. This makes it possible to construct a parser that (possibly empty) series of subsequent subtags, each of which refines
can extract and assign some semantic information to the subtags, even or narrows the range of language identified by the overall tag.
if specific subtag values are not recognized. Thus a parser need not
have an up-to-date copy of the registered subtag values to perform Each type of subtag is distinguished by length, position in the tag,
and content: subtags can be recognized solely by these features.
This makes it possible to construct a parser that can extract and
assign some semantic information to the subtags, even if the specific
subtag values are not recognized. Thus a parser need not have an up-
to-date copy (or any copy at all) of the subtag registry to perform
most searching and matching operations. most searching and matching operations.
The syntax of this tag in ABNF [7] is: The syntax of the language tag in ABNF [RFC2234bis] is:
Language-Tag = (lang Language-Tag = (lang
*3("-" extlang) *3("-" extlang)
["-" script] ["-" script]
["-" region] ["-" region]
*("-" variant) *("-" variant)
*("-" extension) *("-" extension)
["-" privateuse]) ["-" privateuse])
/ privateuse ; private-use tag / privateuse ; private-use tag
/ grandfathered ; grandfathered registrations / grandfathered ; grandfathered registrations
skipping to change at page 5, line 48 skipping to change at page 5, line 48
The character "-" is HYPHEN-MINUS (ABNF: %x2D). All subtags have a The character "-" is HYPHEN-MINUS (ABNF: %x2D). All subtags have a
maximum length of eight characters. Note that there is a subtlety in maximum length of eight characters. Note that there is a subtlety in
the ABNF for 'variant': variants starting with a digit MAY be four the ABNF for 'variant': variants starting with a digit MAY be four
characters long, while those starting with a letter MUST be at least characters long, while those starting with a letter MUST be at least
five characters long. five characters long.
Whitespace is not permitted in a language tag. For examples of Whitespace is not permitted in a language tag. For examples of
language tags, see Appendix B. language tags, see Appendix B.
Note that although [7] refers to octets, the language tags described Note that although [RFC2234bis] refers to octets, the language tags
in this document are sequences of characters from the US-ASCII described in this document are sequences of characters from the US-
repertoire. Language tags MAY be used in documents and applications ASCII repertoire. Language tags MAY be used in documents and
that use other encodings, so long as these encompass the US-ASCII applications that use other encodings, so long as these encompass the
repertoire. An example of this would be an XML document that uses US-ASCII repertoire. An example of this would be an XML document
the UTF-16LE [13] encoding of Unicode [21]. that uses the UTF-16LE [RFC2781] encoding of [Unicode].
The tags and their subtags, including private-use and extensions, are The tags and their subtags, including private-use and extensions, are
to be treated as case insensitive: there exist conventions for the to be treated as case insensitive: there exist conventions for the
capitalization of some of the subtags, but these MUST not be taken to capitalization of some of the subtags, but these MUST not be taken to
carry meaning. carry meaning.
For example: For example:
o [ISO 639] [1] recommends that language codes be written in lower o [ISO639-1] recommends that language codes be written in lower case
case ('mn' Mongolian). ('mn' Mongolian).
o [ISO 3166] [4] recommends that country codes be capitalized ('MN' o [ISO3166] recommends that country codes be capitalized ('MN'
Mongolia). Mongolia).
o [ISO 15924] [3] recommends that script codes use lower case with o [ISO15924] recommends that script codes use lower case with the
the initial letter capitalized ('Cyrl' Cyrillic). initial letter capitalized ('Cyrl' Cyrillic).
However, in the tags defined by this document, the uppercase US-ASCII However, in the tags defined by this document, the uppercase US-ASCII
letters in the range 'A' through 'Z' are considered equivalent and letters in the range 'A' through 'Z' are considered equivalent and
mapped directly to their US-ASCII lowercase equivalents in the range mapped directly to their US-ASCII lowercase equivalents in the range
'a' through 'z'. Thus the tag "mn-Cyrl-MN" is not distinct from "MN- 'a' through 'z'. Thus the tag "mn-Cyrl-MN" is not distinct from "MN-
cYRL-mn" or "mN-cYrL-Mn" (or any other combination) and each of these cYRL-mn" or "mN-cYrL-Mn" (or any other combination) and each of these
variations conveys the same meaning: Mongolian written in the variations conveys the same meaning: Mongolian written in the
Cyrillic script as used in Mongolia. Cyrillic script as used in Mongolia.
2.1.1 Length Considerations
RFC 3066 [24] did not provide an upper limit on the size of language
tags. While RFC 3066 did define the semantics of particular subtags
in such a way that most language tags consisted of language and
region subtags with a combined total length of up to six characters,
larger registered tags were not only possible but were actually
registered.
Neither this document nor the syntax in the ANBF imposes a fixed
upper limit on the number of subtags in a language tag (and thus an
upper bound on the size of a tag). The syntax in this document
suggests that, depending on the specific language, more subtags (and
thus characters) are sometimes necessary to form a complete tag; thus
it is possible to envision long or complex subtag sequences.
Some applications and protocols are forced to allocate fixed buffer
sizes or otherwise limit the length of a language tag in a particular
application. A conformant implementation or specification MAY refuse
to support the storage of language tags which exceed a specified
length. Any such limitation SHOULD be clearly documented, and such
documentation SHOULD include the disposition of any longer tags (for
example, whether an error value is generated or the language tag is
truncated).
In practice, most tags do not require additional subtags or
substantially more characters. Additional subtags sometimes add
useful distinguishing information, but extraneous subtags interfere
with the meaning, understanding, and processing of language tags.
Since language tags MAY be truncated by an application or protocol
that limits tag sizes, when choosing language tags users and
applications SHOULD avoid adding subtags that add no distinguishing
value. In particular, users and implementations SHOULD follow the
'Prefix' and 'Suppress-Script' fields in the registry (defined in
Section 3.1): these fields provide guidance on when specific
additional subtags SHOULD (and SHOULD NOT) be used in a language tag.
(For more information on selecting subtags, see Section 4.1.)
Implementations MUST support a limit of at least 33 characters. This
limit includes at least one subtag of each non-extension, non-private
use type. When choosing a buffer limit, a length of at least 42
characters is strongly RECOMMENDED.
If truncation is permitted it MUST NOT permit a subtag to be divided
or the formation of invalid tags (for example, one ending with the
"-" character). A protocol that allows tags to be truncated at an
arbitrary limit, without giving any indication of what that limit is,
has the potential for causing harm by changing the meaning of tags in
substantial ways.
Some specifications are space constrained but do not have a fixed
length limitation. For example, see [RFC 2231] [23]. This protocol
has no explicit length limitation: the length of the language tag in
this document is limited by the length of other header components
(such as the charset's name) coupled with the 76 character limit in
[RFC 2047] [10]. Thus the "limit" might be 50 or more characters,
but it could potentially be quite small. In these cases,
implementations SHOULD use the longest possible language tag.
Warning the user of truncation, if necessary, is RECOMMENDED, as
truncation can change the semantic meaning of the tag.
The following illustration shows how the 42-character recommendation
was derived. The combination of language and extended language
subtags was chosen for future compatibility. At up to 11 characters,
this combination is longer than the longest possible language subtag
(8 characters):
language = 3 (ISO 639-2; ISO 639-1 requires 2)
extlang1 = 4 (each subsequent subtag includes '-')
extlang2 = 4 (unlikely: needs prefix="language-extlang1")
extlang3 = 4 (extremely unlikely)
script = 5 (if not suppressed: see Section 4.1)
region = 4 (UN M.49; ISO 3166 requires 3)
variant1 = 9 (MUST have language as a prefix)
variant2 = 9 (MUST have language-variant1 as a prefix)
total = 42 characters
Figure 2: Derivation of the Limit on Tag Length
Applications or protocols which have to truncate a tag MUST do so by
progressively removing subtags along with their preceding "-" from
the right side of the language tag until the tag is short enough for
the given buffer. If the resulting tag ends with a single-character
subtag, that subtag and its preceding "-" MUST also be removed. For
example:
Tag to truncate: zh-Hant-CN-variant1-a-extend1-x-wadegile-private1
1. zh-Hant-CN-variant1-a-extend1-x-wadegile
2. zh-Hant-CN-variant1-a-extend1
3. zh-Hant-CN-variant1
4. zh-Hant-CN
5. zh-Hant
6. zh
Figure 3: Example of Tag Truncation
2.2 Language Subtag Sources and Interpretation 2.2 Language Subtag Sources and Interpretation
The namespace of language tags and their subtags is administered by The namespace of language tags and their subtags is administered by
the Internet Assigned Numbers Authority (IANA) [14] according to the the Internet Assigned Numbers Authority (IANA) [RFC2860] according to
rules in Section 5 of this document. The registry maintained by IANA the rules in Section 5 of this document. The registry maintained by
is the source for valid subtags: other standards referenced in this IANA is the source for valid subtags: other standards referenced in
section provide the source material for that registry. this section provide the source material for that registry.
Terminology in this section: Terminology in this section:
o Tag or tags refers to a complete language tag, such as o Tag or tags refers to a complete language tag, such as
"fr-Latn-CA". Examples of tags in this document are enclosed in "fr-Latn-CA". Examples of tags in this document are enclosed in
double-quotes ("en-US"). double-quotes ("en-US").
o Subtag refers to a specific section of a tag, delimited by hyphen, o Subtag refers to a specific section of a tag, delimited by hyphen,
such as the subtag 'Latn' in "fr-Latn-CA". Examples of subtags in such as the subtag 'Latn' in "fr-Latn-CA". Examples of subtags in
this document are enclosed in single quotes ('Latn'). this document are enclosed in single quotes ('Latn').
o Code or codes refers to values defined in external standards (and o Code or codes refers to values defined in external standards (and
which are used as subtags in this document). For example, 'Latn' which are used as subtags in this document). For example, 'Latn'
is an [ISO 15924] [3] script code which was used to define the is an [ISO15924] script code which was used to define the 'Latn'
'Latn' script subtag for use in a language tag. Examples of codes script subtag for use in a language tag. Examples of codes in
in this document are enclosed in single quotes ('en', 'Latn'). this document are enclosed in single quotes ('en', 'Latn').
The definitions in this section apply to the various subtags within The definitions in this section apply to the various subtags within
the language tags defined by this document, excepting those the language tags defined by this document, excepting those
"grandfathered" tags defined in Section 2.2.8. "grandfathered" tags defined in Section 2.2.8.
Language tags are designed so that each subtag type has unique length Language tags are designed so that each subtag type has unique length
and content restrictions. These make identification of the subtag's and content restrictions. These make identification of the subtag's
type possible, even if the content of the subtag itself is type possible, even if the content of the subtag itself is
unrecognized. This allows tags to be parsed and processed without unrecognized. This allows tags to be parsed and processed without
reference to the latest version of the underlying standards or the reference to the latest version of the underlying standards or the
skipping to change at page 10, line 10 skipping to change at page 8, line 8
2.2.1 Primary Language Subtag 2.2.1 Primary Language Subtag
The primary language subtag is the first subtag in a language tag The primary language subtag is the first subtag in a language tag
(with the exception of private-use and certain grandfathered tags) (with the exception of private-use and certain grandfathered tags)
and cannot be omitted. The following rules apply to the primary and cannot be omitted. The following rules apply to the primary
language subtag: language subtag:
1. All two character language subtags were defined in the IANA 1. All two character language subtags were defined in the IANA
registry according to the assignments found in the standard ISO registry according to the assignments found in the standard ISO
639 Part 1, "ISO 639-1:2002, Codes for the representation of 639 Part 1, "ISO 639-1:2002, Codes for the representation of
names of languages -- Part 1: Alpha-2 code" [ISO 639-1] [1], or names of languages -- Part 1: Alpha-2 code" [ISO639-1], or using
using assignments subsequently made by the ISO 639 Part 1 assignments subsequently made by the ISO 639 Part 1 maintenance
maintenance agency or governing standardization bodies. agency or governing standardization bodies.
2. All three character language subtags were defined in the IANA 2. All three character language subtags were defined in the IANA
registry according to the assignments found in ISO 639 Part 2, registry according to the assignments found in ISO 639 Part 2,
"ISO 639-2:1998 - Codes for the representation of names of "ISO 639-2:1998 - Codes for the representation of names of
languages -- Part 2: Alpha-3 code - edition 1" [ISO 639-2] [2], languages -- Part 2: Alpha-3 code - edition 1" [ISO639-2], or
or assignments subsequently made by the ISO 639 Part 2 assignments subsequently made by the ISO 639 Part 2 maintenance
maintenance agency or governing standardization bodies. agency or governing standardization bodies.
3. The subtags in the range 'qaa' through 'qtz' are reserved for 3. The subtags in the range 'qaa' through 'qtz' are reserved for
private use in language tags. These subtags correspond to codes private use in language tags. These subtags correspond to codes
reserved by ISO 639-2 for private use. These codes MAY be used reserved by ISO 639-2 for private use. These codes MAY be used
for non-registered primary-language subtags (instead of using for non-registered primary-language subtags (instead of using
private-use subtags following 'x-'). Please refer to Section 4.4 private-use subtags following 'x-'). Please refer to Section 4.5
for more information on private use subtags. for more information on private use subtags.
4. All four character language subtags are reserved for possible 4. All four character language subtags are reserved for possible
future standardization. future standardization.
5. All language subtags of 5 to 8 characters in length in the IANA 5. All language subtags of 5 to 8 characters in length in the IANA
registry were defined via the registration process in Section 3.4 registry were defined via the registration process in Section 3.4
and MAY be used to form the primary language subtag. At the time and MAY be used to form the primary language subtag. At the time
this document was created, there were no examples of this kind of this document was created, there were no examples of this kind of
subtag and future registrations of this type will be discouraged: subtag and future registrations of this type will be discouraged:
primary languages are strongly RECOMMENDED for registration with primary languages are strongly RECOMMENDED for registration with
ISO 639 and proposals rejected by ISO 639/RA will be closely ISO 639 and proposals rejected by ISO 639/RA will be closely
scrutinized before they are registered with IANA. scrutinized before they are registered with IANA.
6. The single character subtag 'x' as the primary subtag indicates 6. The single character subtag 'x' as the primary subtag indicates
that the language tag consists solely of subtags whose meaning is that the language tag consists solely of subtags whose meaning is
defined by private agreement. For example, in the tag "x-fr-CH", defined by private agreement. For example, in the tag "x-fr-CH",
the subtags 'fr' and 'CH' SHOULD NOT be taken to represent the the subtags 'fr' and 'CH' SHOULD NOT be taken to represent the
French language or the country of Switzerland (or any other value French language or the country of Switzerland (or any other value
in the IANA registry) unless there is a private agreement in in the IANA registry) unless there is a private agreement in
place to do so. See Section 4.4. place to do so. See Section 4.5.
7. The single character subtag 'i' is used by some grandfathered 7. The single character subtag 'i' is used by some grandfathered
tags (see Section 2.2.8) such as "i-klingon" and "i-bnn". (Other tags (see Section 2.2.8) such as "i-klingon" and "i-bnn". (Other
grandfathered tags have a primary language subtag in their first grandfathered tags have a primary language subtag in their first
position) position)
8. Other values MUST NOT be assigned to the primary subtag except by 8. Other values MUST NOT be assigned to the primary subtag except by
revision or update of this document. revision or update of this document.
Note: For languages that have both an ISO 639-1 two character code Note: For languages that have both an ISO 639-1 two character code
and an ISO 639-2 three character code, only the ISO 639-1 two and an ISO 639-2 three character code, only the ISO 639-1 two
character code is defined in the IANA registry. character code is defined in the IANA registry.
Note: For languages that have no ISO 639-1 two character code and for Note: For languages that have no ISO 639-1 two character code and for
which the ISO 639-2/T (Terminology) code and the ISO 639-2/B which the ISO 639-2/T (Terminology) code and the ISO 639-2/B
(Bibliographic) codes differ, only the Terminology code is defined in (Bibliographic) codes differ, only the Terminology code is defined in
skipping to change at page 11, line 23 skipping to change at page 9, line 21
(Bibliographic) codes differ, only the Terminology code is defined in (Bibliographic) codes differ, only the Terminology code is defined in
the IANA registry. At the time this document was created, all the IANA registry. At the time this document was created, all
languages that had both kinds of three character code were also languages that had both kinds of three character code were also
assigned a two character code; it is not expected that future assigned a two character code; it is not expected that future
assignments of this nature will occur. assignments of this nature will occur.
Note: To avoid problems with versioning and subtag choice as Note: To avoid problems with versioning and subtag choice as
experienced during the transition between RFC 1766 and RFC 3066, as experienced during the transition between RFC 1766 and RFC 3066, as
well as the canonical nature of subtags defined by this document, the well as the canonical nature of subtags defined by this document, the
ISO 639 Registration Authority Joint Advisory Committee (ISO 639/ ISO 639 Registration Authority Joint Advisory Committee (ISO 639/
RA-JAC) has included the following statement in [17]: RA-JAC) has included the following statement in [iso639.principles]:
"A language code already in ISO 639-2 at the point of freezing ISO "A language code already in ISO 639-2 at the point of freezing ISO
639-1 shall not later be added to ISO 639-1. This is to ensure 639-1 shall not later be added to ISO 639-1. This is to ensure
consistency in usage over time, since users are directed in Internet consistency in usage over time, since users are directed in Internet
applications to employ the alpha-3 code when an alpha-2 code for that applications to employ the alpha-3 code when an alpha-2 code for that
language is not available." language is not available."
In order to avoid instability of the canonical form of tags, if a two In order to avoid instability of the canonical form of tags, if a two
character code is added to ISO 639-1 for a language for which a three character code is added to ISO 639-1 for a language for which a three
character code was already included in ISO 639-2, the two character character code was already included in ISO 639-2, the two character
skipping to change at page 12, line 14 skipping to change at page 10, line 10
1. Three letter subtags immediately following the primary subtag are 1. Three letter subtags immediately following the primary subtag are
reserved for future standardization, anticipating work that is reserved for future standardization, anticipating work that is
currently under way on ISO 639. currently under way on ISO 639.
2. Extended language subtags MUST follow the primary subtag and 2. Extended language subtags MUST follow the primary subtag and
precede any other subtags. precede any other subtags.
3. There MAY be up to three extended language subtags. 3. There MAY be up to three extended language subtags.
4. Extended language subtags will not be registered except by 4. Extended language subtags MUST NOT be registered or used to form
revision of this document. language tags. Their syntax is described here so that
implementations can be compatible with any future revision of
5. Extended language subtags MUST NOT be used to form language tags this document which does provide for their registration.
except by revision of this document.
Extended language subtag records, once they appear in the registry, Extended language subtag records, once they appear in the registry,
MUST include exactly one 'Prefix' field indicating an appropriate MUST include exactly one 'Prefix' field indicating an appropriate
language subtag or sequence of subtags that MUST always appear as a language subtag or sequence of subtags that MUST always appear as a
prefix to the extended language subtag. prefix to the extended language subtag.
Example: In a future revision or update of this document, the tag Example: In a future revision or update of this document, the tag
"zh-gan" (registered under RFC 3066) might become a valid non- "zh-gan" (registered under RFC 3066) might become a valid non-
grandfathered (that is, redundant) tag in which the subtag 'gan' grandfathered (that is, redundant) tag in which the subtag 'gan'
might represent the Chinese dialect 'Gan'. might represent the Chinese dialect 'Gan'.
2.2.3 Script Subtag 2.2.3 Script Subtag
The following rules apply to the script subtags: Script subtags are used to indicate the script or writing system
variations that distinguish the written forms of a language or its
dialects. The following rules apply to the script subtags:
1. All four character subtags were defined according to ISO 15924 1. All four character subtags were defined according to
[3]--"Codes for the representation of the names of scripts": [ISO15924]--"Codes for the representation of the names of
alpha-4 script codes, or subsequently assigned by the ISO 15924 scripts": alpha-4 script codes, or subsequently assigned by the
maintenance agency or governing standardization bodies, denoting ISO 15924 maintenance agency or governing standardization bodies,
the script or writing system used in conjunction with this denoting the script or writing system used in conjunction with
language. this language.
2. Script subtags MUST immediately follow the primary language 2. Script subtags MUST immediately follow the primary language
subtag and all extended language subtags and MUST occur before subtag and all extended language subtags and MUST occur before
any other type of subtag described below. any other type of subtag described below.
3. The script subtags 'Qaaa' through 'Qabx' are reserved for private 3. The script subtags 'Qaaa' through 'Qabx' are reserved for private
use in language tags. These subtags correspond to codes reserved use in language tags. These subtags correspond to codes reserved
by ISO 15924 for private use. These codes MAY be used for non- by ISO 15924 for private use. These codes MAY be used for non-
registered script values. Please refer to Section 4.4 for more registered script values. Please refer to Section 4.5 for more
information on private-use subtags. information on private-use subtags.
4. Script subtags cannot be registered using the process in 4. Script subtags cannot be registered using the process in
Section 3.4 of this document. Variant subtags MAY be considered Section 3.4 of this document. Variant subtags MAY be considered
for registration for that purpose. for registration for that purpose.
5. There MUST be at most one script subtag in a language tag and the 5. There MUST be at most one script subtag in a language tag and the
script subtag SHOULD be omitted when it adds no distinguishing script subtag SHOULD be omitted when it adds no distinguishing
value to the tag or when the primary language subtag's record value to the tag or when the primary language subtag's record
includes a Supress-Script field listing the applicable script includes a Supress-Script field listing the applicable script
subtag. subtag.
Example: "sr-Latn" represents Serbian written using the Latin script. Example: "sr-Latn" represents Serbian written using the Latin script.
2.2.4 Region Subtag 2.2.4 Region Subtag
The following rules apply to the region subtags: Region subtags are used to indicate regional or geographical
variations that define a language or its dialects. The following
rules apply to the region subtags:
1. The region subtag defines language variations used in a specific 1. The region subtag defines language variations used in a specific
region, geographic, or political area. Region subtags MUST region, geographic, or political area. Region subtags MUST
follow any language, extended language, or script subtags and follow any language, extended language, or script subtags and
MUST precede all other subtags. MUST precede all other subtags.
2. All two character subtags following the primary subtag were 2. All two character subtags following the primary subtag were
defined in the IANA registry according to the assignments found defined in the IANA registry according to the assignments found
in ISO 3166 [4]--"Codes for the representation of names of in [ISO3166]--"Codes for the representation of names of countries
countries and their subdivisions - Part 1: Country and their subdivisions - Part 1: Country codes"--alpha-2 country
codes"--alpha-2 country codes or assignments subsequently made by codes or assignments subsequently made by the ISO 3166
the ISO 3166 maintenance agency or governing standardization maintenance agency or governing standardization bodies.
bodies.
3. All three character subtags consisting of digit (numeric) 3. All three character subtags consisting of digit (numeric)
characters following the primary subtag were defined in the IANA characters following the primary subtag were defined in the IANA
registry according to the assignments found in UN Standard registry according to the assignments found in UN Standard
Country or Area Codes for Statistical Use [5] (UN M.49) or Country or Area Codes for Statistical Use [UN_M.49] or
assignments subsequently made by the governing standards body. assignments subsequently made by the governing standards body.
Note that not all of the UN M.49 codes are defined in the IANA Note that not all of the UN M.49 codes are defined in the IANA
registry. The following rules define which codes are entered registry. The following rules define which codes are entered
into the registry as valid subtags: into the registry as valid subtags:
A. UN numeric codes assigned to 'macro-geographical A. UN numeric codes assigned to 'macro-geographical
(continental)' or sub-regions MUST be registered in the (continental)' or sub-regions MUST be registered in the
registry. These codes are not associated with an assigned registry. These codes are not associated with an assigned
ISO 3166 alpha-2 code and represent supra-national areas, ISO 3166 alpha-2 code and represent supra-national areas,
usually covering more than one nation, state, province, or usually covering more than one nation, state, province, or
skipping to change at page 14, line 38 skipping to change at page 12, line 35
5. There MUST be at most one region subtag in a language tag and the 5. There MUST be at most one region subtag in a language tag and the
region subtag MAY be omitted, as when it adds no distinguishing region subtag MAY be omitted, as when it adds no distinguishing
value to the tag. value to the tag.
6. The region subtags 'AA', 'QM'-'QZ', 'XA'-'XZ', and 'ZZ' are 6. The region subtags 'AA', 'QM'-'QZ', 'XA'-'XZ', and 'ZZ' are
reserved for private use in language tags. These subtags reserved for private use in language tags. These subtags
correspond to codes reserved by ISO 3166 for private use. These correspond to codes reserved by ISO 3166 for private use. These
codes MAY be used for private use region subtags (instead of codes MAY be used for private use region subtags (instead of
using a private-use subtag sequence). Please refer to using a private-use subtag sequence). Please refer to
Section 4.4 for more information on private use subtags. Section 4.5 for more information on private use subtags.
"de-CH" represents German ('de') as used in Switzerland ('CH'). "de-CH" represents German ('de') as used in Switzerland ('CH').
"sr-Latn-CS" represents Serbian ('sr') written using Latin script "sr-Latn-CS" represents Serbian ('sr') written using Latin script
('Latn') as used in Serbia and Montenegro ('CS'). ('Latn') as used in Serbia and Montenegro ('CS').
"es-419" represents Spanish ('es') as used in the UN-defined Latin "es-419" represents Spanish ('es') as used in the UN-defined Latin
America and Caribbean region ('419'). America and Caribbean region ('419').
2.2.5 Variant Subtags 2.2.5 Variant Subtags
The following rules apply to the variant subtags: Variant subtags are used to indicate additional, well-recognized
variations that define a language or its dialects which are not
covered by other available subtags. The following rules apply to the
variant subtags:
1. Variant subtags are not associated with any external standard. 1. Variant subtags are not associated with any external standard.
Variant subtags and their meanings are defined by the Variant subtags and their meanings are defined by the
registration process defined in Section 3.4. registration process defined in Section 3.4.
2. Variant subtags MUST follow all of the other defined subtags, but 2. Variant subtags MUST follow all of the other defined subtags, but
precede any extension or private-use subtag sequences. precede any extension or private-use subtag sequences.
3. More than one variant MAY be used to form the language tag. 3. More than one variant MAY be used to form the language tag.
skipping to change at page 15, line 30 skipping to change at page 13, line 30
1. Variant subtags that begin with a letter (a-z, A-Z) MUST be 1. Variant subtags that begin with a letter (a-z, A-Z) MUST be
at least five characters long. at least five characters long.
2. Variant subtags that begin with a digit (0-9) MUST be at 2. Variant subtags that begin with a digit (0-9) MUST be at
least four characters long. least four characters long.
Variant subtag records in the language subtag registry MAY include Variant subtag records in the language subtag registry MAY include
one or more 'Prefix' fields, which indicates the language tag or tags one or more 'Prefix' fields, which indicates the language tag or tags
that would make a suitable prefix (with other subtags, as that would make a suitable prefix (with other subtags, as
appropriate) in forming a language tag with the variant. For appropriate) in forming a language tag with the variant. For
example, the subtag 'scouse' has a Prefix of "en", making it suitable example, the subtag 'nedis' has a Prefix of "sl", making it suitable
to form language tags such as "en-scouse" and "en-GB-scouse", but not to form language tags such as "sl-nedis" and "sl-IT-nedis", but not
suitable for use in a tag such as "zh-scouse" or "it-GB-scouse". suitable for use in a tag such as "zh-nedis" or "it-IT-nedis".
"en-scouse" represents the Scouse dialect of English. "sl-nedis" represents the Natisone or Nadiza dialect of Slovenian.
"de-CH-1996" represents German as used in Switzerland and as written "de-CH-1996" represents German as used in Switzerland and as written
using the spelling reform beginning in the year 1996 C.E. using the spelling reform beginning in the year 1996 C.E.
Most variants that share a prefix are mutually exclusive. For Most variants that share a prefix are mutually exclusive. For
example, the German orthographic variations '1996' and '1901' SHOULD example, the German orthographic variations '1996' and '1901' SHOULD
NOT be used in the same tag, as they represent the dates of different NOT be used in the same tag, as they represent the dates of different
spelling reforms. A variant that can meaningfully be used in spelling reforms. A variant that can meaningfully be used in
combination with another variant SHOULD include a 'Prefix' field in combination with another variant SHOULD include a 'Prefix' field in
its registry record that lists that other variant. For example, if its registry record that lists that other variant. For example, if
another German variant 'example' were created that made sense to use another German variant 'example' were created that made sense to use
with '1996', then 'example' should include two Prefix fields: "de" with '1996', then 'example' should include two Prefix fields: "de"
and "de-1996". and "de-1996".
2.2.6 Extension Subtags 2.2.6 Extension Subtags
The following rules apply to extensions: Extensions provide a mechanism for extending language tags for use in
various applications. See: Section 3.6. The following rules apply
to extensions:
1. Extension subtags are separated from the other subtags defined 1. Extension subtags are separated from the other subtags defined
in this document by a single-letter subtag ("singleton"). The in this document by a single-letter subtag ("singleton"). The
singleton MUST be one allocated to a registration authority via singleton MUST be one allocated to a registration authority via
the mechanism described in Section 3.6 and cannot be the letter the mechanism described in Section 3.6 and cannot be the letter
'x', which is reserved for private-use subtag sequences. 'x', which is reserved for private-use subtag sequences.
2. Note: Private-use subtag sequences starting with the singleton 2. Note: Private-use subtag sequences starting with the singleton
subtag 'x' are described below. subtag 'x' are described below.
skipping to change at page 17, line 7 skipping to change at page 15, line 7
script, region and variant subtags in a tag. script, region and variant subtags in a tag.
10. All subtags following the singleton and before another singleton 10. All subtags following the singleton and before another singleton
are part of the extension. Example: In the tag "fr-a-Latn", the are part of the extension. Example: In the tag "fr-a-Latn", the
subtag 'Latn' does not represent the script subtag 'Latn' subtag 'Latn' does not represent the script subtag 'Latn'
defined in the IANA Language Subtag Registry. Its meaning is defined in the IANA Language Subtag Registry. Its meaning is
defined by the extension 'a'. defined by the extension 'a'.
11. In the event that more than one extension appears in a single 11. In the event that more than one extension appears in a single
tag, the tag SHOULD be canonicalized as described in tag, the tag SHOULD be canonicalized as described in
Section 4.3. Section 4.4.
For example, if the prefix singleton 'r' and the shown subtags were For example, if the prefix singleton 'r' and the shown subtags were
defined, then the following tag would be a valid example: "en-Latn- defined, then the following tag would be a valid example: "en-Latn-
GB-boont-r-extended-sequence-x-private" GB-boont-r-extended-sequence-x-private"
2.2.7 Private Use Subtags 2.2.7 Private Use Subtags
The following rules apply to private-use subtags: Private use subtags are used to indicate distinctions in language
important in a given context by private agreement. The following
rules apply to private-use subtags:
1. Private-use subtags are separated from the other subtags defined 1. Private-use subtags are separated from the other subtags defined
in this document by the reserved single-character subtag 'x'. in this document by the reserved single-character subtag 'x'.
2. Private-use subtags MUST follow all language, extended language, 2. Private-use subtags MUST follow all language, extended language,
script, region, variant, and extension subtags in the tag. script, region, variant, and extension subtags in the tag.
Another way of saying this is that all subtags following the Another way of saying this is that all subtags following the
singleton 'x' MUST be considered private use. Example: The singleton 'x' MUST be considered private use. Example: The
subtag 'US' in the tag "en-x-US" is a private use subtag. subtag 'US' in the tag "en-x-US" is a private use subtag.
skipping to change at page 18, line 20 skipping to change at page 16, line 21
o Check that the tag and all of its subtags, including extension and o Check that the tag and all of its subtags, including extension and
private-use subtags, conform to the ABNF or that the tag is on the private-use subtags, conform to the ABNF or that the tag is on the
list of grandfathered tags. list of grandfathered tags.
o Check that singleton subtags that identify extensions do not o Check that singleton subtags that identify extensions do not
repeat. For example, the tag "en-a-xx-b-yy-a-zz" is not well- repeat. For example, the tag "en-a-xx-b-yy-a-zz" is not well-
formed. formed.
Well-formed processors are strongly encouraged to implement the Well-formed processors are strongly encouraged to implement the
canonicalization rules contained in Section 4.3. canonicalization rules contained in Section 4.4.
An implementation that claims to be validating MUST: An implementation that claims to be validating MUST:
o Check that the tag is well-formed. o Check that the tag is well-formed.
o Specify the particular registry date for which the implementation o Specify the particular registry date for which the implementation
performs validation of subtags. performs validation of subtags.
o Check that either the tag is a grandfathered tag, or that all o Check that either the tag is a grandfathered tag, or that all
language, script, region, and variant subtags consist of valid language, script, region, and variant subtags consist of valid
skipping to change at page 19, line 32 skipping to change at page 17, line 32
3.1 Format of the IANA Language Subtag Registry 3.1 Format of the IANA Language Subtag Registry
The IANA Language Subtag Registry ("the registry") will consist of a The IANA Language Subtag Registry ("the registry") will consist of a
text file that is machine readable in the format described in this text file that is machine readable in the format described in this
section, plus copies of the registration forms approved by the section, plus copies of the registration forms approved by the
Language Subtag Reviewer in accordance with the process described in Language Subtag Reviewer in accordance with the process described in
Section 3.4. With the exception of the registration forms for Section 3.4. With the exception of the registration forms for
grandfathered and redundant tags, no registration records will be grandfathered and redundant tags, no registration records will be
maintained for the initial set of subtags. maintained for the initial set of subtags.
The registry will be in a modified record-jar format text file [18]. The registry will be in a modified record-jar format text file
Lines are limited to 72 characters, including all whitespace. [record-jar]. Lines are limited to 72 characters, including all
whitespace.
Records are separated by lines containing only the sequence "%%" Records are separated by lines containing only the sequence "%%"
(%x25.25). (%x25.25).
Each field can be viewed as a single, logical line of ASCII Each field can be viewed as a single, logical line of ASCII
characters, comprising a field-name and a field-body separated by a characters, comprising a field-name and a field-body separated by a
COLON character (%x3A). For convenience, the field-body portion of COLON character (%x3A). For convenience, the field-body portion of
this conceptual entity can be split into a multiple-line this conceptual entity can be split into a multiple-line
representation; this is called "folding". The format of the registry representation; this is called "folding". The format of the registry
is described by the following ABNF (per [7]): is described by the following ABNF (per [RFC2234bis]):
registry = record *("%%" CRLF record) registry = record *("%%" CRLF record)
record = 1*( field-name *SP ":" *SP field-body CRLF ) record = 1*( field-name *SP ":" *SP field-body CRLF )
field-name = *(ALPHA / DIGIT / "-") field-name = *(ALPHA / DIGIT / "-")
field-body = *(ASCCHAR/LWSP) field-body = *(ASCCHAR/LWSP)
ASCCHAR = %x21-25 / %x27-7E / UNICHAR ; Note: AMPERSAND is %x26 ASCCHAR = %x21-25 / %x27-7E / UNICHAR ; Note: AMPERSAND is %x26
UNICHAR = "&#x" 2*6HEXDIG ";" UNICHAR = "&#x" 2*6HEXDIG ";"
The sequence '..' (%x2E.2E) in a field-body denotes a range of The sequence '..' (%x2E.2E) in a field-body denotes a range of
values. Such a range represents all subtags of the same length that values. Such a range represents all subtags of the same length that
are alphabetically within that range, including the values explicitly are alphabetically within that range, including the values explicitly
mentioned. For example 'a..c' denotes the values 'a', 'b', and 'c'. mentioned. For example 'a..c' denotes the values 'a', 'b', and 'c'.
Characters from outside the US-ASCII repertoire, as well as the Characters from outside the US-ASCII repertoire, as well as the
AMPERSAND character ("&", %x26) when it occurs in a field-body are AMPERSAND character ("&", %x26) when it occurs in a field-body are
represented by a "Numeric Character Reference" using hexadecimal represented by a "Numeric Character Reference" using hexadecimal
notation in the style used by XML 1.0 [19] (see notation in the style used by [XML10] (see
<http://www.w3.org/TR/REC-xml/#dt-charref>). This consists of the <http://www.w3.org/TR/REC-xml/#dt-charref>). This consists of the
sequence "&#x" (%x26.23.78) followed by a hexadecimal representation sequence "&#x" (%x26.23.78) followed by a hexadecimal representation
of the character's code point in ISO/IEC 10646 [6] followed by a of the character's code point in [ISO10646] followed by a closing
closing semicolon (%x3B). For example, the EURO SIGN, U+20AC, would semicolon (%x3B). For example, the EURO SIGN, U+20AC, would be
be represented by the sequence "&#x20AC;". Note that the hexadecimal represented by the sequence "&#x20AC;". Note that the hexadecimal
notation MAY have between two and six digits. notation MAY have between two and six digits.
All fields whose field-body contains a date value use the "full-date" All fields whose field-body contains a date value use the "full-date"
format specified in RFC 3339 [15]. For example: "2004-06-28" format specified in [RFC3339]. For example: "2004-06-28" represents
represents June 28, 2004 in the Gregorian calendar. June 28, 2004 in the Gregorian calendar.
The first record in the file contains the single field whose field- The first record in the file contains the single field whose field-
name is "File-Date". The field-body of this record contains the last name is "File-Date". The field-body of this record contains the last
modification date of this copy of the registry, making it possible to modification date of this copy of the registry, making it possible to
compare different versions of the registry. The registry on the IANA compare different versions of the registry. The registry on the IANA
website is the most current. Versions with an older date than that website is the most current. Versions with an older date than that
one are not up-to-date. one are not up-to-date.
File-Date: 2004-06-28 File-Date: 2004-06-28
%% %%
skipping to change at page 22, line 19 skipping to change at page 20, line 19
* Deprecated's field-value contains the date the record was * Deprecated's field-value contains the date the record was
deprecated. deprecated.
o Prefix o Prefix
* Prefix's field-value contains a language tag with which this * Prefix's field-value contains a language tag with which this
subtag MAY be used to form a new language tag, perhaps with subtag MAY be used to form a new language tag, perhaps with
other subtags as well. This field MUST only appear in records other subtags as well. This field MUST only appear in records
whose 'Type' field-value is 'variant' or 'extlang'. For whose 'Type' field-value is 'variant' or 'extlang'. For
example, the 'Prefix' for the variant 'scouse' is 'en', meaning example, the 'Prefix' for the variant 'nedis' is 'sl', meaning
that the tags "en-scouse" and "en-GB-scouse" might be that the tags "sl-nedis" and "sl-IT-nedis" might be appropriate
appropriate while the tag "is-scouse" is not. while the tag "is-nedis" is not.
o Comments o Comments
* Comments contains additional information about the subtag, as * Comments contains additional information about the subtag, as
deemed appropriate for understanding the registry and deemed appropriate for understanding the registry and
implementing language tags using the subtag or tag. implementing language tags using the subtag or tag.
o Suppress-Script o Suppress-Script
* Suppress-Script contains a script subtag that SHOULD NOT be * Suppress-Script contains a script subtag that SHOULD NOT be
skipping to change at page 24, line 7 skipping to change at page 22, line 7
'nn'. 'nn'.
Records of type 'variant' MAY have more than one field of type Records of type 'variant' MAY have more than one field of type
'Prefix'. Additional fields of this type MAY be added to a 'variant' 'Prefix'. Additional fields of this type MAY be added to a 'variant'
record via the registration process. record via the registration process.
Records of type 'extlang' MUST have _exactly_ one 'Prefix' field. Records of type 'extlang' MUST have _exactly_ one 'Prefix' field.
The field-value of the 'Prefix' field consists of a language tag The field-value of the 'Prefix' field consists of a language tag
whose subtags are appropriate to use with this subtag. For example, whose subtags are appropriate to use with this subtag. For example,
the variant subtag 'scouse' has a Prefix field of "en". This means the variant subtag '1996' has a Prefix field of "de". This means
that tags starting with the sequence "en-" are most appropriate with that tags starting with the sequence "de-" are appropriate with this
this subtag, so "en-Latn-scouse" and "en-GB-scouse" are both subtag, so "de-Latg-1996" and "de-CH-1996" are both acceptable, while
acceptable, while the tag "fr-scouse" is an inappropriate choice. the tag "fr-1996" is an inappropriate choice.
The field of type 'Prefix' MUST NOT be removed from any record. The The field of type 'Prefix' MUST NOT be removed from any record. The
field-value for this type of field MUST NOT be modified. field-value for this type of field MUST NOT be modified.
The field 'Comments' MAY appear more than once per record. This The field 'Comments' MAY appear more than once per record. This
field MAY be inserted or changed via the registration process and no field MAY be inserted or changed via the registration process and no
guarantee of stability is provided. The content of this field is not guarantee of stability is provided. The content of this field is not
restricted, except by the need to register the information, the restricted, except by the need to register the information, the
suitability of the request, and by reasonable practical size suitability of the request, and by reasonable practical size
limitations. Long screeds about a particular subtag are frowned limitations. Long screeds about a particular subtag are frowned
skipping to change at page 24, line 48 skipping to change at page 22, line 48
Maintenance of the registry requires that as codes are assigned or Maintenance of the registry requires that as codes are assigned or
withdrawn by ISO 639, ISO 15924, ISO 3166, and UN M.49, the Language withdrawn by ISO 639, ISO 15924, ISO 3166, and UN M.49, the Language
Subtag Reviewer will evaluate each change, determine whether it Subtag Reviewer will evaluate each change, determine whether it
conflicts with existing registry entries, and submit the information conflicts with existing registry entries, and submit the information
to IANA for inclusion in the registry. If an change takes place and to IANA for inclusion in the registry. If an change takes place and
the Language Subtag Reviewer does not do this in a timely manner, the Language Subtag Reviewer does not do this in a timely manner,
then any interested party MAY use the procedure in Section 3.4 to then any interested party MAY use the procedure in Section 3.4 to
register the appropriate update. register the appropriate update.
Note: The redundant and grandfathered entries together are the Note: The redundant and grandfathered entries together are the
complete list of tags registered under RFC 3066 [24]. The redundant complete list of tags registered under [RFC3066]. The redundant tags
tags are those that can now be formed using the subtags defined in are those that can now be formed using the subtags defined in the
the registry together with the rules of Section 2.2. The registry together with the rules of Section 2.2. The grandfathered
grandfathered entries are those that can never be legal under those entries are those that can never be legal under those same
same provisions. provisions.
The set of redundant and grandfathered tags is permanent and stable: The set of redundant and grandfathered tags is permanent and stable:
no new entries will be added and none of the entries will be removed. no new entries will be added and none of the entries will be removed.
Records of type 'grandfathered' MAY have their type converted to Records of type 'grandfathered' MAY have their type converted to
'redundant': see Section 3.7 for more information. 'redundant': see Section 3.7 for more information.
RFC 3066 tags that were deprecated prior to the adoption of this RFC 3066 tags that were deprecated prior to the adoption of this
document are part of the list of grandfathered tags and their document are part of the list of grandfathered tags and their
component subtags were not included as registered variants (although component subtags were not included as registered variants (although
they remain eligible for registration). For example, the tag "art- they remain eligible for registration). For example, the tag "art-
skipping to change at page 25, line 40 skipping to change at page 23, line 40
Type: variant Type: variant
Subtag: nedis Subtag: nedis
Description: Natisone dialect Description: Natisone dialect
Description: Nadiza dialect Description: Nadiza dialect
Added: 2003-10-09 Added: 2003-10-09
Prefix: sl Prefix: sl
Comments: This is a comment shown Comments: This is a comment shown
as an example. as an example.
%% %%
Figure 6 Figure 4
Whenever an entry is created or modified in the registry, the 'File- Whenever an entry is created or modified in the registry, the 'File-
Date' record at the start of the registry is updated to reflect the Date' record at the start of the registry is updated to reflect the
most recent modification date in the RFC 3339 [15] "full-date" most recent modification date in the [RFC3339] "full-date" format.
format.
Values in the 'Subtag' field MUST be lowercase except as provided for Values in the 'Subtag' field MUST be lowercase except as provided for
in Section 3.1. in Section 3.1.
3.3 Stability of IANA Registry Entries 3.3 Stability of IANA Registry Entries
The stability of entries and their meaning in the registry is The stability of entries and their meaning in the registry is
critical to the long term stability of language tags. The rules in critical to the long term stability of language tags. The rules in
this section guarantee that a specific language tag's meaning is this section guarantee that a specific language tag's meaning is
stable over time and will not change. stable over time and will not change.
skipping to change at page 30, line 5 skipping to change at page 27, line 49
Preferred-Value: Preferred-Value:
Deprecated: Deprecated:
Suppress-Script: Suppress-Script:
Comments: Comments:
4. Intended meaning of the subtag: 4. Intended meaning of the subtag:
5. Reference to published description 5. Reference to published description
of the language (book or article): of the language (book or article):
6. Any other relevant information: 6. Any other relevant information:
Figure 7 Figure 5
The subtag registration form MUST be sent to The subtag registration form MUST be sent to
<ietf-languages@iana.org> for a two week review period before it can <ietf-languages@iana.org> for a two week review period before it can
be submitted to IANA. (This is an open list and can be joined by be submitted to IANA. (This is an open list and can be joined by
sending a request to <ietf-languages-request@iana.org>.) sending a request to <ietf-languages-request@iana.org>.)
Variant and extlang subtags are always registered for use with a Variant and extlang subtags are always registered for use with a
particular range of language tags. For example, the subtag 'scouse' particular range of language tags. For example, the subtag 'rozaj'
is intended for use with language tags that start with the primary is intended for use with language tags that start with the primary
language subtag "en", since Scouse is a dialect of English. Thus the language subtag "sl", since Resian is a dialect of Slovenian. Thus
subtag 'scouse' could be included in tags such as "en-Latn-scouse" or the subtag 'rozaj' could be included in tags such as "sl-Latn-rozaj"
"en-GB-scouse". This information is stored in the "Prefix" field in or "sl-IT-rozaj". This information is stored in the "Prefix" field
the registry. Variant registration requests are REQUIRED to include in the registry. Variant registration requests are REQUIRED to
at least one "Prefix" field in the registration form. include at least one "Prefix" field in the registration form.
The 'Prefix' field for a given registered subtag will be maintained The 'Prefix' field for a given registered subtag will be maintained
in the IANA registry as a guide to usage. Additional prefixes MAY be in the IANA registry as a guide to usage. Additional prefixes MAY be
added by filing an additional registration form. In that form, the added by filing an additional registration form. In that form, the
"Any other relevant information:" field MUST indicate that it is the "Any other relevant information:" field MUST indicate that it is the
addition of a prefix. addition of a prefix.
Requests to add a prefix to a variant subtag that imply a different Requests to add a prefix to a variant subtag that imply a different
semantic meaning will probably be rejected. For example, a request semantic meaning will probably be rejected. For example, a request
to add the prefix "de" to the subtag 'nedis' so that the tag "de- to add the prefix "de" to the subtag 'nedis' so that the tag "de-
skipping to change at page 31, line 24 skipping to change at page 29, line 22
Note that the reviewer can raise objections on the list if he or she Note that the reviewer can raise objections on the list if he or she
so desires. The important thing is that the objection MUST be made so desires. The important thing is that the objection MUST be made
publicly. publicly.
The applicant is free to modify a rejected application with The applicant is free to modify a rejected application with
additional information and submit it again; this restarts the two additional information and submit it again; this restarts the two
week comment period. week comment period.
Decisions made by the reviewer MAY be appealed to the IESG [RFC 2028] Decisions made by the reviewer MAY be appealed to the IESG [RFC 2028]
[9] under the same rules as other IETF decisions [RFC 2026] [8]. under the same rules as other IETF decisions [RFC2026].
All approved registration forms are available online in the directory All approved registration forms are available online in the directory
http://www.iana.org/numbers.html under "languages". http://www.iana.org/numbers.html under "languages".
Updates or changes to existing records, including previous Updates or changes to existing records, including previous
registrations, follow the same procedure as new registrations. The registrations, follow the same procedure as new registrations. The
Language Subtag Reviewer decides whether there is consensus to update Language Subtag Reviewer decides whether there is consensus to update
the registration following the two week review period; normally the registration following the two week review period; normally
objections by the original registrant will carry extra weight in objections by the original registrant will carry extra weight in
forming such a consensus. forming such a consensus.
skipping to change at page 32, line 29 skipping to change at page 30, line 29
with those authorities. If ISO 639 has previously rejected a with those authorities. If ISO 639 has previously rejected a
language for registration, it is reasonable to assume that there language for registration, it is reasonable to assume that there
must be additional very compelling evidence of need before it will must be additional very compelling evidence of need before it will
be registered in the IANA registry (to the extent that it is very be registered in the IANA registry (to the extent that it is very
unlikely that any subtags will be registered of this type). unlikely that any subtags will be registered of this type).
o Dialect or other divisions or variations within a language, its o Dialect or other divisions or variations within a language, its
orthography, writing system, regional or historical usage, orthography, writing system, regional or historical usage,
transliteration or other transformation, or distinguishing transliteration or other transformation, or distinguishing
variation MAY be registered as variant subtags. An example is the variation MAY be registered as variant subtags. An example is the
'scouse' subtag (the Scouse dialect of English). 'rozaj' subtag (the Resian dialect of Slovenian).
o The addition or maintenance of fields (generally of an o The addition or maintenance of fields (generally of an
informational nature) in Tag or Subtag records as described in informational nature) in Tag or Subtag records as described in
Section 3.1 and subject to the stability provisions in Section 3.1 and subject to the stability provisions in
Section 3.3. This includes descriptions; comments; deprecation Section 3.3. This includes descriptions; comments; deprecation
and preferred values for obsolete or withdrawn codes; or the and preferred values for obsolete or withdrawn codes; or the
addition of script or extlang information to primary language addition of script or extlang information to primary language
subtags. subtags.
o The addition of records and related field value changes necessary o The addition of records and related field value changes necessary
skipping to change at page 35, line 19 skipping to change at page 33, line 19
registration form to iesg@ietf.org, who will forward the request to registration form to iesg@ietf.org, who will forward the request to
iana@iana.org. The maintaining authority of the extension MUST iana@iana.org. The maintaining authority of the extension MUST
maintain the accuracy of the record by sending an updated full copy maintain the accuracy of the record by sending an updated full copy
of the record to iana@iana.org with the subject line "LANGUAGE TAG of the record to iana@iana.org with the subject line "LANGUAGE TAG
EXTENSION UPDATE" whenever content changes. Only the 'Comments', EXTENSION UPDATE" whenever content changes. Only the 'Comments',
'Contact_Email', 'Mailing_List', and 'URL' fields MAY be modified in 'Contact_Email', 'Mailing_List', and 'URL' fields MAY be modified in
these updates. these updates.
Failure to maintain this record, the corresponding registry, or meet Failure to maintain this record, the corresponding registry, or meet
other conditions imposed by this section of this document MAY be other conditions imposed by this section of this document MAY be
appealed to the IESG [RFC 2028] [9] under the same rules as other appealed to the IESG [RFC2028] under the same rules as other IETF
IETF decisions (see [8]) and MAY result in the authority to maintain decisions (see [RFC2026]) and MAY result in the authority to maintain
the extension being withdrawn or reassigned by the IESG. the extension being withdrawn or reassigned by the IESG.
%% %%
Identifier: Identifier:
Description: Description:
Comments: Comments:
Added: Added:
RFC: RFC:
Authority: Authority:
Contact_Email: Contact_Email:
Mailing_List: Mailing_List:
URL: URL:
%% %%
Figure 8: Format of Records in the Language Tag Extensions Registry Figure 6: Format of Records in the Language Tag Extensions Registry
'Identifier' contains the single letter subtag (singleton) assigned 'Identifier' contains the single letter subtag (singleton) assigned
to the extension. The Internet-Draft submitted to define the to the extension. The Internet-Draft submitted to define the
extension SHOULD specify which letter to use, although the IESG MAY extension SHOULD specify which letter to use, although the IESG MAY
change the assignment when approving the RFC. change the assignment when approving the RFC.
'Description' contains the name and description of the extension. 'Description' contains the name and description of the extension.
'Comments' is an OPTIONAL field and MAY contain a broader description 'Comments' is an OPTIONAL field and MAY contain a broader description
of the extension. of the extension.
'Added' contains the date the RFC was published in the "full-date" 'Added' contains the date the RFC was published in the "full-date"
format specified in RFC 3339 [15]. For example: 2004-06-28 format specified in [RFC3339]. For example: 2004-06-28 represents
represents June 28, 2004, in the Gregorian calendar. June 28, 2004, in the Gregorian calendar.
'RFC' contains the RFC number assigned to the extension. 'RFC' contains the RFC number assigned to the extension.
'Authority' contains the name of the maintaining authority for the 'Authority' contains the name of the maintaining authority for the
extension. extension.
'Contact_Email' contains the email address used to contact the 'Contact_Email' contains the email address used to contact the
maintaining authority. maintaining authority.
'Mailing_List' contains the URL or subscription email address of the 'Mailing_List' contains the URL or subscription email address of the
skipping to change at page 37, line 10 skipping to change at page 35, line 10
initial set of records represents no impact on IANA, since the work initial set of records represents no impact on IANA, since the work
to create it will be performed externally (as defined in this to create it will be performed externally (as defined in this
section). Future work will be limited to inserting or replacing section). Future work will be limited to inserting or replacing
whole records preformatted for IANA by the Language Subtag Reviewer. whole records preformatted for IANA by the Language Subtag Reviewer.
The initial registry will be created by the LTRU working group. The initial registry will be created by the LTRU working group.
Using the instructions in this document, the working group will Using the instructions in this document, the working group will
prepare an Informational RFC by creating a series of Internet-Drafts prepare an Informational RFC by creating a series of Internet-Drafts
containing the prototype registry according to the rules in Sections containing the prototype registry according to the rules in Sections
4.2.2 and 4.2.3 and subject to IESG review as described in Section 4.2.2 and 4.2.3 and subject to IESG review as described in Section
6.1.1 of RFC 2026 [8]. 6.1.1 of [RFC2026].
When the Internet-Draft containing the prototype registry has been When the Internet-Draft containing the prototype registry has been
approved by the IESG for publication as an RFC, the document will be approved by the IESG for publication as an RFC, the document will be
forwarded to IANA, which will post the contents of the new registry forwarded to IANA, which will post the contents of the new registry
on-line. on-line.
Tags in the RFC 3066 registry that are not deprecated that consist Tags in the RFC 3066 registry that are not deprecated that consist
entirely of subtags that are valid under this document and which have entirely of subtags that are defined by this document and which have
the correct form and format for tags defined by this document are the correct form and format for tags defined by this document are
superseded by this document. Such tags are placed in records of type superseded by this document. Such tags MUST be placed in records of
'redundant' in the registry. For example, "zh-Hant" is now defined type 'redundant' in the registry. For example, "zh-Hant" is now
by this document. defined by this document because 'zh' is an ISO 639-1 code and 'Hant'
is an ISO 15924 code and both are defined in the registry.
All other tags in the RFC 3066 registry that are deprecated will be Tags in the RFC 3066 registry that contain one or more subtags that
maintained as grandfathered entries. The record for the do not match the valid registration pattern or which are not
grandfathered entry will contain a 'Deprecated' field with the most otherwise defined by this document MUST have records of type
appropriate date that can be determined for when the record was 'grandfathered' created in the registry. These records cannot become
deprecated. The 'Comments' field will contain the reason for the type 'redundant' except by revision of this document, but MAY have a
deprecation. The 'Preferred-Value' field will contain the tag that 'Deprecated' and 'Preferred-Value' field added to them if a subtag
replaces the value. For example, the tag "art-lojban" is deprecated assignment or combination of assignments renders the tag obsolete.
and will be placed in the grandfathered section. It's 'Deprecated'
field will contain the deprecation date (in this case "2003-09-02")
and the 'Preferred-Value' field the value "jbo".
Tags that are not deprecated and which contain subtags which are Tags in the RFC 3066 registry that have a notation that they are
consistent with registration under the guidelines in this document deprecated MUST be maintained as grandfathered entries. The record
will not automatically have a new subtag registration created for for the grandfathered entry MUST contain a 'Deprecated' field with
each eligible subtag. Interested parties MAY use the registration the most appropriate date that can be determined for when the RFC
process in Section 3.4 to register these subtags. If all of the 3066 record was deprecated. The 'Comments' field SHOULD contain the
subtags in the original tag become fully defined by the resulting reason for the deprecation. The 'Preferred-Value' field MAY contain
registrations, then the original tag is superseded by this document. a tag that replaces the value. For example, the tag "art-lojban" is
Such tags will have their record changed from type 'grandfathered' to deprecated and will be placed in the grandfathered section. It's
type 'redundant' in the registry. For example, the subtag 'boont' 'Deprecated' field will contain the deprecation date (in this case
could be registered, resulting in the change of the grandfathered tag "2003-09-02") and the 'Preferred-Value' field the value "jbo".
"en-boont" to type redundant in the registry.
Tags that contain one or more subtags that do not match the valid The remaining tags in the RFC 3066 registry are not deprecated, have
registration pattern and which are not otherwise defined by this a format consistent with language tags as defined by this document,
document will have records of type 'grandfathered' created in the but contain subtags which are not defined by ISO 639, ISO 15924, or
registry. These records cannot become type 'redundant', but MAY have ISO 3166. These subtags are consistent with registration as
a 'Deprecated' and 'Preferred-Value' field added to them if a subtag variants. The initial registry SHALL contain appropriate variant
assignment or combination of assignments renders the tag obsolete. records for the following subtags, and registered RFC 3066 tags
containing these subtags MUST be entered into the initial registry as
type 'redundant':
1901 (use with Prefix: de)
1996 (use with Prefix: de)
nedis (use with Prefix: sl)
rozaj (use with Prefix: sl)
All remaining RFC 3066 registered tags MUST be entered into the
initial registry in records of type 'grandfathered'. Interested
parties MAY use the registration process in Section 3.4 in an attempt
to register the variant subtags not already present in the registry.
If all of the subtags in the original tag become fully defined by the
resulting registrations, then the original tag is superseded by this
document. Such tags MUST have their record changed from type
'grandfathered' to type 'redundant' in the registry. Note that
previous approval of a tag under RFC 3066 is no guarantee of approval
of a variant subtag under this document. The existing RFC 3066 tag
maintains its validity, but the original reason for its registration
might have become obsolete. For example, the subtag 'boont' could be
registered, resulting in the change of the grandfathered tag "en-
boont" to type redundant in the registry.
There MUST be a reasonable period in which the community can comment There MUST be a reasonable period in which the community can comment
on the proposed list entries, which SHALL be no less than four weeks on the proposed list entries, which SHALL be no less than four weeks
in length. At the completion of this period, the chair(s) will in length. At the completion of this period, the chair(s) will
notify iana@iana.org and the ltru and ietf-languages mail lists that notify iana@iana.org and the ltru and ietf-languages mail lists that
the task is complete and forward the necessary materials to IANA for the task is complete and forward the necessary materials to IANA for
publication. publication.
Registrations that are in process under the rules defined in RFC 3066 Registrations that are in process under the rules defined in RFC 3066
MAY be completed under the former rules, at the discretion of the MAY be completed under the former rules, at the discretion of the
language tag reviewer. Any new registrations submitted after the language tag reviewer. Any new registrations submitted after the
request for conversion of the registry MUST be rejected. request for conversion of the registry MUST be rejected. New
registrations completed under RFC 3066 SHALL be entered into the
initial registry using the rules defined just above.
All existing RFC 3066 language tag registrations will be maintained All existing RFC 3066 language tag registrations will be maintained
in perpetuity. in perpetuity.
Users of tags that are grandfathered SHOULD consider registering Users of tags that are grandfathered SHOULD consider registering
appropriate subtags in the IANA subtag registry (but are NOT REQUIRED appropriate subtags in the IANA subtag registry (but are NOT REQUIRED
to). to).
UN numeric codes assigned to 'macro-geographical (continental)' MUST UN numeric codes assigned to 'macro-geographical (continental)' MUST
be defined in the IANA registry and made valid for use in language be defined in the IANA registry and made valid for use in language
skipping to change at page 39, line 7 skipping to change at page 38, line 7
not removed. Changes in meaning or assignment of a subtag are not removed. Changes in meaning or assignment of a subtag are
permitted during this process (for example, the ISO 3166 code 'CS' permitted during this process (for example, the ISO 3166 code 'CS'
was originally assigned to 'Czechoslovakia' and is now assigned to was originally assigned to 'Czechoslovakia' and is now assigned to
'Serbia and Montenegro'). This continues up to the date that this 'Serbia and Montenegro'). This continues up to the date that this
document was adopted. The resulting set of records is added to the document was adopted. The resulting set of records is added to the
registry. Future changes or additions to this portion of the registry. Future changes or additions to this portion of the
registry are governed by the provisions of this document. registry are governed by the provisions of this document.
4. Formation and Processing of Language Tags 4. Formation and Processing of Language Tags
This section addresses how to use the registry with the language tag This section addresses how to use the information in the registry
format to choose, form and process language tags. with the tag syntax to choose, form and process language tags.
4.1 Choice of Language Tag 4.1 Choice of Language Tag
One is sometimes faced with the choice between several possible tags One is sometimes faced with the choice between several possible tags
for the same body of text. for the same body of text.
Interoperability is best served when all users use the same language Interoperability is best served when all users use the same language
tag in order to represent the same language. If an application has tag in order to represent the same language. If an application has
requirements that make the rules here inapplicable, then that requirements that make the rules here inapplicable, then that
application risks damaging interoperability. It is strongly application risks damaging interoperability. It is strongly
RECOMMENDED that users not define their own rules for language tag RECOMMENDED that users not define their own rules for language tag
choice. choice.
Subtags SHOULD only be used where they add useful distinguishing
information; extraneous subtags interfere with the meaning,
understanding, and processing of language tags. In particular, users
and implementations SHOULD follow the 'Prefix' and 'Suppress-Script'
fields in the registry (defined in Section 3.1): these fields provide
guidance on when specific additional subtags SHOULD (and SHOULD NOT)
be used in a language tag.
Of particular note, many applications can benefit from the use of Of particular note, many applications can benefit from the use of
script subtags in language tags, as long as the use is consistent for script subtags in language tags, as long as the use is consistent for
a given context. Script subtags were not formally defined in RFC a given context. Script subtags were not formally defined in RFC
3066 and their use can affect matching and subtag identification by 3066 and their use can affect matching and subtag identification by
implementations of RFC 3066, as these subtags appear between the implementations of RFC 3066, as these subtags appear between the
primary language and region subtags. For example, if a user requests primary language and region subtags. For example, if a user requests
content in an implementation of Section 2.5 of RFC 3066 [24] using content in an implementation of Section 2.5 of [RFC3066] using the
the language range "en-US", content labeled "en-Latn-US" will not language range "en-US", content labeled "en-Latn-US" will not match
match the request. Therefore it is important to know when script the request. Therefore it is important to know when script subtags
subtags will customarily be used and when they ought not be used. In will customarily be used and when they ought not be used. In the
the registry, the Suppress-Script field helps ensure greater registry, the Suppress-Script field helps ensure greater
compatibility between the language tags generated according to the compatibility between the language tags generated according to the
rules in this document and language tags and tag processors or rules in this document and language tags and tag processors or
consumers based on RFC 3066 by defining when users SHOULD NOT include consumers based on RFC 3066 by defining when users SHOULD NOT include
a script subtag with a particular primary language subtag. a script subtag with a particular primary language subtag.
Extended language subtags (type 'extlang' in the registry, see Extended language subtags (type 'extlang' in the registry, see
Section 3.1) also appear between the primary language and region Section 3.1) also appear between the primary language and region
subtags and are reserved for future standardization. Applications subtags and are reserved for future standardization. Applications
might benefit from their judicious use in forming language tags in might benefit from their judicious use in forming language tags in
the future. Similar recommendations are expected to apply to their the future. Similar recommendations are expected to apply to their
skipping to change at page 41, line 5 skipping to change at page 40, line 13
whenever the protocol allows the separate tags for multiple whenever the protocol allows the separate tags for multiple
languages, as is the case for the Content-Language header in languages, as is the case for the Content-Language header in
HTTP. The 'mul' subtag conveys little useful information: HTTP. The 'mul' subtag conveys little useful information:
content in multiple languages SHOULD individually tag the content in multiple languages SHOULD individually tag the
languages where they appear or otherwise indicate the actual languages where they appear or otherwise indicate the actual
language in preference to the 'mul' subtag. language in preference to the 'mul' subtag.
6. The same variant subtag SHOULD NOT be used more than once within 6. The same variant subtag SHOULD NOT be used more than once within
a language tag. a language tag.
* For example, do not use "en-GB-scouse-scouse". * For example, do not use "de-DE-1901-1901".
To ensure consistent backward compatibility, this document contains To ensure consistent backward compatibility, this document contains
several provisions to account for potential instability in the several provisions to account for potential instability in the
standards used to define the subtags that make up language tags. standards used to define the subtags that make up language tags.
These provisions mean that no language tag created under the rules in These provisions mean that no language tag created under the rules in
this document will become obsolete. this document will become obsolete.
4.2 Meaning of the Language Tag 4.2 Meaning of the Language Tag
The language tag always defines a language as spoken (or written,
signed or otherwise signaled) by human beings for communication of
information to other human beings. Computer languages such as
programming languages are explicitly excluded.
If a language tag B contains language tag A as a prefix, then B is
typically "narrower" or "more specific" than A. For example, "zh-
Hant-TW" is more specific than "zh-Hant".
This relationship is not guaranteed in all cases: specifically,
languages that begin with the same sequence of subtags are NOT
guaranteed to be mutually intelligible, although they might be. For
example, the tag "az" shares a prefix with both "az-Latn"
(Azerbaijani written using the Latin script) and "az-Cyrl"
(Azerbaijani written using the Cyrillic script). A person fluent in
one script might not be able to read the other, even though the text
might be identical. Content tagged as "az" most probably is written
in just one script and thus might not be intelligible to a reader
familiar with the other script.
The relationship between the tag and the information it relates to is The relationship between the tag and the information it relates to is
defined by the standard describing the context in which it appears. defined by the the context in which the tag appears. Accordingly,
Accordingly, this section can only give possible examples of its this section can only give possible examples of its usage.
usage.
o For a single information object, the associated language tags o For a single information object, the associated language tags
might be interpreted as the set of languages that is necessary for might be interpreted as the set of languages that is necessary for
a complete comprehension of the complete object. Example: Plain a complete comprehension of the complete object. Example: Plain
text documents. text documents.
o For an aggregation of information objects, the associated language o For an aggregation of information objects, the associated language
tags could be taken as the set of languages used inside components tags could be taken as the set of languages used inside components
of that aggregation. Examples: Document stores and libraries. of that aggregation. Examples: Document stores and libraries.
skipping to change at page 42, line 21 skipping to change at page 41, line 9
structure (including the whole document itself). For example, one structure (including the whole document itself). For example, one
could write <span lang="fr">C'est la vie.</span> inside a could write <span lang="fr">C'est la vie.</span> inside a
Norwegian document; the Norwegian-speaking user could then access Norwegian document; the Norwegian-speaking user could then access
a French-Norwegian dictionary to find out what the marked section a French-Norwegian dictionary to find out what the marked section
meant. If the user were listening to that document through a meant. If the user were listening to that document through a
speech synthesis interface, this formation could be used to signal speech synthesis interface, this formation could be used to signal
the synthesizer to appropriately apply French text-to-speech the synthesizer to appropriately apply French text-to-speech
pronunciation rules to that span of text, instead of applying the pronunciation rules to that span of text, instead of applying the
inappropriate Norwegian rules. inappropriate Norwegian rules.
4.3 Canonicalization of Language Tags Language tags are related when they contain a similar sequence of
subtags. For example, if a language tag B contains language tag A as
a prefix, then B is typically "narrower" or "more specific" than A.
Thus "zh-Hant-TW" is more specific than "zh-Hant".
This relationship is not guaranteed in all cases: specifically,
languages that begin with the same sequence of subtags are NOT
guaranteed to be mutually intelligible, although they might be. For
example, the tag "az" shares a prefix with both "az-Latn"
(Azerbaijani written using the Latin script) and "az-Cyrl"
(Azerbaijani written using the Cyrillic script). A person fluent in
one script might not be able to read the other, even though the text
might be identical. Content tagged as "az" most probably is written
in just one script and thus might not be intelligible to a reader
familiar with the other script.
4.3 Length Considerations
[RFC3066] did not provide an upper limit on the size of language
tags. While RFC 3066 did define the semantics of particular subtags
in such a way that most language tags consisted of language and
region subtags with a combined total length of up to six characters,
larger registered tags were not only possible but were actually
registered.
Neither the language tag syntax nor other requirements in this
document impose a fixed upper limit on the number of subtags in a
language tag (and thus an upper bound on the size of a tag). The
language tag syntax suggests that, depending on the specific
language, more subtags (and thus a longer tag) are sometimes
necessary to completely identify the language for certain
applications; thus it is possible to envision long or complex subtag
sequences.
4.3.1 Working with Limited Buffer Sizes
Some applications and protocols are forced to allocate fixed buffer
sizes or otherwise limit the length of a language tag. A conformant
implementation or specification MAY refuse to support the storage of
language tags which exceed a specified length. Any such limitation
SHOULD be clearly documented, and such documentation SHOULD include
what happens to longer tags (for example, whether an error value is
generated or the language tag is truncated). A protocol that allows
tags to be truncated at an arbitrary limit, without giving any
indication of what that limit is, has the potential for causing harm
by changing the meaning of tags in substantial ways.
In practice, most language tags do not require more than a few
subtags and will not approach reasonably sized buffer limitations:
see Section 4.1.
Some specifications or protocols have limits on tag length but do not
have a fixed length limitation. For example, [RFC2231] has no
explicit length limitation: the length available for the language tag
is constrained by the length of other header components (such as the
charset's name) coupled with the 76 character limit in [RFC2047].
Thus the "limit" might be 50 or more characters, but it could
potentially be quite small.
The considerations for assigning a buffer limit are:
Implementations SHOULD NOT truncate language tags unless the
meaning of the tag is purposefully being changed, or unless the
tag does not fit into a limited buffer size specified by a
protocol for storage or transmission.
Implementations SHOULD warn the user when a tag is truncated since
truncation changes the semantic meaning of the tag.
Implementations of protocols or specifications that are space
constrained but do not have a fixed limit SHOULD use the longest
possible tag in preference to truncation.
Protocols or specifications that specify limited buffer sizes for
language tags MUST allow for language tags of up to 33 characters.
Protocols or specifications that specify limited buffer sizes for
language tags SHOULD allow for language tags of at least 42
characters.
The following illustration shows how the 42-character recommendation
was derived. The combination of language and extended language
subtags was chosen for future compatibility. At up to 15 characters,
this combination is longer than the longest possible primary language
subtag (8 characters):
language = 3 (ISO 639-2; ISO 639-1 requires 2)
extlang1 = 4 (each subsequent subtag includes '-')
extlang2 = 4 (unlikely: needs prefix="language-extlang1")
extlang3 = 4 (extremely unlikely)
script = 5 (if not suppressed: see Section 4.1)
region = 4 (UN M.49; ISO 3166 requires 3)
variant1 = 9 (MUST have language as a prefix)
variant2 = 9 (MUST have language-variant1 as a prefix)
total = 42 characters
Figure 7: Derivation of the Limit on Tag Length
4.3.2 Truncation of Language Tags
Truncation of a language tag alters the meaning of the tag, and thus
SHOULD be avoided. However, truncation of language tags is sometimes
necessary due to limited buffer sizes. Such truncation MUST NOT
permit a subtag to be chopped off in the middle or the formation of
invalid tags (for example, one ending with the "-" character).
This means that applications or protocols which truncate tags MUST do
so by progressively removing subtags along with their preceding "-"
from the right side of the language tag until the tag is short enough
for the given buffer. If the resulting tag ends with a single-
character subtag, that subtag and its preceding "-" MUST also be
removed. For example:
Tag to truncate: zh-Hant-CN-variant1-a-extend1-x-wadegile-private1
1. zh-Latn-CN-variant1-a-extend1-x-wadegile
2. zh-Latn-CN-variant1-a-extend1
3. zh-Latn-CN-variant1
4. zh-Latn-CN
5. zh-Latn
6. zh
Figure 8: Example of Tag Truncation
4.4 Canonicalization of Language Tags
Since a particular language tag is sometimes used by many processes, Since a particular language tag is sometimes used by many processes,
language tags SHOULD always be created or generated in a canonical language tags SHOULD always be created or generated in a canonical
form. form.
A language tag is in canonical form when: A language tag is in canonical form when:
1. The tag is well-formed according the rules in Section 2.1 and 1. The tag is well-formed according the rules in Section 2.1 and
Section 2.2. Section 2.2.
skipping to change at page 44, line 12 skipping to change at page 45, line 36
define how the order of the extension's subtags are interpreted. For define how the order of the extension's subtags are interpreted. For
example, an extension could define that its subtags are in canonical example, an extension could define that its subtags are in canonical
order when the subtags are placed into ASCII order: that is, "en-a- order when the subtags are placed into ASCII order: that is, "en-a-
aaa-bbb-ccc" instead of "en-a-ccc-bbb-aaa". Another extension might aaa-bbb-ccc" instead of "en-a-ccc-bbb-aaa". Another extension might
define that the order of the subtags influences their semantic define that the order of the subtags influences their semantic
meaning (so that "en-b-ccc-bbb-aaa" has a different value from "en-b- meaning (so that "en-b-ccc-bbb-aaa" has a different value from "en-b-
aaa-bbb-ccc"). However, extension specifications SHOULD be designed aaa-bbb-ccc"). However, extension specifications SHOULD be designed
so that they are tolerant of the typical processes described in so that they are tolerant of the typical processes described in
Section 3.6. Section 3.6.
4.4 Considerations for Private Use Subtags 4.5 Considerations for Private Use Subtags
Private-use subtags require private agreement between the parties Private-use subtags require private agreement between the parties
that intend to use or exchange language tags that use them and great that intend to use or exchange language tags that use them and great
caution SHOULD be used in employing them in content or protocols caution SHOULD be used in employing them in content or protocols
intended for general use. Private-use subtags are simply useless for intended for general use. Private-use subtags are simply useless for
information exchange without prior arrangement. information exchange without prior arrangement.
The value and semantic meaning of private-use tags and of the subtags The value and semantic meaning of private-use tags and of the subtags
used within such a language tag are not defined by this document. used within such a language tag are not defined by this document.
skipping to change at page 45, line 10 skipping to change at page 47, line 10
interact with other systems in a different and possibly unsuitable interact with other systems in a different and possibly unsuitable
manner compared to tags that use opaque, privately defined subtags, manner compared to tags that use opaque, privately defined subtags,
so the choice of the best approach sometimes depends on the so the choice of the best approach sometimes depends on the
particular domain in question. particular domain in question.
5. IANA Considerations 5. IANA Considerations
This section deals with the processes and requirements necessary for This section deals with the processes and requirements necessary for
IANA to undertake to maintain the subtag and extension registries as IANA to undertake to maintain the subtag and extension registries as
defined by this document and in accordance with the requirements of defined by this document and in accordance with the requirements of
RFC 2434 [12]. [RFC2434].
The impact on the IANA maintainers of the two registries defined by The impact on the IANA maintainers of the two registries defined by
this document will be a small increase in the frequency of new this document will be a small increase in the frequency of new
entries or updates. entries or updates.
Upon adoption of this document, the process described in Section 3.7 Upon adoption of this document, the process described in Section 3.7
will be used to generate the initial Language Subtag Registry. The will be used to generate the initial Language Subtag Registry. The
initial set of records represents no impact on IANA, since the work initial set of records represents no impact on IANA, since the work
to create it will be performed externally (as defined in that to create it will be performed externally (as defined in that
section). The new registry will be listed under "Language Tags" at section). The new registry will be listed under "Language Tags" at
skipping to change at page 46, line 17 skipping to change at page 48, line 17
Language tags used in content negotiation, like any other information Language tags used in content negotiation, like any other information
exchanged on the Internet, might be a source of concern because they exchanged on the Internet, might be a source of concern because they
might be used to infer the nationality of the sender, and thus might be used to infer the nationality of the sender, and thus
identify potential targets for surveillance. identify potential targets for surveillance.
This is a special case of the general problem that anything sent is This is a special case of the general problem that anything sent is
visible to the receiving party and possibly to third parties as well. visible to the receiving party and possibly to third parties as well.
It is useful to be aware that such concerns can exist in some cases. It is useful to be aware that such concerns can exist in some cases.
The evaluation of the exact magnitude of the threat, and any possible The evaluation of the exact magnitude of the threat, and any possible
countermeasures, is left to each application protocol (see BCP 72, countermeasures, is left to each application protocol (see BCP 72
RFC 3552 [16] for best current practice guidance on security threats [RFC3552] for best current practice guidance on security threats and
and defenses). defenses).
The language tag associated with a particular information item is of The language tag associated with a particular information item is of
no consequence whatsoever in determining whether that content might no consequence whatsoever in determining whether that content might
contain possible homographs. The fact that a text is tagged as being contain possible homographs. The fact that a text is tagged as being
in one language or using a particular script subtag provides no in one language or using a particular script subtag provides no
assurance whatsoever that it does not contain characters from scripts assurance whatsoever that it does not contain characters from scripts
other than the one(s) associated with or specified by that language other than the one(s) associated with or specified by that language
tag. tag.
Since there is no limit to the number of variant, private use, and Since there is no limit to the number of variant, private use, and
extension subtags, and consequently no limit on the possible length extension subtags, and consequently no limit on the possible length
of a tag, implementations need to guard against buffer overflow of a tag, implementations need to guard against buffer overflow
attacks. See Section 2.1.1 for details on language tag truncation, attacks. See Section 4.3 for details on language tag truncation,
which can occur as a consequence of defenses against buffer overflow. which can occur as a consequence of defenses against buffer overflow.
Although the specification of valid subtags for an extension (see: Although the specification of valid subtags for an extension (see:
Section 3.6) MUST be available over the Internet, implementations Section 3.6) MUST be available over the Internet, implementations
SHOULD NOT mechanically depend on it being always accessible, to SHOULD NOT mechanically depend on it being always accessible, to
prevent denial-of-service attacks. prevent denial-of-service attacks.
7. Character Set Considerations 7. Character Set Considerations
The syntax in this document requires that language tags use only the The syntax in this document requires that language tags use only the
skipping to change at page 48, line 14 skipping to change at page 50, line 14
8. Changes from RFC 3066 8. Changes from RFC 3066
The main goals for this revision of language tags were the following: The main goals for this revision of language tags were the following:
*Compatibility.* All valid RFC 3066 language tags (including those *Compatibility.* All valid RFC 3066 language tags (including those
in the IANA registry) remain valid in this specification. Thus in the IANA registry) remain valid in this specification. Thus
there is complete backward compatibility of this specification with there is complete backward compatibility of this specification with
existing content. In addition, this document defines language tags existing content. In addition, this document defines language tags
in such as way as to ensure future compatibility, and processors in such as way as to ensure future compatibility, and processors
based solely on the RFC 3066 ABNF (such as those described in XML based solely on the RFC 3066 ABNF (such as those described in
Schema version 1.0 [20]) will be able to process tags described by [XMLSchema]) will be able to process tags described by this document.
this document.
*Stability.* Because of the changes in underlying ISO standards, a *Stability.* Because of the changes in underlying ISO standards, a
valid RFC 3066 language tag may become invalid (or have its meaning valid RFC 3066 language tag may become invalid (or have its meaning
change) at a later date. With so much of the world's computing change) at a later date. With so much of the world's computing
infrastructure dependent on language tags, this is simply infrastructure dependent on language tags, this is simply
unacceptable: it invalidates content that may have an extensive unacceptable: it invalidates content that may have an extensive
shelf-life. In this specification, once a language tag is valid, it shelf-life. In this specification, once a language tag is valid, it
remains valid forever. Previously, there was no way to determine remains valid forever. Previously, there was no way to determine
when two tags were equivalent. This specification provides a stable when two tags were equivalent. This specification provides a stable
mechanism for doing so, through the use of canonical forms. These mechanism for doing so, through the use of canonical forms. These
skipping to change at page 50, line 24 skipping to change at page 52, line 21
region subtags respectively. region subtags respectively.
o Adds a well-defined extension mechanism. o Adds a well-defined extension mechanism.
o Defines an extended language subtag, possibly for use with certain o Defines an extended language subtag, possibly for use with certain
anticipated features of ISO 639-3. anticipated features of ISO 639-3.
Ed Note: The following items are provided for the convenience of Ed Note: The following items are provided for the convenience of
reviewers and will be removed from the final document. reviewers and will be removed from the final document.
Changes between draft-ietf-ltru-registry-04 and this version are: Changes between draft-ietf-ltru-registry-05 and this version are:
o Changes to Section 2.1.1. Incorporated Frank Ellermann's text o Changes to the initial population rules to pre-register four
about RFC 2231 and modified some conformance criteria. (#944) subtags. This included changing all the variant examples to use
just those four subtags (nedis, rozaj, 1996, and 1901) in
appropriate ways. It also includes substandtial wordsmithing of
the rules on handling RFC 3066 grandfathered/redundant
registrations (A.Phillips)
o Changed Section 2.2.4 and added UN M.49 to the list of standards o Rewrote the introduction to use "tag" instead of many (long,
monitored for changes in Section 3.4, plus added some additional convoluted) synonyms and to generally simplify the text. (thread
squirms to Section 3.3 to ensure that ISO-3166-less UN M.49 codes of #944) (M.Duerst, A.Phillips)
are not registered automagically but may be registered by
individuals given inaction on the part of ISO 3166 for 180 days.
Also made the assignments of UN M.49 codes in Section 2.2.4
normative (MUST instead of 'are'). Finally, the initial rules
were modified to reflect the foregoing in Section 3.7. (#1026)
(D.Ewell, P.Constable, A.Phillips)
o Added text to Section 3.5 allowing new entries and other changes o Added an introduction to Section 2 (moved from Section 4.2).
per the rules in Section 3.3 (A.Phillips) (M.Duerst)
o Added text to Section 2.2.4 and Section 3.3 forbidding the o Reorganized the resulting Section 4.2.
registration of UN M.49 country or area codes not assigned an ISO
3166 code. (#1026) (A.Phillips)
o Harmonized the rules pertaining to position and number of script o Divided Section 4.3 by added two subsections, moving paragraphs to
and region subtags (basically now they say that they MUST occur fit into the proper sub-section. Made the actual requirements
only once and MAY be omitted) (A.Phillips) into a list so that they would be very visible. (I.McDonald)
o Added the homograph paragraph to Section 6. (#967)(R.Presuhn) o Added the processing instruction symrefs='yes' (F.Ellermann)
o Moved Length Considerations from Section 2.1 to Section 4.3. Some
text was moved or reorganized as a result and a small change was
made in Section 4.1 (Choice) to ensure that no information was
lost. (A.Phillips)
o Added a small description of each subtag type to the sub-section
on each subtag in Section 2.1. (F.Charles)
o Modified the restriction on using extended language subtags in
Section 2.2.2 so that it is clearer. (J.Cowan)
9. References 9. References
9.1 Normative References 9.1 Normative References
[1] International Organization for Standardization, "ISO 639- [ISO639-1]
1:2002, Codes for the representation of names of languages -- International Organization for Standardization, "ISO 639-
Part 1: Alpha-2 code", ISO Standard 639, 2002. 1:2002, Codes for the representation of names of languages
-- Part 1: Alpha-2 code", ISO Standard 639, 2002, <ISO
639-1>.
[2] International Organization for Standardization, "ISO 639-2:1998 [ISO639-2]
- Codes for the representation of names of languages -- Part 2: International Organization for Standardization, "ISO 639-
Alpha-3 code - edition 1", August 1988. 2:1998 - Codes for the representation of names of
languages -- Part 2: Alpha-3 code - edition 1",
August 1988, <ISO 639-2>.
[3] ISO TC46/WG3, "ISO 15924:2003 (E/F) - Codes for the [ISO15924]
representation of names of scripts", January 2004. ISO TC46/WG3, "ISO 15924:2003 (E/F) - Codes for the
representation of names of scripts", January 2004, <ISO
15924>.
[4] International Organization for Standardization, "Codes for the [ISO3166] International Organization for Standardization, "Codes for
representation of names of countries, 3rd edition", the representation of names of countries, 3rd edition",
ISO Standard 3166, August 1988. ISO Standard 3166, August 1988, <ISO 3166>.
[5] Statistical Division, United Nations, "Standard Country or Area [UN_M.49] Statistical Division, United Nations, "Standard Country or
Codes for Statistical Use", UN Standard Country or Area Codes Area Codes for Statistical Use", UN Standard Country or
for Statistical Use, Revision 4 (United Nations publication, Area Codes for Statistical Use, Revision 4 (United Nations
Sales No. 98.XVII.9, June 1999. publication, Sales No. 98.XVII.9, June 1999, <UN M.49>.
[6] International Organization for Standardization, "ISO/IEC 10646- [ISO10646]
1:2000. Information technology -- Universal Multiple-Octet International Organization for Standardization, "ISO/IEC
Coded Character Set (UCS) -- Part 1: Architecture and Basic 10646-1:2000. Information technology -- Universal
Multilingual Plane and ISO/IEC 10646-2:2001. Information Multiple-Octet Coded Character Set (UCS) -- Part 1:
technology -- Universal Multiple-Octet Coded Character Set Architecture and Basic Multilingual Plane and ISO/IEC
(UCS) -- Part 2: Supplementary Planes, as, from time to time, 10646-2:2001. Information technology -- Universal
amended, replaced by a new edition or expanded by the addition Multiple-Octet Coded Character Set (UCS) -- Part 2:
of new parts", 2000. Supplementary Planes, as, from time to time, amended,
replaced by a new edition or expanded by the addition of
new parts", 2000, <ISO/IEC 10646>.
[7] Crocker, D. and P. Overell, "Augmented BNF for Syntax [RFC2234bis]
Specifications: ABNF", draft-crocker-abnf-rfc2234bis-00 (work Crocker, D. and P. Overell, "Augmented BNF for Syntax
in progress), March 2005. Specifications: ABNF", draft-crocker-abnf-rfc2234bis-00
(work in progress), March 2005.
[8] Bradner, S., "The Internet Standards Process -- Revision 3", [RFC2026] Bradner, S., "The Internet Standards Process -- Revision
BCP 9, RFC 2026, October 1996. 3", BCP 9, RFC 2026, October 1996.
[9] Hovey, R. and S. Bradner, "The Organizations Involved in the [RFC2028] Hovey, R. and S. Bradner, "The Organizations Involved in
IETF Standards Process", BCP 11, RFC 2028, October 1996. the IETF Standards Process", BCP 11, RFC 2028,
October 1996.
[10] Moore, K., "MIME (Multipurpose Internet Mail Extensions) Part [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail Extensions)
Three: Message Header Extensions for Non-ASCII Text", RFC 2047, Part Three: Message Header Extensions for Non-ASCII Text",
November 1996. RFC 2047, November 1996.
[11] Bradner, S., "Key words for use in RFCs to Indicate Requirement [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[12] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an
Considerations Section in RFCs", BCP 26, RFC 2434, IANA Considerations Section in RFCs", BCP 26, RFC 2434,
October 1998. October 1998.
[13] Hoffman, P. and F. Yergeau, "UTF-16, an encoding of ISO 10646", [RFC2781] Hoffman, P. and F. Yergeau, "UTF-16, an encoding of ISO
RFC 2781, February 2000. 10646", RFC 2781, February 2000.
[14] Carpenter, B., Baker, F., and M. Roberts, "Memorandum of [RFC2860] Carpenter, B., Baker, F., and M. Roberts, "Memorandum of
Understanding Concerning the Technical Work of the Internet Understanding Concerning the Technical Work of the
Assigned Numbers Authority", RFC 2860, June 2000. Internet Assigned Numbers Authority", RFC 2860, June 2000.
[15] Klyne, G. and C. Newman, "Date and Time on the Internet: [RFC3339] Klyne, G. and C. Newman, "Date and Time on the Internet:
Timestamps", RFC 3339, July 2002. Timestamps", RFC 3339, July 2002.
[16] Rescorla, E. and B. Korver, "Guidelines for Writing RFC Text on [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC
Security Considerations", BCP 72, RFC 3552, July 2003. Text on Security Considerations", BCP 72, RFC 3552,
July 2003.
9.2 Informative References 9.2 Informative References
[17] ISO 639 Joint Advisory Committee, "ISO 639 Joint Advisory [iso639.principles]
ISO 639 Joint Advisory Committee, "ISO 639 Joint Advisory
Committee: Working principles for ISO 639 maintenance", Committee: Working principles for ISO 639 maintenance",
March 2000, March 2000,
<http://www.loc.gov/standards/iso639-2/iso639jac_n3r.html>. <http://www.loc.gov/standards/iso639-2/
iso639jac_n3r.html>.
[18] Raymond, E., "The Art of Unix Programming", 2003. [record-jar]
Raymond, E., "The Art of Unix Programming", 2003.
[19] Bray (et al), T., "Extensible Markup Language (XML) 1.0", [XML10] Bray (et al), T., "Extensible Markup Language (XML) 1.0",
02 2004. 02 2004.
[20] Biron, P., Ed. and A. Malhotra, Ed., "XML Schema Part 2: [XMLSchema]
Biron, P., Ed. and A. Malhotra, Ed., "XML Schema Part 2:
Datatypes Second Edition", 10 2004, < Datatypes Second Edition", 10 2004, <
http://www.w3.org/TR/xmlschema-2/>. http://www.w3.org/TR/xmlschema-2/>.
[21] Unicode Consortium, "The Unicode Consortium. The Unicode [Unicode] Unicode Consortium, "The Unicode Consortium. The Unicode
Standard, Version 4.1.0, defined by: The Unicode Standard, Standard, Version 4.1.0, defined by: The Unicode Standard,
Version 4.0 (Boston, MA, Addison-Wesley, 2003. ISBN 0-321- Version 4.0 (Boston, MA, Addison-Wesley, 2003. ISBN 0-321-
18578-1), as amended by Unicode 4.0.1 18578-1), as amended by Unicode 4.0.1
(http://www.unicode.org/versions/Unicode4.0.1) and by Unicode (http://www.unicode.org/versions/Unicode4.0.1) and by
4.1.0 (http://www.unicode.org/versions/Unicode4.1.0).", Unicode 4.1.0
(http://www.unicode.org/versions/Unicode4.1.0).",
March 2005. March 2005.
[22] Alvestrand, H., "Tags for the Identification of Languages", [RFC1766] Alvestrand, H., "Tags for the Identification of
RFC 1766, March 1995. Languages", RFC 1766, March 1995.
[23] Freed, N. and K. Moore, "MIME Parameter Value and Encoded Word [RFC2231] Freed, N. and K. Moore, "MIME Parameter Value and Encoded
Extensions: Character Sets, Languages, and Continuations", Word Extensions: Character Sets, Languages, and
RFC 2231, November 1997. Continuations", RFC 2231, November 1997.
[24] Alvestrand, H., "Tags for the Identification of Languages", [RFC3066] Alvestrand, H., "Tags for the Identification of
BCP 47, RFC 3066, January 2001. Languages", BCP 47, RFC 3066, January 2001.
Authors' Addresses Authors' Addresses
Addison Phillips (editor) Addison Phillips (editor)
Quest Software Quest Software
Email: addison.phillips@quest.com Email: addison.phillips@quest.com
Mark Davis (editor) Mark Davis (editor)
IBM IBM
skipping to change at page 55, line 37 skipping to change at page 58, line 37
Language-Script-Region: Language-Script-Region:
zh-Hans-CN (Chinese written using the Simplified script as used in zh-Hans-CN (Chinese written using the Simplified script as used in
mainland China) mainland China)
sr-Latn-CS (Serbian written using the Latin script as used in sr-Latn-CS (Serbian written using the Latin script as used in
Serbia and Montenegro) Serbia and Montenegro)
Language-Variant: Language-Variant:
en-boont (Boontling dialect of English) sl-rozaj (Resian dialect of Slovenian
en-scouse (Scouse dialect of English) sl-nedis (Nadiza dialect of Slovenian)
Language-Region-Variant: Language-Region-Variant:
en-GB-scouse (Scouse dialect of English as used in the UK) de-CH-1901 (German as used in Switzerland using the 1901 variant
[othography])
sl-IT-nedis (Slovenian as used in Italy, Nadiza dialect)
Language-Script-Region-Variant: Language-Script-Region-Variant:
sl-Latn-IT-nedis (Nadiza dialect of Slovenian written using the sl-Latn-IT-nedis (Nadiza dialect of Slovenian written using the
Latin script as used in Italy. Note that this tag is NOT Latin script as used in Italy. Note that this tag is NOT
RECOMMENDED because subtag 'sl' has a Suppress-Script value of RECOMMENDED because subtag 'sl' has a Suppress-Script value of
'Latn') 'Latn')
Language-Region: Language-Region:
skipping to change at page 60, line 10 skipping to change at page 63, line 10
%% %%
Type: variant Type: variant
Subtag: 1901 Subtag: 1901
Description: Traditional German Description: Traditional German
orthography orthography
Added: 2004-09-09 Added: 2004-09-09
Prefix: de Prefix: de
Comment: <shows continuation> Comment: <shows continuation>
%% %%
Type: variant Type: variant
Subtag: 1996 Subtag: nedis
Description: German orthography of 1996 Description: Nadiza dialect
Added: 2004-09-09 Description: Natisone dialect
Prefix: de Added: 2003-10-09
%% Prefix: sl
Type: variant
Subtag: boont
Description: Boontling
Added: 2003-02-14
Prefix: en
%%
Type: variant
Subtag: gaulish
Description: Gaulish
Added: 2001-05-25
Prefix: cel
%% %%
Type: grandfathered Type: grandfathered
Tag: art-lojban Tag: art-lojban
Description: Lojban Description: Lojban
Added: 2001-11-11 Added: 2001-11-11
Canonical: jbo Canonical: jbo
Deprecated: 2003-09-02 Deprecated: 2003-09-02
%% %%
Type: grandfathered Type: grandfathered
Tag: en-GB-oed Tag: en-GB-oed
 End of changes. 

This html diff was produced by rfcdiff 1.24, available from http://www.levkowetz.com/ietf/tools/rfcdiff/