draft-ietf-ltru-registry-12.txt   draft-ietf-ltru-registry-13.txt 
Network Working Group A. Phillips, Ed. Network Working Group A. Phillips, Ed.
Internet-Draft Quest Software Internet-Draft Quest Software
Expires: February 18, 2006 M. Davis, Ed. Obsoletes: 3066 (if approved) M. Davis, Ed.
IBM Expires: March 26, 2006 IBM
August 17, 2005 September 22, 2005
Tags for Identifying Languages Tags for Identifying Languages
draft-ietf-ltru-registry-12 draft-ietf-ltru-registry-13
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 35 skipping to change at page 1, line 35
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on February 18, 2006. This Internet-Draft will expire on March 26, 2006.
Copyright Notice Copyright Notice
Copyright (C) The Internet Society (2005). Copyright (C) The Internet Society (2005).
Abstract Abstract
This document describes the structure, content, construction, and This document describes the structure, content, construction, and
semantics of language tags for use in cases where it is desirable to semantics of language tags for use in cases where it is desirable to
indicate the language used in an information object. It also indicate the language used in an information object. It also
skipping to change at page 2, line 22 skipping to change at page 2, line 22
2.2.2 Extended Language Subtags . . . . . . . . . . . . . . 10 2.2.2 Extended Language Subtags . . . . . . . . . . . . . . 10
2.2.3 Script Subtag . . . . . . . . . . . . . . . . . . . . 10 2.2.3 Script Subtag . . . . . . . . . . . . . . . . . . . . 10
2.2.4 Region Subtag . . . . . . . . . . . . . . . . . . . . 11 2.2.4 Region Subtag . . . . . . . . . . . . . . . . . . . . 11
2.2.5 Variant Subtags . . . . . . . . . . . . . . . . . . . 13 2.2.5 Variant Subtags . . . . . . . . . . . . . . . . . . . 13
2.2.6 Extension Subtags . . . . . . . . . . . . . . . . . . 14 2.2.6 Extension Subtags . . . . . . . . . . . . . . . . . . 14
2.2.7 Private Use Subtags . . . . . . . . . . . . . . . . . 15 2.2.7 Private Use Subtags . . . . . . . . . . . . . . . . . 15
2.2.8 Pre-Existing RFC 3066 Registrations . . . . . . . . . 16 2.2.8 Pre-Existing RFC 3066 Registrations . . . . . . . . . 16
2.2.9 Classes of Conformance . . . . . . . . . . . . . . . . 16 2.2.9 Classes of Conformance . . . . . . . . . . . . . . . . 16
3. Registry Format and Maintenance . . . . . . . . . . . . . . . 18 3. Registry Format and Maintenance . . . . . . . . . . . . . . . 18
3.1 Format of the IANA Language Subtag Registry . . . . . . . 18 3.1 Format of the IANA Language Subtag Registry . . . . . . . 18
3.2 Maintenance of the Registry . . . . . . . . . . . . . . . 23 3.2 Language Subtag Reviewer . . . . . . . . . . . . . . . . . 23
3.3 Stability of IANA Registry Entries . . . . . . . . . . . . 24 3.3 Maintenance of the Registry . . . . . . . . . . . . . . . 24
3.4 Registration Procedure for Subtags . . . . . . . . . . . . 28 3.4 Stability of IANA Registry Entries . . . . . . . . . . . . 25
3.5 Possibilities for Registration . . . . . . . . . . . . . . 31 3.5 Registration Procedure for Subtags . . . . . . . . . . . . 28
3.6 Extensions and Extensions Registry . . . . . . . . . . . . 33 3.6 Possibilities for Registration . . . . . . . . . . . . . . 31
3.7 Initialization of the Registries . . . . . . . . . . . . . 36 3.7 Extensions and Extensions Registry . . . . . . . . . . . . 33
4. Formation and Processing of Language Tags . . . . . . . . . . 37 3.8 Initialization of the Registries . . . . . . . . . . . . . 36
4.1 Choice of Language Tag . . . . . . . . . . . . . . . . . . 37 4. Formation and Processing of Language Tags . . . . . . . . . . 38
4.2 Meaning of the Language Tag . . . . . . . . . . . . . . . 39 4.1 Choice of Language Tag . . . . . . . . . . . . . . . . . . 38
4.3 Length Considerations . . . . . . . . . . . . . . . . . . 40 4.2 Meaning of the Language Tag . . . . . . . . . . . . . . . 40
4.3.1 Working with Limited Buffer Sizes . . . . . . . . . . 40 4.3 Length Considerations . . . . . . . . . . . . . . . . . . 41
4.3.2 Truncation of Language Tags . . . . . . . . . . . . . 42 4.3.1 Working with Limited Buffer Sizes . . . . . . . . . . 41
4.4 Canonicalization of Language Tags . . . . . . . . . . . . 42 4.3.2 Truncation of Language Tags . . . . . . . . . . . . . 43
4.5 Considerations for Private Use Subtags . . . . . . . . . . 44 4.4 Canonicalization of Language Tags . . . . . . . . . . . . 43
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 46 4.5 Considerations for Private Use Subtags . . . . . . . . . . 45
5.1 Language Subtag Registry . . . . . . . . . . . . . . . . . 46 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 47
5.2 Extensions Registry . . . . . . . . . . . . . . . . . . . 47 5.1 Language Subtag Registry . . . . . . . . . . . . . . . . . 47
6. Security Considerations . . . . . . . . . . . . . . . . . . . 48 5.2 Extensions Registry . . . . . . . . . . . . . . . . . . . 48
7. Character Set Considerations . . . . . . . . . . . . . . . . . 49 6. Security Considerations . . . . . . . . . . . . . . . . . . . 49
8. Changes from RFC 3066 . . . . . . . . . . . . . . . . . . . . 50 7. Character Set Considerations . . . . . . . . . . . . . . . . . 50
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 53 8. Changes from RFC 3066 . . . . . . . . . . . . . . . . . . . . 51
9.1 Normative References . . . . . . . . . . . . . . . . . . . 53 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 54
9.2 Informative References . . . . . . . . . . . . . . . . . . 54 9.1 Normative References . . . . . . . . . . . . . . . . . . . 54
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 55 9.2 Informative References . . . . . . . . . . . . . . . . . . 55
A. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 56 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 56
B. Examples of Language Tags (Informative) . . . . . . . . . . . 57 A. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 57
Intellectual Property and Copyright Statements . . . . . . . . 60 B. Examples of Language Tags (Informative) . . . . . . . . . . . 58
Intellectual Property and Copyright Statements . . . . . . . . 61
1. Introduction 1. Introduction
Human beings on our planet have, past and present, used a number of Human beings on our planet have, past and present, used a number of
languages. There are many reasons why one would want to identify the languages. There are many reasons why one would want to identify the
language used when presenting or requesting information. language used when presenting or requesting information.
User's language preferences often need to be identified so that A user's language preferences often need to be identified so that
appropriate processing can be applied. For example, the user's appropriate processing can be applied. For example, the user's
language preferences in a Web browser can be used to select Web pages language preferences in a Web browser can be used to select Web pages
appropriately. Language preferences can also be used to select among appropriately. Language preferences can also be used to select among
tools (such as dictionaries) to assist in the processing or tools (such as dictionaries) to assist in the processing or
understanding of content in different languages. understanding of content in different languages.
In addition, knowledge about the particular language used by some In addition, knowledge about the particular language used by some
piece of information content might be useful or even required by some piece of information content might be useful or even required by some
types of processing; for example spell-checking, computer-synthesized types of processing; for example spell-checking, computer-synthesized
speech, Braille transcription, or high-quality print renderings. speech, Braille transcription, or high-quality print renderings.
skipping to change at page 4, line 23 skipping to change at page 4, line 23
2.1 Syntax 2.1 Syntax
The language tag is composed of one or more parts or "subtags". Each The language tag is composed of one or more parts or "subtags". Each
subtag consists of a sequence of alpha-numeric characters. Subtags subtag consists of a sequence of alpha-numeric characters. Subtags
are distinguished and separated from one another by a hyphen ("-", are distinguished and separated from one another by a hyphen ("-",
ABNF [RFC2234bis] %x2D). A language tag consists of a "primary ABNF [RFC2234bis] %x2D). A language tag consists of a "primary
language" subtag and a (possibly empty) series of subsequent subtags, language" subtag and a (possibly empty) series of subsequent subtags,
each of which refines or narrows the range of language identified by each of which refines or narrows the range of language identified by
the overall tag. the overall tag.
Each type of subtag is distinguished by length, position in the tag, Usually, each type of subtag is distinguished by length, position in
and content: subtags can be recognized solely by these features. the tag, and content: subtags can be recognized solely by these
This makes it possible to construct a parser that can extract and features. The only exception to this is a fixed list of
assign some semantic information to the subtags, even if the specific grandfathered tags registered under RFC 3066 [RFC3066]. This makes
subtag values are not recognized. Thus a parser need not have an up- it possible to construct a parser that can extract and assign some
to-date copy (or any copy at all) of the subtag registry to perform semantic information to the subtags, even if the specific subtag
most searching and matching operations. values are not recognized. Thus a parser need not have an up-to-date
copy (or any copy at all) of the subtag registry to perform most
searching and matching operations.
The syntax of the language tag in ABNF [RFC2234bis] is: The syntax of the language tag in ABNF [RFC2234bis] is:
Language-Tag = langtag Language-Tag = langtag
/ privateuse ; private use tag / privateuse ; private use tag
/ grandfathered ; grandfathered registrations / grandfathered ; grandfathered registrations
langtag = (language langtag = (language
["-" script] ["-" script]
["-" region] ["-" region]
skipping to change at page 6, line 40 skipping to change at page 6, line 40
initial letter capitalized ('Cyrl' Cyrillic). initial letter capitalized ('Cyrl' Cyrillic).
However, in the tags defined by this document, the uppercase US-ASCII However, in the tags defined by this document, the uppercase US-ASCII
letters in the range 'A' through 'Z' are considered equivalent and letters in the range 'A' through 'Z' are considered equivalent and
mapped directly to their US-ASCII lowercase equivalents in the range mapped directly to their US-ASCII lowercase equivalents in the range
'a' through 'z'. Thus the tag "mn-Cyrl-MN" is not distinct from "MN- 'a' through 'z'. Thus the tag "mn-Cyrl-MN" is not distinct from "MN-
cYRL-mn" or "mN-cYrL-Mn" (or any other combination) and each of these cYRL-mn" or "mN-cYrL-Mn" (or any other combination) and each of these
variations conveys the same meaning: Mongolian written in the variations conveys the same meaning: Mongolian written in the
Cyrillic script as used in Mongolia. Cyrillic script as used in Mongolia.
Although case distinctions do not carry meaning in language tags,
consistent formatting and presentation of the tags will aid users.
The format of the tags and subtags in the registry is RECOMMENDED.
In this format, all non-initial two-letter subtags are uppercase, all
non-initial four-letter subtags are titlecase, and all other subtags
are lowercase.
2.2 Language Subtag Sources and Interpretation 2.2 Language Subtag Sources and Interpretation
The namespace of language tags and their subtags is administered by The namespace of language tags and their subtags is administered by
the Internet Assigned Numbers Authority (IANA) [RFC2860] according to the Internet Assigned Numbers Authority (IANA) [RFC2860] according to
the rules in Section 5 of this document. The Language Subtag the rules in Section 5 of this document. The Language Subtag
Registry maintained by IANA is the source for valid subtags: other Registry maintained by IANA is the source for valid subtags: other
standards referenced in this section provide the source material for standards referenced in this section provide the source material for
that registry. that registry.
Terminology in this section: Terminology in this section:
skipping to change at page 7, line 51 skipping to change at page 8, line 7
use. These include the following current uses: use. These include the following current uses:
o The single letter subtag 'x' is reserved to introduce a sequence o The single letter subtag 'x' is reserved to introduce a sequence
of private use subtags. The interpretation of any private use of private use subtags. The interpretation of any private use
subtags is defined solely by private agreement and is not defined subtags is defined solely by private agreement and is not defined
by the rules in this section or in any standard or registry by the rules in this section or in any standard or registry
defined in this document. defined in this document.
o All other single letter subtags are reserved to introduce o All other single letter subtags are reserved to introduce
standardized extension subtag sequences as described in standardized extension subtag sequences as described in
Section 3.6. Section 3.7.
The single letter subtag 'i' is used by some grandfathered tags, such The single letter subtag 'i' is used by some grandfathered tags, such
as "i-enochian", where it always appears in the first position and as "i-enochian", where it always appears in the first position and
cannot be confused with an extension. cannot be confused with an extension.
2.2.1 Primary Language Subtag 2.2.1 Primary Language Subtag
The primary language subtag is the first subtag in a language tag The primary language subtag is the first subtag in a language tag
(with the exception of private use and certain grandfathered tags) (with the exception of private use and certain grandfathered tags)
and cannot be omitted. The following rules apply to the primary and cannot be omitted. The following rules apply to the primary
skipping to change at page 8, line 41 skipping to change at page 8, line 45
private use in language tags. These subtags correspond to codes private use in language tags. These subtags correspond to codes
reserved by ISO 639-2 for private use. These codes MAY be used reserved by ISO 639-2 for private use. These codes MAY be used
for non-registered primary-language subtags (instead of using for non-registered primary-language subtags (instead of using
private use subtags following 'x-'). Please refer to Section 4.5 private use subtags following 'x-'). Please refer to Section 4.5
for more information on private use subtags. for more information on private use subtags.
4. All four character language subtags are reserved for possible 4. All four character language subtags are reserved for possible
future standardization. future standardization.
5. All language subtags of 5 to 8 characters in length in the IANA 5. All language subtags of 5 to 8 characters in length in the IANA
registry were defined via the registration process in Section 3.4 registry were defined via the registration process in Section 3.5
and MAY be used to form the primary language subtag. At the time and MAY be used to form the primary language subtag. At the time
this document was created, there were no examples of this kind of this document was created, there were no examples of this kind of
subtag and future registrations of this type will be discouraged: subtag and future registrations of this type will be discouraged:
primary languages are strongly RECOMMENDED for registration with primary languages are strongly RECOMMENDED for registration with
ISO 639 and proposals rejected by ISO 639/RA will be closely ISO 639 and proposals rejected by ISO 639/RA will be closely
scrutinized before they are registered with IANA. scrutinized before they are registered with IANA.
6. The single character subtag 'x' as the primary subtag indicates 6. The single character subtag 'x' as the primary subtag indicates
that the language tag consists solely of subtags whose meaning is that the language tag consists solely of subtags whose meaning is
defined by private agreement. For example, in the tag "x-fr-CH", defined by private agreement. For example, in the tag "x-fr-CH",
skipping to change at page 9, line 43 skipping to change at page 9, line 48
"A language code already in ISO 639-2 at the point of freezing ISO "A language code already in ISO 639-2 at the point of freezing ISO
639-1 shall not later be added to ISO 639-1. This is to ensure 639-1 shall not later be added to ISO 639-1. This is to ensure
consistency in usage over time, since users are directed in Internet consistency in usage over time, since users are directed in Internet
applications to employ the alpha-3 code when an alpha-2 code for that applications to employ the alpha-3 code when an alpha-2 code for that
language is not available." language is not available."
In order to avoid instability in the canonical form of tags, if a two In order to avoid instability in the canonical form of tags, if a two
character code is added to ISO 639-1 for a language for which a three character code is added to ISO 639-1 for a language for which a three
character code was already included in ISO 639-2, the two character character code was already included in ISO 639-2, the two character
code MUST NOT be registered. See Section 3.3. code MUST NOT be registered. See Section 3.4.
For example, if some content were tagged with 'haw' (Hawaiian), which For example, if some content were tagged with 'haw' (Hawaiian), which
currently has no two character code, the tag would not be invalidated currently has no two character code, the tag would not be invalidated
if ISO 639-1 were to assign a two character code to the Hawaiian if ISO 639-1 were to assign a two character code to the Hawaiian
language at a later date. language at a later date.
For example, one of the grandfathered IANA registrations is For example, one of the grandfathered IANA registrations is
"i-enochian". The subtag 'enochian' could be registered in the IANA "i-enochian". The subtag 'enochian' could be registered in the IANA
registry as a primary language subtag (assuming that ISO 639 does not registry as a primary language subtag (assuming that ISO 639 does not
register this language first), making tags such as "enochian-AQ" and register this language first), making tags such as "enochian-AQ" and
skipping to change at page 11, line 12 skipping to change at page 11, line 16
subtag and all extended language subtags and MUST occur before subtag and all extended language subtags and MUST occur before
any other type of subtag described below. any other type of subtag described below.
3. The script subtags 'Qaaa' through 'Qabx' are reserved for private 3. The script subtags 'Qaaa' through 'Qabx' are reserved for private
use in language tags. These subtags correspond to codes reserved use in language tags. These subtags correspond to codes reserved
by ISO 15924 for private use. These codes MAY be used for non- by ISO 15924 for private use. These codes MAY be used for non-
registered script values. Please refer to Section 4.5 for more registered script values. Please refer to Section 4.5 for more
information on private use subtags. information on private use subtags.
4. Script subtags MUST NOT be registered using the process in 4. Script subtags MUST NOT be registered using the process in
Section 3.4 of this document. Variant subtags MAY be considered Section 3.5 of this document. Variant subtags MAY be considered
for registration for that purpose. for registration for that purpose.
5. There MUST be at most one script subtag in a language tag and the 5. There MUST be at most one script subtag in a language tag and the
script subtag SHOULD be omitted when it adds no distinguishing script subtag SHOULD be omitted when it adds no distinguishing
value to the tag or when the primary language subtag's record value to the tag or when the primary language subtag's record
includes a Suppress-Script field listing the applicable script includes a Suppress-Script field listing the applicable script
subtag. subtag.
Example: "sr-Latn" represents Serbian written using the Latin script. Example: "sr-Latn" represents Serbian written using the Latin script.
skipping to change at page 12, line 19 skipping to change at page 12, line 23
ISO 3166 alpha-2 code and represent supra-national areas, ISO 3166 alpha-2 code and represent supra-national areas,
usually covering more than one nation, state, province, or usually covering more than one nation, state, province, or
territory. territory.
B. UN numeric codes for 'economic groupings' or 'other B. UN numeric codes for 'economic groupings' or 'other
groupings' MUST NOT be registered in the IANA registry and groupings' MUST NOT be registered in the IANA registry and
MUST NOT be used to form language tags. MUST NOT be used to form language tags.
C. UN numeric codes for countries or areas with ambiguous ISO C. UN numeric codes for countries or areas with ambiguous ISO
3166 alpha-2 codes, when entered into the registry, MUST be 3166 alpha-2 codes, when entered into the registry, MUST be
defined according to the rules in Section 3.3 and MUST be defined according to the rules in Section 3.4 and MUST be
used to form language tags that represent the country or used to form language tags that represent the country or
region for which they are defined. region for which they are defined.
D. UN numeric codes for countries or areas for which there is an D. UN numeric codes for countries or areas for which there is an
associated ISO 3166 alpha-2 code in the registry MUST NOT be associated ISO 3166 alpha-2 code in the registry MUST NOT be
entered into the registry and MUST NOT be used to form entered into the registry and MUST NOT be used to form
language tags. Note that the ISO 3166-based subtag in the language tags. Note that the ISO 3166-based subtag in the
registry MUST actually be associated with the UN M.49 code in registry MUST actually be associated with the UN M.49 code in
question. question.
E. UN numeric codes and ISO 3166 alpha-2 codes for countries or E. UN numeric codes and ISO 3166 alpha-2 codes for countries or
areas listed as eligible for registration in [initial- areas listed as eligible for registration in [initial-
registry] but not presently registered MAY be entered into registry] but not presently registered MAY be entered into
the IANA registry via the process described in Section 3.4. the IANA registry via the process described in Section 3.5.
Once registered, these codes MAY be used to form language Once registered, these codes MAY be used to form language
tags. tags.
F. All other UN numeric codes for countries or areas which do F. All other UN numeric codes for countries or areas which do
not have an associated ISO 3166 alpha-2 code MUST NOT be not have an associated ISO 3166 alpha-2 code MUST NOT be
entered into the registry and MUST NOT be used to form entered into the registry and MUST NOT be used to form
language tags. For more information about these codes, see language tags. For more information about these codes, see
Section 3.3. Section 3.4.
4. Note: The alphanumeric codes in Appendix X of the UN document 4. Note: The alphanumeric codes in Appendix X of the UN document
MUST NOT be entered into the registry and MUST NOT be used to MUST NOT be entered into the registry and MUST NOT be used to
form language tags. (At the time this document was created these form language tags. (At the time this document was created these
values match the ISO 3166 alpha-2 codes.) values match the ISO 3166 alpha-2 codes.)
5. There MUST be at most one region subtag in a language tag and the 5. There MUST be at most one region subtag in a language tag and the
region subtag MAY be omitted, as when it adds no distinguishing region subtag MAY be omitted, as when it adds no distinguishing
value to the tag. value to the tag.
6. The region subtags 'AA', 'QM'-'QZ', 'XA'-'XZ', and 'ZZ' are 6. The region subtags 'AA', 'QM'-'QZ', 'XA'-'XZ', and 'ZZ' are
reserved for private use in language tags. These subtags reserved for private use in language tags. These subtags
correspond to codes reserved by ISO 3166 for private use. These correspond to codes reserved by ISO 3166 for private use. These
codes MAY be used for private use region subtags (instead of codes MAY be used for private use region subtags (instead of
using a private use subtag sequence). Please refer to using a private use subtag sequence). Please refer to
Section 4.5 for more information on private use subtags. Section 4.5 for more information on private use subtags.
skipping to change at page 13, line 29 skipping to change at page 13, line 32
2.2.5 Variant Subtags 2.2.5 Variant Subtags
Variant subtags are used to indicate additional, well-recognized Variant subtags are used to indicate additional, well-recognized
variations that define a language or its dialects which are not variations that define a language or its dialects which are not
covered by other available subtags. The following rules apply to the covered by other available subtags. The following rules apply to the
variant subtags: variant subtags:
1. Variant subtags are not associated with any external standard. 1. Variant subtags are not associated with any external standard.
Variant subtags and their meanings are defined by the Variant subtags and their meanings are defined by the
registration process defined in Section 3.4. registration process defined in Section 3.5.
2. Variant subtags MUST follow all of the other defined subtags, but 2. Variant subtags MUST follow all of the other defined subtags, but
precede any extension or private use subtag sequences. precede any extension or private use subtag sequences.
3. More than one variant MAY be used to form the language tag. 3. More than one variant MAY be used to form the language tag.
4. Variant subtags MUST be registered with IANA according to the 4. Variant subtags MUST be registered with IANA according to the
rules in Section 3.4 of this document before being used to form rules in Section 3.5 of this document before being used to form
language tags. In order to distinguish variants from other types language tags. In order to distinguish variants from other types
of subtags, registrations MUST meet the following length and of subtags, registrations MUST meet the following length and
content restrictions: content restrictions:
1. Variant subtags that begin with a letter (a-z, A-Z) MUST be 1. Variant subtags that begin with a letter (a-z, A-Z) MUST be
at least five characters long. at least five characters long.
2. Variant subtags that begin with a digit (0-9) MUST be at 2. Variant subtags that begin with a digit (0-9) MUST be at
least four characters long. least four characters long.
skipping to change at page 14, line 25 skipping to change at page 14, line 29
spelling reforms. A variant that can meaningfully be used in spelling reforms. A variant that can meaningfully be used in
combination with another variant SHOULD include a 'Prefix' field in combination with another variant SHOULD include a 'Prefix' field in
its registry record that lists that other variant. For example, if its registry record that lists that other variant. For example, if
another German variant 'example' were created that made sense to use another German variant 'example' were created that made sense to use
with '1996', then 'example' should include two Prefix fields: "de" with '1996', then 'example' should include two Prefix fields: "de"
and "de-1996". and "de-1996".
2.2.6 Extension Subtags 2.2.6 Extension Subtags
Extensions provide a mechanism for extending language tags for use in Extensions provide a mechanism for extending language tags for use in
various applications. See: Section 3.6. The following rules apply various applications. See: Section 3.7. The following rules apply
to extensions: to extensions:
1. Extension subtags are separated from the other subtags defined 1. Extension subtags are separated from the other subtags defined
in this document by a single character subtag ("singleton"). in this document by a single character subtag ("singleton").
The singleton MUST be one allocated to a registration authority The singleton MUST be one allocated to a registration authority
via the mechanism described in Section 3.6 and MUST NOT be the via the mechanism described in Section 3.7 and MUST NOT be the
letter 'x', which is reserved for private use subtag sequences. letter 'x', which is reserved for private use subtag sequences.
2. Note: Private use subtag sequences starting with the singleton 2. Note: Private use subtag sequences starting with the singleton
subtag 'x' are described below. subtag 'x' are described in Section 2.2.7 below.
3. An extension MUST follow at least a primary language subtag. 3. An extension MUST follow at least a primary language subtag.
That is, a language tag cannot begin with an extension. That is, a language tag cannot begin with an extension.
Extensions extend language tags, they do not override or replace Extensions extend language tags, they do not override or replace
them. For example, "a-value" is not a well-formed language tag, them. For example, "a-value" is not a well-formed language tag,
while "de-a-value" is. while "de-a-value" is.
4. Each singleton subtag MUST appear at most one time in each tag 4. Each singleton subtag MUST appear at most one time in each tag
(other than as a private use subtag). That is, singleton (other than as a private use subtag). That is, singleton
subtags MUST NOT be repeated. For example, the tag "en-a-bbb-a- subtags MUST NOT be repeated. For example, the tag "en-a-bbb-a-
skipping to change at page 16, line 25 skipping to change at page 16, line 31
publication of SIL International for language identification might publication of SIL International for language identification might
agree to exchange tags such as "az-Arab-x-AZE-derbend". This example agree to exchange tags such as "az-Arab-x-AZE-derbend". This example
contains two private use subtags. The first is 'AZE' and the second contains two private use subtags. The first is 'AZE' and the second
is 'derbend'. is 'derbend'.
2.2.8 Pre-Existing RFC 3066 Registrations 2.2.8 Pre-Existing RFC 3066 Registrations
Existing IANA-registered language tags from RFC 1766 and/or RFC 3066 Existing IANA-registered language tags from RFC 1766 and/or RFC 3066
maintain their validity. These tags will be maintained in the maintain their validity. These tags will be maintained in the
registry in records of either the "grandfathered" or "redundant" registry in records of either the "grandfathered" or "redundant"
type. For more information see Section 3.7. type. Grandfathered tags contain one or more subtags that are not
defined in the Language Subtag Registry (see Section 3). Redundant
tags consist entirely of subtags defined above and whose independent
registration is superseded by this document. For more information
see Section 3.8.
It is important to note that all language tags formed under the It is important to note that all language tags formed under the
guidelines in this document were either legal, well-formed tags or guidelines in this document were either legal, well-formed tags or
could have been registered under RFC 3066. could have been registered under RFC 3066.
2.2.9 Classes of Conformance 2.2.9 Classes of Conformance
Implementations sometimes need to describe their capabilities with Implementations sometimes need to describe their capabilities with
regard to the rules and practices described in this document. There regard to the rules and practices described in this document. There
are two classes of conforming implementations described by this are two classes of conforming implementations described by this
skipping to change at page 17, line 17 skipping to change at page 17, line 28
o Check that the tag is well-formed. o Check that the tag is well-formed.
o Specify the particular registry date for which the implementation o Specify the particular registry date for which the implementation
performs validation of subtags. performs validation of subtags.
o Check that either the tag is a grandfathered tag, or that all o Check that either the tag is a grandfathered tag, or that all
language, script, region, and variant subtags consist of valid language, script, region, and variant subtags consist of valid
codes for use in language tags according to the IANA registry as codes for use in language tags according to the IANA registry as
of the particular date specified by the implementation. of the particular date specified by the implementation.
o Specify which, if any, extension RFCs as defined in Section 3.6 o Specify which, if any, extension RFCs as defined in Section 3.7
are supported, including version, revision, and date. are supported, including version, revision, and date.
o For any such extensions supported, check that all subtags used in o For any such extensions supported, check that all subtags used in
that extension are valid. that extension are valid.
o For variant and extended language subtags, if the registry o For variant and extended language subtags, if the registry
contains one or more 'Prefix' fields for that subtag, check that contains one or more 'Prefix' fields for that subtag, check that
the tag matches at least one prefix. The tag matches if all the the tag matches at least one prefix. The tag matches if all the
subtags in the 'Prefix' also appear in the tag. For example, the subtags in the 'Prefix' also appear in the tag. For example, the
prefix "es-CO" matches the tag "es-Latn-CO-x-private" because both prefix "es-CO" matches the tag "es-Latn-CO-x-private" because both
the 'es' language subtag and 'CO' region subtag appear in the tag. the 'es' language subtag and 'CO' region subtag appear in the tag.
3. Registry Format and Maintenance 3. Registry Format and Maintenance
This section defines the Language Subtag Registry and the maintenance This section defines the Language Subtag Registry and the maintenance
and update procedures associated with it, as well as a registry for and update procedures associated with it, as well as a registry for
extensions to language tags (Section 3.6). extensions to language tags (Section 3.7).
The Language Subtag Registry contains a comprehensive list of all of The Language Subtag Registry contains a comprehensive list of all of
the subtags valid in language tags. This allows implementers a the subtags valid in language tags. This allows implementers a
straightforward and reliable way to validate language tags. The straightforward and reliable way to validate language tags. The
Language Subtag Registry will be maintained so that, except for Language Subtag Registry will be maintained so that, except for
extension subtags, it is possible to validate all of the subtags that extension subtags, it is possible to validate all of the subtags that
appear in a language tag under the provisions of this document or its appear in a language tag under the provisions of this document or its
revisions or successors. In addition, the meaning of the various revisions or successors. In addition, the meaning of the various
subtags will be unambiguous and stable over time. (The meaning of subtags will be unambiguous and stable over time. (The meaning of
private use subtags, of course, is not defined by the IANA registry.) private use subtags, of course, is not defined by the IANA registry.)
3.1 Format of the IANA Language Subtag Registry 3.1 Format of the IANA Language Subtag Registry
The IANA Language Subtag Registry ("the registry") consists of a text The IANA Language Subtag Registry ("the registry") consists of a text
file that is machine readable in the format described in this file that is machine readable in the format described in this
section, plus copies of the registration forms approved by the section, plus copies of the registration forms approved in accordance
Language Subtag Reviewer in accordance with the process described in with the process described in Section 3.5. The existing registration
Section 3.4. With the exception of the registration forms for forms for grandfathered and redundant tags taken from RFC 3066 will
grandfathered and redundant tags, no registration records will be be maintained as part of the obsolete RFC 3066 registry. The
maintained for the initial set of subtags. remaining set of initial subtags will not have registration forms
created for them.
The registry is in a modified record-jar format text file [record- The registry is in the text format described below. This format was
jar]. Lines are limited to 72 characters, including all whitespace. based on the record-jar format described in [record-jar].
Records are separated by lines containing only the sequence "%%" Each line of text is limited to 72 characters, including all
(%x25.25). whitespace. Records are separated by lines containing only the
sequence "%%" (%x25.25).
Each field can be viewed as a single, logical line of ASCII Each field can be viewed as a single, logical line of ASCII
characters, comprising a field-name and a field-body separated by a characters, comprising a field-name and a field-body separated by a
COLON character (%x3A). For convenience, the field-body portion of COLON character (%x3A). For convenience, the field-body portion of
this conceptual entity can be split into a multiple-line this conceptual entity can be split into a multiple-line
representation; this is called "folding". The format of the registry representation; this is called "folding". The format of the registry
is described by the following ABNF (per [RFC2234bis]): is described by the following ABNF (per [RFC2234bis]):
registry = record *("%%" CRLF record) registry = record *("%%" CRLF record)
record = 1*( field-name *SP ":" *SP field-body CRLF ) record = 1*( field-name *SP ":" *SP field-body CRLF )
field-name = (ALPHA / DIGIT)[*(ALPHA / DIGIT / "-") (ALPHA / DIGIT)] field-name = (ALPHA / DIGIT)[*(ALPHA / DIGIT / "-") (ALPHA / DIGIT)]
field-body = *(ASCCHAR/LWSP) field-body = *(ASCCHAR/LWSP)
ASCCHAR = %x21-25 / %x27-7E / UNICHAR ; Note: AMPERSAND is %x26 ASCCHAR = %x21-25 / %x27-7E / UNICHAR ; Note: AMPERSAND is %x26
UNICHAR = "&#x" 2*6HEXDIG ";" UNICHAR = "&#x" 2*6HEXDIG ";"
Figure 2: record-jar ABNF
The sequence '..' (%x2E.2E) in a field-body denotes a range of The sequence '..' (%x2E.2E) in a field-body denotes a range of
values. Such a range represents all subtags of the same length that values. Such a range represents all subtags of the same length that
are alphabetically within that range, including the values explicitly are in alphabetic or numeric order within that range, including the
mentioned. For example 'a..c' denotes the values 'a', 'b', and 'c'. values explicitly mentioned. For example 'a..c' denotes the values
'a', 'b', and 'c' and '11..13' denotes the values '11', '12', and
'13'.
Characters from outside the US-ASCII[ISO646] repertoire, as well as Characters from outside the US-ASCII[ISO646] repertoire, as well as
the AMPERSAND character ("&", %x26) when it occurs in a field-body the AMPERSAND character ("&", %x26) when it occurs in a field-body
are represented by a "Numeric Character Reference" using hexadecimal are represented by a "Numeric Character Reference" using hexadecimal
notation in the style used by [XML10] (see notation in the style used by [XML10] (see
<http://www.w3.org/TR/REC-xml/#dt-charref>). This consists of the <http://www.w3.org/TR/REC-xml/#dt-charref>). This consists of the
sequence "&#x" (%x26.23.78) followed by a hexadecimal representation sequence "&#x" (%x26.23.78) followed by a hexadecimal representation
of the character's code point in [ISO10646] followed by a closing of the character's code point in [ISO10646] followed by a closing
semicolon (%x3B). For example, the EURO SIGN, U+20AC, would be semicolon (%x3B). For example, the EURO SIGN, U+20AC, would be
represented by the sequence "&#x20AC;". Note that the hexadecimal represented by the sequence "&#x20AC;". Note that the hexadecimal
skipping to change at page 19, line 32 skipping to change at page 19, line 38
The first record in the file contains the single field whose field- The first record in the file contains the single field whose field-
name is "File-Date". The field-body of this record contains the last name is "File-Date". The field-body of this record contains the last
modification date of this copy of the registry, making it possible to modification date of this copy of the registry, making it possible to
compare different versions of the registry. The registry on the IANA compare different versions of the registry. The registry on the IANA
website is the most current. Versions with an older date than that website is the most current. Versions with an older date than that
one are not up-to-date. one are not up-to-date.
File-Date: 2004-06-28 File-Date: 2004-06-28
%% %%
Figure 3: Example of the File-Date Record
Subsequent records represent subtags in the registry. Each of the Subsequent records represent subtags in the registry. Each of the
fields in each record MUST occur no more than once, unless otherwise fields in each record MUST occur no more than once, unless otherwise
noted below. Each record MUST contain the following fields: noted below. Each record MUST contain the following fields:
o 'Type' o 'Type'
* Type's field-value MUST consist of one of the following * Type's field-value MUST consist of one of the following
strings: "language", "extlang", "script", "region", "variant", strings: "language", "extlang", "script", "region", "variant",
"grandfathered", and "redundant" and denotes the type of tag or "grandfathered", and "redundant" and denotes the type of tag or
subtag. subtag.
o Either 'Subtag' or 'Tag' o Either 'Subtag' or 'Tag'
* Subtag's field-value contains the subtag being defined. This * Subtag's field-value contains the subtag being defined. This
field MUST only appear in records of whose 'Type' has one of field MUST only appear in records of whose 'Type' has one of
these values: "language", "extlang", "script", "region", or these values: "language", "extlang", "script", "region", or
"variant". "variant".
* Tag's field-value contains a complete language tag. This field * Tag's field-value contains a complete language tag. This field
MUST only appear in records whose 'Type' has one of these MUST only appear in records whose 'Type' has one of these
values: "grandfathered" or "redundant". values: "grandfathered" or "redundant". Note that the field-
value will always follow the 'grandfathered' production in the
ABNF in Section 2.1
o Description o Description
* Description's field-value contains a non-normative description * Description's field-value contains a non-normative description
of the subtag or tag. of the subtag or tag.
o Added o Added
* Added's field-value contains the date the record was added to * Added's field-value contains the date the record was added to
the registry. the registry.
The 'Subtag' or 'Tag' field MUST use lowercase letters to form the The 'Subtag' or 'Tag' field MUST use lowercase letters to form the
subtag or tag, with two exceptions. Subtags whose 'Type' field is subtag or tag, with two exceptions. Subtags whose 'Type' field is
'script' (in other words, subtags defined by ISO 15924) MUST use 'script' (in other words, subtags defined by ISO 15924) MUST use
titlecase. Subtags whose 'Type' field is 'region' (in other words, titlecase. Subtags whose 'Type' field is 'region' (in other words,
subtags defined by ISO 3166) MUST use uppercase. These exceptions subtags defined by ISO 3166) MUST use uppercase. These exceptions
mirror the use of case in the underlying standards. mirror the use of case in the underlying standards.
The field 'Description' MAY appear more than one time. At least one The field 'Description' MAY appear more than one time and contains a
of the 'Description' fields MUST contain a description of the tag description of the tag or subtag in the record. At least one of the
being registered written or transcribed into the Latin script; the 'Description' fields MUST be written or transcribed into the Latin
same or additional fields MAY also include a description in a non- script; the same or additional fields MAY also include a description
Latin script. The 'Description' field is used for identification in a non-Latin script. The 'Description' field is used for
purposes and SHOULD NOT be taken to represent the actual native name identification purposes and SHOULD NOT be taken to represent the
of the language or variation or to be in any particular language. actual native name of the language or variation or to be in any
Most descriptions are taken directly from source standards such as particular language. Most descriptions are taken directly from
ISO 639 or ISO 3166. source standards such as ISO 639 or ISO 3166.
Note: Descriptions in registry entries that correspond to ISO 639, Note: Descriptions in registry entries that correspond to ISO 639,
ISO 15924, ISO 3166 or UN M.49 codes are intended only to indicate ISO 15924, ISO 3166 or UN M.49 codes are intended only to indicate
the meaning of that identifier as defined in the source standard at the meaning of that identifier as defined in the source standard at
the time it was added to the registry. The description does not the time it was added to the registry. The description does not
replace the content of the source standard itself. The descriptions replace the content of the source standard itself. The descriptions
are not intended to be the English localized names for the subtags. are not intended to be the English localized names for the subtags.
Localization or translation of language tag and subtag descriptions Localization or translation of language tag and subtag descriptions
is out of scope of this document. is out of scope of this document.
skipping to change at page 21, line 34 skipping to change at page 21, line 45
implementing language tags using the subtag or tag. implementing language tags using the subtag or tag.
o Suppress-Script o Suppress-Script
* Suppress-Script contains a script subtag that SHOULD NOT be * Suppress-Script contains a script subtag that SHOULD NOT be
used to form language tags with the associated primary language used to form language tags with the associated primary language
subtag. This field MUST only appear in records whose 'Type' subtag. This field MUST only appear in records whose 'Type'
field-value is 'language'. See Section 4.1. field-value is 'language'. See Section 4.1.
The field 'Deprecated' MAY be added to any record via the maintenance The field 'Deprecated' MAY be added to any record via the maintenance
process described in Section 3.2 or via the registration process process described in Section 3.3 or via the registration process
described in Section 3.4. Usually the addition of a 'Deprecated' described in Section 3.5. Usually the addition of a 'Deprecated'
field is due to the action of one of the standards bodies, such as field is due to the action of one of the standards bodies, such as
ISO 3166, withdrawing a code. In some historical cases it might not ISO 3166, withdrawing a code. In some historical cases it might not
have been possible to reconstruct the original deprecation date. For have been possible to reconstruct the original deprecation date. For
these cases, an approximate date appears in the registry. Although these cases, an approximate date appears in the registry. Although
valid in language tags, subtags and tags with a 'Deprecated' field valid in language tags, subtags and tags with a 'Deprecated' field
are deprecated and validating processors SHOULD NOT generate these are deprecated and validating processors SHOULD NOT generate these
subtags. Note that a record that contains a 'Deprecated' field and subtags. Note that a record that contains a 'Deprecated' field and
no corresponding 'Preferred-Value' field has no replacement mapping. no corresponding 'Preferred-Value' field has no replacement mapping.
The field 'Preferred-Value' contains a mapping between the record in The field 'Preferred-Value' contains a mapping between the record in
which it appears and a tag or subtag which SHOULD be preferred when which it appears and another tag or subtag. The value in this field
selected language tags. These values form three groups: is STRONGLY RECOMMENDED as the best choice to represent the value of
this record when selecting a language tag. These values form three
groups:
ISO 639 language codes which were later withdrawn in favor of 1. ISO 639 language codes which were later withdrawn in favor of
other codes. These values are mostly a historical curiosity. other codes. These values are mostly a historical curiosity.
ISO 3166 region codes which have been withdrawn in favor of a new 2. ISO 3166 region codes which have been withdrawn in favor of a new
code. This sometimes happens when a country changes its name or code. This sometimes happens when a country changes its name or
administration in such a way that warrants a new region code. administration in such a way that warrants a new region code.
Tags grandfathered from RFC 3066. In many cases these tags have 3. Tags grandfathered from RFC 3066. In many cases these tags have
become obsolete because the values they represent were later become obsolete because the values they represent were later
encoded by ISO 639. encoded by ISO 639.
Records that contain a 'Preferred-Value' field MUST also have a Records that contain a 'Preferred-Value' field MUST also have a
'Deprecated' field. This field contains a date of deprecation. Thus 'Deprecated' field. This field contains a date of deprecation. Thus
a language tag processor can use the registry to construct the valid, a language tag processor can use the registry to construct the valid,
non-deprecated set of subtags for a given date. In addition, for any non-deprecated set of subtags for a given date. In addition, for any
given tag, a processor can construct the set of valid language tags given tag, a processor can construct the set of valid language tags
that correspond to that tag for all dates up to the date of the that correspond to that tag for all dates up to the date of the
registry. The ability to do these mappings MAY be beneficial to registry. The ability to do these mappings MAY be beneficial to
skipping to change at page 22, line 34 skipping to change at page 22, line 45
sometimes do not represent exactly the same meaning as the original sometimes do not represent exactly the same meaning as the original
value. There are many reasons for a country code to be changed and value. There are many reasons for a country code to be changed and
the effect this has on the formation of language tags will depend on the effect this has on the formation of language tags will depend on
the nature of the change in question. the nature of the change in question.
In particular, the 'Preferred-Value' field does not imply retagging In particular, the 'Preferred-Value' field does not imply retagging
content that uses the affected subtag. content that uses the affected subtag.
The field 'Preferred-Value' MUST NOT be modified once created in the The field 'Preferred-Value' MUST NOT be modified once created in the
registry. The field MAY be added to records of type "grandfathered" registry. The field MAY be added to records of type "grandfathered"
and "region" according to the rules in Section 3.2. Otherwise the and "region" according to the rules in Section 3.3. Otherwise the
field MUST NOT be added to any record already in the registry. field MUST NOT be added to any record already in the registry.
The 'Preferred-Value' field in records of type "grandfathered" and The 'Preferred-Value' field in records of type "grandfathered" and
"redundant" contains whole language tags that are strongly "redundant" contains whole language tags that are strongly
RECOMMENDED for use in place of the record's value. In many cases RECOMMENDED for use in place of the record's value. In many cases
the mappings were created by deprecation of the tags during the the mappings were created by deprecation of the tags during the
period before this document was adopted. For example, the tag "no- period before this document was adopted. For example, the tag "no-
nyn" was deprecated in favor of the ISO 639-1 defined language code nyn" was deprecated in favor of the ISO 639-1 defined language code
'nn'. 'nn'.
skipping to change at page 23, line 17 skipping to change at page 23, line 29
the tag "fr-1996" is an inappropriate choice. the tag "fr-1996" is an inappropriate choice.
The field of type 'Prefix' MUST NOT be removed from any record. The The field of type 'Prefix' MUST NOT be removed from any record. The
field-value for this type of field MUST NOT be modified. field-value for this type of field MUST NOT be modified.
The field 'Comments' MAY appear more than once per record. This The field 'Comments' MAY appear more than once per record. This
field MAY be inserted or changed via the registration process and no field MAY be inserted or changed via the registration process and no
guarantee of stability is provided. The content of this field is not guarantee of stability is provided. The content of this field is not
restricted, except by the need to register the information, the restricted, except by the need to register the information, the
suitability of the request, and by reasonable practical size suitability of the request, and by reasonable practical size
limitations. Long texts about a particular subtag are frowned upon. limitations.
The field 'Suppress-Script' MUST only appear in records whose 'Type' The field 'Suppress-Script' MUST only appear in records whose 'Type'
field-value is 'language'. This field MUST NOT appear more than one field-value is 'language'. This field MUST NOT appear more than one
time in a record. This field indicates a script used to write the time in a record. This field indicates a script used to write the
overwhelming majority of documents for the given language and which overwhelming majority of documents for the given language and which
therefore adds no distinguishing information to a language tag. It therefore adds no distinguishing information to a language tag. It
helps ensure greater compatibility between the language tags helps ensure greater compatibility between the language tags
generated according to the rules in this document and language tags generated according to the rules in this document and language tags
and tag processors or consumers based on RFC 3066. For example, and tag processors or consumers based on RFC 3066. For example,
virtually all Icelandic documents are written in the Latin script, virtually all Icelandic documents are written in the Latin script,
making the subtag 'Latn' redundant in the tag "is-Latn". making the subtag 'Latn' redundant in the tag "is-Latn".
3.2 Maintenance of the Registry 3.2 Language Subtag Reviewer
The Language Subtag Reviewer is appointed by the IESG for an
indefinite term, subject to removal or replacement at the IESG's
discretion. The Language Subtag Reviewer moderates the ietf-
languages mailing list, responds to requests for registration, and
performs the other registry maintenance duties described in
Section 3.3. Only the Language Subtag Reviewer is permitted to
request IANA to change, update or add records to the Language Subtag
Registry.
The performance or decisions of the Language Subtag Reviewer MAY be
appealed to the IESG under the same rules as other IETF decisions
(see [RFC2026]). The IESG can reverse or overturn the decision of
the Language Subtag Reviewer, provide guidance, or take other
appropriate actions.
3.3 Maintenance of the Registry
Maintenance of the registry requires that as codes are assigned or Maintenance of the registry requires that as codes are assigned or
withdrawn by ISO 639, ISO 15924, ISO 3166, and UN M.49, the Language withdrawn by ISO 639, ISO 15924, ISO 3166, and UN M.49, the Language
Subtag Reviewer MUST evaluate each change, determine whether it Subtag Reviewer MUST evaluate each change, determine whether it
conflicts with existing registry entries, and submit the information conflicts with existing registry entries, and submit the information
to IANA for inclusion in the registry. If an change takes place and to IANA for inclusion in the registry. If a change takes place and
the Language Subtag Reviewer does not do this in a timely manner, the Language Subtag Reviewer does not do this in a timely manner,
then any interested party MAY use the procedure in Section 3.4 to then any interested party MAY use the procedure in Section 3.5 to
register the appropriate update. register the appropriate update.
Note: The redundant and grandfathered entries together are the Note: The redundant and grandfathered entries together are the
complete list of tags registered under [RFC3066]. The redundant tags complete list of tags registered under [RFC3066]. The redundant tags
are those that can now be formed using the subtags defined in the are those that can now be formed using the subtags defined in the
registry together with the rules of Section 2.2. The grandfathered registry together with the rules of Section 2.2. The grandfathered
entries are those that can never be legal under those same entries include those that can never be legal under those same
provisions. provisions.
The set of redundant and grandfathered tags is permanent and stable: The set of redundant and grandfathered tags is permanent and stable:
new entries in this section MUST NOT be added and existing entries new entries in this section MUST NOT be added and existing entries
MUST NOT be removed. Records of type 'grandfathered' MAY have their MUST NOT be removed. Records of type 'grandfathered' MAY have their
type converted to 'redundant': see Section 3.7 for more information. type converted to 'redundant': see item 12 in Section 3.6 for more
information. The decision making process about which tags were
initially grandfathered and which were made redundant is described in
[initial-registry].
RFC 3066 tags that were deprecated prior to the adoption of this RFC 3066 tags that were deprecated prior to the adoption of this
document are part of the list of grandfathered tags and their document are part of the list of grandfathered tags and their
component subtags were not included as registered variants (although component subtags were not included as registered variants (although
they remain eligible for registration). For example, the tag "art- they remain eligible for registration). For example, the tag "art-
lojban" was deprecated in favor of the language subtag 'jbo'. lojban" was deprecated in favor of the language subtag 'jbo'.
The Language Subtag Reviewer MUST ensure that new subtags meet the The Language Subtag Reviewer MUST ensure that new subtags meet the
requirements in Section 4.1 or submit an appropriate alternate subtag requirements in Section 4.1 or submit an appropriate alternate subtag
as described in that section. When either a change or addition to as described in that section. When either a change or addition to
the registry is needed, the Language Subtag Reviewer MUST prepare the the registry is needed, the Language Subtag Reviewer MUST prepare the
complete record, including all fields, and forward it to IANA for complete record, including all fields, and forward it to IANA for
insertion into the registry. insertion into the registry. Each record being modified or inserted
MUST be forwarded in a separate message.
If record represents a new subtag that does not currently exist in If a record represents a new subtag that does not currently exist in
the registry, then the message's subject line MUST include the word the registry, then the message's subject line MUST include the word
"INSERT". If the record represents a change to an existing subtag, "INSERT". If the record represents a change to an existing subtag,
then the subject line of the message MUST include the word "MODIFY". then the subject line of the message MUST include the word "MODIFY".
The message MUST contain both the record for the subtag being The message MUST contain both the record for the subtag being
inserted or modified and the new File-Date record. Here is an inserted or modified and the new File-Date record. Here is an
example of what the body of the message might contain: example of what the body of the message might contain:
LANGUAGE SUBTAG MODIFICATION LANGUAGE SUBTAG MODIFICATION
File-Date: 2005-01-02 File-Date: 2005-01-02
%% %%
Type: variant Type: variant
Subtag: nedis Subtag: nedis
Description: Natisone dialect Description: Natisone dialect
Description: Nadiza dialect Description: Nadiza dialect
Added: 2003-10-09 Added: 2003-10-09
Prefix: sl Prefix: sl
Comments: This is a comment shown Comments: This is a comment shown
as an example. as an example.
%% %%
Figure 4 Figure 4: Example of a Language Subtag Modification Form
Whenever an entry is created or modified in the registry, the 'File- Whenever an entry is created or modified in the registry, the 'File-
Date' record at the start of the registry is updated to reflect the Date' record at the start of the registry is updated to reflect the
most recent modification date in the [RFC3339] "full-date" format. most recent modification date in the [RFC3339] "full-date" format.
Values in the 'Subtag' field MUST be lowercase except as provided for Before forwarding a new registration to IANA, the Language Subtag
in Section 3.1. Reviewer MUST ensure that values in the 'Subtag' field match case
according to the description in Section 3.1.
3.3 Stability of IANA Registry Entries 3.4 Stability of IANA Registry Entries
The stability of entries and their meaning in the registry is The stability of entries and their meaning in the registry is
critical to the long term stability of language tags. The rules in critical to the long term stability of language tags. The rules in
this section guarantee that a specific language tag's meaning is this section guarantee that a specific language tag's meaning is
stable over time and will not change. stable over time and will not change.
These rules specifically deal with how changes to codes (including These rules specifically deal with how changes to codes (including
withdrawal and deprecation of codes) maintained by ISO 639, ISO withdrawal and deprecation of codes) maintained by ISO 639, ISO
15924, ISO 3166, and UN M.49 are reflected in the IANA Language 15924, ISO 3166, and UN M.49 are reflected in the IANA Language
Subtag Registry. Assignments to the IANA Language Subtag Registry Subtag Registry. Assignments to the IANA Language Subtag Registry
skipping to change at page 26, line 24 skipping to change at page 27, line 13
('2004-07-06'). ('2004-07-06').
10. Codes assigned by ISO 639, ISO 15924, or ISO 3166 that conflict 10. Codes assigned by ISO 639, ISO 15924, or ISO 3166 that conflict
with existing subtags of the associated type, including subtags with existing subtags of the associated type, including subtags
that are deprecated, MUST NOT be entered into the registry. The that are deprecated, MUST NOT be entered into the registry. The
following additional considerations apply to subtag values that following additional considerations apply to subtag values that
are reassigned: are reassigned:
A. For ISO 639 codes, if the newly assigned code's meaning is A. For ISO 639 codes, if the newly assigned code's meaning is
not represented by a subtag in the IANA registry, the not represented by a subtag in the IANA registry, the
Language Subtag Reviewer, as described in Section 3.4, SHALL Language Subtag Reviewer, as described in Section 3.5, SHALL
prepare a proposal for entering in the IANA registry as soon prepare a proposal for entering in the IANA registry as soon
as practical a registered language subtag as an alternate as practical a registered language subtag as an alternate
value for the new code. The form of the registered language value for the new code. The form of the registered language
subtag will be at the discretion of the Language Subtag subtag will be at the discretion of the Language Subtag
Reviewer and MUST conform to other restrictions on language Reviewer and MUST conform to other restrictions on language
subtags in this document. subtags in this document.
B. For all subtags whose meaning is derived from an external B. For all subtags whose meaning is derived from an external
standard (i.e. ISO 639, ISO 15924, ISO 3166, or UN M.49), standard (i.e. ISO 639, ISO 15924, ISO 3166, or UN M.49),
if a new meaning is assigned to an existing code and the new if a new meaning is assigned to an existing code and the new
meaning broadens the meaning of that code, then the meaning meaning broadens the meaning of that code, then the meaning
for the associated subtag MAY be changed to match. The for the associated subtag MAY be changed to match. The
meaning of a subtag MUST NOT be narrowed, however, as this meaning of a subtag MUST NOT be narrowed, however, as this
can result in an unknown proportion of the existing uses of can result in an unknown proportion of the existing uses of
a subtag becoming invalid. Note: ISO 639 MA/RA has adopted a subtag becoming invalid. Note: ISO 639 MA/RA has adopted
a similar stability policy. a similar stability policy.
C. For ISO 15924 codes, if the newly assigned code's meaning is C. For ISO 15924 codes, if the newly assigned code's meaning is
not represented by a subtag in the IANA registry, the not represented by a subtag in the IANA registry, the
Language Subtag Reviewer, as described in Section 3.4, SHALL Language Subtag Reviewer, as described in Section 3.5, SHALL
prepare a proposal for entering in the IANA registry as soon prepare a proposal for entering in the IANA registry as soon
as practical a registered variant subtag as an alternate as practical a registered variant subtag as an alternate
value for the new code. The form of the registered variant value for the new code. The form of the registered variant
subtag will be at the discretion of the Language Subtag subtag will be at the discretion of the Language Subtag
Reviewer and MUST conform to other restrictions on variant Reviewer and MUST conform to other restrictions on variant
subtags in this document. subtags in this document.
D. For ISO 3166 codes, if the newly assigned code's meaning is D. For ISO 3166 codes, if the newly assigned code's meaning is
associated with the same UN M.49 code as another 'region' associated with the same UN M.49 code as another 'region'
subtag, then the existing region subtag remains as the subtag, then the existing region subtag remains as the
preferred value for that region and no new entry is created. preferred value for that region and no new entry is created.
A comment MAY be added to the existing region subtag A comment MAY be added to the existing region subtag
indicating the relationship to the new ISO 3166 code. indicating the relationship to the new ISO 3166 code.
E. For ISO 3166 codes, if the newly assigned code's meaning is E. For ISO 3166 codes, if the newly assigned code's meaning is
associated with a UN M.49 code that is not represented by an associated with a UN M.49 code that is not represented by an
existing region subtag, then the Language Subtag Reviewer, existing region subtag, then the Language Subtag Reviewer,
as described in Section 3.4, SHALL prepare a proposal for as described in Section 3.5, SHALL prepare a proposal for
entering the appropriate UN M.49 country code as an entry in entering the appropriate UN M.49 country code as an entry in
the IANA registry. the IANA registry.
F. For ISO 3166 codes, if there is no associated UN numeric F. For ISO 3166 codes, if there is no associated UN numeric
code, then the Language Subtag Reviewer SHALL petition the code, then the Language Subtag Reviewer SHALL petition the
UN to create one. If there is no response from the UN UN to create one. If there is no response from the UN
within ninety days of the request being sent, the Language within ninety days of the request being sent, the Language
Subtag Reviewer SHALL prepare a proposal for entering in the Subtag Reviewer SHALL prepare a proposal for entering in the
IANA registry as soon as practical a registered variant IANA registry as soon as practical a registered variant
subtag as an alternate value for the new code. The form of subtag as an alternate value for the new code. The form of
skipping to change at page 27, line 41 skipping to change at page 28, line 28
11. UN M.49 has codes for both countries and areas (such as '276' 11. UN M.49 has codes for both countries and areas (such as '276'
for Germany) and geographical regions and sub-regions (such as for Germany) and geographical regions and sub-regions (such as
'150' for Europe). UN M.49 country or area codes for which '150' for Europe). UN M.49 country or area codes for which
there is no corresponding ISO 3166 code SHOULD NOT be there is no corresponding ISO 3166 code SHOULD NOT be
registered, except as a surrogate for an ISO 3166 code that is registered, except as a surrogate for an ISO 3166 code that is
blocked from registration by an existing subtag. If such a code blocked from registration by an existing subtag. If such a code
becomes necessary, then the registration authority for ISO 3166 becomes necessary, then the registration authority for ISO 3166
SHOULD first be petitioned to assign a code to the region. If SHOULD first be petitioned to assign a code to the region. If
the petition for a code assignment by ISO 3166 is refused or not the petition for a code assignment by ISO 3166 is refused or not
acted on in a timely manner, the registration process described acted on in a timely manner, the registration process described
in Section 3.4 MAY then be used to register the corresponding UN in Section 3.5 MAY then be used to register the corresponding UN
M.49 code. At the time this document was written, there were M.49 code. At the time this document was written, there were
only four such codes: 830 (Channel Islands), 831 (Guernsey), 832 only four such codes: 830 (Channel Islands), 831 (Guernsey), 832
(Jersey), and 833 (Isle of Man). This way UN M.49 codes remain (Jersey), and 833 (Isle of Man). This way UN M.49 codes remain
available as the value of last resort in cases where ISO 3166 available as the value of last resort in cases where ISO 3166
reassigns a deprecated value in the registry. reassigns a deprecated value in the registry.
12. Stability provisions apply to grandfathered tags with this 12. Stability provisions apply to grandfathered tags with this
exception: should all of the subtags in a grandfathered tag exception: should all of the subtags in a grandfathered tag
become valid subtags in the IANA registry, then the field 'Type' become valid subtags in the IANA registry, then the field 'Type'
in that record is changed from 'grandfathered' to 'redundant'. in that record is changed from 'grandfathered' to 'redundant'.
Note that this will not affect language tags that match the Note that this will not affect language tags that match the
grandfathered tag, since these tags will now match valid grandfathered tag, since these tags will now match valid
generative subtag sequences. For example, if the subtag 'gan' generative subtag sequences. For example, if the subtag 'gan'
in the language tag "zh-gan" were to be registered as an in the language tag "zh-gan" were to be registered as an
extended language subtag, then the grandfathered tag "zh-gan" extended language subtag, then the grandfathered tag "zh-gan"
would be deprecated (but existing content or implementations would be deprecated (but existing content or implementations
that use "zh-gan" would remain valid). that use "zh-gan" would remain valid).
3.4 Registration Procedure for Subtags 3.5 Registration Procedure for Subtags
The procedure given here MUST be used by anyone who wants to use a The procedure given here MUST be used by anyone who wants to use a
subtag not currently in the IANA Language Subtag Registry. subtag not currently in the IANA Language Subtag Registry.
Only subtags of type 'language' and 'variant' will be considered for Only subtags of type 'language' and 'variant' will be considered for
independent registration of new subtags. Handling of subtags needed independent registration of new subtags. Handling of subtags needed
for stability and subtags necessary to keep the registry synchronized for stability and subtags necessary to keep the registry synchronized
with ISO 639, ISO 15924, ISO 3166, and UN M.49 within the limits with ISO 639, ISO 15924, ISO 3166, and UN M.49 within the limits
defined by this document are described in Section 3.2. Stability defined by this document are described in Section 3.3. Stability
provisions are described in Section 3.3. provisions are described in Section 3.4.
This procedure MAY also be used to register or alter the information This procedure MAY also be used to register or alter the information
for the "Description", "Comments", "Deprecated", or "Prefix" fields for the "Description", "Comments", "Deprecated", or "Prefix" fields
in a subtag's record as described in Section 3.3. Changes to all in a subtag's record as described in Section 3.4. Changes to all
other fields in the IANA registry are NOT permitted. other fields in the IANA registry are NOT permitted.
Registering a new subtag or requesting modifications to an existing Registering a new subtag or requesting modifications to an existing
tag or subtag starts with the requester filling out the registration tag or subtag starts with the requester filling out the registration
form reproduced below. Note that each response is not limited in form reproduced below. Note that each response is not limited in
size so that the request can adequately describe the registration. size so that the request can adequately describe the registration.
The fields in the "Record Requested" section SHOULD follow the The fields in the "Record Requested" section SHOULD follow the
requirements in Section 3.1. requirements in Section 3.1.
LANGUAGE SUBTAG REGISTRATION FORM LANGUAGE SUBTAG REGISTRATION FORM
skipping to change at page 29, line 24 skipping to change at page 29, line 43
Preferred-Value: Preferred-Value:
Deprecated: Deprecated:
Suppress-Script: Suppress-Script:
Comments: Comments:
4. Intended meaning of the subtag: 4. Intended meaning of the subtag:
5. Reference to published description 5. Reference to published description
of the language (book or article): of the language (book or article):
6. Any other relevant information: 6. Any other relevant information:
Figure 5 Figure 5: The Language Subtag Registration Form
The subtag registration form MUST be sent to The subtag registration form MUST be sent to
<ietf-languages@iana.org> for a two week review period before it can <ietf-languages@iana.org> for a two week review period before it can
be submitted to IANA. (This is an open list and can be joined by be submitted to IANA. (This is an open list and can be joined by
sending a request to <ietf-languages-request@iana.org>.) sending a request to <ietf-languages-request@iana.org>.)
Variant and extlang subtags are always registered for use with a Variant subtags are usually registered for use with a particular
particular range of language tags. For example, the subtag 'rozaj' range of language tags. For example, the subtag 'rozaj' is intended
is intended for use with language tags that start with the primary for use with language tags that start with the primary language
language subtag "sl", since Resian is a dialect of Slovenian. Thus subtag "sl", since Resian is a dialect of Slovenian. Thus the subtag
the subtag 'rozaj' could be included in tags such as "sl-Latn-rozaj" 'rozaj' would be appropriate in tags such as "sl-Latn-rozaj" or "sl-
or "sl-IT-rozaj". This information is stored in the "Prefix" field IT-rozaj". This information is stored in the "Prefix" field in the
in the registry. Variant registration requests are REQUIRED to registry. Variant registration requests SHOULD include at least one
include at least one "Prefix" field in the registration form. "Prefix" field in the registration form.
Extended language subtags are reserved for future standardization.
These subtags will be REQUIRED to include exactly one "Prefix" field
once they are allowed for registration.
The 'Prefix' field for a given registered subtag exists in the IANA The 'Prefix' field for a given registered subtag exists in the IANA
registry as a guide to usage. Additional prefixes MAY be added by registry as a guide to usage. Additional prefixes MAY be added by
filing an additional registration form. In that form, the "Any other filing an additional registration form. In that form, the "Any other
relevant information:" field MUST indicate that it is the addition of relevant information:" field MUST indicate that it is the addition of
a prefix. a prefix.
Requests to add a prefix to a variant subtag that imply a different Requests to add a prefix to a variant subtag that imply a different
semantic meaning will probably be rejected. For example, a request semantic meaning will probably be rejected. For example, a request
to add the prefix "de" to the subtag 'nedis' so that the tag "de- to add the prefix "de" to the subtag 'nedis' so that the tag "de-
skipping to change at page 30, line 18 skipping to change at page 30, line 41
MUST be escaped using the syntax described in Section 3.1. The MUST be escaped using the syntax described in Section 3.1. The
'Description' field is used for identification purposes and doesn't 'Description' field is used for identification purposes and doesn't
necessarily represent the actual native name of the language or necessarily represent the actual native name of the language or
variation or to be in any particular language. variation or to be in any particular language.
While the 'Description' field itself is not guaranteed to be stable While the 'Description' field itself is not guaranteed to be stable
and errata corrections MAY be undertaken from time to time, attempts and errata corrections MAY be undertaken from time to time, attempts
to provide translations or transcriptions of entries in the registry to provide translations or transcriptions of entries in the registry
itself will probably be frowned upon by the community or rejected itself will probably be frowned upon by the community or rejected
outright, as changes of this nature have an impact on the provisions outright, as changes of this nature have an impact on the provisions
in Section 3.3. in Section 3.4.
The Language Subtag Reviewer is responsible for responding to
requests for the registration of subtags through the registration
process and is appointed by the IESG.
When the two week period has passed the Language Subtag Reviewer When the two week period has passed the Language Subtag Reviewer
either forwards the record to be inserted or modified to either forwards the record to be inserted or modified to
iana@iana.org according to the procedure described in Section 3.2, or iana@iana.org according to the procedure described in Section 3.3, or
rejects the request because of significant objections raised on the rejects the request because of significant objections raised on the
list or due to problems with constraints in this document (which MUST list or due to problems with constraints in this document (which MUST
be explicitly cited). The reviewer MAY also extend the review period be explicitly cited). The Language Subtag Reviewer MAY also extend
in two week increments to permit further discussion. The reviewer the review period in two week increments to permit further
MUST indicate on the list whether the registration has been accepted, discussion. The Language Subtag Reviewer MUST indicate on the list
rejected, or extended following each two week period. whether the registration has been accepted, rejected, or extended
following each two week period.
Note that the reviewer MAY raise objections on the list if he or she Note that the Language Subtag Reviewer MAY raise objections on the
so desires. The important thing is that the objection MUST be made list if he or she so desires. The important thing is that the
publicly. objection MUST be made publicly.
The applicant is free to modify a rejected application with The applicant is free to modify a rejected application with
additional information and submit it again; this restarts the two additional information and submit it again; this restarts the two
week comment period. week comment period.
Decisions made by the reviewer MAY be appealed to the IESG [RFC2028] Decisions made by the Language Subtag Reviewer MAY be appealed to the
under the same rules as other IETF decisions [RFC2026]. IESG [RFC2028] under the same rules as other IETF decisions
[RFC2026].
All approved registration forms are available online in the directory All approved registration forms are available online in the directory
http://www.iana.org/numbers.html under "languages". http://www.iana.org/numbers.html under "languages".
Updates or changes to existing records follow the same procedure as Updates or changes to existing records follow the same procedure as
new registrations. The Language Subtag Reviewer decides whether new registrations. The Language Subtag Reviewer decides whether
there is consensus to update the registration following the two week there is consensus to update the registration following the two week
review period; normally objections by the original registrant will review period; normally objections by the original registrant will
carry extra weight in forming such a consensus. carry extra weight in forming such a consensus.
Registrations are permanent and stable. Once registered, subtags Registrations are permanent and stable. Once registered, subtags
will not be removed from the registry and will remain a valid way in will not be removed from the registry and will remain a valid way in
which to specify a specific language or variant. which to specify a specific language or variant.
Note: The purpose of the "Description" in the registration form is Note: The purpose of the "Description" in the registration form is
intended as an aid to people trying to verify whether a language is intended as an aid to people trying to verify whether a language is
registered or what language or language variation a particular subtag registered or what language or language variation a particular subtag
refers to. In most cases, reference to an authoritative grammar or refers to. In most cases, reference to an authoritative grammar or
dictionary of that language will be useful; in cases where no such dictionary of that language will be useful; in cases where no such
work exists, other well known works describing that language or in work exists, other well known works describing that language or in
that language MAY be appropriate. The subtag reviewer decides what that language MAY be appropriate. The Language Subtag Reviewer
constitutes "good enough" reference material. This requirement is decides what constitutes "good enough" reference material. This
not intended to exclude particular languages or dialects due to the requirement is not intended to exclude particular languages or
size of the speaker population or lack of a standardized orthography. dialects due to the size of the speaker population or lack of a
Minority languages will be considered equally on their own merits. standardized orthography. Minority languages will be considered
equally on their own merits.
3.5 Possibilities for Registration 3.6 Possibilities for Registration
Possibilities for registration of subtags or information about Possibilities for registration of subtags or information about
subtags include: subtags include:
o Primary language subtags for languages not listed in ISO 639 that o Primary language subtags for languages not listed in ISO 639 that
are not variants of any listed or registered language MAY be are not variants of any listed or registered language MAY be
registered. At the time this document was created there were no registered. At the time this document was created there were no
examples of this form of subtag. Before attempting to register a examples of this form of subtag. Before attempting to register a
language subtag, there MUST be an attempt to register the language language subtag, there MUST be an attempt to register the language
with ISO 639. Subtags MUST NOT be registered for codes that exist with ISO 639. Subtags MUST NOT be registered for codes that exist
skipping to change at page 31, line 50 skipping to change at page 32, line 25
o Dialect or other divisions or variations within a language, its o Dialect or other divisions or variations within a language, its
orthography, writing system, regional or historical usage, orthography, writing system, regional or historical usage,
transliteration or other transformation, or distinguishing transliteration or other transformation, or distinguishing
variation MAY be registered as variant subtags. An example is the variation MAY be registered as variant subtags. An example is the
'rozaj' subtag (the Resian dialect of Slovenian). 'rozaj' subtag (the Resian dialect of Slovenian).
o The addition or maintenance of fields (generally of an o The addition or maintenance of fields (generally of an
informational nature) in Tag or Subtag records as described in informational nature) in Tag or Subtag records as described in
Section 3.1 and subject to the stability provisions in Section 3.1 and subject to the stability provisions in
Section 3.3. This includes descriptions; comments; deprecation Section 3.4. This includes descriptions; comments; deprecation
and preferred values for obsolete or withdrawn codes; or the and preferred values for obsolete or withdrawn codes; or the
addition of script or extlang information to primary language addition of script or extlang information to primary language
subtags. subtags.
o The addition of records and related field value changes necessary o The addition of records and related field value changes necessary
to reflect assignments made by ISO 639, ISO 15924, ISO 3166, and to reflect assignments made by ISO 639, ISO 15924, ISO 3166, and
UN M.49 as described in Section 3.3. UN M.49 as described in Section 3.4.
Subtags proposed for registration that would cause all or part of a
grandfathered tag to become redundant but whose meaning conflicts
with or alters the meaning of the grandfathered tag MUST be rejected.
This document leaves the decision on what subtags or changes to This document leaves the decision on what subtags or changes to
subtags are appropriate (or not) to the registration process subtags are appropriate (or not) to the registration process
described in Section 3.4. described in Section 3.5.
Note: four character primary language subtags are reserved to allow Note: four character primary language subtags are reserved to allow
for the possibility of alpha4 codes in some future addition to the for the possibility of alpha4 codes in some future addition to the
ISO 639 family of standards. ISO 639 family of standards.
ISO 639 defines a maintenance agency for additions to and changes in ISO 639 defines a maintenance agency for additions to and changes in
the list of languages in ISO 639. This agency is: the list of languages in ISO 639. This agency is:
International Information Centre for Terminology (Infoterm) International Information Centre for Terminology (Infoterm)
Aichholzgasse 6/12, AT-1120 Aichholzgasse 6/12, AT-1120
skipping to change at page 33, line 15 skipping to change at page 33, line 43
Statistical Services Branch Statistical Services Branch
Statistics Division Statistics Division
United Nations, Room DC2-1620 United Nations, Room DC2-1620
New York, NY 10017, USA New York, NY 10017, USA
Fax: +1-212-963-0623 Fax: +1-212-963-0623
E-mail: statistics@un.org E-mail: statistics@un.org
URL: http://unstats.un.org/unsd/methods/m49/m49alpha.htm URL: http://unstats.un.org/unsd/methods/m49/m49alpha.htm
3.6 Extensions and Extensions Registry 3.7 Extensions and Extensions Registry
Extension subtags are those introduced by single character subtags Extension subtags are those introduced by single character subtags
("singletons") other than 'x'. They are reserved for the generation ("singletons") other than 'x'. They are reserved for the generation
of identifiers which contain a language component, and are compatible of identifiers which contain a language component, and are compatible
with applications that understand language tags. with applications that understand language tags.
The structure and form of extensions are defined by this document so The structure and form of extensions are defined by this document so
that implementations can be created that are forward compatible with that implementations can be created that are forward compatible with
applications that might be created using singletons in the future. applications that might be created using singletons in the future.
In addition, defining a mechanism for maintaining singletons will In addition, defining a mechanism for maintaining singletons will
skipping to change at page 36, line 20 skipping to change at page 36, line 36
that the most significant information be in the most significant that the most significant information be in the most significant
(left-most) subtags, and that the specification gracefully handle (left-most) subtags, and that the specification gracefully handle
truncated subtags. truncated subtags.
When a language tag is to be used in a specific, known, protocol, it When a language tag is to be used in a specific, known, protocol, it
is RECOMMENDED that that the language tag not contain extensions not is RECOMMENDED that that the language tag not contain extensions not
supported by that protocol. In addition, note that some protocols supported by that protocol. In addition, note that some protocols
MAY impose upper limits on the length of the strings used to store or MAY impose upper limits on the length of the strings used to store or
transport the language tag. transport the language tag.
3.7 Initialization of the Registries 3.8 Initialization of the Registries
Upon adoption of this document an initial version of the Language Upon adoption of this document an initial version of the Language
Subtag Registry containing the various subtags initially valid in a Subtag Registry containing the various subtags initially valid in a
language tag is necessary. This collection of subtags, along with a language tag is necessary. This collection of subtags, along with a
description of the process used to create it, is described by description of the process used to create it, is described by
[initial-registry]. [initial-registry]. IANA SHALL publish the initial version of the
registry described by this document from the content of [initial-
registry]. Once published by IANA, the maintenance procedures, rules
and registration processes described in this document will be
available for new registrations or updates.
Registrations that are in process under the rules defined in Registrations that are in process under the rules defined in
[RFC3066] when this document is adopted MAY be completed under the [RFC3066] when this document is adopted MAY be completed under the
former rules, at the discretion of the language tag reviewer. Any former rules, at the discretion of the Language Tag Reviewer (as
new registrations submitted after the adoption of this document MUST described in [RFC3066]). Until the IESG officially appoints a
be rejected. Language Subtag Reviewer, the existing Language Tag Reviewer SHALL
serve as the Language Subtag Reviewer.
Any new registrations submitted using the RFC 3066 forms or format
after the adoption of this document and publication of the registry
by IANA MUST be rejected.
An initial version of the Language Extension Registry described in An initial version of the Language Extension Registry described in
Section 3.6 is also needed. The Language Extension Registry SHALL be Section 3.7 is also needed. The Language Extension Registry SHALL be
initialized with a single record containing a single field of type initialized with a single record containing a single field of type
"File-Date" as a placeholder for future assignments. "File-Date" as a placeholder for future assignments.
4. Formation and Processing of Language Tags 4. Formation and Processing of Language Tags
This section addresses how to use the information in the registry This section addresses how to use the information in the registry
with the tag syntax to choose, form and process language tags. with the tag syntax to choose, form and process language tags.
4.1 Choice of Language Tag 4.1 Choice of Language Tag
skipping to change at page 44, line 34 skipping to change at page 45, line 34
various subtags in the extension and thus MAY define an alternate various subtags in the extension and thus MAY define an alternate
canonicalization scheme for the extension's subtags. Extensions MAY canonicalization scheme for the extension's subtags. Extensions MAY
define how the order of the extension's subtags are interpreted. For define how the order of the extension's subtags are interpreted. For
example, an extension could define that its subtags are in canonical example, an extension could define that its subtags are in canonical
order when the subtags are placed into ASCII order: that is, "en-a- order when the subtags are placed into ASCII order: that is, "en-a-
aaa-bbb-ccc" instead of "en-a-ccc-bbb-aaa". Another extension might aaa-bbb-ccc" instead of "en-a-ccc-bbb-aaa". Another extension might
define that the order of the subtags influences their semantic define that the order of the subtags influences their semantic
meaning (so that "en-b-ccc-bbb-aaa" has a different value from "en-b- meaning (so that "en-b-ccc-bbb-aaa" has a different value from "en-b-
aaa-bbb-ccc"). However, extension specifications SHOULD be designed aaa-bbb-ccc"). However, extension specifications SHOULD be designed
so that they are tolerant of the typical processes described in so that they are tolerant of the typical processes described in
Section 3.6. Section 3.7.
4.5 Considerations for Private Use Subtags 4.5 Considerations for Private Use Subtags
Private use subtags, like all other subtags, MUST conform to the Private use subtags, like all other subtags, MUST conform to the
format and content constraints in the ABNF. Private use subtags have format and content constraints in the ABNF. Private use subtags have
no meaning outside the private agreement between the parties that no meaning outside the private agreement between the parties that
intend to use or exchange language tags that employ them. The same intend to use or exchange language tags that employ them. The same
subtags MAY be used with a different meaning under a separate private subtags MAY be used with a different meaning under a separate private
agreement. They SHOULD NOT be used where alternatives exist and agreement. They SHOULD NOT be used where alternatives exist and
SHOULD NOT be used in content or protocols intended for general use. SHOULD NOT be used in content or protocols intended for general use.
skipping to change at page 46, line 32 skipping to change at page 47, line 32
work to create it will be performed externally. work to create it will be performed externally.
The new registry MUST be listed under "Language Tags" at The new registry MUST be listed under "Language Tags" at
<http://www.iana.org/numbers.html>, replacing the existing <http://www.iana.org/numbers.html>, replacing the existing
registrations defined by [RFC3066]. The existing set of registration registrations defined by [RFC3066]. The existing set of registration
forms and RFC 3066 registrations MUST be relabeled as "Language Tags forms and RFC 3066 registrations MUST be relabeled as "Language Tags
(Obsolete)" and maintained (but not added to or modified). (Obsolete)" and maintained (but not added to or modified).
Future work on the Language Subtag Registry SHALL be limited to Future work on the Language Subtag Registry SHALL be limited to
inserting or replacing whole records preformatted for IANA by the inserting or replacing whole records preformatted for IANA by the
Language Subtag Reviewer as described in Section 3.2 of this Language Subtag Reviewer as described in Section 3.3 of this document
document. This simplifies IANA's work by limiting it to placing the and archiving the forwarded registration form.
text in the appropriate location in the registry.
Each record MUST be sent to iana@iana.org with a subject line Each record MUST be sent to iana@iana.org with a subject line
indicating whether the enclosed record is an insertion of a new indicating whether the enclosed record is an insertion of a new
record (indicated by the word "INSERT" in the subject line) or a record (indicated by the word "INSERT" in the subject line) or a
replacement of an existing record (indicated by the word "MODIFY" in replacement of an existing record (indicated by the word "MODIFY" in
the subject line). Records MUST NOT be deleted from the registry. the subject line). Records MUST NOT be deleted from the registry.
IANA MUST place any inserted or modified records into the appropriate IANA MUST place any inserted or modified records into the appropriate
section of the language subtag registry, grouping the records by section of the language subtag registry, grouping the records by
their 'Type' field. Inserted records MAY be placed anywhere in the their 'Type' field. Inserted records MAY be placed anywhere in the
appropriate section; there is no guarantee of the order of the appropriate section; there is no guarantee of the order of the
skipping to change at page 47, line 8 skipping to change at page 48, line 8
Included in any request to insert or modify records MUST be a new Included in any request to insert or modify records MUST be a new
File-Date record. This record MUST be placed first in the registry. File-Date record. This record MUST be placed first in the registry.
In the event that the File-Date record present in the registry has a In the event that the File-Date record present in the registry has a
later date then the record being inserted or modified, the existing later date then the record being inserted or modified, the existing
record MUST be preserved. record MUST be preserved.
5.2 Extensions Registry 5.2 Extensions Registry
The Language Tag Extensions registry will also be generated and sent The Language Tag Extensions registry will also be generated and sent
to IANA as described in Section 3.6. This registry can contain at to IANA as described in Section 3.7. This registry can contain at
most 35 records and thus changes to this registry are expected to be most 35 records and thus changes to this registry are expected to be
very infrequent. very infrequent.
Future work by IANA on the Language Tag Extensions Registry is Future work by IANA on the Language Tag Extensions Registry is
limited to two cases. First, the IESG MAY request that new records limited to two cases. First, the IESG MAY request that new records
be inserted into this registry from time to time. These requests be inserted into this registry from time to time. These requests
MUST include the record to insert in the exact format described in MUST include the record to insert in the exact format described in
Section 3.6. In addition, there MAY be occasional requests from the Section 3.7. In addition, there MAY be occasional requests from the
maintaining authority for a specific extension to update the contact maintaining authority for a specific extension to update the contact
information or URLs in the record. These requests MUST include the information or URLs in the record. These requests MUST include the
complete, updated record. IANA is not responsible for validating the complete, updated record. IANA is not responsible for validating the
information provided, only that it is properly formatted. It should information provided, only that it is properly formatted. It should
reasonably be seen to come from the maintaining authority named in reasonably be seen to come from the maintaining authority named in
the record present in the registry. the record present in the registry.
6. Security Considerations 6. Security Considerations
Language tags used in content negotiation, like any other information Language tags used in content negotiation, like any other information
skipping to change at page 48, line 36 skipping to change at page 49, line 36
other than the one(s) associated with or specified by that language other than the one(s) associated with or specified by that language
tag. tag.
Since there is no limit to the number of variant, private use, and Since there is no limit to the number of variant, private use, and
extension subtags, and consequently no limit on the possible length extension subtags, and consequently no limit on the possible length
of a tag, implementations need to guard against buffer overflow of a tag, implementations need to guard against buffer overflow
attacks. See Section 4.3 for details on language tag truncation, attacks. See Section 4.3 for details on language tag truncation,
which can occur as a consequence of defenses against buffer overflow. which can occur as a consequence of defenses against buffer overflow.
Although the specification of valid subtags for an extension (see: Although the specification of valid subtags for an extension (see:
Section 3.6) MUST be available over the Internet, implementations Section 3.7) MUST be available over the Internet, implementations
SHOULD NOT mechanically depend on it being always accessible, to SHOULD NOT mechanically depend on it being always accessible, to
prevent denial-of-service attacks. prevent denial-of-service attacks.
7. Character Set Considerations 7. Character Set Considerations
The syntax in this document requires that language tags use only the The syntax in this document requires that language tags use only the
characters A-Z, a-z, 0-9, and HYPHEN-MINUS, which are present in most characters A-Z, a-z, 0-9, and HYPHEN-MINUS, which are present in most
character sets, so the composition of language tags should not have character sets, so the composition of language tags should not have
any character set issues. any character set issues.
skipping to change at page 55, line 26 skipping to change at page 56, line 26
draft-ietf-ltru-initial-registry-00.txt>. draft-ietf-ltru-initial-registry-00.txt>.
[iso639.principles] [iso639.principles]
ISO 639 Joint Advisory Committee, "ISO 639 Joint Advisory ISO 639 Joint Advisory Committee, "ISO 639 Joint Advisory
Committee: Working principles for ISO 639 maintenance", Committee: Working principles for ISO 639 maintenance",
March 2000, March 2000,
<http://www.loc.gov/standards/iso639-2/ <http://www.loc.gov/standards/iso639-2/
iso639jac_n3r.html>. iso639jac_n3r.html>.
[record-jar] [record-jar]
Raymond, E., "The Art of Unix Programming", 2003. Raymond, E., "The Art of Unix Programming", 2003,
<urn:isbn:0-13-142901-9>.
Authors' Addresses Authors' Addresses
Addison Phillips (editor) Addison Phillips (editor)
Quest Software Quest Software
Email: addison.phillips@quest.com Email: addison.phillips@quest.com
URI: http://www.inter-locale.com URI: http://www.inter-locale.com
Mark Davis (editor) Mark Davis (editor)
skipping to change at page 56, line 22 skipping to change at page 57, line 22
document, made enormous contributions directly or indirectly to this document, made enormous contributions directly or indirectly to this
document and are generally responsible for the success of language document and are generally responsible for the success of language
tags. tags.
The following people (in alphabetical order) contributed to this The following people (in alphabetical order) contributed to this
document or to RFCs 1766 and 3066: document or to RFCs 1766 and 3066:
Glenn Adams, Harald Tveit Alvestrand, Tim Berners-Lee, Marc Blanchet, Glenn Adams, Harald Tveit Alvestrand, Tim Berners-Lee, Marc Blanchet,
Nathaniel Borenstein, Karen Broome, Eric Brunner, Sean M. Burke, M.T. Nathaniel Borenstein, Karen Broome, Eric Brunner, Sean M. Burke, M.T.
Carrasco Benitez, Jeremy Carroll, John Clews, Jim Conklin, Peter Carrasco Benitez, Jeremy Carroll, John Clews, Jim Conklin, Peter
Constable, John Cowan, Mark Crispin, Dave Crocker, Martin Duerst, Constable, John Cowan, Mark Crispin, Dave Crocker, Elwyn Davies,
Frank Ellerman, Michael Everson, Doug Ewell, Ned Freed, Tim Goodwin, Martin Duerst, Frank Ellerman, Michael Everson, Doug Ewell, Ned
Dirk-Willem van Gulik, Marion Gunn, Joel Halpren, Elliotte Rusty Freed, Tim Goodwin, Dirk-Willem van Gulik, Marion Gunn, Joel Halpren,
Harold, Paul Hoffman, Scott Hollenbeck, Richard Ishida, Olle Elliotte Rusty Harold, Paul Hoffman, Scott Hollenbeck, Richard
Jarnefors, Kent Karlsson, John Klensin, Erkki Kolehmainen, Alain Ishida, Olle Jarnefors, Kent Karlsson, John Klensin, Erkki
LaBonte, Eric Mader, Ira McDonald, Keith Moore, Chris Newman, Kolehmainen, Alain LaBonte, Eric Mader, Ira McDonald, Keith Moore,
Masataka Ohta, Dylan Pierce, Randy Presuhn, George Rhoten, Felix Chris Newman, Masataka Ohta, Dylan Pierce, Randy Presuhn, George
Sasaki, Markus Scherer, Keld Jorn Simonsen, Thierry Sourbier, Otto Rhoten, Felix Sasaki, Markus Scherer, Keld Jorn Simonsen, Thierry
Stolz, Tex Texin, Andrea Vine, Rhys Weatherley, Misha Wolf, Francois Sourbier, Otto Stolz, Tex Texin, Andrea Vine, Rhys Weatherley, Misha
Yergeau and many, many others. Wolf, Francois Yergeau and many, many others.
Very special thanks must go to Harald Tveit Alvestrand, who Very special thanks must go to Harald Tveit Alvestrand, who
originated RFCs 1766 and 3066, and without whom this document would originated RFCs 1766 and 3066, and without whom this document would
not have been possible. Special thanks must go to Michael Everson, not have been possible. Special thanks must go to Michael Everson,
who has served as language tag reviewer for almost the complete who has served as Language Tag Reviewer for almost the complete
period since the publication of RFC 1766. Special thanks to Doug period since the publication of RFC 1766. Special thanks to Doug
Ewell, for his production of the first complete subtag registry, and Ewell, for his production of the first complete subtag registry, and
his work in producing a test parser for verifying language tags. his work in producing a test parser for verifying language tags.
Appendix B. Examples of Language Tags (Informative) Appendix B. Examples of Language Tags (Informative)
Simple language subtag: Simple language subtag:
de (German) de (German)
 End of changes. 80 change blocks. 
164 lines changed or deleted 226 lines changed or added

This html diff was produced by rfcdiff 1.27, available from http://www.levkowetz.com/ietf/tools/rfcdiff/