draft-ietf-nfsv4-internationalization-00.txt   draft-ietf-nfsv4-internationalization-01.txt 
NFSv4 D. Noveck NFSv4 D. Noveck
Internet-Draft NetApp Internet-Draft NetApp
Updates: 8881, 7530 (if approved) March 26, 2021 Updates: 8881, 7530 (if approved) September 26, 2021
Intended status: Standards Track Intended status: Standards Track
Expires: September 27, 2021 Expires: March 30, 2022
Internationalization for the NFSv4 Protocols Internationalization for the NFSv4 Protocols
draft-ietf-nfsv4-internationalization-00 draft-ietf-nfsv4-internationalization-01
Abstract Abstract
This document describes the handling of internationalization for all This document describes the handling of internationalization for all
NFSv4 protocols, including NFSv4.0, NFSv4.1, NFSv4.2 and extensions NFSv4 protocols, including NFSv4.0, NFSv4.1, NFSv4.2 and extensions
thereof, and future minor versions. thereof, and future minor versions.
It updates RFC7530 and RFC8881. It updates RFC7530 and RFC8881.
Status of This Memo Status of This Memo
skipping to change at page 1, line 35 skipping to change at page 1, line 35
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 27, 2021. This Internet-Draft will expire on March 30, 2022.
Copyright Notice Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 4, line 5 skipping to change at page 4, line 5
internationalization here supersedes that in RFC7530 [3], the internationalization here supersedes that in RFC7530 [3], the
treatments are intended to be essentially the same, in order to treatments are intended to be essentially the same, in order to
eliminate interoperability issues. eliminate interoperability issues.
Because of a change in the handling of Internationalized domain Because of a change in the handling of Internationalized domain
names, there are some differences from the handling in RFC7530 names, there are some differences from the handling in RFC7530
[3], as discussed in Appendix A. For a discussion of those [3], as discussed in Appendix A. For a discussion of those
differences and potential compatibility issues, see Sections 12.1 differences and potential compatibility issues, see Sections 12.1
and 12.2. and 12.2.
o With regard to NFSv4.1 as defined RFC5661 [21], the situation is o With regard to NFSv4.1 as defined by RFC881 [9], the situation is
quite different. The approach to internationalization specified quite different. The approach to internationalization specified
in that document, based in large part on that in RFC3530 was never in that document, based in large part on that in RFC3530 was never
implemented, and implementers were either unaware of the implemented, and implementers were either unaware of the
troublesome implications of that approach or chose to ignore the troublesome implications of that approach or chose to ignore the
existing specification as essentially unimplementable. An existing specification as essentially unimplementable. An
internationalization approach compatible with that specified in internationalization approach compatible with that specified in
RFC7530 [3] tended to be followed, despite the fact that, in other RFC7530 [3] tended to be followed, despite the fact that, in other
respects, NFSv4.1 was considered to be a separate protocol. respects, NFSv4.1 was considered to be a separate protocol.
If there were NFSv4 servers who obeyed the internationalization If there were NFSv4 servers who obeyed the internationalization
skipping to change at page 6, line 6 skipping to change at page 6, line 6
is known -- but where there remains some uncertainty as to details is known -- but where there remains some uncertainty as to details
-- is described using "should". Such cases primarily concern -- is described using "should". Such cases primarily concern
details of error returns. New implementations should follow details of error returns. New implementations should follow
existing practice even though such situations generally do not existing practice even though such situations generally do not
affect interoperability. affect interoperability.
There are also cases in which certain server behaviors, while not There are also cases in which certain server behaviors, while not
known to exist, cannot be reliably determined not to exist. In part, known to exist, cannot be reliably determined not to exist. In part,
this is a consequence of the long period of time that has elapsed this is a consequence of the long period of time that has elapsed
since the publication of the defining specifications, resulting in a since the publication of the defining specifications, resulting in a
situation in which those involved in t implementation work may no situation in which those involved in the implementation work may no
longer be involved in or aware of working group activities. longer be involved in or be aware of working group activities.
In the case of possible server behavior that is neither known to In the case of possible server behavior that is neither known to
exist nor known not to exist, we use "SHOULD NOT" and "MUST NOT" as exist nor known not to exist, we use "SHOULD NOT" and "MUST NOT" as
follows, and similarly for "SHOULD" and "MUST". follows, and similarly for "SHOULD" and "MUST".
o In some cases, the potential behavior is not known to exist but is o In some cases, the potential behavior is not known to exist but is
of such a nature that, if it were in fact implemented, of such a nature that, if it were in fact implemented,
interoperability difficulties would be expected and reported, interoperability difficulties would be expected and reported,
giving us cause to conclude that the potential behavior is not giving us cause to conclude that the potential behavior is not
implemented. For such behavior, we use "MUST NOT". Similarly, we implemented. For such behavior, we use "MUST NOT". Similarly, we
skipping to change at page 7, line 25 skipping to change at page 7, line 25
potential future minor versions and protocol extensions are potential future minor versions and protocol extensions are
addressed in Section 15. addressed in Section 15.
o Some changes motivated by the shift from IDNA2003 to IDNA2008 have o Some changes motivated by the shift from IDNA2003 to IDNA2008 have
been made. The intention is to maintain compatibility with all been made. The intention is to maintain compatibility with all
existing NFSv4 minor versions. Potential compatibility issues existing NFSv4 minor versions. Potential compatibility issues
with regard to the IDNA shift are discussed in Section 12.2. with regard to the IDNA shift are discussed in Section 12.2.
o There is more detailed discussion of case-insensitive handling of o There is more detailed discussion of case-insensitive handling of
file names, with particular attention to the complexities that can file names, with particular attention to the complexities that can
arise when multiple language convention in these matters need to arise when multiple language conventions in these matters need to
be accommodated. The discussion in Section 10 applies to both be accommodated. The discussion in Section 10 applies to both
client or server, although issues relating to the client's client or server, although issues relating to the client's
knowledge are dealt with in Section 11. knowledge are dealt with in Section 11.
o There is additional material, dealing with the implications of o There is additional material, dealing with the implications of
server-side internationalization-related file name processing for server-side internationalization-related file name processing for
clients that cache the results of READDIR's. This includes a clients that cache the results of READDIR's. This includes a
discussion of options to deal with the current lack of detailed discussion of options to deal with the current lack of detailed
information about the server (in Section 11.2), and options for information about the server (in Section 11.2), and options for
handling when more detailed information is available (in handling when more detailed information is available (in
skipping to change at page 10, line 13 skipping to change at page 10, line 13
discussed in Section 7.2 discussed in Section 7.2
7.1. The Attribute Fs_charset_cap in Published NFSv4.1 Specifications 7.1. The Attribute Fs_charset_cap in Published NFSv4.1 Specifications
We reproduce Section 14.4 of [9] below, with comments interspersed We reproduce Section 14.4 of [9] below, with comments interspersed
trying to make sense of what is there, in order to arrive at an trying to make sense of what is there, in order to arrive at an
appropriate replacement, to be presented in Section 7.2. In that appropriate replacement, to be presented in Section 7.2. In that
connection, we need to understand better a few issues: connection, we need to understand better a few issues:
o The use of two bits while one is clearly adequate, given the o The use of two bits while one is clearly adequate, given the
subject matter actually mentioned subject matter actually mentioned.
o The mention of possible "capabilities" which could not possibly be o The mention of possible "capabilities" which could not possibly be
realized. realized.
o The use of the RFC2119 keyword "SHOULD" in contexts in which this o The use of the RFC2119 keyword "SHOULD" in contexts in which this
term is clearly inappropriate. term is clearly inappropriate.
Issues related to the confusion caused by mention of "UTF-8 Issues related to the confusion caused by mention of "UTF-8
characters" and the lack of mention of Unicode will be addressed in characters" and the lack of mention of Unicode will be addressed in
the revision in Section 7.2 but will not be further discussed here. the revision in Section 7.2 but will not be further discussed here.
skipping to change at page 10, line 38 skipping to change at page 10, line 38
typedef uint32_t fs_charset_cap4; typedef uint32_t fs_charset_cap4;
While it is made clear that two separate bits are to be provided, While it is made clear that two separate bits are to be provided,
their names seem to indicate that they should be complements of one their names seem to indicate that they should be complements of one
another. As a way of understanding why two bits were specified, it another. As a way of understanding why two bits were specified, it
is helpful to consider a possible boolean attribute as a potential is helpful to consider a possible boolean attribute as a potential
replacement. That attribute would clearly govern whether names that replacement. That attribute would clearly govern whether names that
do not conform to the rules of UTF-8 are to be rejected, which was a do not conform to the rules of UTF-8 are to be rejected, which was a
"MUST" in RFC3530 [20]. Although conveying this information is "MUST" in RFC3530 [20]. Although conveying this information is
clearly part of the motivation, stating so clearly might have been clearly part of the motivation, stating so clearly might have been
judged by the authors as too provocative, given the role of IESG in judged by the authors as unnecessarily provocative, given the role of
arriving at the internationalization approach specified in RFC3530. IESG in arriving at the internationalization approach specified in
RFC3530.
Because some operating environments and file systems do not Because some operating environments and file systems do not
enforce character set encodings, enforce character set encodings,
It is clear that the ability of operating environments to enforce use It is clear that the ability of operating environments to enforce use
of UTF-8 encoding is not an issue, since RFC3530 made this the of UTF-8 encoding is not an issue, since RFC3530 made this the
responsibility of the server implementation. That mandate was never responsibility of the server implementation. That mandate was never
followed because implementers chose not to follow it, and not because followed because implementers chose not to follow it, and not because
they were unable to do so. The apparently confused statement above they were unable to do so. The apparently confused statement above
is best understood if one notes that its essential job is to state is best understood if one notes that its essential job is to state
that the "MUST" in RFC3530 referred to above is not reasonable. that the "MUST" in RFC3530 referred to above is not reasonable.
However, the authors might well feel unable to say so clearly, in However, the authors might well have felt unable to say so clearly,
light of the potential IESG reaction. in light of the potential IESG reaction.
NFSv4.1 supports the fs_charset_cap attribute (Section 5.8.2.11) NFSv4.1 supports the fs_charset_cap attribute (Section 5.8.2.11)
that indicates to the client a file system's UTF-8 capabilities. that indicates to the client a file system's UTF-8 capabilities.
The problem with the mention of (plural) capabilities is that the The problem with the mention of (plural) capabilities is that the
only capability mentioned which servers could implement is to accept only capability mentioned which servers could implement is to accept
strings which are not valid UTF-8. There are other potential strings which are not valid UTF-8. There are other potential
capabilities having to do with the implementation of canonical capabilities having to do with the implementation of canonical
equivalence, but since they were not mentioned, they will not be equivalence, but since they were not mentioned, they will not be
discussed further here. discussed further here.
skipping to change at page 13, line 8 skipping to change at page 13, line 8
into account the issues noted in Section 7.1. Given there was a into account the issues noted in Section 7.1. Given there was a
working group consensus to adopt the confusing language discussed working group consensus to adopt the confusing language discussed
there, we must now adopt, by consensus, a clearer replacement that there, we must now adopt, by consensus, a clearer replacement that
reflects the working group's intentions. Given the passage of time reflects the working group's intentions. Given the passage of time
and the changed context, it might not be possible to determine those and the changed context, it might not be possible to determine those
intentions. In any case, we will have to be aware of how this intentions. In any case, we will have to be aware of how this
attribute was implemented and used, particularly with regard to the attribute was implemented and used, particularly with regard to the
first flag, whose meaning remains obscure. first flag, whose meaning remains obscure.
The following treatment is proposed as a basis for discussion, with The following treatment is proposed as a basis for discussion, with
the understanding that it would need to be changed, if it raises the understanding that it would need to be changed, if it could raise
interoperability issues. interoperability issues.
const FSCHARSET_CAP4_CONTAINS_NON_UTF8 = 0x1; const FSCHARSET_CAP4_CONTAINS_NON_UTF8 = 0x1;
const FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 = 0x2; const FSCHARSET_CAP4_ALLOWS_ONLY_UTF8 = 0x2;
typedef uint32_t fs_charset_cap4; typedef uint32_t fs_charset_cap4;
This attribute provides a simple way of determining whether a This attribute provides a simple way of determining whether a
particular file system behaves as a UTF-8-only server and rejects particular file system behaves as a UTF-8-only server and rejects
file names which are not valid UTF-8 strings. When this attribute file names which are not valid UTF-8 strings. When this attribute
skipping to change at page 16, line 35 skipping to change at page 16, line 35
to emulate the server's name handling, would need information about to emulate the server's name handling, would need information about
how certain cases are to be dealt with. In cases in which that how certain cases are to be dealt with. In cases in which that
information is unavailable, the client needs to avoid making information is unavailable, the client needs to avoid making
assumptions about the server's handling, since it will be unaware of assumptions about the server's handling, since it will be unaware of
the Unicode version implemented by the server, or many of the details the Unicode version implemented by the server, or many of the details
of specific issues that might need to be addressed differently by of specific issues that might need to be addressed differently by
different server file systems in implementing case-insensitive name different server file systems in implementing case-insensitive name
handling. handling.
Many of the problematic issues with regard to the case-insensitive Many of the problematic issues with regard to the case-insensitive
handling of name are discussed in Section 5.18 of the Unicode handling of names are discussed in Section 5.18 of the Unicode
Standard [12] which deals with case mapping. While we need to Standard [12] which deals with case mapping. While we need to
address all of these issues as well, our approach will not be exactly address all of these issues as well, our approach will not be exactly
the same. the same.
o Since the client will be doing case-insensitive comparisons, o Since the client will be doing case-insensitive comparisons,
issues that apply only to uppercasing or lowercasing do not have issues that apply only to uppercasing or lowercasing do not have
the same significance. the same significance.
o Many clients will have to operate correctly even in the absence of o Many clients will have to operate correctly even in the absence of
detailed information about the specifics of server case-mapping or detailed information about the specifics of server case-mapping or
the version on Unicode implemented by the server. the version of Unicode implemented by the server.
o Clients will have to accommodate server behaviors not anticipated o Clients will have to accommodate server behaviors not anticipated
by the Unicode Specification since the neither the server nor the by the Unicode Specification since it might be that neither the
client might have any locale knowledge when file names are server nor the client would have any relevant locale knowledge
processed. when file names are processed.
Another source of information about case-folding, and indirectly Another source of information about case-folding, and indirectly
about case-insensitive comparisons, is the case-folding text file about case-insensitive comparisons, is the case-folding text file
which is part of the Unicode Standard [13]. This file contains, for which is part of the Unicode Standard [13]. This file contains, for
each Unicode character that can be uppercased or lowercased, a single each Unicode character that can be uppercased or lowercased, a single
character, or, in some cases a string of characters of the other character, or, in some cases a string of characters of the other
case. For characters in capital case, the lowercase counterpart is case. For characters in capital case, the lowercase counterpart is
given. Each of the mappings is characterized as of one of four given. Each of the mappings is characterized as of one of four
types: types:
skipping to change at page 34, line 18 skipping to change at page 34, line 18
particular item, but the full implications must be understood and particular item, but the full implications must be understood and
carefully weighed before choosing a different course". To fully carefully weighed before choosing a different course". To fully
understand a particular "SHOULD", there needs to be enough context understand a particular "SHOULD", there needs to be enough context
to determine whether particular reasons for ignoring the item are to determine whether particular reasons for ignoring the item are
in fact valid, and sufficient guidance to understand the in fact valid, and sufficient guidance to understand the
implication of ignoring the item. In the absence of such implication of ignoring the item. In the absence of such
information, the relevant fact is that the peer needs to deal with information, the relevant fact is that the peer needs to deal with
the item being ignored, making the implications of a "SHOULD" hard the item being ignored, making the implications of a "SHOULD" hard
to distinguish from those of "MAY". to distinguish from those of "MAY".
o While the document states. "the general rules for handling all of o While the document states, "the general rules for handling all of
these domain-related strings are similar and independent of the these domain-related strings are similar and independent of the
role of the sender or receiver as client or server", all of the role of the sender or receiver as client or server", all of the
following text is explicitly about the server's options, choices following text is explicitly about the server's options, choices
and responsibilities, leaving the client case unclear. and responsibilities, leaving the client case unclear.
o In a number of places within the paragraph describing server o In a number of places within the paragraph describing server
approach #1, the word "can" is used as in the text "the server can approach #1, the word "can" is used as in the text "the server can
use the ToUnicode function", leaving it unclear whether the server use the ToUnicode function", leaving it unclear whether the server
can choose to do anything else and if so what. can choose to do anything else and if so what.
skipping to change at page 35, line 31 skipping to change at page 35, line 31
principally because the physical file systems used assume that principally because the physical file systems used assume that
user and group identifiers fit in 32 bits each and the vnode user and group identifiers fit in 32 bits each and the vnode
interfaces used by server implementations make similar interfaces used by server implementations make similar
assumptions. assumptions.
Given these restrictions, the typical implementation pattern is Given these restrictions, the typical implementation pattern is
for servers to accept only a single domain, specified as part of for servers to accept only a single domain, specified as part of
the server configuration, together with information necessary to the server configuration, together with information necessary to
effect the appropriate name-to-id mappings. effect the appropriate name-to-id mappings.
o The other uses of domain names in NFSv4, to represent hostnames in o The other uses of domain names in NFSv4, to represent host names
location attributes, the values are generated by the server and in location attributes, the values are generated by the server and
will normally include only include hostnames within DNS-registered will normally include only include host names within DNS-
domains. registered domains.
Keeping the above in mind, we can see that interoperability issues, Keeping the above in mind, we can see that interoperability issues,
while they might exist are unlikely to raise major challenges as while they might exist are unlikely to raise major challenges as
looking to the following specific cases shows looking to the following specific cases shows
o When an internationalized domain name is used as part of a user or o When an internationalized domain name is used as part of a user or
group, it would need to be configured as such, with the domain group, it would need to be configured as such, with the domain
string known to both client and server. string known to both client and server.
While it is theoretically possible that a client might work with While it is theoretically possible that a client might work with
skipping to change at page 39, line 43 skipping to change at page 39, line 43
16. IANA Considerations 16. IANA Considerations
The current document does not require any actions by IANA. The current document does not require any actions by IANA.
17. Security Considerations 17. Security Considerations
Unicode in the form of UTF-8 is generally is used for file component Unicode in the form of UTF-8 is generally is used for file component
names (i.e., both directory and file components). However, other names (i.e., both directory and file components). However, other
character sets may also be allowed for these names. For the owner character sets may also be allowed for these names. For the owner
and owner_group attributes and other sorts strings whose form is and owner_group attributes and other sorts strings whose form is
affected by standard outside NFSv4 (see Section 12.) are always affected by standards outside NFSv4 (see Section 12.) are always
encoded as UTF-8. String processing (e.g., Unicode normalization) encoded as UTF-8. String processing (e.g., Unicode normalization)
raises security concerns for string comparison. See Sections 12 and raises security concerns for string comparison. See Sections 12 and
9 as well as the respective Sections 5.9 of RFC7530 [3] and RFC5661 9 as well as the respective Sections 5.9 of RFC7530 [3] and RFC5661
[21] for further discussion. See [23] for related identifier [21] for further discussion. See [23] for related identifier
comparison security considerations. File component names are comparison security considerations. File component names are
identifiers with respect to the identifier comparison discussion in identifiers with respect to the identifier comparison discussion in
[23] because they are used to identify the objects to which ACLs are [23] because they are used to identify the objects to which ACLs are
applied (See the respective Sections 6 of RFC7530 [3] and RFC5661 applied (See the respective Sections 6 of RFC7530 [3] and RFC5661
[21]). [21]).
skipping to change at page 42, line 24 skipping to change at page 42, line 24
[22] Hoffman, P. and J. Klensin, "Terminology Used in [22] Hoffman, P. and J. Klensin, "Terminology Used in
Internationalization in the IETF", BCP 166, RFC 6365, Internationalization in the IETF", BCP 166, RFC 6365,
DOI 10.17487/RFC6365, September 2011, DOI 10.17487/RFC6365, September 2011,
<https://www.rfc-editor.org/info/rfc6365>. <https://www.rfc-editor.org/info/rfc6365>.
[23] Thaler, D., Ed., "Issues in Identifier Comparison for [23] Thaler, D., Ed., "Issues in Identifier Comparison for
Security Purposes", RFC 6943, DOI 10.17487/RFC6943, May Security Purposes", RFC 6943, DOI 10.17487/RFC6943, May
2013, <https://www.rfc-editor.org/info/rfc6943>. 2013, <https://www.rfc-editor.org/info/rfc6943>.
[24] Shepler, S., "NFS version 4 Protocol", draft-ietf- [24] Beame, C., Thurlow, R., Callaghan, B., Robinson, D.,
nfsv4-rfc3010bis-04 (work in progress), October 2002. Noveck, D., Eisler, M., and S. Shepler, "Network File
System (NFS) version 4 Protocol", draft-ietf-
nfsv4-rfc3010bis-05 (work in progress), November 2002.
[25] Williams, N., "Internationalization Considerations for [25] Williams, N., "Internationalization Considerations for
Filesystems and Filesystem Protocols", draft-williams- Filesystems and Filesystem Protocols", draft-williams-
filesystem-18n-00 (work in progress), July 2020. filesystem-18n-00 (work in progress), July 2020.
Appendix A. History Appendix A. History
This section describes the history of internationalization within This section describes the history of internationalization within
NFSv4. Despite the fact that NFSv4.0 and subsequent minor versions NFSv4. Despite the fact that NFSv4.0 and subsequent minor versions
have differed in many ways, the actual implementations of have differed in many ways, the actual implementations of
 End of changes. 18 change blocks. 
27 lines changed or deleted 30 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/