--- 1/draft-ietf-mmusic-sdp-new-07.txt 2006-02-05 00:28:38.000000000 +0100 +++ 2/draft-ietf-mmusic-sdp-new-08.txt 2006-02-05 00:28:38.000000000 +0100 @@ -1,16 +1,16 @@ Internet Engineering Task Force MMUSIC WG INTERNET-DRAFT Mark Handley/ACIRI -draft-ietf-mmusic-sdp-new-07.txt Van Jacobson/Packet Design +draft-ietf-mmusic-sdp-new-08.txt Van Jacobson/Packet Design Colin Perkins/ISI - 26 March 2002 - Expires: September 2002 + 19 April 2002 + Expires: October 2002 SDP: Session Description Protocol Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups @@ -39,27 +39,27 @@ purposes of session announcement, session invitation, and other forms of multimedia session initiation. 1. Introduction On the Internet multicast backbone (Mbone), a session directory tool is used to advertise multimedia conferences and communicate the conference addresses and conference tool-specific information necessary for participation. This document defines a session description protocol for this purpose, and for general real-time multimedia session description -purposes. This draft does not describe multicast address allocation or +purposes. This memo does not describe multicast address allocation or the distribution of SDP messages. These are described in accompanying drafts. SDP is not intended for negotiation of media encodings. 2. Background -The Mbone is the part of the internet that supports IP multicast, and +The Mbone is the part of the Internet that supports IP multicast, and thus permits efficient many-to-many communication. It is used extensively for multimedia conferencing. Such conferences usually have the property that tight coordination of conference membership is not necessary; to receive a conference, a user at an Mbone site only has to know the conference's multicast group address and the UDP ports for the conference data streams. Session directories assist the advertisement of conference sessions and communicate the relevant conference setup information to prospective participants. SDP is designed to convey such information to recipients. @@ -103,54 +103,54 @@ discover and participate in a multimedia session. 3.1. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [13]. 4. Examples of SDP Usage -4.1. Session Initiation +4.1. Multicast Announcement + +In order to assist the advertisement of multicast multimedia conferences +and other multicast sessions, and to communicate the relevant session +setup information to prospective participants, a distributed session +directory may be used. An instance of such a session directory +periodically sends packets containing a description of the session to a +well known multicast group. These advertisements are received by other +session directories such that potential remote participants can use the +session description to start the tools required to participate in the +session. + +One protocol commonly used to implement such a distributed directory is +the Session Announcement Protocol, SAP [4]. SDP provides the recommended +session description format for such announcements. + +4.2. Session Initiation The Session Initiation Protocol, SIP [11] is an application-layer control protocol for creating, modifying and terminating sessions such as Internet multimedia conferences, Internet telephone calls and multimedia distribution. The SIP messages used to create sessions carry session descriptions which allow participants to agree on a set of compatible media types. These session descriptions are commonly formatted using SDP. -4.2. Streaming media +4.3. Streaming media The Real Time Streaming Protocol, RTSP [12], is an application-level protocol for control over the delivery of data with real-time properties. RTSP provides an extensible framework to enable controlled, on-demand delivery of real-time data, such as audio and video. It is necessary for RTSP to convey a description of the session to be -controlled: SDP is often used for this purpose. - -4.3. Multicast Announcement - -In order to assist the advertisement of multicast multimedia conferences -and other multicast sessions, and to communicate the relevant session -setup information to prospective participants, a distributed session -directory may be used. An instance of such a session directory -periodically sends packets containing a description of the session to a -well known multicast group. These advertisements are received by other -session directories such that potential remote participants can use the -session description to start the tools required to participate in the -session. - -One protocol commonly used to implement such a distributed directory is -the Session Announcement Protocol, SAP [4]. SDP provides the recommended -session description format for such announcements. +controlled. SDP is often used for this purpose. 4.4. Email and the World Wide Web Alternative means of conveying session descriptions include electronic mail and the World Wide Web. For both email and WWW distribution, the use of the MIME content type ``application/sdp'' MUST be used. This enables the automatic launching of applications for participation in the session from the WWW client or mail reader in a standard manner. Note that announcements of multicast sessions made only via email or the @@ -285,56 +286,52 @@ The SDP specification recommends the use of the ISO 10646 character sets in the UTF-8 encoding (RFC 2044) to allow many different languages to be represented. However, to assist in compact representations, SDP also allows other character sets such as ISO 8859-1 to be used when desired. Internationalization only applies to free-text fields (session name and background information), and not to SDP as a whole. 6. SDP Specification SDP session descriptions are entirely textual using the ISO 10646 -character set in UTF-8 encoding. SDP field names and attributes names +character set in UTF-8 encoding. SDP field names and attribute names use only the US-ASCII subset of UTF-8, but textual fields and attribute -values may use the full ISO 10646 character set. The textual form, as +values MAY use the full ISO 10646 character set. The textual form, as opposed to a binary encoding such as ASN/1 or XDR, was chosen to enhance portability, to enable a variety of transports to be used (e.g, session description in a MIME email message) and to allow flexible, text-based toolkits (e.g., Tcl/Tk ) to be used to generate and to process session descriptions. However, since SDP may be used in environments where the maximum permissable size of a session description is limited (e.g. SAP announcements; SIP transported in UDP), the encoding is deliberately compact. Also, since announcements may be transported via very -unreliable means (e.g., email) or damaged by an intermediate caching -server, the encoding was designed with strict order and formatting rules -so that most errors would result in malformed announcements which could -be detected easily and discarded. This also allows rapid discarding of +unreliable means or damaged by an intermediate caching server, the +encoding was designed with strict order and formatting rules so that +most errors would result in malformed announcements which could be +detected easily and discarded. This also allows rapid discarding of + encrypted announcements for which a receiver does not have the correct key. An SDP session description consists of a number of lines of text of the form = is always exactly one character and is case-significant. is a structured text string whose format depends on . It also will be case-significant unless a specific field defines otherwise. Whitespace MUST NOT be used either side of the `=' sign. In general is either a number of fields delimited by a single space character or a free format string. -A session description consists of a session-level description (details -that apply to the whole session and all media streams) and optionally -several media-level descriptions (details that apply only to a single -media stream). - -An announcement consists of a session-level section followed by zero or -more media-level sections. The session-level part starts with a `v=' -line and continues to the first media-level section. The media +A session description consists of a session-level section followed by +zero or more media-level sections. The session-level part starts with a +`v=' line and continues to the first media-level section. The media description starts with an `m=' line and continues to the next media description or end of the whole session description. In general, session-level values are the default for all media unless overridden by an equivalent media-level value. Some lines in each description are REQUIRED and some are OPTIONAL but all MUST appear in exactly the order given here (the fixed order greatly enhances error detection and allows for a simple parser). OPTIONAL items are marked with a `*'. @@ -390,27 +387,27 @@ u=http://www.cs.ucl.ac.uk/staff/M.Handley/sdp.03.ps e=mjh@isi.edu (Mark Handley) c=IN IP4 224.2.17.12/127 t=2873397496 2873404696 a=recvonly m=audio 49170 RTP/AVP 0 m=video 51372 RTP/AVP 31 m=application 32416 udp wb a=orient:portrait -Text records such as the session name and information are bytes strings -which may contain any byte with the exceptions of 0x00 (Nul), 0x0a +Text records such as the session name and information are octet strings +which may contain any octet with the exceptions of 0x00 (Nul), 0x0a (ASCII newline) and 0x0d (ASCII carriage return). The sequence CRLF -(0x0d0a) is used to end a record, although parsers should be tolerant +(0x0d0a) is used to end a record, although parsers SHOULD be tolerant and also accept records terminated with a single newline character. By default these byte strings contain ISO-10646 characters in UTF-8 -encoding, but this default may be changed using the `charset' attribute. +encoding, but this default MAY be changed using the `charset' attribute. Protocol Version v=0 The ``v='' field gives the version of the Session Description Protocol. There is no minor version number. Origin @@ -481,74 +478,67 @@ media streams. As such, they are most likely to be useful when a single session has more than one distinct media stream of the same media type. An example would be two different whiteboards, one for slides and one for feedback and questions. URI u= -o A URI is a Universal Resource Identifier as used by WWW clients - -o The URI should be a pointer to additional information about the - conference - -o This field is OPTIONAL, but if it is present it MUST be specified - before the first media field - -o No more than one URI field is allowed per session description +A URI is a Universal Resource Identifier as used by WWW clients. The URI +should be a pointer to additional information about the conference. This +field is OPTIONAL, but if it is present it MUST be specified before the +first media field. No more than one URI field is allowed per session +description Email Address and Phone Number e= p= -o These specify contact information for the person responsible for the - conference. This is not necessarily the same person that created - the conference announcement. - -o Inclusion of an email address or phone number is OPTIONAL. Note - that the previous version of SDP specified that either an email - field or a phone field MUST be specified, but this was widely - ignored. The change brings the specification into line with common - usage. +These specify contact information for the person responsible for the +conference. This is not necessarily the same person that created the +conference announcement. -o If these are present, they should be specified before the first - media field. +Inclusion of an email address or phone number is OPTIONAL. Note that the +previous version of SDP specified that either an email field or a phone +field MUST be specified, but this was widely ignored. The change brings +the specification into line with common usage. -o More than one email or phone field can be given for a session +If these are present, they MUST be specified before the first media +field. More than one email or phone field can be given for a session description. -o Phone numbers should be given in the conventional international - format - preceded by a ``+'' and the international country code. - There must be a space or a hyphen (``-'') between the country code - and the rest of the phone number. Spaces and hyphens may be used to - split up a phone field to aid readability if desired. For example: +Phone numbers should be given in the conventional international format - +preceded by a ``+'' and the international country code. There must be a +space or a hyphen (``-'') between the country code and the rest of the +phone number. Spaces and hyphens may be used to split up a phone field +to aid readability if desired. For example: p=+44-171-380-7777 or p=+1 617 253 6011 -o Both email addresses and phone numbers can have an optional free - text string associated with them, normally giving the name of the - person who may be contacted. This should be enclosed in parenthesis - if it is present. For example: +Both email addresses and phone numbers can have an optional free text +string associated with them, normally giving the name of the person who +may be contacted. This should be enclosed in parenthesis if it is +present. For example: e=mjh@isi.edu (Mark Handley) - The alternative RFC822 name quoting convention is also allowed for - both email addresses and phone numbers. For example, +The alternative RFC822 name quoting convention is also allowed for both +email addresses and phone numbers. For example, e=Mark Handley - The free text string should be in the ISO-10646 character set with - UTF-8 encoding, or alternatively in ISO-8859-1 or other encodings if - the appropriate charset session-level attribute is set. +The free text string SHOULD be in the ISO-10646 character set with UTF-8 +encoding, or alternatively in ISO-8859-1 or other encodings if the +appropriate charset session-level attribute is set. Connection Data c=
The ``c='' field contains connection data. A session announcement MUST contain either one ``c='' field in each media description (see below) or a ``c='' field at the session-level. It MAY contain a session-level ``c='' field and one additional ``c='' @@ -748,27 +738,28 @@ To make description more compact, times may also be given in units of days, hours or minutes. The syntax for these is a number immediately followed by a single case-sensitive character. Fractional units are not allowed - a smaller unit should be used instead. The following unit specification characters are allowed: d - days (86400 seconds) h - hours (3600 seconds) m - minutes (60 seconds) s - seconds (allowed for completeness but not recommended) + Thus, the above announcement could also have been written: r=7d 1h 0 25h - Monthly and yearly repeats cannot currently be directly specified - with a single SDP repeat time - instead separate "t" fields should - be used to explicitly list the session times. + Monthly and yearly repeats cannot be directly specified with a + single SDP repeat time - instead separate "t" fields should be used + to explicitly list the session times. z= .... o To schedule a repeated session which spans a change from daylight- saving time to standard time or vice-versa, it is necessary to specify offsets from the base repeat times. This is required because different time zones change time at different times of day, different countries change to or from daylight time on different dates, and some countries do not have daylight saving time at all. @@ -869,21 +859,21 @@ ``a=:''. An example might be that a whiteboard could have the value attribute ``a=orient:landscape'' Attribute interpretation depends on the media tool being invoked. Thus receivers of session descriptions should be configurable in their interpretation of announcements in general and of attributes in particular. Attribute names MUST be in the US-ASCII subset of ISO-10646/UTF-8. -Attribute values are byte strings, and MAY use any byte value except +Attribute values are octet strings, and MAY use any octet value except 0x00 (Nul), 0x0A (LF), and 0x0D (CR). By default, attribute values are to be interpreted as in ISO-10646 character set with UTF-8 encoding. Unlike other text fields, attribute values are NOT normally affected by the `charset' attribute as this would make comparisons against known values problematic. However, when an attribute is defined, it can be defined to be charset-dependent, in which case it's value should be interpreted in the session charset rather than in ISO-10646. Attributes that will be commonly used can be registered with IANA (see Appendix B). Unregistered attributes should begin with "X-" to prevent @@ -1039,21 +1029,21 @@ attributes. Up to one rtpmap attribute can be defined for each media format specified. Thus we might have: m=audio 49230 RTP/AVP 96 97 98 a=rtpmap:96 L8/8000 a=rtpmap:97 L16/8000 a=rtpmap:98 L16/11025/2 - RTP profiles that specify the use of dynamic payload types must + RTP profiles that specify the use of dynamic payload types MUST define the set of valid encoding names and/or a means to register encoding names if that profile is to be used with SDP. Experimental encoding formats can also be specified using rtpmap. RTP formats that are not registered as standard format names must be preceded by ``X-''. Thus a new experimental redundant audio stream called GSMLPC using dynamic payload type 99 could be specified as: m=audio 49232 RTP/AVP 99 a=rtpmap:99 X-GSMLPC/8000 @@ -1098,35 +1088,36 @@ charset specified for the session description if one is specified, or by default in ISO 10646/UTF-8. a=tool: This gives the name and version number of the tool used to create the session description. It is a session-level attribute, and is not dependent on charset. a=ptime: This gives the length of time in milliseconds represented by the - media in a packet. This is probably only meaningful for audio data. - It should not be necessary to know ptime to decode RTP or vat audio, - and it is intended as a recommendation for the - encoding/packetisation of audio. It is a media attribute, and is - not dependent on charset. + media in a packet. This is probably only meaningful for audio data, + but may be used with other media types if it makes sense. It should + not be necessary to know ptime to decode RTP or vat audio, and it is + intended as a recommendation for the encoding/packetisation of + audio. It is a media attribute, and is not dependent on charset. a=maxptime: The maximum amount of media which can be encapsulated in each - packet, expressed as time in milliseconds. The time shall be + packet, expressed as time in milliseconds. The time SHALL be calculated as the sum of the time the media present in the packet - represents. The time SHOULD be a multiple of the frame size. This is - probably only meaningful for audio data. It is a media attribute, - and is not dependent on charset. Note that this attribute was - introduced after RFC 2327, and non updated implementations will - ignore this attribute. + represents. The time SHOULD be a multiple of the frame size. This + attribute is probably only meaningful for audio data, but may be + used with other media types if it makes sense. It is a media + attribute, and is not dependent on charset. Note that this + attribute was introduced after RFC 2327, and non updated + implementations will ignore this attribute. a=rtpmap: /[/] See the section on Media Announcements (the ``m='' field). This may be either a session or media attribute. a=recvonly This specifies that the tools should be started in receive-only mode where applicable. It can be either a session or media attribute, and is not dependent on charset. Note that recvonly applies to the media @@ -1192,21 +1183,21 @@ specified with the following SDP attribute: a=charset:ISO-8859-1 This is a session-level attribute; if this attribute is present, it must be before the first media field. The charset specified MUST be one of those registered with IANA, such as ISO-8859-1. The character set identifier is a US-ASCII string and MUST be compared against the IANA identifiers using a case-insensitive comparison. If the identifier is not recognised or not supported, all strings - that are affected by it SHOULD be regarded as byte strings. + that are affected by it SHOULD be regarded as octet strings. Note that a character set specified MUST still prohibit the use of bytes 0x00 (Nul), 0x0A (LF) and 0x0d (CR). Character sets requiring the use of these characters MUST define a quoting mechanism that prevents these bytes appearing within text fields. a=sdplang: This can be a session level attribute or a media level attribute. As a session level attribute, it specifies the language for the session description. As a media level attribute, it specifies the @@ -1337,49 +1328,49 @@ One transport that will frequently be used to distribute session descriptions is the Session Announcement Protocol (SAP). SAP provides both encryption and authentication mechanisms but due to the nature of session announcements it is likely that there are many occasions where the originator of a session announcement cannot be authenticated because they are previously unknown to the receiver of the announcement and because no common public key infrastructure is available. On receiving a session description over an unauthenticated transport mechanism or from an untrusted party, software parsing the session -should take a few precautions. Session description contain information +should take a few precautions. Session descriptions contain information required to start software on the receivers system. Software that -parses a session description MUST not be able to start other software +parses a session description MUST NOT be able to start other software except that which is specifically configured as appropriate software to participate in multimedia sessions. It is normally considered -INAPPROPRIATE for software parsing a session description to start, on a +inappropriate for software parsing a session description to start, on a user's system, software that is appropriate to participate in multimedia sessions, without the user first being informed that such software will be started and giving their consent. Thus a session description -arriving by session announcement, email, sessioR multimedia,session page -SHOULD NOT deliver the user into an interactive +arriving by session announcement, email, session invitation, or WWW page +SHOULD NOT deliver the user into an interactive multimedia session without the user being aware that this will happen. As it is not always simple to tell whether a session is interactive or not, applications that are unsure should assume sessions are interactive. In this specification, there are no attributes which would allow the recipient of a session description to be informed to start multimedia tools in a mode where they default to transmitting. Under some circumstances it might be appropriate to define such attributes. If this is done an application parsing a session description containing such attributes SHOULD either ignore them, or inform the user that joining this session will result in the automatic transmission of multimedia data. The default behaviour for an unknown attribute is to ignore it. Session descriptions may be parsed at intermediate systems such as firewalls for the purposes of opening a hole in the firewall to allow the participation in multimedia sessions. It is considered -INAPPROPRIATE for a firewall to open such holes for unicast data streams +inappropriate for a firewall to open such holes for unicast data streams unless the session description comes in a request from inside the firewall. For multicast sessions, it is likely that local administrators will apply their own policies, but the exclusive use of "local" or "site-local" administrative scope within the firewall and the refusal of the firewall to open a hole for such scopes will provide separation of global multicast sessions from local ones. Appendix A: SDP Grammar @@ -1620,21 +1611,21 @@ ; generic sub-rules: primitives alpha-numeric = ALPHA / DIGIT POS-DIGIT = %x31-39 ; 1 - 9 ; external references: ; ALPHA, DIGIT, CRLF, SP, VCHAR: from RFC 2234 ; URI-reference: from RFC1630 and RFC2732 ; addr-spec: from RFC 2822 -Appendix B: Guidelines for registering SDP names with IANA +Appendix B: IANA Considerations There are seven field names that may be registered with IANA. Using the terminology in the SDP specification BNF, they are "media", "proto", "fmt", "att-field", "bwtype", "nettype" and "addrtype". "media" (eg, audio, video, application, data). The set of media is intended to be small and not to be extended except under rare circumstances. The same rules should apply for media names as for top-level MIME content types, and where possible @@ -1685,22 +1676,22 @@ For non-RTP formats, any unregistered format name may be registered. If there is a suitable mapping from a MIME subtype to the format, then the MIME subtype name should be registered. If there is no suitable mapping from a MIME subtype, a new name should be registered. In either case, unless there are strong reasons not to do so, a standards-track RFC SHOULD be produced describing the format and this RFC SHOULD be referenced in the registration. "att-field" (Attribute names) - Attribute field names MAY be registered with IANA, although this is - not compulsory, and unknown attributes are simply ignored. + Attribute field names SHOULD be registered with IANA, although this + is not compulsory, and unknown attributes are simply ignored. When an attribute is registered, it must be accompanied by a brief specification stating the following: o contact name, email address and telephone number o attribute-name (as it will appear in SDP) o long-form attribute name in English @@ -1734,24 +1725,24 @@ New bandwidth specifiers may be registered with IANA. The submission MUST reference a standards-track RFC specifying the semantics of the bandwidth specifier precisely, and indicating when it should be used, and why the existing registered bandwidth specifiers do not suffice. "nettype" (Network Type) New network types may be registered with IANA if SDP needs to be - used in the context of non-internet environments. Whilst these are + used in the context of non-Internet environments. Whilst these are not normally the preserve of IANA, there may be circumstances when - an Internet application needs to interoperate with a non-internet - application, such as when gatewaying an internet telephony call + an Internet application needs to interoperate with a non-Internet + application, such as when gatewaying an Internet telephony call into the PSTN. The number of network types should be small and should be rarely extended. A new network type cannot be registered without registering at least one address type to be used with that network type. A new network type registration MUST reference an RFC which gives details of the network type and address type and specifies how and when they would be used. Such an RFC MAY be Informational. "addrtype" (Address Type)