draft-ietf-mmusic-fid-02.txt | draft-ietf-mmusic-fid-03.txt | |||
---|---|---|---|---|
Internet Engineering Task Force Gonzalo Camarillo | Internet Engineering Task Force Gonzalo Camarillo | |||
Internet draft Jan Holler | Internet draft Jan Holler | |||
Goran AP Eriksson | Goran AP Eriksson | |||
Ericsson | Ericsson | |||
June 2001 | July 2001 | |||
Expires December 2001 | Expires January 2002 | |||
<draft-ietf-mmusic-fid-02.txt> | <draft-ietf-mmusic-fid-03.txt> | |||
Grouping of m lines in SDP | Grouping of media lines in SDP | |||
Status of this Memo | Status of this Memo | |||
This document is an Internet-Draft and is in full conformance with | This document is an Internet-Draft and is in full conformance with | |||
all provisions of Section 10 of RFC2026. | all provisions of Section 10 of RFC2026. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
other groups may also distribute working documents as Internet- | other groups may also distribute working documents as Internet- | |||
Drafts. Internet-Drafts are draft documents valid for a maximum of | Drafts. Internet-Drafts are draft documents valid for a maximum of | |||
skipping to change at line 33 | skipping to change at line 33 | |||
as reference material or to cite them other than as "work in | as reference material or to cite them other than as "work in | |||
progress." | progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
Abstract | Abstract | |||
This document defines two SDP attributes: "groupe" and "mid". They | This document defines two SDP attributes: "group" and "mid". They | |||
allow to group together several "m" lines for two different | allow to group together several "m" lines for two different | |||
purposes: for lip synchronization and for receiving media from a | purposes: for lip synchronization and for receiving media from a | |||
single flow (several media streams), encoded in different formats | single flow (several media streams), encoded in different formats | |||
during a particular session, in different ports and host interfaces. | during a particular session, in different ports and host interfaces. | |||
Camarillo/Holler/Eriksson 1 | Camarillo/Holler/Eriksson 1 | |||
Grouping of m lines in SDP | Grouping of media lines in SDP | |||
TABLE OF CONTENTS | TABLE OF CONTENTS | |||
1 Media stream identification attribute........................2 | 1 Terminology................................................2 | |||
2 Groupe attribute.............................................2 | 2 Media stream identification attribute......................2 | |||
3 Lip Synchronization (LS).....................................3 | 3 Group attribute............................................2 | |||
4 Flow Identification (FID)....................................3 | 4 Lip Synchronization (LS)...................................3 | |||
4.1 SIP and cellular access......................................3 | 5 Flow Identification (FID)..................................3 | |||
4.2 DTMF tones...................................................4 | 5.1 SIP and cellular access....................................4 | |||
5 Media flow definition........................................4 | 5.2 DTMF tones.................................................4 | |||
6 FID semantics................................................4 | 5.3 Media flow definition......................................5 | |||
7 Interactions of "groupe" with other media level attributes...5 | 5.4 FID semantics..............................................5 | |||
8 Usage of the "groupe" attribute in SIP.......................6 | 5.4.1 Interactions of "group" with other media level attributes..6 | |||
8.1 Backward compatibility.......................................6 | 5.4.2 Media in parallel..........................................7 | |||
8.2 Caller does not support fid..................................6 | 5.4.3 DTMF tones encoded as telephony events.....................8 | |||
8.3 Callee does not support fid..................................6 | 6 Usage of the "group" attribute in SIP......................8 | |||
9 Acknoledgements..............................................7 | 6.1 Media alignment............................................9 | |||
10 References..................................................7 | 6.2 Mid value in responses.....................................9 | |||
11 Authors³ Addresses..........................................7 | 6.3 Group value in responses...................................9 | |||
6.4 Backward compatibility....................................10 | ||||
6.4.1 Client does not support "group"...........................11 | ||||
6.4.2 Server does not support "group"...........................11 | ||||
7 Acknoledgements...........................................11 | ||||
8 References................................................11 | ||||
9 Authors³ Addresses........................................12 | ||||
1. Media stream identification attribute | 1 Terminology | |||
In this document, the key words "MUST", "MUST NOT", "REQUIRED", | ||||
"SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", | ||||
and "OPTIONAL" are to be interpreted as described in RFC 2119 [1] | ||||
and indicate requirement levels for compliant implementations. | ||||
2. Media stream identification attribute | ||||
A new "media stream identification" media attribute is defined. It | A new "media stream identification" media attribute is defined. It | |||
is used for identifying media streams within a session description. | is used for identifying media streams within a session description. | |||
Its formatting in SDP is described by the following BNF: | Its formatting in SDP [2] is described by the following BNF: | |||
mid-attribute = "a=mid:" identification-tag | mid-attribute = "a=mid:" identification-tag | |||
identification-tag = token | identification-tag = token | |||
The identification tag is unique within the SDP session description. | The identification tag MUST be unique within the SDP session | |||
description. | ||||
2. Group attribute | 3. Group attribute | |||
A new "group" session level attribute is defined. It is used for | A new "group" session level attribute is defined. It is used for | |||
grouping together different media streams. Its formatting in SDP is | grouping together different media streams. Its formatting in SDP is | |||
described by the following BNF: | described by the following BNF: | |||
groupe-attribute = "a=groupe:" semantics space | Camarillo/Holler/Eriksson 2 | |||
Grouping of media lines in SDP | ||||
group-attribute = "a=group:" semantics | ||||
2*(space identification-tag) | 2*(space identification-tag) | |||
semantics = "LS" | "FID" | semantics = "LS" | "FID" | |||
This document defines two standard semantics: LS (Lip | This document defines two standard semantics: LS (Lip | |||
Synchronization) and FID (Flow Identification). If in the future it | Synchronization) and FID (Flow Identification). If in the future it | |||
was needed to standardize further semantics they would need to be | was needed to standardize further semantics they would need to be | |||
defined in a standards track document. However, defining new | defined in a standards track document. However, defining new | |||
semantics apart from LS and FID is discouraged. Instead, it is | semantics apart from LS and FID is discouraged. Instead, it is | |||
RECOMMENDED to use other session description mechanisms such as | RECOMMENDED to use other session description mechanisms such as | |||
SDPng [1]. | SDPng [3]. | |||
Camarillo/Holler/Eriksson 2 | There MAY be several "a=group" lines in a session description. | |||
Grouping of m lines in SDP | ||||
There might be several "a=groupe" lines in a session description. | "a=group" lines that contain identification-tags that are not | |||
"a=groupe" lines that contain identification-tags that are not | present in the session description MUST be simply ignored. The | |||
present in the session description are simply ignored. The | application acts as if the "a=group" line did not exist. | |||
application acts as if the "a=groupe" line did not exist. | ||||
3. Lip Synchronization (LS) | 4. Lip Synchronization (LS) | |||
The play out of media streams that are grouped together using LS | The play out of media streams that are grouped together using LS | |||
semantics have to be synchronized. Synchronization is typically | semantics MUST be synchronized. Synchronization is typically | |||
performed using RTCP, which provides enough information to map time | performed using RTCP, which provides enough information to map time | |||
stamps from the different streams into a wall clock. | stamps from the different streams into a wall clock. | |||
The following example shows a session description where the audio | The following example shows a session description where the audio | |||
and the video stream have to be synchronized. | and the video stream have to be synchronized. | |||
v=0 | v=0 | |||
o=Laura 289083124 289083124 IN IP4 first.example.com | o=Laura 289083124 289083124 IN IP4 first.example.com | |||
t=0 0 | t=0 0 | |||
c=IN IP4 131.160.1.112 | c=IN IP4 131.160.1.112 | |||
a=groupe:LS 1 2 | a=group:LS 1 2 | |||
m=audio 30000 RTP/AVP 0 | m=audio 30000 RTP/AVP 0 | |||
a=mid:1 | a=mid:1 | |||
m=video 30002 RTP/AVP 31 | m=video 30002 RTP/AVP 31 | |||
a=mid:2 | a=mid:2 | |||
m=audio 30004 RTP/AVP 0 | ||||
a=mid:3 | ||||
4. Flow Identification (FID) | Note that although the third media stream is not present in the | |||
group line it still contains an mid attribute (mid:3). All the "m" | ||||
lines of a session description that uses "group" MUST be identified | ||||
with an "mid" attribute regardless of whether they appear or not in | ||||
the group line(s). | ||||
The RTSP RFC [2] defines a media stream as "a single media instance, | 5. Flow Identification (FID) | |||
e.g., an audio stream or a video stream as well as a single | ||||
whiteboard or shared application group. When using RTP, a stream | An "m" line in an SDP session description defines a media stream. | |||
consists of all RTP and RTCP packets created by a source within an | However, SDP does not define what a media stream is. To find the | |||
RTP session". | ||||
Camarillo/Holler/Eriksson 3 | ||||
Grouping of media lines in SDP | ||||
definition of a media stream we have to go to the RTSP | ||||
specification. The RTSP RFC [4] defines a media stream as "a single | ||||
media instance, e.g., an audio stream or a video stream as well as a | ||||
single whiteboard or shared application group. When using RTP, a | ||||
stream consists of all RTP and RTCP packets created by a source | ||||
within an RTP session". | ||||
This definition assumes that a single audio (or video) stream maps | This definition assumes that a single audio (or video) stream maps | |||
into an RTP session. The RTP RFC [3] defines an RTP session as | into an RTP session. To find the definition of an RTP session we go | |||
to the RTP specification. The RTP RFC [5] defines an RTP session as | ||||
follows: "For each participant, the session is defined by a | follows: "For each participant, the session is defined by a | |||
particular pair of destination transport addresses (one network | particular pair of destination transport addresses (one network | |||
address plus a port pair for RTP and RTCP)". | address plus a port pair for RTP and RTCP)". | |||
However, there are situations where a single media instance, (e.g., | While the previous definitions cover the most common cases, there | |||
an audio stream or a video stream) is sent using more than one RTP | are situations where a single media instance, (e.g., an audio stream | |||
session. Two examples (among many others) of this kind of situation | or a video stream) is sent using more than one RTP session. Two | |||
are cellular systems using SIP [4] and systems receiving DTMF tones | examples (among many others) of this kind of situation are cellular | |||
on a different host than the voice. | systems using SIP [6] and systems receiving DTMF tones on a | |||
different host than the voice. | ||||
4.1 SIP and cellular access | 5.1 SIP and cellular access | |||
Systems using a cellular access and SIP as a signalling protocol | Systems using a cellular access and SIP as a signalling protocol | |||
need to receive media over the air. During a session the media can | need to receive media over the air. During a session the media can | |||
be encoded using different codecs. The encoded media has to traverse | be encoded using different codecs. The encoded media has to traverse | |||
the radio interface. The radio interface is generally characterized | the radio interface. The radio interface is generally characterized | |||
by being bit error prone and associated with relatively high packet | by being bit error prone and associated with relatively high packet | |||
Camarillo/Holler/Eriksson 3 | ||||
Grouping of m lines in SDP | ||||
transfer delays. In addition, radio interface resources in a | transfer delays. In addition, radio interface resources in a | |||
cellular environment are scarce and thus expensive, which calls for | cellular environment are scarce and thus expensive, which calls for | |||
special measures in providing a highly efficient transport [5]. In | special measures in providing a highly efficient transport [7]. In | |||
order to get an appropriate speech quality in combination with an | order to get an appropriate speech quality in combination with an | |||
efficient transport, precise knowledge of codec properties are | efficient transport, precise knowledge of codec properties are | |||
required so that a proper radio bearer for the RTP session can be | required so that a proper radio bearer for the RTP session can be | |||
configured before transferring the media. These radio bearers are | configured before transferring the media. These radio bearers are | |||
dedicated bearers per media type, i.e. codec. | dedicated bearers per media type, i.e. codec. | |||
Cellular systems typically configure different radio bearers on | Cellular systems typically configure different radio bearers on | |||
different port numbers. Therefore, incoming media has to have | different port numbers. Therefore, incoming media has to have | |||
different destination port numbers for the different possible codecs | different destination port numbers for the different possible codecs | |||
in order to be routed properly to the correct radio bearer. Thus, | in order to be routed properly to the correct radio bearer. Thus, | |||
this is an example in which several RTP sessions are used to carry a | this is an example in which several RTP sessions are used to carry a | |||
single media instance (the encoded speech from the sender). | single media instance (the encoded speech from the sender). | |||
4.2 DTMF tones | 5.2 DTMF tones | |||
Some voice sessions include DTMF tones. Sometimes the voice handling | Some voice sessions include DTMF tones. Sometimes the voice handling | |||
is performed by a different host than the DTMF handling. [6] | is performed by a different host than the DTMF handling. [8] | |||
contains several examples of how application servers in the network | contains several examples of how application servers in the network | |||
gather DTMF tones for the user while the user receives the encoded | gather DTMF tones for the user while the user receives the encoded | |||
speech on his user agent. In this situations it is necessary to | speech on his user agent. In this situations it is necessary to | |||
establish two RTP sessions: one for the voice and the other for the | establish two RTP sessions: one for the voice and the other for the | |||
Camarillo/Holler/Eriksson 4 | ||||
Grouping of media lines in SDP | ||||
DTMF tones. Both RTP sessions are logically part of the same media | DTMF tones. Both RTP sessions are logically part of the same media | |||
instance. | instance. | |||
5. Media flow definition | 5.3 Media flow definition | |||
The previous examples show that the definition of a media stream in | The previous examples show that the definition of a media stream in | |||
[2] has to be updated. It cannot be assumed that a single media | [4] do not cover some scenarios. It cannot be assumed that a single | |||
instance maps into a single RTP session. Therefore, we introduce the | media instance maps into a single RTP session. Therefore, we | |||
definition of a media flow: | introduce the definition of a media flow: | |||
Media flow consists of a single media instance, e.g., an audio | Media flow consists of a single media instance, e.g., an audio | |||
stream or a video stream as well as a single whiteboard or shared | stream or a video stream as well as a single whiteboard or shared | |||
application group. When using RTP, a media flow comprises one or | application group. When using RTP, a media flow comprises one or | |||
more RTP sessions. | more RTP sessions. | |||
For instance, in a two party call where the voice exchanged can be | For instance, in a two party call where the voice exchanged can be | |||
encoded using GSM or PCM, the receiver wants to receive GSM on a | encoded using GSM or PCM, the receiver wants to receive GSM on a | |||
port number and PCM on a different port number. Two RTP sessions | port number and PCM on a different port number. Two RTP sessions | |||
will be established, one carrying GSM and the other carrying PCM. | will be established, one carrying GSM and the other carrying PCM. | |||
At any particular moment just one codec is in use. Therefore, at any | At any particular moment just one codec is in use. Therefore, at any | |||
moment one of the RTP sessions will not transport any voice. Here | moment one of the RTP sessions will not transport any voice. Here | |||
the systems are dealing with a single media flow, but two RTP | the systems are dealing with a single media flow, but two RTP | |||
sessions. | sessions. | |||
6. FID semantics | 5.4 FID semantics | |||
Several "m" lines grouped together using FID semantics form a media | Several "m" lines grouped together using FID semantics form a media | |||
flow. A media agent handling a media flow that comprises several "m" | flow. A media agent handling a media flow that comprises several "m" | |||
Camarillo/Holler/Eriksson 4 | ||||
Grouping of m lines in SDP | ||||
lines sends media to different destinations (IP address/port number) | lines sends media to different destinations (IP address/port number) | |||
depending on the codec used at any moment. If several "m" lines | depending on the codec used at any moment. | |||
contain the codec used media is sent to different destinations in | ||||
parallel. | ||||
For instance, a SIP user agent receives an INVITE with the following | For instance, a SIP user agent receives an INVITE with the following | |||
body: | body: | |||
v=0 | v=0 | |||
o=Laura 289083124 289083124 IN IP4 second.example.com | o=Laura 289083124 289083124 IN IP4 second.example.com | |||
t=0 0 | t=0 0 | |||
c=IN IP4 131.160.1.112 | c=IN IP4 131.160.1.112 | |||
a=groupe:FID 1 2 3 | a=group:FID 1 2 | |||
m=audio 30000 RTP/AVP 0 | m=audio 30000 RTP/AVP 3 | |||
a=rtpmap:3 GSM/8000 | ||||
a=mid:1 | a=mid:1 | |||
m=audio 30002 RTP/AVP 8 | m=audio 30002 RTP/AVP 97 | |||
a=rtpmap:97 AMR/8000 | ||||
a=fmtp:97 mode-set=0,2,5,7; mode-change-period=2; mode-change- | ||||
neighbor; maxframes=1 | ||||
a=mid:2 | a=mid:2 | |||
m=audio 30004 RTP/AVP 0 8 | ||||
a=mid:3 | ||||
At a particular point of time, if the media agent is sending PCM u- | This would be the SDP sent by a terminal using a cellular access. | |||
law (payload 0) it sends RTP packets to ports 30000 and 30004 (first | The terminal supports GSM on port 30000 and AMR on port 30002. When | |||
and third "m" lines). If it is sending PCM A-law (payload 8) it | the remote party sends GSM it will send RTP packets to port number | |||
sends RTP packets to ports 30002 and 30004 (second and third "m" | 30000. When AMR is the codec chosen, packets will be sent to port | |||
lines). | ||||
Note that if several "m" lines with the same fid value contain the | Camarillo/Holler/Eriksson 5 | |||
same codec the media agent MUST send media over several RTP sessions | Grouping of media lines in SDP | |||
at the same time. | ||||
7 Interactions of "groupe" with other media level attributes | 30002. Note that the remote party can switch between both codecs | |||
dynamically in the middle of the session. | ||||
In the previous example a system receives media on the same IP | ||||
address on different port numbers. The following example shows how a | ||||
system can receive different codecs on different IP addresses. | ||||
v=0 | ||||
o=Laura 289083124 289083124 IN IP4 third.example.com | ||||
t=0 0 | ||||
c=IN IP4 131.160.1.112 | ||||
a=group:FID 1 2 | ||||
m=audio 20000 RTP/AVP 0 | ||||
c=IN IP4 131.160.1.111 | ||||
a=rtpmap:0 PCMU/8000 | ||||
a=mid:1 | ||||
m=audio 30002 RTP/AVP 97 | ||||
a=rtpmap:97 AMR/8000 | ||||
a=fmtp:97 mode-set=0,2,5,7; mode-change-period=2; mode-change- | ||||
neighbor; maxframes=1 | ||||
a=mid:2 | ||||
The cellular terminal of this example only supports the AMR codec. | ||||
However, many current IP phones only support PCM (payload 0). In | ||||
order to be able to interoperate with them, the cellular terminal | ||||
uses a transcoder whose IP address is 131.160.1.111. The cellular | ||||
terminal includes in its SDP support for PCM at that IP address. | ||||
Remote systems will send AMR directly to the terminal but PCM will | ||||
be sent to the transcoder. The transcoder will be configured (using | ||||
whatever method) to convert the incoming PCM audio to AMR and send | ||||
it to the terminal. | ||||
5.4.1 Interactions of "group" with other media level attributes | ||||
Media level attributes affect a media stream defined by an "m" line. | Media level attributes affect a media stream defined by an "m" line. | |||
The presence of "groupe" does not modify this behavior. | The presence of "group" does not modify this behavior. | |||
For instance, a SIP user agent receives an INVITE with the following | This property can be used for different purposes. The example below | |||
body: | shows one possible use of this. A SIP user agent receives an INVITE | |||
with the following body: | ||||
v=0 | v=0 | |||
o=Laura 289083124 289083124 IN IP4 third.example.com | o=Laura 289083124 289083124 IN IP4 forth.example.com | |||
t=0 0 | t=0 0 | |||
c=IN IP4 131.160.1.112 | c=IN IP4 131.160.1.112 | |||
a=groupe:FID 1 2 | a=group:FID 1 2 | |||
m=audio 30000 RTP/AVP 0 | m=audio 30000 RTP/AVP 0 | |||
a=mid:1 | a=mid:1 | |||
m=audio 30002 RTP/AVP 8 | m=audio 30002 RTP/AVP 8 | |||
a=recvonly | a=recvonly | |||
a=mid:2 | a=mid:2 | |||
Camarillo/Holler/Eriksson 6 | ||||
Grouping of media lines in SDP | ||||
The media agent knows that at a certain moment it can send either | The media agent knows that at a certain moment it can send either | |||
PCM u-law to port number 30000 or PCM A-law to port number 30002. | PCM u-law to port number 30000 or PCM A-law to port number 30002. | |||
However, the media agent also knows that the other end will only | However, the media agent also knows that the other end will only | |||
send PCM u-law (payload 0). | send PCM u-law (payload 0). | |||
Camarillo/Holler/Eriksson 5 | Note that the "group" attribute used with FID semantics allows to | |||
Grouping of m lines in SDP | ||||
Note that the "groupe" attribute used with FID semantics allows to | ||||
express uni-directional codecs for a bi-directional media flow, as | express uni-directional codecs for a bi-directional media flow, as | |||
it is shown in the example above. | it is shown in the example above. | |||
8. Usage of the "groupe" attribute in SIP | 5.4.2 Media in parallel | |||
SIP [4] is an application layer protocol for establishing, | It can happen that different "m" lines grouped together using FID | |||
semantics contain the same codec. The SDP below shows one example of | ||||
this situation: | ||||
v=0 | ||||
o=Laura 289083124 289083124 IN IP4 fifth.example.com | ||||
t=0 0 | ||||
c=IN IP4 131.160.1.112 | ||||
a=groupe:FID 1 2 3 | ||||
m=audio 30000 RTP/AVP 0 | ||||
a=mid:1 | ||||
m=audio 30002 RTP/AVP 8 | ||||
a=mid:2 | ||||
m=audio 20000 RTP/AVP 0 8 | ||||
c=IN IP4 131.160.1.111 | ||||
a=recvonly | ||||
a=mid:3 | ||||
If several "m" lines contain the codec used at a certain point of | ||||
time media MUST be sent to different destinations in parallel. | ||||
At a particular point of time, if the media agent is sending PCM u- | ||||
law (payload 0) it sends RTP packets to 131.160.1.112 on port 30000 | ||||
and to 131.160.1.111 on port 20000 (first and third "m" lines). If | ||||
it is sending PCM A-law (payload 8) it sends RTP packets to | ||||
131.160.1.112 on port 30002 and to 131.160.1.111 on port 20000 | ||||
(second and third "m" lines). | ||||
The system that generated the SDP above supports PCM u-law on port | ||||
30000 and PCM A-law on port 30002. Besides, it uses an application | ||||
server whose IP address is 131.160.1.111 that records all the | ||||
conversation. That is why the application server always receives a | ||||
copy of the audio stream regardless of the codec being used at any | ||||
given moment (it receives both u-law and A-law). | ||||
Note that if several "m" lines grouped together using FID semantics | ||||
contain the same codec the media agent MUST send media over several | ||||
RTP sessions at the same time. | ||||
Camarillo/Holler/Eriksson 7 | ||||
Grouping of media lines in SDP | ||||
5.4.3 DTMF tones encoded as telephony events | ||||
DTMF tones can be transmitted using a regular voice codec or can be | ||||
transmitted as telephony events. The RTP payload for DTMF tones | ||||
treated as telephone events is described in RFC 2833 [9]. Below | ||||
there is an example of an SDP session description using FID | ||||
semantics and this payload type. | ||||
v=0 | ||||
o=Laura 289083124 289083124 IN IP4 sixth.example.com | ||||
t=0 0 | ||||
c=IN IP4 131.160.1.112 | ||||
a=group:FID 1 2 | ||||
m=audio 30000 RTP/AVP 0 | ||||
a=mid:1 | ||||
m=audio 20000 RTP/AVP 97 | ||||
c=IN IP4 131.160.1.111 | ||||
a=rtpmap:97 telephone-events | ||||
a=mid:2 | ||||
The remote party would send PCM encoded voice (payload 0) to | ||||
131.160.1.112 and DTMF tones encoded as telephony events to | ||||
131.160.1.111. Note that only voice or DTMF is sent at a particular | ||||
point of time. When DTMF tones are sent the first media stream does | ||||
not carry any data and when voice is sent there is no data in the | ||||
second media stream. FID semantics provide different destinations | ||||
for alternative codecs. | ||||
Some systems implement the RTP payload defined in RFC 2833, but when | ||||
they send DTMF tones they do not mute the voice channel. Therefore, | ||||
effectively they are sending two copies of the same DTMF tone: | ||||
encoded as voice and encoded as a telephony event. When the receiver | ||||
gets both copies it typically uses the telephony event rather than | ||||
the tone encoded as voice. FID semantics MUST NOT be used in this | ||||
context to group both media streams since such a system is not using | ||||
alternative codecs but rather different parallel encodings for the | ||||
same information. | ||||
6. Usage of the "group" attribute in SIP | ||||
SDP descriptions are used by several different protocols, SIP among | ||||
them. We include a section about SIP because the "group" attribute | ||||
will most likely be used mainly by SIP systems. | ||||
SIP [6] is an application layer protocol for establishing, | ||||
terminating and modifying multimedia sessions. SIP carries session | terminating and modifying multimedia sessions. SIP carries session | |||
descriptions in the bodies of the SIP messages but is independent | descriptions in the bodies of the SIP messages but is independent | |||
from the protocol used for describing sessions. SDP [7] is one of | from the protocol used for describing sessions. SDP [2] is one of | |||
the protocols that can be used for this purpose. | the protocols that can be used for this purpose. | |||
Appendix B of [4] describes the usage of SDP in relation to SIP. It | Camarillo/Holler/Eriksson 8 | |||
Grouping of media lines in SDP | ||||
6.1 Media alignment | ||||
Appendix B of [6] describes the usage of SDP in relation to SIP. It | ||||
states: "The caller and callee align their media description so that | states: "The caller and callee align their media description so that | |||
the nth media stream ("m=" line) in the caller³s session description | the nth media stream ("m=" line) in the caller³s session description | |||
corresponds to the nth media stream in the callee³s description." | corresponds to the nth media stream in the callee³s description." | |||
The presence of the "groupe" attribute in an SDP session description | The presence of the "group" attribute in an SDP session description | |||
does not modify this behavior. | does not modify this behavior. | |||
8.1 Backward compatibility | Since the "mid" attribute provides a means to label "m" lines it | |||
would be possible to perform media alignment using "mid" labels | ||||
rather than matching nth "m" lines. However this would not bring any | ||||
gain and would add complexity to implementations. Therefore SIP | ||||
systems MUST perform media alignment matching nth lines regardless | ||||
of the presence of the "group" or "mid" attributes. | ||||
6.2 Mid value in responses | ||||
The "mid" attribute is an identifier for a particular media stream. | ||||
Therefore, the "mid" value in the response MUST be the same as the | ||||
"mid" value in the request. Besides, subsequent requests such as re- | ||||
INVITEs MUST use the same "mid" value for the already existing media | ||||
streams. | ||||
6.3 Group value in responses | ||||
The "group" attribute in a response will typically be the same as | ||||
the one received in the request. However, there are situations when | ||||
both are different. In these situations the "group" value to be used | ||||
in the session is the one present in the response. | ||||
Note the "group value in the response" really refers to the | ||||
"group" value in the last SDP exchanged between both parties. | ||||
That is, if in the establishment of a particular session | ||||
(INVITE-200 OK-ACK) SDPs are present in the 200 OK and in the | ||||
ACK (not in the INVITE), the "group" value to be used during | ||||
the session will be the one in the ACK. | ||||
The example below shows how the callee refuses a media stream | ||||
offered by the caller setting its port number to zero. The "mid" | ||||
value corresponding to that media stream is removed from the "group" | ||||
value in the response. | ||||
SDP in the INVITE from caller to callee: | ||||
v=0 | ||||
o=Laura 289083124 289083124 IN IP4 seventh.example.com | ||||
t=0 0 | ||||
c=IN IP4 131.160.1.112 | ||||
a=group:FID 1 2 3 | ||||
m=audio 30000 RTP/AVP 0 | ||||
a=mid:1 | ||||
Camarillo/Holler/Eriksson 9 | ||||
Grouping of media lines in SDP | ||||
m=audio 30002 RTP/AVP 8 | ||||
a=mid:2 | ||||
m=audio 30004 RTP/AVP 3 | ||||
a=mid:3 | ||||
SDP in the INVITE from callee to caller: | ||||
v=0 | ||||
o=Bob 289083125 289083125 IN IP4 fifth.example.com | ||||
t=0 0 | ||||
c=IN IP4 131.160.1.113 | ||||
a=group:FID 1 3 | ||||
m=audio 20000 RTP/AVP 0 | ||||
a=mid:1 | ||||
m=audio 0 RTP/AVP 8 | ||||
a=mid:2 | ||||
m=audio 20002 RTP/AVP 3 | ||||
a=mid:3 | ||||
Note that although the media stream was refused the "mid" value was | ||||
still included. | ||||
6.4 Backward compatibility | ||||
An application that wants to be compliant to this specification MUST | ||||
support both "group" and "mid". Supporting just one of them would be | ||||
useless. | ||||
A SIP entity that receives a request that contains "group" and "mid" | ||||
attributes, understands them and it is willing to use the grouping | ||||
semantics offered returns a response that also contains "group" and | ||||
"mid" attributes. This way, the client that issued the request knows | ||||
that the server understood this extension. | ||||
Note that grouping of m lines is always requested by the issuer of | ||||
the request (the client), never by the issuer of the response (the | ||||
server). Since there is no response to a response in SIP, a server | ||||
that requested grouping in a response would not know whether the | ||||
"group" attribute was accepted by the client or not. A server that | ||||
wants to group media lines should issue another request after having | ||||
responded to the first one (a re-INVITE for instance). | ||||
This document does not define any SIP "Require" header. Therefore, | This document does not define any SIP "Require" header. Therefore, | |||
if one of the SIP user agents does not understand the "groupe" | if one of the SIP user agents does not understand the "group" | |||
attribute the standard SDP fall back mechanism is used. | attribute the standard SDP fall back mechanism is used. | |||
A system that understands the "groupe" attribute MUST add an "mid" | A client that does not want to perform grouping of media lines in a | |||
attribute to every "m" line in any SDP session description that it | session SHOULD NOT add "mid" lines either. The presence of "mid" | |||
generates. | lines would not be of any use for the server. Even if the server can | |||
see that the client supported "mid" (and obviously "group" also) it | ||||
would be impossible to know which particular semantics are supported | ||||
(LS or/and FID). | ||||
8.2 Caller does not support "groupe" | Camarillo/Holler/Eriksson 10 | |||
Grouping of media lines in SDP | ||||
This situation does not represent a problem. The SDP in the INVITE | 6.4.1 Client does not support "group" | |||
will not contain any "mid" attribute. The callee knows that the | ||||
caller does not support "groupe". | ||||
8.3 Callee does not support "groupe" | This situation does not represent a problem because grouping | |||
requests is always performed by clients, not by servers. If the | ||||
client does not support "group" this attribute will just not be | ||||
used. | ||||
The callee will ignore the "groupe" attribute, since it does not | 6.4.2 Server does not support "group" | |||
understand it. For LS semantics, the callee might decide to perform | ||||
or to not perform synchronization between media streams. | ||||
For FID semantics, the callee will consider that the session | The server will ignore the "group" attribute, since it does not | |||
understand it (it will also ignore the "mid" attribute). For LS | ||||
semantics, the server might decide to perform or to not perform | ||||
synchronization between media streams. | ||||
For FID semantics, the server will consider that the session | ||||
comprises several media streams. | comprises several media streams. | |||
Different implementations would behave in different ways. | Different implementations would behave in different ways. | |||
In the case of audio and different "m" lines for different codecs an | In the case of audio and different "m" lines for different codecs an | |||
implementation might decide to act as a mixer with the different | implementation might decide to act as a mixer with the different | |||
incoming RTP sessions, which is the correct behavior. | incoming RTP sessions, which is the correct behavior. | |||
Camarillo/Holler/Eriksson 6 | ||||
Grouping of m lines in SDP | ||||
An implementation might also decide to refuse the request (e.g. 488 | An implementation might also decide to refuse the request (e.g. 488 | |||
Not acceptable here or 606 Not Acceptable) because it contains | Not acceptable here or 606 Not Acceptable) because it contains | |||
several "m" lines. In this case, the callee does not support the | several "m" lines. In this case, the server does not support the | |||
type of session that the caller wanted to establish. In case the | type of session that the caller wanted to establish. In case the | |||
caller is willing to establish a simpler session anyway, he should | client is willing to establish a simpler session anyway, he should | |||
re-try the request without "groupe" attribute and only one "m" line | re-try the request without "group" attribute and only one "m" line | |||
per flow. | per flow. | |||
9. Acknowledgments | 7. Acknowledgments | |||
The authors would like to thank Jonathan Rosenberg, Adam Roach and | The authors would like to thank Jonathan Rosenberg, Adam Roach and | |||
Orit Levin for their feedback on this document. | Orit Levin for their feedback on this document. | |||
10. References | 8. References | |||
[1] D. Kutscher/J. Ott/C. Bormann, "Session Description and | [1] S. Bradner, "Key words for use in RFCs to Indicate Requirement | |||
Levels", RFC 2119, IETF; March 1997. | ||||
[2] M. Handley/V. Jacobson, "SDP: Session Description Protocol", RFC | ||||
2327, IETF; April 1998. | ||||
[3] D. Kutscher/J. Ott/C. Bormann, "Session Description and | ||||
Capability Negotiation", draft-ietf-mmusic-sdpng-00.txt, IETF; April | Capability Negotiation", draft-ietf-mmusic-sdpng-00.txt, IETF; April | |||
2001. Work in progress. | 2001. Work in progress. | |||
[2] H. Schulzrinne/A. Rao/R. Lanphier, "Real Time Streaming Protocol | [4] H. Schulzrinne/A. Rao/R. Lanphier, "Real Time Streaming Protocol | |||
(RTSP)", RFC 2326, IETF; April 1998. | (RTSP)", RFC 2326, IETF; April 1998. | |||
[3] H. Schulzrinne/S. Casner/R. Frederick/V. Jacobson, "RTP: A | Camarillo/Holler/Eriksson 11 | |||
Grouping of media lines in SDP | ||||
[5] H. Schulzrinne/S. Casner/R. Frederick/V. Jacobson, "RTP: A | ||||
Transport Protocol for Real-Time Applications", RFC 1889, IETF; | Transport Protocol for Real-Time Applications", RFC 1889, IETF; | |||
January 1996. | January 1996. | |||
[4] M. Handley/H. Schulzrinne/E. Schooler/J. Rosenberg, "SIP: | [6] M. Handley/H. Schulzrinne/E. Schooler/J. Rosenberg, "SIP: | |||
Session Initiation Protocol", RFC 2543, IETF; Mach 1999. | Session Initiation Protocol", RFC 2543, IETF; Mach 1999. | |||
[5] L. Westberg/M. Lindqvist, "Realtime Traffic over Cellular Access | [7] L. Westberg/M. Lindqvist, "Realtime Traffic over Cellular Access | |||
Networks", draft-westberg-realtime-cellular-03.txt, IETF; November | Networks", draft-westberg-realtime-cellular-04.txt, IETF; June 2001. | |||
2000. Work in progress. | Work in progress. | |||
[6] J. Rosemberg/P.Mataga/H.Schulzrinne, "An Applcation Server | [8] J. Rosenberg/P.Mataga/H.Schulzrinne, "An Application Server | |||
Component Architecture for SIP", draft-rosenberg-sip-app-components- | Component Architecture for SIP", draft-rosenberg-sip-app-components- | |||
00.txt, IETF; November 2000. Work in progress. | 00.txt, IETF; November 2000. Work in progress. | |||
[7] M. Handley/V. Jacobson, "SDP: Session Description Protocol", RFC | [9] H. Schulzrinne/S. Petrack, "RTP Payload for DTMF Digits, | |||
2327, IETF; April 1998. | Telephony Tones and Telephony Signals", RFC 2833, IETF; May 2000. | |||
11. Authors³ Addresses | 9. Authors³ Addresses | |||
Gonzalo Camarillo | Gonzalo Camarillo | |||
Ericsson | Ericsson | |||
Advanced Signalling Research Lab. | Advanced Signalling Research Lab. | |||
FIN-02420 Jorvas | FIN-02420 Jorvas | |||
Finland | Finland | |||
Phone: +358 9 299 3371 | Phone: +358 9 299 3371 | |||
Fax: +358 9 299 3052 | Fax: +358 9 299 3052 | |||
Email: Gonzalo.Camarillo@ericsson.com | Email: Gonzalo.Camarillo@ericsson.com | |||
Jan Holler | Jan Holler | |||
Ericsson Research | Ericsson Research | |||
Camarillo/Holler/Eriksson 7 | ||||
Grouping of m lines in SDP | ||||
S-16480 Stockholm | S-16480 Stockholm | |||
Sweden | Sweden | |||
Phone: +46 8 58532845 | Phone: +46 8 58532845 | |||
Fax: +46 8 4047020 | Fax: +46 8 4047020 | |||
Email: Jan.Holler@era.ericsson.se | Email: Jan.Holler@era.ericsson.se | |||
Goran AP Eriksson | Goran AP Eriksson | |||
Ericsson Research | Ericsson Research | |||
S-16480 Stockholm | S-16480 Stockholm | |||
Sweden | Sweden | |||
Phone: +46 8 58531762 | Phone: +46 8 58531762 | |||
Fax: +46 8 4047020 | Fax: +46 8 4047020 | |||
Email: Goran.AP.Eriksson@era.ericsson.se | Email: Goran.AP.Eriksson@era.ericsson.se | |||
Camarillo/Holler/Eriksson 8 | Camarillo/Holler/Eriksson 12 | |||
End of changes. | ||||
This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/ |