draft-ietf-avt-rtp-ipmr-02.txt   draft-ietf-avt-rtp-ipmr-03.txt 
Audio/Video Transport Working Group SPIRIT DSP
Audio/Video Transport Working Group Intended status: Informational
Internet Draft SPIRIT DSP
Intended status: Informational February 25, 2009
RTP Payload Format for IP-MR Speech Codec draft-ietf-avt-rtp-ipmr-02.txt RTP Payload Format for IP-MR Speech Codec draft-ietf-avt-rtp-ipmr-03.txt
Status of this Memo Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and This Internet-Draft is submitted to IETF in full conformance with the
BCP 79. provisions of BCP 78 and BCP 79.
Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights Copyright (c) 2009 IETF Trust and the persons identified as the document
reserved. authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF
Documents in effect on the date of publication of this document (http://trustee.ietf.org/license- This document is subject to BCP 78 and the IETF Trust's Legal Provisions
info). Please review these documents carefully, as they describe your rights and restrictions with Relating to IETF Documents in effect on the date of publication of this
document (http://trustee.ietf.org/license-info). Please review these
documents carefully, as they describe your rights and restrictions with
respect to this document. respect to this document.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, Internet-Drafts are working documents of the Internet Engineering Task
and its working groups. Note that other groups may also distribute working documents as Force (IETF), its areas, and its working groups. Note that other groups
Internet-Drafts. may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, Internet-Drafts are draft documents valid for a maximum of six months
replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts and may be updated, replaced, or obsoleted by other documents at any
as reference material or to cite them other than as "work in progress." time. It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This Internet-Draft will expire on August 25, 2009. This Internet-Draft will expire on October 14, 2009.
Abstract Abstract
This document specifies the payload format for packetization of SPIRIT IP-MR encoded speech This document specifies the payload format for packetization of SPIRIT
signals into the Real-time Transport Protocol (RTP). The payload format supports transmission IP-MR encoded speech signals into the Real-time Transport Protocol
of multiple frames per payload and introduced redundancy for robustness against packet loss. (RTP). The payload format supports transmission of multiple frames per
payload and introduced redundancy for robustness against packet loss.
Table of Contents Table of Contents
1. Introduction 2 1. Introduction......................................................3
2. IP-MR Codec Description 2 2. P-MR Codec Description............................................3
3. Payload Format 4 3. Payload Format....................................................4
3.1. Payload Format Structure 4 3.1. Payload Format Structure.....................................4
3.2. Payload Header 4 3.2. Payload Header...............................................4
3.3. Speech Table of Contents 5 3.3. Speech Table of Contents.....................................5
3.4. Speech Data 5 3.4. Speech Data..................................................6
3.5. Redundancy Header 5 3.5. Redundancy Header............................................6
3.6. Redundancy Table of Contents 6 3.6. Redundancy Table of Contents.................................7
3.7. Redundancy Data 6 3.7. Redundancy Data..............................................7
4. Payload Examples 6 4. Payload Examples..................................................7
4.1. Payload Carrying a Single Frame 6 4.1. Payload Carrying a Single Frame..............................7
4.2. Payload Carrying Multiple Frames with Redundancy 7 4.2. Payload Carrying Multiple Frames with Redundancy.............8
5. Media Type Registration 8 5. Media Type Registration...........................................9
5.1. Registration of media subtype audio/ip-mr_v2.5 8 5.1. Registration of media subtype audio/ip-mr_v2.5...............9
5.2. Mapping Media Type Parameters into SDP 9 5.2. Mapping Media Type Parameters into SDP......................10
6. Security Considerations 9 6. Security Considerations..........................................11
7. IANA Considerations 10 7. IANA Considerations..............................................11
8. Normative References 10 8. Normative References.............................................11
9. Author's Information 10 9. Author's Information ............................................11
10. Expiration date 10 10. Disclaimer......................................................12
11. Legal Terms 10 11. Legal Terms.....................................................12
1. Introduction 1. Introduction
This document specifies the payload format for packetization of SPIRIT IP-MR encoded speech
signals into the Real-time Transport Protocol (RTP). The payload format supports transmission This document specifies the payload format for packetization of SPIRIT
of multiple frames per payload and introduced redundancy for robustness against packet loss. IP-MR encoded speech signals into the Real-time Transport Protocol
(RTP). The payload format supports transmission of multiple frames per
payload and introduced redundancy for robustness against packet loss.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119. document are to be interpreted as described in RFC 2119 [RFC 2119].
2. IP-MR Codec Description 2. IP-MR Codec Description
The IP-MR codec is scalable adaptive multi-rate wideband speech codec designed by SPIRIT for The IP-MR codec is scalable adaptive multi-rate wideband speech codec
use in IP based networks. These codec is suitable for real time communications such as designed by SPIRIT for use in IP based networks. These codec is suitable
telephony and videoconferencing. for real time communications such as telephony and videoconferencing.
The codec operates on 20 ms frames at 16 kHz sampling rate and has an algorithmic delay of 25 The codec operates on 20 ms frames at 16 kHz sampling rate and has an
ms. algorithmic delay of 25ms.
The IP-MR supports six wide band speech coding modes with respective bit rates ranging from The IP-MR supports six wide band speech coding modes with respective bit
about 7.7 to about 34.2 kbps. The coding mode can be changed at any 20 ms frame boundary rates ranging from about 7.7 to about 34.2 kbps. The coding mode can be
making possible to dynamically adjust the speech encoding rate during a session to adapt to the changed at any 20 ms frame boundary making possible to dynamically
varying transmission conditions. adjust the speech encoding rate during a session to adapt to the varying
transmission conditions.
The coded frame consists of multiple coding layers-base (or core) layer and several enhancement The coded frame consists of multiple coding layers-base (or core) layer
layers which are coded independently. These enhancement layers can be omitted and remaining and several enhancement layers which are coded independently. These
base layer can be meaningfully decoded without artifacts. This making bit stream scalable and enhancement layers can be omitted and remaining base layer can be
allows reduce bit rate during transmission without re-encoding. meaningfully decoded without artifacts. This making bit stream scalable
and allows reduce bit rate during transmission without re-encoding.
This memo specifies an optional form of redundancy coding within RTP for protection against This memo specifies an optional form of redundancy coding within RTP for
packet loss. It is based on commonly known scheme when previously transmitted frames are protection against packet loss. It is based on commonly known scheme
aggregated together with new ones. Each frame is retransmitted once in the following RTP when previously transmitted frames are aggregated together with new
payload packet. f(n-2)...f(n+4) denote a sequence of speech frames, and p(n-1)...p(n+4) a ones. Each frame is retransmitted once in the following RTP payload
sequence of payload packets: packet. f(n-2)...f(n+4) denote a sequence of speech frames, and
p(n-1)...p(n+4) a sequence of payload packets:
--+--------+--------+--------+--------+--------+--------+--------+-- --+--------+--------+--------+--------+--------+--------+--------+--
| f(n-2) | f(n-1) | f(n) | f(n+1) | f(n+2) | f(n+3) | f(n+4) | | f(n-2) | f(n-1) | f(n) | f(n+1) | f(n+2) | f(n+3) | f(n+4) |
--+--------+--------+--------+--------+--------+--------+--------+-- --+--------+--------+--------+--------+--------+--------+--------+--
<---- p(n-1) ----> <---- p(n-1) ---->
<----- p(n) -----> <----- p(n) ----->
<---- p(n+1) ----> <---- p(n+1) ---->
<---- p(n+2) ----> <---- p(n+2) ---->
<---- p(n+3) ----> <---- p(n+3) ---->
<---- p(n+4) ----> <---- p(n+4) ---->
But because of scalable nature of IP-MR codec there is no need to duplicate a whole previous But because of scalable nature of IP-MR codec there is no need to
frame - only core layer may be retransmitted. This reduces redundancy overhead while keeping duplicate a whole previous frame - only core layer may be retransmitted.
efficiency. Moreover, the speech bits encoded in core layer are divided on six classes (from A to This reduces redundancy overhead while keeping efficiency. Moreover, the
F) of perceptual sensitivity to errors. Using these classes as introduced redundancy make speech bits encoded in core layer are divided on six classes (from A to
possible to adjust trade-off between overhead and robustness against packet loss. F) of perceptual sensitivity to errors. Using these classes as
introduced redundancy make possible to adjust trade-off between overhead
and robustness against packet loss.
The mechanism described does not really require signaling at the session setup. The sender is The mechanism described does not really require signaling at the session
responsible for selecting an appropriate amount of redundancy based on feedback about the setup. The sender is responsible for selecting an appropriate amount of
channel conditions. redundancy based on feedback about the channel conditions.
The main codec characteristics can be summarized as follows: The main codec characteristics can be summarized as follows:
* Wideband, 16 kHz, speech codec o Wideband, 16 kHz, speech codec
* Adaptive multi rate with six modes from about 7.7 to about 34.2 kbps o Adaptive multi rate with six modes from about 7.7 to about
34.2 kbps
* Bit rate scalable o Bit rate scalable
* Variable bit rate changing in accordance with actual speech content o Variable bit rate changing in accordance with actual speech
content
* Discontinuous Transmission (DTX), silence suppression and comfort noise generation o Discontinuous Transmission (DTX), silence suppression and
comfort noise generation
* In-band redundancy scheme for protection against packet loss o In-band redundancy scheme for protection against packet loss
3. Payload Format 3. Payload Format
3.1. Payload Format Structure 3.1. Payload Format Structure
The IP-MR payload format consists of a payload header with general information about packet, a The IP-MR payload format consists of a payload header with general
speech table of contents (TOC), and speech data. An optional redundancy section follows after information about packet, a speech table of contents (TOC), and speech
speech data. The redundancy section consists of redundancy header, redundancy TOC and data. An optional redundancy section follows after speech data. The
redundancy section consists of redundancy header, redundancy TOC and
redundancy data payload. redundancy data payload.
The following diagram shows the standard payload format layout: T The following diagram shows the standard payload format layout:
+---------+--------+--------+- - - - - - +- - - - - - +- - - - - - + +---------+--------+--------+- - - - - - +- - - - - - +- - - - - - +
| payload | speech | speech | redundancy | redundancy | redundancy | | payload | speech | speech | redundancy | redundancy | redundancy |
| header | TOC | data | header | TOC | data | | header | TOC | data | header | TOC | data |
+---------+--------+--------+- - - - - - +- - - - - - +- - - - - - + +---------+--------+--------+- - - - - - +- - - - - - +- - - - - - +
3.2. Payload Header 3.2. Payload Header
The payload header has the following format: The payload header has the following format:
0 1 0 1
0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+
|T| CR | BR |D|A|GR |R| |T| CR | BR |D|A|GR |R|
+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+
o T (1 bit): Reserved compatibility with future extensions. SHOULD
be set to 0.
* T (1 bit): Reserved compatibility with future extensions. SHOULD be set to 0. o CR (3 bits): coding rate of frame(s) in this packet, as per the
following table:
* CR (3 bits): coding rate of frame(s) in this packet, as per the following table:
+-------+--------------+ +-------+--------------+
| CR | avg. bitrate | | CR | avg. bitrate |
+-------+--------------+ +-------+--------------+
| 0 | 7.7 kbps | | 0 | 7.7 kbps |
| 1 | 9.8 kbps | | 1 | 9.8 kbps |
| 2 | 14.3 kbps | | 2 | 14.3 kbps |
| 3 | 20.8 kbps | | 3 | 20.8 kbps |
| 4 | 27.9 kbps | | 4 | 27.9 kbps |
| 5 | 34.2 kbps | | 5 | 34.2 kbps |
| 6 | (reserved) | | 6 | (reserved) |
| 7 | NO_DATA | | 7 | NO_DATA |
+-------+--------------+ +-------+--------------+
The CR value 7 (NO_DATA) indicates that there is no speech data (and speech TOC The CR value 7 (NO_DATA) indicates that there is no speech data (and
accordingly) in the payload. This MAY be used to transmit redundancy data only. The value 6 is speech TOC accordingly) in the payload. This MAY be used to transmit
reserved. If receiving this value the packet SHOULD be discarded. redundancy data only. The value 6 is reserved. If receiving this value
the packet SHOULD be discarded.
* BR (3 bits): base rate for core layer of frame(s) in this packet. Values in the range 0-5 o BR (3 bits): base rate for core layer of frame(s) in this packet.
indicate bitrates for core layer, same as for CR. Values 6 and 7 are reserved. If one of Values in the range 0-5 indicate bitrates for core layer, same as
these values is received the packet SHOULD be discarded. The base rate is the lowest for packet SHOULD be discarded. The base rate is the lowest rate
rate for scalability, so speech payload can be scaled down not lower than BR value. If a for scalability, so speech payload can be scaled down not lower
received packet has BR > CR then during decoding it will be assumed that BR = CR. than BR value. If a received packet has BR > CR then during
decoding it will be assumed that BR = CR.
* D (1 bit): indicates if the DTX mode is allowed or not. o D (1 bit): indicates if the DTX mode is allowed or not.
* A (1 bit): byte-aligned payload. If A=1 then all speech frames MUST be byte-aligned. o A (1 bit): byte-aligned payload. If A=1 then all speech frames
This mode speeds up speech data access. The A=0 value specifies bandwidth-efficient MUST be byte-aligned. This mode speeds up speech data access.
mode with no byte alignment (including end of header). The A=0 value specifies bandwidth-efficient mode with no byte
alignment(including end of header).
* GR (2 bits): number of frames in packet (grouping size). Actual grouping size is GR + 1, o GR (2 bits): number of frames in packet (grouping size). Actual
thus maximum grouping supported is 4. grouping size is GR + 1, thus maximum grouping supported is 4.
* R (1 bit): redundancy presence bit. If R=1 then the packet contains redundancy o R (1 bit): redundancy presence bit. If R=1 then the packet
information for lost packets recovery. In this case after speech data the redundancy contains redundancy information for lost packets recovery.
section is present. In this case after speech data the redundancy section is present.
3.3. Speech Table of Contents 3.3. Speech Table of Contents
The speech TOC contains entries for each frame in packet (grouping size in total). Each entry The speech TOC contains entries for each frame in packet (grouping size
contains a single field: in total). Each entry contains a single field:
0 0
+-+ +-+
|E| |E|
+-+ +-+
o E (1 bit): frame existence indicator. If set to 0, this indicates
the corresponding frame is absent and the receiver should set
special LOST_FRAME flag for decoder. This can be followed by the
lost frame itself or by empty frames generated by the encoder
during silence intervals in DTX mode.
* E (1 bit): frame existence indicator. If set to 0, this indicates the corresponding frame is Note that if CR flag from payload header is 7 (NO_DATA) then speech TOC
absent and the receiver should set special LOST_FRAME flag for decoder. This can be is empty.
followed by the lost frame itself or by empty frames generated by the encoder during
silence intervals in DTX mode.
Note that if CR flag from payload header is 7 (NO_DATA) then speech TOC is empty.
3.4. Speech Data 3.4. Speech Data
Speech data of a payload contains one or more speech frames or comfort noise frames, as Speech data of a payload contains one or more speech frames or comfort
specified in the speech TOC of the payload. noise frames, as specified in the speech TOC of the payload.
Each speech frame represents 20 ms of speech encoded with the rate indicated in the CR and Each speech frame represents 20 ms of speech encoded with the rate
base rate indicated in BR field of the payload header. The length of the speech frame is variable indicated in the CR and base rate indicated in BR field of the payload
due to the nature of the codec and can be calculated after decoding. header. The length of the speech frame is variable due to the nature of
the codec and can be calculated after decoding.
3.5. Redundancy Header 3.5. Redundancy Header
If a packet contains redundancy (R field of payload header is 1) the speech data is followed by If a packet contains redundancy (R field of payload header is 1) the
redundancy header: speech data is followed by redundancy header:
0 1 2 3 4 5 0 1 2 3 4 5
+-+-+-+-+-+-+ +-+-+-+-+-+-+
| CL1 | CL2 | | CL1 | CL2 |
+-+-+-+-+-+-+ +-+-+-+-+-+-+
Redundancy header consists of two fields. Each field contains class specifier for amount of Redundancy header consists of two fields. Each field contains class
redundancy partly taken from the preceding packet (CL1) and pre-preceding packet (CL2), e.g. specifier for amount of redundancy partly taken from the preceding
distant from the current packet by 1 and 2 packets accordingly. The values are listed in the table packet (CL1) and pre-preceding packet (CL2), e.g. distant from the
below: current packet by 1 and 2 packets accordingly. The values are listed
in the table below:
+-------+-------------------+ +-------+-------------------+
| CL | amount redundancy | | CL | amount redundancy |
+-------+-------------------+ +-------+-------------------+
| 0 | NONE | | 0 | NONE |
| 1 | CLASS A | | 1 | CLASS A |
| 2 | CLASS B | | 2 | CLASS B |
| 3 | CLASS C | | 3 | CLASS C |
| 4 | CLASS D | | 4 | CLASS D |
| 5 | CLASS E | | 5 | CLASS E |
| 6 | CLASS F | | 6 | CLASS F |
| 7 | (reserved) | | 7 | (reserved) |
+-------+-------------------+ +-------+-------------------+
Each specifier takes 3 bits, thus the total redundancy header size is 6 bits. Each specifier takes 3 bits, thus the total redundancy header size is 6
bits.
3.6. Redundancy Table of Contents 3.6. Redundancy Table of Contents
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Pkt1 Entries| Pkt2 Entries| | Pkt1 Entries| Pkt2 Entries|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The redundancy TOC contains entries for redundancy frames from preceding and pre-preceding The redundancy TOC contains entries for redundancy frames from preceding
packets. Each entry takes 1 bit like speech TOC entry (3.3): and pre-preceding packets. Each entry takes 1 bit like speech TOC entry
(3.3):
0 0
+-+ +-+
|E| |E|
+-+ +-+
* E (1 bit): frame existence indicator. If set to 0, this indicates the corresponding frame is o E (1 bit): frame existence indicator. If set to 0, this indicates
absent. the corresponding frame is absent.
* For each preceding and pre-preceding packet the number of entries is equal to the o For each preceding and pre-preceding packet the number of entries
grouping size of the current packet. E.g. maximum number of entries is 4*2 = 8. is equal to the grouping size of the current packet. E.g. maximum
number of entries is 4*2 = 8.
* If class specifier in the redundancy header is CL=0 (NO_DATA) then there is no entries o If class specifier in the redundancy header is CL=0 (NO_DATA)
for corresponding packet redundancy. then there is no entries for corresponding packet redundancy.
3.7. Redundancy Data 3.7. Redundancy Data
Redundancy data of a payload contains redundancy information for one or more speech frames Redundancy data of a payload contains redundancy information for one or
or comfort noise frames that may be lost during transition, as specified in the redundancy TOC more speech frames or comfort noise frames that may be lost during
of the payload. Actually redundancy is the most important part of preceding frames representing transition, as specified in the redundancy TOC of the payload. Actually
20 ms of speech. This data MAY be used for partial reconstruction of lost frames. The amount of redundancy is the most important part of preceding frames representing
available redundancy is specified by CL flag in redundancy header section (3.5). This flag 20 ms of speech. This data MAY be used for partial reconstruction of
SHOULD be passed to decoder. The length of redundancy frame is variable and can be lost frames. The amount of available redundancy is specified by CL flag
in redundancy header section (3.5). This flag SHOULD be passed to
decoder. The length of redundancy frame is variable and can be 281
calculated after decoding. calculated after decoding.
4. Payload Examples 4. Payload Examples
A few examples to highlight the payload format follow. A few examples to highlight the payload format follow.
4.1. Payload Carrying a Single Frame 4.1. Payload Carrying a Single Frame
The following diagram shows a standard IP-MR payload carrying a single speech frame without The following diagram shows a standard IP-MR payload carrying a single
redundancy: speech frame without redundancy:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0|CR=1 |BR=0 |0|0|0 0|0|1|sp(0) | |0|CR=1 |BR=0 |0|0|0 0|0|1|sp(0) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| sp(193)|P| | sp(193)|P|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
In the payload the speech frame is not damaged at the IP origin (E=1), the coding rate is 9.7 kbps In the payload the speech frame is not damaged at the IP origin (E=1),
(CR=1), the base rate is 7.8 kbps (BR=0), and the DTX mode is off. There is no byte alignment the coding rate is 9.7 kbps(CR=1), the base rate is 7.8 kbps (BR=0), and
(A=0) and no redundancy (R=0). The encoded speech bits - s(0) to s(193) - are placed the DTX mode is off. There is no byte alignment (A=0) and no redundancy
immediately after TOC. Finally, one zero bit is added at the end as padding to make the payload (R=0). The encoded speech bits - s(0) to s(193) - are placed immediately
byte aligned. after TOC. Finally, one zero bit is added at the end as padding to make
the payload byte aligned.
4.2. Payload Carrying Multiple Frames with Redundancy 4.2. Payload Carrying Multiple Frames with Redundancy
The following diagram shows a payload that contains three frames, one of them with no speech The following diagram shows a payload that contains three frames, one of
data. The coding rate is 7.7 kbps (CR=0), the base rate is 7.7 kbps (BR=0), and the DTX mode them with no speech data. The coding rate is 7.7 kbps (CR=0), the base
is on. The speech frames are byte aligned (A=1), so 1 zero bit is added at the end of the header. rate is 7.7 kbps (BR=0), and the DTX mode is on. The speech frames are
Besides the speech frames the payload contains six redundancy frames (three per each delayed byte aligned (A=1), so 1 zero bit is added at the end of the header.
packet). Besides the speech frames the payload contains six redundancy frames
(three per each delayed packet).
The first speech frame consists of bits sp1(0) to sp1(92). After that 3 bits are added for byte The first speech frame consists of bits sp1(0) to sp1(92). After that 3
alignment. The second frame does not contain any speech information that is represented in the bits are added for byte alignment. The second frame does not contain any
payload by its TOC entry. The third frame consists of bits sp3(0) to sp3(171). speech information that is represented in the payload by its TOC entry.
The third frame consists of bits sp3(0) to sp3(171).
The redundancy header follows after speech data. The one-packet- delayed redundancy contains The redundancy header follows after speech data. The one-packet- delayed
class A+B bits (CL1=2), and two-packet- delayed redundancy contains class A bits (Cl2=1). The redundancy contains class A+B bits (CL1=2), and two-packet- delayed
one-packet- delayed redundancy contains three frames with 20, 39 and 35 bits respectively. The redundancy contains class A bits (Cl2=1). The one-packet- delayed
first frame of two-packet-delayed redundancy is absent, it is represented in its TOC entry, and redundancy contains three frames with 20, 39 and 35 bits respectively.
two other frames have sizes 15 and 19 bits. The first frame of two-packet-delayed redundancy is absent, it is
represented in its TOC entry, and two other frames have sizes 15 and 19
bits.
Note that all speech frames are padded with zero bits for byte alignment. Note that all speech frames are padded with zero bits for byte
alignment.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0|CR=0 |BR=0 |1|1|1 0|1|1 0 1|P|sp1(0) | |0|CR=0 |BR=0 |1|1|1 0|1|1 0 1|P|sp1(0) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
skipping to change at line 368 skipping to change at page 9, line 39
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| red1_2(38)|red1_3(0) | | red1_2(38)|red1_3(0) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| red1_3(34)|red2_2(0) red2_2(14)|red2_3(0) | | red1_3(34)|red2_2(0) red2_2(14)|red2_3(0) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| red2_3(18)|P|P|P|P| | red2_3(18)|P|P|P|P|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
5. Media Type Registration 5. Media Type Registration
This section describes the media types and names associated with this payload format. This section describes the media types and names associated with this
payload format.
5.1. Registration of media subtype audio/ip-mr_v2.5 5.1. Registration of media subtype audio/ip-mr_v2.5
Type name: audio Type name: audio
Subtype name: ip-mr_v2.5 Subtype name: ip-mr_v2.5
Required parameters: none Required parameters: none
Optional parameters: Optional parameters:
* ptime: Gives the length of time in milliseconds represented by the media in a packet. * ptime: Gives the length of time in milliseconds represented by the
Allowed values are: 20, 40, 60 and 80. media in a packet. Allowed values are: 20, 40, 60 and 80.
Encoding considerations: Encoding considerations: This media type is framed binary data (see RFC
This media type is framed binary data (see RFC 4288, Section 4.8). 4288, Section 4.8).
Security considerations: Security considerations: See RFC 3550 [RFC 3550]
See RFC 3550
Interoperability considerations: none Interoperability considerations: none
Published specification: Published specification: RFC XXXX
RFC XXXX
Applications that use this media type: Applications that use this media type: Real-time audio applications like
Real-time audio applications like voice over IP and teleconference, and multi-media voice over IP and teleconference, and multi-media streaming.
streaming.
Additional information: none Additional information: none
Person & email address to contact for further information: Person & email address to contact for further information:
Elena Berlizova Elena Berlizova
berlizova@spiritdsp.com berlizova@spiritdsp.com
Intended usage: COMMON Intended usage: COMMON
Restrictions on usage: Restrictions on usage: This media type depends on RTP framing, and hence
This media type depends on RTP framing, and hence is only defined for transfer via RTP is only defined for transfer via RTP [RFC 3550].
(RFC 3550).
Author: Author:
Sergey Ikonin <ikonin@spiritdsp.com> Sergey Ikonin <ikonin@spiritdsp.com>
Change controller: Change controller: IETF Audio/Video Transport working group delegated
IETF Audio/Video Transport working group delegated from the IESG. from the IESG.
5.2. Mapping Media Type Parameters into SDP 5.2. Mapping Media Type Parameters into SDP
The information carried in the media type specification has a specific mapping to fields in the The information carried in the media type specification has a specific
Session Description Protocol (SDP) [RFC4566], which is commonly used to describe RTP mapping to fields in the Session Description Protocol (SDP) [RFC 4566],
sessions. When SDP is used to specify sessions employing the IP-MR codec, the mapping is as which is commonly used to describe RTP sessions. When SDP is used to
follows: specify sessions employing the IP-MR codec, the mapping is as follows:
* The media type ("audio") goes in SDP "m=" as the media name. o The media type ("audio") goes in SDP "m=" as the media name.
* The media subtype (payload format name) goes in SDP "a=rtpmap" as the encoding name. The o The media subtype (payload format name) goes in SDP "a=rtpmap"
RTP clock rate in "a=rtpmap" MUST 16000. as the encoding name. The RTP clock rate in "a=rtpmap" MUST 16000.
* The parameter "ptime" goes in the SDP "a=ptime" attributes. o The parameter "ptime" goes in the SDP "a=ptime" attributes.
Any remaining parameters go in the SDP "a=fmtp" attribute by copying them directly from the Any remaining parameters go in the SDP "a=fmtp" attribute by copying
media type parameter string as a semicolon- separated list of parameter=value pairs. them directly from the media type parameter string as a semicolon-
separated list of parameter=value pairs.
Note that the payload format (encoding) names are commonly shown in upper case. Media Note that the payload format (encoding) names are commonly shown in
subtypes are commonly shown in lower case. These names are case-insensitive in both places. upper case. Media subtypes are commonly shown in lower case. These
names are case-insensitive in both places.
6. Security Considerations 6. Security Considerations
RTP packets using the payload format defined in this specification are subject to the security RTP packets using the payload format defined in this specification are
considerations discussed in the RTP specification [RFC3550], and any appropriate RTP profile. subject to the security considerations discussed in the RTP
This implies that confidentiality of the media streams is achieved by encryption. Encryption specification [RFC 3550], and any appropriate RTP profile. This implies
may be performed after compression so there is no conflict between the two operations. that confidentiality of the media streams is achieved by encryption.
Encryption may be performed after compression so there is no conflict
between the two operations.
This payload format does not exhibit any significant non-uniformity in the receiver side This payload format does not exhibit any significant non-uniformity in
computational complexity for packet processing, and thus is unlikely to pose a denial-of-service the receiver side computational complexity for packet processing, and
threat due to the receipt of pathological data. thus is unlikely to pose a denial-of-service threat due to the receipt
of pathological data.
7. IANA Considerations 7. IANA Considerations
One media type has been defined and needs registration in the media types registry. One media type has been defined and needs registration in the media
types registry.
8. Normative References 8. Normative References
[1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC [RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate
2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[2] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson,
"RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.
[3] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC
4566, July 2006.
9. Author's Information [RFC 3550] Schulzrinne, H., Casner, S., Frederick, R., and
V. Jacobson, "RTP: A Transport Protocol for Real-Time
Applications", STD 64, RFC 3550, July 2003.
Sergey Ikonin [RFC 4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
Description Protocol", RFC 4566, July 2006.
Russia 9. Author(s) Information
109004 B.Kommunisticheskaya st. 27
Sergey Ikonin email: ikonin@spiritdsp.com
Russia 109004
Building 27, A. Solgenizyn street
Tel: +7 495 661-2178 Tel: +7 495 661-2178
Fax: +7 495 912-6786 Fax: +7 495 912-6786
Email: ikonin@spiritdsp.com
10. Expiration date 10. Disclaimer
This Internet-Draft will expire on August 25, 2009. This document may contain material from IETF Documents or IETF
Contributions published or made publicly available before November 10,
2008. The person(s) controlling the copyright in some of this material
may not have granted the IETF Trust the right to allow modifications of
such material outside the IETF Standards Process. Without obtaining an
adequate license from the person(s) controlling the copyright in such
materials, this document may not be modified outside the IETF Standards
Process, and derivative works of it may not be created outside the IETF
Standards Process, except to format it for publication as an RFC or to
translate it into languages other than English.
11. Legal Terms 11. Legal Terms
All IETF Documents and the information contained therein are provided on an "AS IS" basis and
THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED All IETF Documents and the information contained therein are provided on
BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION THEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED INFORMATION THEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
The IETF Trust takes no position regarding the validity or scope of any Intellectual Property The IETF Trust takes no position regarding the validity or scope of any
Rights or other rights that might be claimed to pertain to the implementation or use of the Intellectual Property Rights or other rights that might be claimed to
technology described in any IETF Document or the extent to which any license under such rights pertain to the implementation or use of the technology described in any
might or might not be available; nor does it represent that it has made any independent effort to IETF Document or the extent to which any license under such rights might
identify any such rights. or might not be available; nor does it represent that it has made any
independent effort to identify any such rights.
Copies of Intellectual Property disclosures made to the IETF Secretariat and any assurances of Copies of Intellectual Property disclosures made to the IETF Secretariat
licenses to be made available, or the result of an attempt made to obtain a general license or and any assurances of licenses to be made available, or the result of an
permission for the use of such proprietary rights by implementers or users of this specification attempt made to obtain a general license or permission for the use of
can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. such proprietary rights by implementers or users of this specification
can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any copyrights, patents or patent The IETF invites any interested party to bring to its attention any
applications, or other proprietary rights that may cover technology that may be required to copyrights, patents or patent applications, or other proprietary rights
implement any standard or specification contained in an IETF Document. Please address the that may cover technology that may be required to implement any standard
or specification contained in an IETF Document. Please address the
information to the IETF at ietf-ipr@ietf.org. information to the IETF at ietf-ipr@ietf.org.
The definitive version of an IETF Document is that published by, or under the auspices of, the The definitive version of an IETF Document is that published by, or
IETF. Versions of IETF Documents that are published by third parties, including those that are under the auspices of, the IETF. Versions of IETF Documents that are
translated into other languages, should not be considered to be definitive versions of IETF published by third parties, including those that are translated into
Documents. The definitive version of these Legal Provisions is that published by, or under the other languages, should not be considered to be definitive versions of
auspices of, the IETF. Versions of these Legal Provisions that are published by third parties, IETF Documents. The definitive version of these Legal Provisions is that
including those that are translated into other languages, should not be considered to be definitive published by, or under the auspices of, the IETF. Versions of these
versions of these Legal Provisions. Legal Provisions that are published by third parties, including those
that are translated into other languages, should not be considered to be
definitive versions of these Legal Provisions.
For the avoidance of doubt, each Contributor to the IETF Standards Process licenses each For the avoidance of doubt, each Contributor to the IETF Standards
Contribution that he or she makes as part of the IETF Standards Process to the IETF Trust Process licenses each Contribution that he or she makes as part of the
pursuant to the provisions of RFC 5378. No language to the contrary, or terms, conditions or IETF Standards Process to the IETF Trust pursuant to the provisions of
rights that differ from or are inconsistent with the rights and licenses granted under RFC 5378, RFC 5378. No language to the contrary, or terms, conditions or rights
shall have any effect and shall be null and void, whether published or posted by such that differ from or are inconsistent with the rights and licenses
Contributor, or included with or in such Contribution. granted under RFC 5378, shall have any effect and shall be null and
void, whether published or posted by such Contributor, or included with
or in such Contribution.
 End of changes. 89 change blocks. 
233 lines changed or deleted 293 lines changed or added

This html diff was produced by rfcdiff 1.35. The latest version is available from http://tools.ietf.org/tools/rfcdiff/