draft-ietf-avt-rtp-ipmr-09.txt   draft-ietf-avt-rtp-ipmr-10.txt 
Audio/Video Transport Working Group S. Ikonin Audio/Video Transport Working Group S. Ikonin
Internet Draft SPIRIT DSP Internet Draft SPIRIT DSP
Intended status: Informational September 28, 2009 Intended status: Informational October 05, 2009
RTP Payload Format for IP-MR Speech Codec draft-ietf-avt-rtp-ipmr-09.txt RTP Payload Format for IP-MR Speech Codec draft-ietf-avt-rtp-ipmr-10.txt
Status of this Memo Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Copyright (c) 2009 IETF Trust and the persons identified as the document Copyright (c) 2009 IETF Trust and the persons identified as the document
authors. All rights reserved. authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions This document is subject to BCP 78 and the IETF Trust's Legal Provisions
skipping to change at page 1, line 40 skipping to change at page 1, line 40
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference material time. It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress." or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This Internet-Draft will expire on March 28, 2010. This Internet-Draft will expire on April 05, 2010.
Abstract Abstract
This document specifies the payload format for packetization of SPIRIT This document specifies the payload format for packetization of SPIRIT
IP-MR encoded speech signals into the Real-time Transport Protocol IP-MR encoded speech signals into the Real-time Transport Protocol
(RTP). The payload format supports transmission of multiple frames per (RTP). The payload format supports transmission of multiple frames per
payload and introduced redundancy for robustness against packet loss. payload and introduced redundancy for robustness against packet loss.
Table of Contents Table of Contents
skipping to change at page 6, line 30 skipping to change at page 6, line 30
redundancy data only. The value 6 is reserved. If receiving this value redundancy data only. The value 6 is reserved. If receiving this value
the packet SHOULD be discarded. the packet SHOULD be discarded.
o BR (3 bits): base rate for core layer of frame(s) in this packet o BR (3 bits): base rate for core layer of frame(s) in this packet
using the table for CR. Values in the range 0-5 indicate bitrates using the table for CR. Values in the range 0-5 indicate bitrates
for core layer, same as for packet SHOULD be discarded. The base for core layer, same as for packet SHOULD be discarded. The base
rate is the lowest rate for scalability, so speech payload can rate is the lowest rate for scalability, so speech payload can
be scaled down not lower than BR value. If a received packet has be scaled down not lower than BR value. If a received packet has
BR > CR then during decoding it will be assumed that BR = CR. BR > CR then during decoding it will be assumed that BR = CR.
o D (1 bit): indicates if the DTX mode is active or not. This o D (1 bit): reserved. Must be always set to 1.
parameter is retained for backward interoperability with previous Previously, this bit indicated DTX mode availability, but in fact
codec releases and required for payload parsing. The payload dublicates this information.
decoder implementation MUST always include DTX mode
support and update internal states properly. The decoder cannot
assume that DTX will be constantly inactive during a session.
o A (1 bit): reserved. Must be always set to 1. o A (1 bit): reserved. Must be always set to 1.
Previously, this bit indicated aligned mode, but this mode has
never been used and was always set to 1.
o GR (2 bits): number of frames in packet (grouping size). Actual o GR (2 bits): number of frames in packet (grouping size). Actual
grouping size is GR + 1, thus maximum grouping supported is 4. grouping size is GR + 1, thus maximum grouping supported is 4.
o R (1 bit): redundancy presence bit. If R=1 then the packet o R (1 bit): redundancy presence bit. If R=1 then the packet
contains redundancy information for lost packets recovery. contains redundancy information for lost packets recovery.
In this case after speech data the redundancy section is present. In this case after speech data the redundancy section is present.
3.4. Speech Table of Contents 3.4. Speech Table of Contents
skipping to change at page 17, line 26 skipping to change at page 17, line 26
get_frame_info.c get_frame_info.c
Retrieving frame information for IP-MR Speech Codec Retrieving frame information for IP-MR Speech Codec
******************************************************************/ ******************************************************************/
#define RATES_NUM 6 // number of codec rates #define RATES_NUM 6 // number of codec rates
#define SENSE_CLASSES 6 // number of sensitivity classes (A..F) #define SENSE_CLASSES 6 // number of sensitivity classes (A..F)
// frame types // frame types
#define FT_DTX_SPEECH 0 // active speech in DTX mode #define FT_SPEECH 0 // active speech
#define FT_DTX_SID 1 // silence insertion descriptor #define FT_DTX_SID 1 // silence insertion descriptor
#define FT_NO_DTX 2 // no DTX frame
// get specified bit from coded data // get specified bit from coded data
int GetBit(unsigned char *data, int curBit) int GetBit(unsigned char *data, int curBit)
{ {
return ((data[curBit >> 3] >> (curBit % 8)) & 1); return ((data[curBit >> 3] >> (curBit % 8)) & 1);
} }
// retrieve frame information // retrieve frame information
int GetFrameInfo( // o: frame size in bits int GetFrameInfo( // o: frame size in bits
short rate, // i: encoding rate (0..5) short rate, // i: encoding rate (0..5)
short base_rate, // i: base (core) layer rate, short base_rate, // i: base (core) layer rate,
// if base_rate > rate, then assumed // if base_rate > rate, then assumed
// that base_rate = rate. // that base_rate = rate.
short allow_DTX, // i: flag of DTX mode
unsigned char *pCoded, // i: coded bit frame unsigned char *pCoded, // i: coded bit frame
short pLayerBits // o: number of bits in layers short pLayerBits // o: number of bits in layers
[RATES_NUM], [RATES_NUM],
short pSenseBits // o: number of bits in sensitivity classes short pSenseBits // o: number of bits in sensitivity classes
[SENSE_CLASSES], [SENSE_CLASSES],
short *nLayers // o: number of layers short *nLayers // o: number of layers
) )
{ {
static const short Bits_1[4] = {0, 9, 9, 15}; static const short Bits_1[4] = {0, 9, 9, 15};
static const short Bits_2[16] = { 43,50,36,31,46,48,40,44,47,43,44, static const short Bits_2[16] = { 43,50,36,31,46,48,40,44,47,43,44,
skipping to change at page 18, line 21 skipping to change at page 18, line 21
if (rate < 0 || rate > 5) { if (rate < 0 || rate > 5) {
return 0; // incorrect stream return 0; // incorrect stream
} }
for(i = 0; i < SENSE_CLASSES; i++) { for(i = 0; i < SENSE_CLASSES; i++) {
pSenseBits[i] = 0; pSenseBits[i] = 0;
} }
nBits = 0; nBits = 0;
// extract frame type bit if required // extract frame type bit if required
if (allow_DTX) { FrType = GetBit(pCoded, nBits++) ? FT_SPEECH : FT_DTX_SID;
FrType = GetBit(pCoded, nBits++) ? FT_DTX_SPEECH : FT_DTX_SID;
} else {
FrType = FT_NO_DTX;
}
{ {
int cw_0; int cw_0;
int b[14]; int b[14];
// extract meaning bits // extract meaning bits
for(i = 0 ; i < 14; i++) { for(i = 0 ; i < 14; i++) {
b[i] = GetBit(pCoded, nBits++); b[i] = GetBit(pCoded, nBits++);
} }
// parse // parse
skipping to change at page 19, line 11 skipping to change at page 19, line 11
cw_2 = (cw_2 << 1) | b[1]; cw_2 = (cw_2 << 1) | b[1];
cw_2 = (cw_2 << 1) | b[3]; cw_2 = (cw_2 << 1) | b[3];
cw_2 = (cw_2 << 1) | b[5]; cw_2 = (cw_2 << 1) | b[5];
cw_2 = (cw_2 << 1) | b[7]; cw_2 = (cw_2 << 1) | b[7];
cw_0 = (b[10]<<0)|(b[11]<<1)|(b[12]<<2)|(b[13]<<3); cw_0 = (b[10]<<0)|(b[11]<<1)|(b[12]<<2)|(b[13]<<3);
if (base_rate < 0) base_rate = 0; if (base_rate < 0) base_rate = 0;
if (base_rate > rate) base_rate = rate; if (base_rate > rate) base_rate = rate;
idx = base_rate == 0 ? 0 : 1; idx = base_rate == 0 ? 0 : 1;
pSenseBits[0] = (FrType == FT_DTX_SPEECH ? 1:0)+14+Bits_2[cw_0]; pSenseBits[0] = 15+Bits_2[cw_0];
pSenseBits[1] = Bits_1[(cw_1 >> 0)&0x3] + Bits_1[(cw_1>>2)&0x3]; pSenseBits[1] = Bits_1[(cw_1 >> 0)&0x3] + Bits_1[(cw_1>>2)&0x3];
pSenseBits[2] = nFlag_1*5; pSenseBits[2] = nFlag_1*5;
pSenseBits[3] = nFlag_2*30; pSenseBits[3] = nFlag_2*30;
pSenseBits[5] = (4 - nFlag_2)*(Bits_3[idx][0]); pSenseBits[5] = (4 - nFlag_2)*(Bits_3[idx][0]);
for (i = 1; i < rate+1; i++) { for (i = 1; i < rate+1; i++) {
pLayerBits[i] = 4*(Bits_3[idx][i]); pLayerBits[i] = 4*(Bits_3[idx][i]);
} }
} }
 End of changes. 10 change blocks. 
18 lines changed or deleted 12 lines changed or added

This html diff was produced by rfcdiff 1.37a. The latest version is available from http://tools.ietf.org/tools/rfcdiff/