draft-ietf-avt-rtp-g718-03.txt   draft-ietf-avt-rtp-g718-04.txt 
Audio/Video Transport WG Ari Lakaniemi
Internet Draft Nokia
Intended status: Standards track Ye-Kui Wang
Expires: October 2010 Huawei Technologies
April 22, 2010
RTP payload format for G.718 speech/audio Network Working Group G. Zorn, Ed.
draft-ietf-avt-rtp-g718-03.txt Internet-Draft Network Zen
Intended status: Standards Track Y. Wang
Expires: June 11, 2011 Huawei Technologies
A. Lakaniemi
Nokia
December 8, 2010
Status of this Memo RTP Payload Format for G.718 Speech/audio
draft-ietf-avt-rtp-g718-04.txt
This Internet-Draft is submitted to IETF in full conformance with the Abstract
This document specifies the Real-Time Transport Protocol (RTP)
payload format for the Embedded Variable Bit-Rate (EV-VBR) speech/
audio codec, specified in ITU-T G.718. A media type registration for
this RTP payload format is also included.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF). Note that other groups may also distribute
other groups may also distribute working documents as Internet-Drafts. working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at This Internet-Draft will expire on June 11, 2011.
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on October 22, 2010.
Copyright Notice Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the BSD License. described in the Simplified BSD License.
Abstract
This document specifies the Real-Time Transport Protocol (RTP)
payload format for the Embedded Variable Bit-Rate (EV-VBR)
speech/audio codec, specified in ITU-T G.718. A media type
registration for this RTP payload format is also included.
Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
Table of Contents Table of Contents
1. Introduction...................................................3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Background.....................................................3 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 3
2.1. The G.718 codec...........................................3 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2. Benefits of layered design................................5 3.1. The G.718 Codec . . . . . . . . . . . . . . . . . . . . . 3
2.3. Transmitting layered data.................................5 3.2. Benefits of Layered Design . . . . . . . . . . . . . . . . 5
2.4. Scaling scenarios & rate control..........................6 3.3. Transmitting Layered Data . . . . . . . . . . . . . . . . 5
3. G.718 RTP payload format.......................................7 3.4. Scaling Scenarios and Rate Control . . . . . . . . . . . . 6
3.1. Payload Structure.........................................7 4. G.718 RTP Payload Format . . . . . . . . . . . . . . . . . . . 7
3.1.1. Payload Header.......................................7 4.1. Payload Structure . . . . . . . . . . . . . . . . . . . . 7
3.1.2. G.718 transport blocks...............................8 4.1.1. Payload Header . . . . . . . . . . . . . . . . . . . . 7
3.2. Handling the Encoded data................................11 4.1.2. G.718 Transport Blocks . . . . . . . . . . . . . . . . 7
3.3. G.718 scaling............................................13 4.2. Handling The Encoded Data . . . . . . . . . . . . . . . . 10
3.4. CRC verification.........................................14 4.3. G.718 Scaling . . . . . . . . . . . . . . . . . . . . . . 12
3.5. G.718 session............................................14 4.4. CRC Verification . . . . . . . . . . . . . . . . . . . . . 12
3.6. Cross-stream/cross-layer timing synchronization..........14 4.5. G.718 Session . . . . . . . . . . . . . . . . . . . . . . 13
3.7. RTP Header usage.........................................15 4.6. Cross-stream/Cross-layer Timing Synchronization . . . . . 13
4. Payload Format Parameters.....................................15 4.7. RTP Header Usage . . . . . . . . . . . . . . . . . . . . . 13
4.1. Media Type Registration..................................15 5. Payload Format Parameters . . . . . . . . . . . . . . . . . . 14
4.2. Mapping to SDP Parameters................................17 5.1. Media Type Registration . . . . . . . . . . . . . . . . . 14
4.3. Offer/answer considerations..............................18 5.2. Mapping to SDP Parameters . . . . . . . . . . . . . . . . 16
4.4. Declarative usage of SDP.................................18 5.3. Offer/Answer Considerations . . . . . . . . . . . . . . . 16
4.5. SDP examples.............................................18 5.4. Declarative Usage of SDP . . . . . . . . . . . . . . . . . 17
5. Security Considerations.......................................20 5.5. SDP Examples . . . . . . . . . . . . . . . . . . . . . . . 17
6. Congestion control............................................21 5.5.1. Example 1 . . . . . . . . . . . . . . . . . . . . . . 17
7. IANA Considerations...........................................22 5.5.2. Example 2 . . . . . . . . . . . . . . . . . . . . . . 17
APPENDIX A: Payload examples.....................................23 5.5.3. Example 3 . . . . . . . . . . . . . . . . . . . . . . 18
A.1. Simple payload examples..................................23 6. Congestion Control . . . . . . . . . . . . . . . . . . . . . . 19
A.1.1. All the layers in the same payload..................23 7. Security Considerations . . . . . . . . . . . . . . . . . . . 19
A.1.2. Layers in separate RTP streams......................24 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20
A.2. Advanced examples........................................25 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 20
A.2.1. Different update rate for subset of layers..........25 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20
A.2.2. Redundant frames with limited set of layers.........26 10.1. Normative References . . . . . . . . . . . . . . . . . . . 20
8. References....................................................28 10.2. Informative References . . . . . . . . . . . . . . . . . . 21
8.1. Normative References.....................................28 Appendix A. Payload Examples . . . . . . . . . . . . . . . . . . 22
8.2. Informative References...................................29 A.1. Simple Payload Examples . . . . . . . . . . . . . . . . . 22
Author's Addresses...............................................30 A.1.1. All The Layers in The Same Payload . . . . . . . . . . 22
Acknowledgment...................................................30 A.1.2. Layers in Seperate RTP Streams . . . . . . . . . . . . 23
9. Open Issues...................................................30 A.2. Advanced Examples . . . . . . . . . . . . . . . . . . . . 24
10. Changes Log..................................................31 A.2.1. Different Update Rate for Subset of Layers . . . . . . 24
A.2.2. Redundant Frames With Limited Set of Layers . . . . . 25
1. Introduction 1. Introduction
The International Telecommunication Union (ITU-T) Recommendation The International Telecommunication Union (ITU-T) Recommendation
G.718 [G.718] specifies the Embedded Variable Bit Rate (EV-VBR) G.718 [ITU.G718.2008] specifies the Embedded Variable Bit Rate (EV-
speech/audio codec. This document specifies the Real-time Transport VBR) speech/audio codec. This document specifies the Real-time
Protocol (RTP) [RFC3550] payload format for this codec. Transport Protocol (RTP) [RFC3550] payload format for this codec.
2. Background 2. Requirements Language
2.1. The G.718 codec The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
3. Background
3.1. The G.718 Codec
G.718 is an embedded variable rate speech codec having a layered G.718 is an embedded variable rate speech codec having a layered
design. The bitstream of the G.718 core codec consists of a core design. The bitstream of the G.718 core codec consists of a core
layer, denoted as L1, and four enhancement layers, denoted as L2-L5. layer, denoted as L1, and four enhancement layers, denoted as L2-L5.
The bit-rates of the G.718 core codec range from 8 kbit/s (core layer The bit-rates of the G.718 core codec range from 8 kbit/s (core layer
only) to 32 kbit/s (with all layers up to L5). Furthermore, the G.718 only) to 32 kbit/s (with all layers up to L5). Furthermore, the
codec supports also discontinuous transmission (DTX) and comfort G.718 codec also supports discontinuous transmission (DTX) and
noise generation (CNG) by sending Silence Descriptor (SID) frames comfort noise generation (CNG) by sending Silence Descriptor (SID)
during periods of non-active input signal, resulting in a reduced frames during periods of non-active input signal, resulting in a
bit-rate. The sampling frequency of the core codec is 16 kHz and the reduced bit-rate. The sampling frequency of the core codec is 16 kHz
codec operates on 20 ms frames. The G.718 codec is also capable of and the codec operates on 20 ms frames. The G.718 codec is also
narrowband operation with audio input and/or output at 8 kHz sampling capable of narrowband operation with audio input and/or output at 8
frequency. kHz sampling frequency.
While transmitting/receiving the core layer L1 is enough for While transmitting/receiving the core layer L1 is enough for
successful decoding of the audio content, each of the enhancement successful decoding of the audio content, each of the enhancement
layers Ln (n being 2 to 5, inclusive) provides an improvement to layers Ln (n being 2 to 5, inclusive) provides an improvement to
reconstructed audio quality. Thus, the core layer ensures the basic reconstructed audio quality. Thus, the core layer ensures the basic
communication while the enhancement layers can be used to improve the communication while the enhancement layers can be used to improve the
perceptual quality. Furthermore, enhancement layers are dependent on perceptual quality. Furthermore, enhancement layers are dependent on
all the lower layers in a sense that successful decoding of layer Ln all the lower layers in a sense that successful decoding of layer Ln
requires also all the layers Lm with m<n to be available. requires also all the layers Lm with m<n to be available.
The sizes, sampling rates and possible outputs of the G.718 core The sizes, sampling rates and possible outputs of the G.718 core
codec layers L1-L5 are summarized in Table 1 below, where "Bytes" codec layers L1-L5 are summarized in Table 1 below, where the "Bytes"
column indicates the number of bytes per encoded data unit for a column indicates the number of bytes per encoded data unit for a
layer, NB and WB denotes narrowband and wideband, respectively. The layer. NB and WB denote narrowband and wideband, respectively. The
"Bytes" column in other tables has the same meaning. Note that for "Bytes" column in other tables has the same meaning. Note that for
layers L1 and L2, the corresponding output may either be NB or WB, layers L1 and L2, the corresponding output may either be NB or WB,
depending on the rendering device and the application requirement, depending on the rendering device and the application requirement,
regardless of the sampling rate of the encoded data. regardless of the sampling rate of the encoded data.
Table 1: G.718 layers Table 1: G.718 Layers
Layer Bytes Cumulative bit-rate Sampling rate Output Layer Bytes Cumulative bit-rate Sampling rate Output
-----------------------------------------------------------------_ ----------------------------------------------------------------
L1 20 8 kbit/s 8 or 16 kHz NB or WB L1' 32 12.8 kbit/s 16 kHz WB
L2 10 12 kbit/s 8 or 16 kHz NB or WB L3' 9 16.4 kbit/s 16 kHz WB
L3 10 16 kbit/s 16 kHz WB L4 20 24.4 kbit/s 16 kHz WB
L4 20 24 kbit/s 16 kHz WB L5 20 32.4 kbit/s 16 kHz WB
L5 20 32 kbit/s 16 kHz WB
The G.718 codec includes also an operating mode that is compatible The G.718 codec also includes an operating mode that is compatible
with the Adaptive Multi-Rate Wideband (AMR-WB) codec [AMR-WB], for with the Adaptive Multi-Rate Wideband (AMR-WB) codec [AMR-WB], for
which the RTP payload format is specified in [RFC4867]. In this AMR- which the RTP payload format is specified in [RFC4867]. In this AMR-
WB interoperable mode, layers L1, L2 are replaced by L1' consisting WB interoperable mode, layers L1 and L2 are replaced by L1'
of AMR-WB encoded data. Furthermore, together with L1' modified L3' consisting of AMR-WB encoded data. Furthermore, together with L1'.
is used instead of L3. The usage of layers L4 and L5 is not affected modified L3' is used instead of L3. The usage of layers L4 and L5 is
by transmitting AMR-WB data in the lower layers. If layer L3' is not affected by transmitting AMR-WB data in the lower layers. If
present in the encoded bit-stream, the base layer L1' must use the layer L3' is present in the encoded bit-stream, the base layer L1'
AMR-WB mode 2 with the bit-rate of 12.65 kbits/s. Otherwise (the must use the AMR-WB mode 2 with a bit-rate of 12.65 kbits/s.
encoded bit-stream contains only the L1' layer), any of the 9 AMR-WB Otherwise (the encoded bit-stream contains only the L1' layer), any
coding modes 0, 1, 2, 3, 4, 5, 6, 7, and 8 correspond to the bit- of the 9 AMR-WB coding modes 0, 1, 2, 3, 4, 5, 6, 7, and 8 correspond
rates of 6.60, 8.85, 12.65, 14.25, 15.85, 18.25, 19.85, 23.05, and to the bit- rates of 6.60, 8.85, 12.65, 14.25, 15.85, 18.25, 19.85,
23.85 kbit/s, respectively, may be in use. Table 2 summarizes the 23.05, and 23.85 kbit/s, respectively, may be in use. Table 2
AMR-WB interoperable mode when more than one layer may be present. summarizes the AMR-WB interoperable mode when more than one layer may
be present.
Table 2: G.718 layers in the AMR-WB interoperable mode Table 2: G.718 layers in the AMR-WB interoperable mode
Layer Bytes Cumulative bit-rate Sampling rate Output Layer Bytes Cumulative bit-rate Sampling rate Output
-----------------------------------------------------------------_ ----------------------------------------------------------------
L1' 32 12.8 kbit/s 16 kHz WB L1' 32 12.8 kbit/s 16 kHz WB
L3' 9 16.4 kbit/s 16 kHz WB L3' 9 16.4 kbit/s 16 kHz WB
L4 20 24.4 kbit/s 16 kHz WB L4 20 24.4 kbit/s 16 kHz WB
L5 20 32.4 kbit/s 16 kHz WB L5 20 32.4 kbit/s 16 kHz WB
Note that the bit-rate for the raw bit-stream of AMR-WB mode 2 is Note that the bit-rate for the raw bit-stream of AMR-WB mode 2 is
12.65 kbits/s. However, after counting the padding bits to make each 12.65 kbits/s. However, after counting the padding bits to make each
encoded data unit byte-aligned, as in the octet-aligned mode encoded data unit byte-aligned, as in the octet-aligned mode
specified in [RFC4867], the resulting bit-rate is then 12.8 kbits/s. specified in [RFC4867], the resulting bit-rate is then 12.8 kbits/s.
In the AMR-WB interoperable mode, when the base layer L1' is In the AMR-WB interoperable mode, when the base layer L1' is
transported in its own RTP packet stream, the packetisation specified transported in its own RTP packet stream, the packetisation specified
in [RFC4867] MUST be used, to enable legacy RFC4867 receivers to in [RFC4867] MUST be used, to enable legacy RFC4867 receivers to
receive the base layer L1'. receive the base layer L1'.
ITU-T SG16 is currently working on a set of extension layers in order ITU-T SG16 is currently working on a set of extension layers in order
to provide a so-called super-wideband (SWB) audio and stereophonic to provide so-called super-wideband (SWB) audio and stereophonic
encoding extensions on top of the G.718 core codec. Further details encoding extensions on top of the G.718 core codec. Further details
and the usage of these layers are TBD. and the usage of these layers are undtermined at this time.
The main application of the G.718 codec is telephony. Other expected The main application of the G.718 codec is telephony. Other expected
applications include audio/video conferencing and streaming. applications include audio/video conferencing and streaming.
2.2. Benefits of layered design 3.2. Benefits of Layered Design
The layered design enables simple scalability of the transmitted Layered design enables simple scalability of the transmitted stream
stream simply by conveying a suitable number of layers. The number of simply by conveying a suitable number of layers. The number of
layers used in a session may be selected for example based on the layers used in a session may be selected for example based on the
capacity of the transmission channel, current transmission conditions, capacity of the transmission channel, current transmission
characteristics of the source signal or available processing capacity. conditions, characteristics of the source signal or available
processing capacity.
Another obvious benefit of the layered codec design is the Another obvious benefit of the layered codec design is the
possibility to exploit the scalability to support congestion control possibility to exploit the scalability to support congestion control
by transmitting/dropping some of the (higher) enhancement layers in by transmitting/dropping some of the (higher) enhancement layers in
order to alleviate congestion in the network. See more detailed order to alleviate congestion in the network. See more detailed
discussion on the congestion control in section 6. discussion on the congestion control in section 6.
Furthermore, the layered design also implicitly provides possibility Furthermore, the layered design also implicitly provides possibility
for unequal error detection/protection by employing different levels for unequal error detection/protection by employing different levels
of protection on core layer and enhancement layers. of protection on core layer and enhancement layers.
2.3. Transmitting layered data 3.3. Transmitting Layered Data
In principle there are two basic approaches to carry the data from a In principle there are two basic approaches to carry the data from a
layered encoder: layered encoder:
1. All the layers are carried within a single RTP session. 1. All the layers are carried within a single RTP session
2. The encoded data is divided over multiple RTP sessions, each 2. The encoded data is divided over multiple RTP sessions, each
session carrying a subset of layers. This is also referred to as session carrying a subset of layers. This is also referred to as
Multi-Session Transmission (MST). Multi-Session Transmission (MST)
The first choice is the most efficient in terms of exploitation of The first choice is the most efficient in terms of exploitation of
transmission bandwidth. Furthermore, using only one packet to carry transmission bandwidth. Furthermore, using only one packet to carry
all encoded data layers of a frame requires less resources also from all encoded data layers of a frame requires less resources also from
the end-systems (and intermediate systems) since the number of the end-systems (and intermediate systems) since the number of
packets is kept at minimum and only single RTP packet stream needs to packets is kept at minimum and only single RTP packet stream needs to
be handled. However, this option requires any intermediate network be handled. However, this option requires any intermediate network
element performing the scaling operation to be fully media-aware element performing the scaling operation to be fully media-aware
since removing encoded layers requires modification of the payload. since removing encoded layers requires modification of the payload.
Furthermore, the intermediate network element needs to be within the Furthermore, the intermediate network element needs to be within the
security context to enable the meaningful manipulation of the payload, security context to enable the meaningful manipulation of the
in case secure transport is employed. This might not be feasible in payload, in case secure transport is employed. This might not be
all systems/scenarios, but some special-purpose devices such as e.g. feasible in all systems/scenarios, but some special-purpose devices
media gateways in cellular telephone systems may be able to implement such as e.g. media gateways in cellular telephone systems may be able
this kind of media-aware functionality. to implement this kind of media-aware functionality.
The second alternative transmitting selected subsets of layers in The second alternative, transmitting selected subsets of layers in
separate RTP sessions facilitates simple scalability in intermediate separate RTP sessions, facilitates simple scalability in intermediate
network elements without the requirement of being fully media-aware. network elements without the requirement of being fully media-aware.
One use case of this alternative is layered multicast [McCanne]. On One use case of this alternative is layered multicast [McCanne]. On
the other hand, this approach introduces separate packet header the other hand, this approach introduces separate packet header
overhead for each subset of layers for those low-delay application overhead for each subset of layers for those low-delay application
scenarios wherein aggregation of data from multiple frames is not scenarios wherein aggregation of data from multiple frames is not
ideal. In this case, when the size of the encoded data block per ideal. In this case, when the size of the encoded data block per
single layer is in the range of 10 to 20 bytes, the packetisation may single layer is in the range of 10 to 20 bytes, the packetisation may
result in relatively high amount of protocol overhead, which might be result in relatively high amount of protocol overhead, which might be
an expensive solution on bandwidth-limited links. Another drawback of an expensive solution on bandwidth-limited links. Another drawback
this approach is somewhat more complex session setup and the of this approach is somewhat more complex session setup and the
additional complexity associated with handling of several concurrent additional complexity associated with handling of several concurrent
RTP sessions. However, this is a trade-off that enables simple RTP sessions. However, this is a trade-off that enables simple
scalability also by intermediate network elements that are not aware scalability also by intermediate network elements that are not aware
of the details of the transmitted media. of the details of the transmitted media.
2.4. Scaling scenarios & rate control 3.4. Scaling Scenarios and Rate Control
In principle there are three different ways to make use of the In principle there are three different ways to make use of the
layered design to control the bandwidth usage: layered design to control the bandwidth usage:
1. A sender decides to change the number of layers it is transmitting 1. A sender decides to change the number of layers it is
(for example due to congestion control constrains) transmitting (for example due to congestion control constraints)
2. A receiver or an intermediate network element instructs a sender 2. A receiver or an intermediate network element instructs a sender
to change the number of layers it is transmitting to change the number of layers it is transmitting
3. An intermediate network element passes forward only a subset of 3. An intermediate network element passes through only a subset of
layers it receives layers it receives
The most appropriate mechanism depends on the application and the The most appropriate mechanism depends on the application and the
employed network topology. For example point-to-point conversational employed network topology. For example point-to-point conversational
audio connection can easily introduce rate control by changing the audio connection can easily introduce rate control by changing the
number of transmitted layers, while in centralized audio/video number of transmitted layers, while in centralized audio/video
conferencing scenario the conference server is a more appropriate conferencing scenario the conference server is a more appropriate
point to implement the rate control instead of transmitting end-point. point to implement the rate control instead of transmitting end-
Please refer to [RFC5117] for extensive discussion on the different point. Please refer to RFC 5117 for extensive discussion on the
topologies and their implications to the transmission. different topologies and their implications to the transmission.
However, the fundamental difference between these choices is that However, the fundamental difference between these choices is that
method 1 does not necessarily need any feedback from the receiver(s), method 1 does not necessarily need any feedback from the receiver(s),
while methods 2 and 3 require a signaling mechanism to support rate while methods 2 and 3 require a signaling mechanism to support rate
control. control.
3. G.718 RTP payload format 4. G.718 RTP Payload Format
The basic G.718 source data unit is one layer of an encoded frame. The basic G.718 source data unit is one layer of an encoded frame.
Since generally the term layer refers to time series of data Since generally the term layer refers to time series of data
representing certain encoding layer, in this specification we use the representing certain encoding layer, in this specification we use the
term Encoded Data Unit (EDU) to refer to a single layer of data from term Encoded Data Unit (EDU) to refer to a single layer of data from
single encoded frame. Thus, each EDU has a (conceptual) frame number single encoded frame. Thus, each EDU has a (conceptual) frame number
indicating its location in encoding/decoding order and a layer number indicating its location in encoding/decoding order and a layer number
indicating the encoding layer the EDU represents. indicating the encoding layer the EDU represents.
3.1. Payload Structure 4.1. Payload Structure
The G.718 payload format consists of a payload header, followed by The G.718 payload format consists of a payload header, followed by
one or more transport blocks (TB) forming the actual payload data. one or more transport blocks (TB) forming the actual payload data.
+-----------------+----------+----------+- /// -+----------+ +-----------------+----------+----------+- /// -+----------+
| Payload header | TB(1) | TB(2) | TB(n) | | Payload header | TB(1) | TB(2) | TB(n) |
+-----------------+----------+----------+- /// -+----------+ +-----------------+----------+----------+- /// -+----------+
3.1.1. Payload Header 4.1.1. Payload Header
The payload header consists of an 8-bit payload CRC checksum: The payload header consists of an 8-bit payload CRC checksum:
+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+
| CRC | | CRC |
+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+
In the transmitting end the payload checksum is computed over the On the transmitting end the payload checksum is computed over the
primary transport block (specified in section 3.1.2. ) of the payload primary transport block (specified in Section 4.1.2) of the payload
using the generator polynomial using the generator polynomial
C(z) = z^8 + z^4 + z^3 + z^2 + 1. C(z) = z^8 + z^4 + z^3 + z^2 + 1
Subsequent transport blocks are prepared in such a way that the Subsequent transport blocks are prepared in such a way that the
payload checksum is valid for any integer number of contiguous payload checksum is valid for any integer number of contiguous
transport blocks within one RTP packet starting from the beginning of transport blocks within one RTP packet starting from the beginning of
the primary transport block. the primary transport block.
In the receiving end the payload CRC checksum can be used to verify On the receiving end the payload CRC checksum can be used to verify
the correct reception of any contiguous subset of transport blocks the correct reception of any contiguous subset of transport blocks
within one RTP packet starting from the beginning of the primary within one RTP packet starting from the beginning of the primary
transport block (see section 3.4. for detailed description). transport block (see Section 4.4 for a detailed description).
3.1.2. G.718 transport blocks 4.1.2. G.718 Transport Blocks
The basic building block of the G.718 RTP payload data is an G.718 The basic building block of the G.718 RTP payload data is an G.718
transport block (TB). There are two types of transport blocks: transport block (TB). There are two types of transport blocks:
primary transport block and secondary transport block. primary and secondary.
The structure of the primary transport block is depicted below. The structure of the primary transport block is depicted below.
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+----------------------------+ +-+-+-+-+-+-+-+-+----------------------------+
| L-ID |NF | Encoded data | | L-ID |NF | Encoded data |
+-+-+-+-+-+-+-+-+----------------------------+ +-+-+-+-+-+-+-+-+----------------------------+
The structure of the secondary transport block is depicted below. The structure of the secondary transport block is depicted below.
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+----------------------------+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+----------------------------+-+-+-+-+-+-+-+-+
| L-ID |NF | Encoded data | Tail | | L-ID |NF | Encoded Data | Tail |
+-+-+-+-+-+-+-+-+----------------------------+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+----------------------------+-+-+-+-+-+-+-+-+
The layer ID (L-ID) and the NF fields form the transport block header.
The L-ID field is used to identify the layer structure of the encoded
data carried in this G.718 transport block, and the NF field
indicates the number of encoded frames with this layer structure
carried in the Encoded data part following the transport block header.
The Tail field of the secondary transport block carries a modified 8-
bit CRC checksum computed over the transport block, as specified
below.
Author's note: For streaming or other applications that The layer ID (L-ID) and the NF fields form the transport block
allow for relatively long end-to-end delay, sometimes it header. The L-ID field is used to identify the layer structure of
would be beneficial to aggregate more than 4 frames in one the encoded data carried in this G.718 transport block, and the NF
TB. Should the length of NF be larger? field indicates the number of encoded frames with this layer
structure carried in the Encoded data part following the transport
block header. The Tail field of the secondary transport block
carries a modified 8-bit CRC checksum computed over the transport
block, as specified below.
An G.718 RTP packet payload SHALL include exactly one primary A G.718 RTP packet payload SHALL include exactly one primary
transport block, which MAY be followed by one or more secondary transport block, which MAY be followed by one or more secondary
transport blocks. The data fields of both transport block types are transport blocks. The data fields of both transport block types are
described below. described below.
L-ID Identification (6 bits) of the encoded data carried in this L-ID (6 bits)
transport block. Table 3 below specifies the mapping between L- Identification of the encoded data carried in this transport
ID and the encoded data. block. Table 3 below specifies the mapping between L-ID and the
encoded data. Note that L-ID is treated as an unsigned integer.
Table 3: Layer identification (L-ID) values
L-ID Encoded data
--------------------------------------
0 Empty frame
1 L1
2 L1-L2
3 L1-L3
4 L1-L4
5 L1-L5
6 L2
7 L2-L3
8 L2-L4
9 L2-L5
10 L3
11 L3-L4
12 L3-L5
13 L4
14 L4-L5
15 L5
16 L1'
17 L1', L3'
18 L1', L3', L4
19 L1', L3', L4-L5
20 G.718 SID
21 AMR-WB SID
22-63 Reserved
Author's note: The current approach provides maximum
flexibility in terms of layer configuration. However,
limiting choices would be one way to leave more bits for
stereo & SWB layer configurations.
Author's note: One suggested way to make sure we do not
run out of L-ID values with the extension modes has been
to make the mapping between L-ID and layer configuration
dynamic (to be specified using SDP in session set-up).
While this would provide effective usage of L-ID bits, it
would require all elements processing the payload to be
signaling-aware. A compromise solution would be to provide
static mapping for selected layer configurations and leave
'more exotic' cases to be dynamically mapped on session
basis. The usage of this type of approach is FFS.
Author's note: Yet another possible way is to do similar
as in the SVC RTP payload format draft, i.e. to signal the
bitrate etc. parameters for an operation point, and signal
dependency between sessions using the MMUSIC decoding
dependency draft. This way should be generic enough and
applicable to future versions of scalable codecs. However,
the above methods (using detailed layer configuration may
provide more useful information as the bitrate etc. of
each layer is fixed, not as flexible as in SVC.)
Author's note: Yet another approach is to allocate L-ID Table 3: Layer Identification (L-ID) Values
according to different mode. For example, the mode with L1
being present and the AMR-WB compatible mode (with L1'
being present) use different value spaces of L-ID.
NF Number of frames in this transport block (2 bits) decreased by L-ID Encoded data
one. The number of frames is equal to the value of NF --------------------------
incremented by one. For example, value NF=0 indicates that the 0 Empty frame
transport block carries one frame, and value NF=3 indicate that 1 L1
the transport block carries four frames. If the sender wants to 2 L1-L2
encapsulate more than four frames per payload, several 3 L1-L3
transport blocks need to be used. 4 L1-L4
5 L1-L5
6 L2
7 L2-L3
8 L2-L4
9 L2-L5
10 L3
11 L3-L4
12 L3-L5
13 L4
14 L4-L5
15 L5
16 L1'
17 L1', L3'
18 L1', L3', L4
19 L1', L3', L4-L5
20 G.718 SID
21 AMR-WB SID
22-63 Reserved
Encoded data NF (2 bits)
Number of frames in this transport block (2 bits) decreased by
one. The number of frames is equal to the value of NF incremented
by one. For example, value NF=0 indicates that the transport
block carries one frame, and value NF=3 indicate that the
transport block carries four frames. If the sender wants to
encapsulate more than four frames per payload, several transport
blocks need to be used.
Encoded data consists of EDUs as specified by the values L-ID Encoded Data (variable length)
and NF fields, arranged according to rules given in section 3.2. Encoded data consists of EDUs as specified by the values L-ID and
When L-ID is equal to 0 (empty frame), the encoded data field NF fields, arranged according to the rules given in Section 4.2.
is not present (i.e. it consists of zero octet). When L-ID is equal to 0 (empty frame), the encoded data field is
not present.
Tail The 8-bit tail field of the secondary transport block carries a Tail (8 bits)
bit field that is needed to modify the partial CRC checksum The Tail field of the secondary transport block carries a bit
over the payload data up to the end of this TB to match the field that is needed to modify the partial CRC checksum over the
payload CRC field value carried in the payload header. payload data up to the end of this TB to match the payload CRC
field value carried in the payload header.
In the transmitter the Tail bits for a secondary TB(n) are In the transmitter the Tail bits for a secondary TB(n) are
computed by first computing the CRC checksum CRC(n) over the computed by first computing the CRC checksum CRC(n) over the
payload data from the beginning of the primary TB up to the end payload data from the beginning of the primary TB up to the end of
of TB(n) using the generator polynomial C(z) given above. The TB(n) using the generator polynomial C(z) given above. The bits
bits of the Tail field of TB(n) are set to zero value for the of the Tail field of TB(n) are set to zero value for the CRC
CRC computation. The transmitted value of the Tail field in computation. The transmitted value of the Tail field in TB(n) is
TB(n) is obtained by bitwise XOR operation between the payload obtained by bitwise XOR operation between the payload CRC field
CRC field value carried in the payload header and the CRC(n) value carried in the payload header and the CRC(n) computed for
computed for TB(n). TB(n).
3.2. Handling the Encoded data 4.2. Handling The Encoded Data
In order to provide unique mapping of EDUs to encoded frames, the In order to provide unique mapping of EDUs to encoded frames, the
following rules on sequence of frames and sequence of layers need to following rules on sequence of frames and sequence of layers need to
be followed when creating a payload: be followed when creating a payload:
o The frames within a payload MUST form a set of contiguous frames o The frames within a payload MUST form a set of contiguous frames
in decoding order, i.e. if a payload carries frames n and n+N, all in decoding order, i.e. if a payload carries frames n and n+N, all
frames between n and n+N in decoding order MUST also be present in frames between n and n+N in decoding order MUST also be present in
the payload. the payload.
skipping to change at page 11, line 39 skipping to change at page 10, line 49
Explicit timing information for the transport blocks is not needed, Explicit timing information for the transport blocks is not needed,
since the ordering of EDUs in the payload and their mapping to since the ordering of EDUs in the payload and their mapping to
transport blocks can be used to implicitly carry this information. transport blocks can be used to implicitly carry this information.
The following rules apply: The following rules apply:
o If the highest layer carried in transport block k is n, and the o If the highest layer carried in transport block k is n, and the
lowest layer carried by transport block k+1 is n+1, then the EDUs lowest layer carried by transport block k+1 is n+1, then the EDUs
of transport block k and k+1 belong to the same encoded frame. of transport block k and k+1 belong to the same encoded frame.
Furthermore, if transport blocks k and k+1 carry EDUs belonging to Furthermore, if transport blocks k and k+1 carry EDUs belonging to
the same encoded frame(s), these transport blocks MUST include the the same encoded frame(s), these transport blocks MUST include the
same number of EDUs. same number of EDUs
o If the highest layer carried in transport block k is n, and the o If the highest layer carried in transport block k is n, and the
lowest layer carried by transport block k+1 is smaller than or lowest layer carried by transport block k+1 is smaller than or
equal to n, the EDUs of transport block k and k+1 belong to the equal to n, the EDUs of transport block k and k+1 belong to the
two separate encoded frames, which are contiguous in decoding two separate encoded frames, which are contiguous in decoding
order. order
o Multiple copies of an EDU MUST NOT be included in the payload. o Multiple copies of an EDU MUST NOT be included in the payload
A set of EDUs can be allocated to transport blocks in several ways. A set of EDUs can be allocated to transport blocks in several ways.
For example each EDU can be encapsulated in its own transport block, For example each EDU can be encapsulated in its own transport block,
all EDUs can be carried in single transport block, EDUs belonging to all EDUs can be carried in single transport block, EDUs belonging to
the same encoded frame can be encapsulated in dedicated transport the same encoded frame can be encapsulated in dedicated transport
block, or EDUs representing the same layer can be carried in their block, or EDUs representing the same layer can be carried in their
own transport blocks. Three examples on this with two frames with own transport blocks. Three examples on this with two frames with
layers L1-L3 are given below. The first example illustrates the case layers L1-L3 are given below. The first example illustrates the case
using a single transport block for the whole payload, while the using a single transport block for the whole payload, while the
second payload example introduces separate transport blocks for each second payload example introduces separate transport blocks for each
of the EDUs. The third example shows an approach where all layers are of the EDUs. The third example shows an approach where all layers
carried in dedicated transport blocks. The notation Fx-Ly is used to are carried in dedicated transport blocks. The notation Fx-Ly is
denote layer y of frame x. used to denote layer y of frame x.
Example 1: All EDUs in a single transport block Example 1: All EDUs in a single transport block
+---------+-----+-------+-------+-------+-------+-------+--------+ +---------+-----+-------+-------+-------+-------+-------+--------+
| L-ID=3 |NF=1 | F1-L1 | F2-L1 | F1-L2 | F2-L2 | F1-L3 | F2-L3 | | L-ID=3 |NF=1 | F1-L1 | F2-L1 | F1-L2 | F2-L2 | F1-L3 | F2-L3 |
+---------+-----+-------+-------+-------+-------+-------+--------+ +---------+-----+-------+-------+-------+-------+-------+--------+
Author's note: Currently, it is mandated that lower layer
EDUs of later frames go before higher layer EDUs of
earlier frames. This way is friendlier to adaptation
(dropping of higher layers). However, if all layers are
received, then the depacketizer needs to reorder the EDUs
to their decoding order before feeding them to the decoder.
Therefore, the other way around (i.e. lower layer EDUs of
later frames go after higher layer EDUs of earlier frames,
or EDUs in transport blocks are placed in decoding order)
is more friendly to the depacketizer. Another benefit of
the latter is that it does not introduce any end-to-end
delay. Which way to be specified (or both allowed if
needed) is FFS.
Example 2: All EDUs in separate transport blocks Example 2: All EDUs in separate transport blocks
+---------+-----+-------+---------+-----+-------+ +---------+-----+-------+---------+-----+-------+
| L-ID=1 |NF=0 | F1-L1 | L-ID=1 |NF=0 | F2-L1 | | L-ID=1 |NF=0 | F1-L1 | L-ID=1 |NF=0 | F2-L1 |
+---------+-----+-------+---------+-----+-------+ +---------+-----+-------+---------+-----+-------+
| L-ID=8 |NF=0 | F1-L2 | L-ID=8 |NF=0 | F2-L2 | | L-ID=8 |NF=0 | F1-L2 | L-ID=8 |NF=0 | F2-L2 |
+---------+-----+-------+---------+-----+-------+ +---------+-----+-------+---------+-----+-------+
| L-ID=14 |NF=0 | F1-L3 | L-ID=14 |NF=0 | F2-L3 | | L-ID=14 |NF=0 | F1-L3 | L-ID=14 |NF=0 | F2-L3 |
+---------+-----+-------+---------+-----+-------+ +---------+-----+-------+---------+-----+-------+
Example 3: Dedicated transport for EDUs of each layer Example 3: Dedicated transport for EDUs of each layer
+---------+-----+-------+-------+---------+-----+-------+-------+
| L-ID=1 |NF=1 | F1-L1 | F2-L1 | L-ID=6 |NF=1 | F1-L2 | F2-L2 | +---------+-----+-------+-------+---------+-----+-------+-------+
+---------+-----+-------+-------+---------+-----+-------+-------+ | L-ID=1 |NF=1 | F1-L1 | F2-L1 | L-ID=6 |NF=1 | F1-L2 | F2-L2 |
| L-ID=10 |NF=1 | F1-L3 | F2-L3 | +---------+-----+-------+-------+---------+-----+-------+-------+
+---------+-----+-------+-------+ | L-ID=10 |NF=1 | F1-L3 | F2-L3 |
+---------+-----+-------+-------+
While the first example carrying data from all layers in the same While the first example carrying data from all layers in the same
transport block obviously consumes less bandwidth, the second example transport block obviously consumes less bandwidth, the second example
using separate transport block for each EDU, and the third example using separate transport block for each EDU, and the third example
using dedicated transport blocks for each layer provide simple using dedicated transport blocks for each layer provide simple
scaling possibility: while in the first case the removal of e.g. scaling possibility: while in the first case the removal of e.g.
layer L3 (from each frame in the payload) would require changing the layer L3 (from each frame in the payload) would require changing the
value of the L-ID in addition to removing the corresponding EDU(s), value of the L-ID in addition to removing the corresponding EDU(s),
in the second and third options it is enough to just remove all in the second and third options it is enough to just remove all
transport blocks carrying L3 data and the remaining part of the transport blocks carrying L3 data and the remaining part of the
payload can be left untouched (however the packet size information in payload can be left untouched (however the packet size information in
high-layer protocol headers needs change). high-layer protocol headers needs change).
3.3. G.718 scaling 4.3. G.718 Scaling
Some media-aware network elements (MANEs) MAY modify the G.718 Some Media-Aware Network Elements (MANEs) MAY modify the G.718
bitstream by dropping some of the layers in case congestion control bitstream by dropping some of the layers in case congestion control
or e.g. access link bandwidth requires such scaling to take place. or e.g. access link bandwidth requires such scaling to take place.
Such MANEs are RTP translators (with the topology Topo-Translator as Such MANEs are RTP translators (with the topology Topo-Translator as
described in [RFC5117]), for which the rules for RTP translators described in [RFC5117], for which the rules for RTP translators
specified in [RFC3550] apply. specified in [RFC3550] apply.
A payload can be either completely dropped or some of the transport A payload can be either completely dropped or some of the transport
blocks it carries can be discarded. In case full payloads are dropped blocks it carries can be discarded. In case full payloads are
to implement scaling, a packet containing the core layer L1 SHOULD dropped to implement scaling, a packet containing the core layer L1
NOT be discarded, since the decoding of higher layers of the same SHOULD NOT be discarded, since the decoding of higher layers of the
encoded frame is not possible without the core layer data being same encoded frame is not possible without the core layer data being
available. This means that payloads with L-ID values equal to 1 to 5, available. This means that payloads with L-ID values equal to 1 to
inclusive and 16 to 19, inclusive, SHOULD NOT be completely discarded. 5, inclusive and 16 to 19, inclusive, SHOULD NOT be completely
discarded.
Author's note: To be checked whether the case of dropping Author's note: To be checked whether the case of dropping a subset
a subset of the transport blocks in one packet also of the transport blocks in one packet also strictly follows the
strictly follows the topology Topo-Translator. topology Topo-Translator.
In case the payload is forwarded with modified content, at least the In case the payload is forwarded with modified content, at least the
primary transport block MUST be preserved in the payload, while some primary transport block MUST be preserved in the payload, while some
of the secondary transport blocks at the end of the payload MAY be of the secondary transport blocks at the end of the payload MAY be
discarded. discarded.
3.4. CRC verification 4.4. CRC Verification
Both UDP-Lite [RFC3828] and DCCP [4340] provide partial checksum Both UDP-Lite [RFC3828] and DCCP [RFC4340] provide partial checksum
options, in which partially damaged payloads can be delivered to the options, in which partially damaged payloads can be delivered to the
application layer. In cases wherein such a transport layer operation application layer. In cases wherein such a transport layer operation
is in use, and the partial checksum service by the transport layer is in use, and the partial checksum service by the transport layer
protects up to the RTP header and the payload header, the CRC protects up to the RTP header and the payload header, the CRC
checksum provided in the payload header can be used to verify whether checksum provided in the payload header can be used to verify whether
an RTP packet payload contains corrupt transport blocks. an RTP packet payload contains corrupt transport blocks.
In the receiving end the CRC verification is made in such a way that On the receiving end the CRC verification is made in such a way that
the CRC computation is started from the beginning of the primary TB, the CRC computation is started from the beginning of the primary TB,
i.e. from the MSB of the first octet of the TB(1), and the i.e. from the MSB of the first octet of the TB(1), and the
computation is continued until the end of the payload data or until computation is continued until the end of the payload data or until
an erroneous TB is encountered. At the end of each TB a check MAY be an erroneous TB is encountered. At the end of each TB a check MAY be
performed: if the CRC value at the end of TB(n) matches the payload performed: if the CRC value at the end of TB(n) matches the payload
CRC value received in the payload header, the verification is CRC value received in the payload header, the verification is
successful and the data up TB(n) is valid. If the CRC value at the successful and the data up TB(n) is valid. If the CRC value at the
end of TB(n) does not match the payload CRC value received in the end of TB(n) does not match the payload CRC value received in the
payload header, there is an error in the TB(n) and it MUST be payload header, there is an error in the TB(n) and it MUST be
discarded as corrupted. Furthermore, if the verification indicates discarded as corrupted. Furthermore, if the verification indicates
corrupted TB(n), all subsequent transport blocks TB(m) with m>n MUST corrupted TB(n), all subsequent transport blocks TB(m) with m>n MUST
also be discarded. also be discarded.
3.5. G.718 session 4.5. G.718 Session
An G.718 session consists of one or several RTP sessions carrying A G.718 session consists of one or several RTP sessions carrying
encoded G.718 data according the payload format specified in section G.718 data encoded according to the payload format specified in
3.1. Section 4.1.
3.6. Cross-stream/cross-layer timing synchronization 4.6. Cross-stream/Cross-layer Timing Synchronization
In case an G.718 session consists of multiple RTP sessions, the RTP In the case where a G.718 session consists of multiple RTP sessions,
packets transmitted on separate RTP sessions need to be synchronized the RTP packets transmitted on separate RTP sessions need to be
in order to enable reconstruction of the frames in the receiving end. synchronized in order to enable reconstruction of the frames in the
Since each of the RTP sessions uses its own random initial value for receiving end. Since each of the RTP sessions uses its own random
the RTP timestamp, there is also a random offset between the RTP initial value for the RTP timestamp, there is also a random offset
timestamps values carrying the EDUs belonging to the same encoded between the RTP timestamps values carrying the EDUs belonging to the
frame in different RTP sessions. same encoded frame in different RTP sessions.
The receiver MUST use the traditional RTCP based mechanism to The receiver MUST use the traditional RTCP-based mechanism to
synchronize streams by using the RTP and NTP timestamps of the RTCP synchronize streams by using the RTP and NTP timestamps of the RTCP
Sender Reports (SR) it receives. Sender Reports (SR) it receives.
3.7. RTP Header usage 4.7. RTP Header Usage
This section specifies the usage of some fields of the RTP header This section specifies the usage of some fields of the RTP header
(specified in section 5 of [RFC3550]) with the G.718 RTP payload (specified in Section 5 of [RFC3550]) with the G.718 RTP payload
format. Setting of other RTP header fields is as specified in format. The settings for other RTP header fields are as specified in
[RFC3550]. [RFC3550].
The RTP timestamp corresponds to the sampling instant of the first The RTP timestamp corresponds to the sampling instant of the first
encoded sample of the earliest frame in the payload. The timestamp encoded sample of the earliest frame in the payload. The timestamp
clock frequency is 32 kHz. clock frequency is 32 kHz.
The marker bit (M) of each of the RTP streams of the session SHALL be The marker bit (M) of each of the RTP streams of the session SHALL be
set to value 1 if the payload carries an EDU belonging to the first set to value 1 if the payload carries an EDU belonging to the first
frame after an inactive period, i.e. an EDU from the first frame of a frame after an inactive period, i.e. an EDU from the first frame of a
talkspurt. For all other packets the marker bit is set to value 0. talkspurt. For all other packets the marker bit is set to value 0.
4. Payload Format Parameters 5. Payload Format Parameters
This section defines the parameters that may be used to configure This section defines the parameters that may be used to configure
optional features in the G.718 RTP transmission. optional features in the G.718 RTP transmission.
The parameters are defined here as part of the media subtype The parameters are defined here as part of the media subtype
registration for the G.718 codec. Mapping of the parameters into the registration for the G.718 codec. Mapping of the parameters into the
Session Description Protocol (SDP) [RFC4566] is also provided for Session Description Protocol (SDP) [RFC4566] is also provided for
those applications that use SDP. In control protocols that do not those applications that use SDP. In control protocols that do not
use MIME or SDP, the media type parameters must be mapped to the use MIME or SDP, the media type parameters MUST be mapped to the
appropriate format used with that control protocol. format used with that control protocol.
4.1. Media Type Registration 5.1. Media Type Registration
This registration is done using the template defined in RFC 4288 This registration is done using the template defined in RFC 4288
[RFC4288] and following RFC 4855 [RFC4855]. [RFC4288] and following RFC 4855 [RFC4855].
Type name: audio Type name: audio
Subtype name: G718 Subtype name: G718
Required parameters: none Required parameters: none
Optional parameters: Optional parameters:
mode: This parameter MAY be used to indicate whether the mode: This parameter MAY be used to indicate whether the
mode with layer L1 being present or the AMR-WB mode with layer L1 being present or the AMR-WB
compatible mode (with layer L1' being present) is in compatible mode (with layer L1' being present) is
use. If this parameter is not present or the value of in use. If this parameter is not present or the
this parameter is equal to 0, the mode with layer L1 value of this parameter is equal to 0, the mode
being present is in use. Otherwise, the AMR-WB with layer L1 being present is in use. Otherwise,
compatible mode is in use. When this parameter is the AMR-WB compatible mode is in use. When this
present, the value MUST be either 0 or 1. parameter is present, the value MUST be either 0
or 1.
Author's note: When the upcoming stereo and SWB options are NOTE: When the upcoming stereo and SWB options are
present, the semantics of this parameter may change. present, the semantics of this parameter may
change.
layers: The numbers of the layers (in range from 1 to 5, layers: The numbers of the layers (in range from 1 to 5,
denoting layers from L1 to L5, respectively) denoting layers from L1 to L5, respectively)
transmitted in this session, expressed as comma- transmitted in this session, expressed as comma-
separated list of layer numbers. If the parameter is separated list of layer numbers. If the parameter
present, at least layer L1 or L1' MUST be included in is present, at least layer L1 or L1' MUST be
the list of layers in one of the RTP sessions included included in the list of layers in one of the RTP
in the G.718 session. If the parameter is not present, sessions included in the G.718 session. If the
all layers up to layer L5 MAY be used in the session. parameter is not present, all layers up to layer
L5 MAY be used in the session.
Author's note: Why not use semantics similarly as L-ID? NOTE: Why not use semantics similarly as L-ID?
ptime: The recommended length of time (in milliseconds) ptime: The recommended length of time (in milliseconds)
represented by the media in a packet. See Section 6 represented by the media in a packet. See Section
of [RFC4566]. 6 of [RFC4566].
maxptime: The maximum length of time (in milliseconds) that can maxptime: The maximum length of time (in milliseconds) that
be encapsulated in a packet. See Section 6 of can be encapsulated in a packet. See Section 6 of
[RFC4566] [RFC4566].
Author's note: Some further study is needed to see if separate NOTE: Some further study is needed to see if separate parameters
parameters for sending and receiving capabilities/preferences are for sending and receiving capabilities/preferences are
needed -- especially for upcoming stereo and SWB options. needed -- especially for upcoming stereo and SWB options.
Author's note: The support for upcoming SWB and stereo options NOTE: Support for upcoming SWB and stereo options needs to be
needs to be taken into account. Basically we can either 1) extend taken into account. Basically we can either 1) extend the
the parameter "layers" to cover also this aspect, or 2) define parameter "layers" to cover also this aspect, or 2) define
separate parameter(s) for these new options when more details on separate parameter(s) for these new options when more
the stereo/SWB support are available. details on the stereo/SWB support are available.
Encoding considerations: Encoding considerations:
This media type is framed and contains binary data; see Section
4.8 of [RFC4288].
This media type is framed and contains binary data; see Section 4.8 Security considerations: See Section 7 of RFC XXXX.
of [RFC4288]. [RFC Editor: Upon publication as an RFC, please "XXXX" with the
number assigned to this document and remove this note.]
Security considerations: See Section 6 of RFC xxxx
Interoperability considerations: none Interoperability considerations: None.
Published specification: RFC xxxx Published specification: RFC XXX.
[RFC Editor: Upon publication as an RFC, please "XXXX" with the
number assigned to this document and remove this note.]
Applications which use this media type: Applications which use this media type:
For example: Voice over IP, audio and video conferencing, audio
streaming and voice messaging.
For example Voice over IP, audio and video conferencing, audio Additional information: None.
streaming and voice messaging.
Additional information: none
Person & email address to contact for further information: Person & email address to contact for further information:
Ari Lakaniemi, ari.lakaniemi@nokia.com
Ari Lakaniemi, ari.lakaniemi@nokia.com
Intended usage: COMMON Intended usage: COMMON
Restrictions on usage: Restrictions on usage:
This media type depends on RTP framing, and hence is only defined
for transfer via RTP [RFC3550].
This media type depends on RTP framing, and hence is only defined Author: Ari Lakaniemi, ari.lakaniemi@nokia.com
for transfer via RTP [RFC3550]
Author:
Ari Lakaniemi, ari.lakaniemi@nokia.com
Change controller: Change controller:
IETF Audio/Video Transport Working Group delegated from the IESG.
IETF Audio/Video Transport working group delegated from the IESG 5.2. Mapping to SDP Parameters
4.2. Mapping to SDP Parameters
The information carried in the media type specification has a The information carried in the media type specification has a
specific mapping to fields of the SDP [RFC4566], which is commonly specific mapping to fields of the SDP [RFC4566], which is commonly
used to describe RTP sessions. When SDP is used to specify sessions used to describe RTP sessions. When SDP is used to specify sessions
employing the G.718 codec, the mapping is as follows: employing the G.718 codec, the mapping is as follows:
o The media type ("audio") goes in SDP "m=" as the media name. o The media type ("audio") goes in SDP "m=" as the media name.
o The media subtype ("G718") goes in SDP "a=rtpmap" as the encoding o The media subtype ("G718") goes in SDP "a=rtpmap" as the encoding
name. The RTP clock rate in "a=rtpmap" MUST be 32000 for G.718. name. The RTP clock rate in "a=rtpmap" MUST be 32000 for G.718.
NOTE: The current choice for the RTP clock rate is a
Author's note: The current choice for the RTP clock rate is a 'placeholder'. The clock rate needs to be set according to SWB
'placeholder'. The clock rate needs to be set according to SWB sampling rate, which is still T.B.D. Since the core codec
sampling rate, which is still T.B.D. Since the core codec employs employs 16000 Hz sampling rate, an integer multiple of 16000 Hz
16000 Hz sampling rate, an integer multiple of 16000 Hz seems to seems to be a preferable choice.
be a preferable choice.
o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and
"a=maxptime" attributes, respectively. "a=maxptime" attributes, respectively.
o Any remaining parameters go in the SDP "a=fmtp" attribute by o Any remaining parameters go in the SDP "a=fmtp" attribute by
copying them directly from the media type string as a semicolon copying them directly from the media type string as a semicolon
separated list of parameter=value pairs. separated list of parameter=value pairs.
4.3. Offer/answer considerations 5.3. Offer/Answer Considerations
The following considerations apply when using the SDP offer/answer The following considerations apply when using the SDP offer/answer
[RFC3264] mechanism to negotiate the G.718 transport. The parameter [RFC3264] mechanism to negotiate the G.718 transport. The parameter
"layers" MAY be used to indicate the layer configuration for the each "layers" MAY be used to indicate the layer configuration for the each
RTP session belonging to current G.718 session an end-point making RTP session belonging to current G.718 session an end-point making
the offer is ready to transmit and wishes to receive. the offer is ready to transmit and wishes to receive.
o In case the G.718 session consists of a single RTP session, it is o In case the G.718 session consists of a single RTP session, it is
RECOMMENDED not to impose any layer restrictions for the session RECOMMENDED not to impose any layer restrictions for the session
but to use the rate control functionality to set possible but to use the rate control functionality to set possible
restrictions on usage of the higher or highest layers. If the restrictions on usage of the higher or highest layers. If the
offer includes a layer configuration parameter, the answer MAY use offer includes a layer configuration parameter, the answer MAY use
different configuration, but the highest layer in the answer MUST different configuration, but the highest layer in the answer MUST
NOT be higher than the highest layer of the offered configuration. NOT be higher than the highest layer of the offered configuration.
Author's note: Support for answer modifying the layer NOTE: Support for answer modifying the layer configuration is
configuration is FFS. FFS.
In case the G.718 session consists of multiple RTP sessions, the o In case the G.718 session consists of multiple RTP sessions, the
answer MUST use the layer configurations provided in the offer for answer MUST use the layer configurations provided in the offer for
the sessions it accepts. the sessions it accepts.
4.4. Declarative usage of SDP 5.4. Declarative Usage of SDP
In declarative usage, such as SDP in RTSP [RFC2326] or SAP [RFC2974], In declarative usage, such as SDP in RTSP [RFC2326] or SAP [RFC2974],
the parameter "layers" SHALL be interpreted to provide a set of the parameter "layers" SHALL be interpreted to provide a set of
layers that the sender may use in the session. layers that the sender MAY use in the session.
4.5. SDP examples 5.5. SDP Examples
Some example SDP session descriptions utilizing G.718 encodings are Some example SDP session descriptions utilizing G.718 encodings are
provided below. provided below.
The first example illustrates the simple case where the G.718 session 5.5.1. Example 1
The first example illustrates the simple case with the G.718 session
employing a single RTP session and the AVPF profile is offered, and employing a single RTP session and the AVPF profile is offered, and
the answer accepts the offer without any changes. the answer accepts the offer without any changes.
Offer: Offer:
m=audio 49120 RTP/AVPF 97 m=audio 49120 RTP/AVPF 97
a=rtpmap:97 G718/32000/1 a=rtpmap:97 G718/32000/1
Answer: Answer:
m=audio 49120 RTP/AVPF 97 m=audio 49120 RTP/AVPF 97
a=rtpmap:97 G718/32000/1 a=rtpmap:97 G718/32000/1
The second example shows a bit more complex case where the G.718 5.5.2. Example 2
session using a single RTP session and the AVPF profile is offered
with restriction to send/receive only with layers L1 and L2. The This example shows a bit more complex case where the G.718 session
answer indicates that the other end-point is happy to receive (and using a single RTP session and the AVPF profile is offered with the
send) layers up to L5. restriction to send/receive only with layers L1 and L2. The answer
indicates that the other end-point is happy to receive (and send)
layers up to L5.
Offer: Offer:
m=audio 49120 RTP/AVPF 97 m=audio 49120 RTP/AVPF 97
a=rtpmap:97 G718/32000/1 a=rtpmap:97 G718/32000/1
a=fmtp:97 layers=1,2 a=fmtp:97 layers=1,2
Answer: Answer:
m=audio 49120 RTP/AVPF 97 m=audio 49120 RTP/AVPF 97
a=rtpmap:97 G718/32000/1 a=rtpmap:97 G718/32000/1
a=fmtp:97 layers=1,2,3,4,5 a=fmtp:97 layers=1,2,3,4,5
5.5.3. Example 3
The third example shows an G.718 session using multiple RTP sessions The third example shows an G.718 session using multiple RTP sessions
with the AVPF profile. The answerer wishes to use only layers up to with the AVPF profile. The answerer wishes to use only layers up to
L3. L3.
Offer: Offer:
m=audio 49120 RTP/AVPF 97 m=audio 49120 RTP/AVPF 97
a=rtpmap:97 G718/32000/1 a=rtpmap:97 G718/32000/1
a=fmtp:97 layers=1,2 a=fmtp:97 layers=1,2
a=mid=1 a=mid=1
m=audio 49122 RTP/AVPF 98 m=audio 49122 RTP/AVPF 98
a=rtpmap:98 G718/32000/1 a=rtpmap:98 G718/32000/1
a=fmtp:98 layers=3 a=fmtp:98 layers=3
a=mid=2 a=mid=2
a=depend:lay 1 a=depend:lay 1
m=audio 49124 RTP/AVPF 99 m=audio 49124 RTP/AVPF 99
a=rtpmap:99 G718/32000/1 a=rtpmap:99 G718/32000/1
a=fmtp:99 layers=4,5 a=fmtp:99 layers=4,5
a=mid=3 a=mid=3
a=depend:lay 1 2 a=depend:lay 1 2
Answer: Answer:
m=audio 49120 RTP/AVPF 97 m=audio 49120 RTP/AVPF 97
a=rtpmap:97 G718/32000/1 a=rtpmap:97 G718/32000/1
a=fmtp:97 layers=1,2 a=fmtp:97 layers=1,2
a=mid=1 a=mid=1
m=audio 49120 RTP/AVPF 98 m=audio 49120 RTP/AVPF 98
a=rtpmap:98 G718/32000/1 a=rtpmap:98 G718/32000/1
a=fmtp:98 layers=3 a=fmtp:98 layers=3
a=mid=2 a=mid=2
a=depend:lay 1 a=depend:lay 1
Note that the dependency signaling according to [smd-sdp] is used in Note that the dependency signaling described in [RFC5583] is used in
the third example above to indicate the relationship between the the third example above to indicate the relationship between the
layers distributed into separate RTP sessions. layers distributed into separate RTP sessions.
5. Security Considerations 6. Congestion Control
As a scalable codec, G.718 implicitly provides means for congestion
control by providing a possibility for 'thinning' the bitstream. The
RTP payload format according to this specification provides several
different means for reducing the G.718 session bandwidth. The most
appropriate mechanism (in terms of impact to the user experience)
depends on the employed payload structure and also on the employed
session configuration (single RTP session or multiple RTP sessions).
The following means (in no particular order) can be used to assist
congestion control procedures -- either by the sender or by the
intermediate node.
o The transport blocks carrying the EDUs representing the highest
layers within the payload may be dropped
o The payloads carrying the EDUs representing the highest layers in
an G.718 session are dropped
o Transport blocks or payloads carrying EDUs belonging to redundant
frames included in the payload are dropped
7. Security Considerations
RTP packets using the payload format defined in this specification RTP packets using the payload format defined in this specification
are subject to the security considerations discussed in the RTP are subject to the security considerations discussed in the RTP
specification [RFC3550], and in any appropriate RTP profile (for specification [RFC3550], and in any appropriate RTP profile (for
example [RFC3551] or [RFC4585]). This implies that confidentiality example [RFC3551] or [RFC4585]. This implies that confidentiality of
of the media streams is achieved by encryption; for example, through the media streams is achieved by encryption; for example, through the
the application of SRTP [RFC3711]. Because the data compression used application of SRTP [RFC3711]. Because the data compression used
with this payload format is applied end-to-end, any encryption needs with this payload format is applied end-to-end, any encryption needs
to be performed after compression. to be performed after compression.
A potential denial-of-service threat exists for data encodings using A potential denial-of-service threat exists for data encodings using
compression techniques that have non-uniform receiver-end compression techniques that have non-uniform receiver-end
computational load. The attacker can inject pathological datagrams computational load. The attacker can inject pathological datagrams
into the stream that will increase the processing load of the decoder into the stream that will increase the processing load of the decoder
and may cause the receiver to be overloaded. For example inserting and may cause the receiver to be overloaded. For example inserting
additional EDUs representing the higher enhancement layers on top of additional EDUs representing the higher enhancement layers on top of
the ones actually transmitted may increase the decoder load. However, the ones actually transmitted may increase the decoder load.
the G.718 codec is not particularly vulnerable to such an attack, However, the G.718 codec is not particularly vulnerable to such an
since the majority of the computational load in an G.718 session is attack, since the majority of the computational load in an G.718
associated to the encoder. Another form of possible attach might be session is associated to the encoder. Another form of possible
forging of codec bit-rate control messages, which may result in attach might be forging of codec bit-rate control messages, which may
encoder operating employing higher number of enhancement layers than result in encoder operating employing higher number of enhancement
originally intended and thereby requiring larger amount of layers than originally intended and thereby requiring larger amount
computation resources. Therefore, the usage of data origin of computation resources. Therefore, the usage of data origin
authentication and data integrity protection of at least the RTP authentication and data integrity protection of at least the RTP
packet is RECOMMENDED; for example, with SRTP [RFC3711]. packet is RECOMMENDED; for example, with SRTP [RFC3711].
Note that the appropriate mechanism to ensure confidentiality and Note that the appropriate mechanism to ensure confidentiality and
integrity of RTP packets and their payloads is very dependent on the integrity of RTP packets and their payloads is very dependent on the
application and on the transport and signaling protocols employed. application and on the transport and signaling protocols employed.
Thus, although SRTP is given as an example above, other possible Thus, although SRTP is given as an example above, other possible
choices exist. choices exist.
Note that end-to-end security with either authentication, integrity Note that end-to-end security with either authentication, integrity
or confidentiality protection will prevent a network element not or confidentiality protection will prevent a network element not
within the security context from performing media-aware operations within the security context from performing media-aware operations
other than discarding complete packets. To allow any (media-aware) other than discarding complete packets. To allow any (media-aware)
intermediate network element to perform its operations, it is intermediate network element to perform its operations, it is
required to be a trusted entity which is included in the security required to be a trusted entity which is included in the security
context establishment. context establishment.
6. Congestion control 8. IANA Considerations
As scalable codec G.718 implicitly provides means for congestion IANA is kindly requested to register a media type for the G.718 codec
control by providing a possibility for 'thinning' the bitstream. The for RTP transport, as specified in Section 5.1 of this document.
RTP payload format according to this specification provides several
different means for reducing the G.718 session bandwidth. The most
appropriate mechanism (in terms of impact to the user experience)
depends on the employed payload structure and also on the employed
session configuration (single RTP session or multiple RTP sessions).
The following means (in no particular order) can be used to assist
congestion control procedures -- either by the sender or by the
intermediate node.
o The transport blocks carrying the EDUs representing the highest 9. Acknowledgements
layers within the payload may be dropped.
o The payloads carrying the EDUs representing the highest layers in Thanks to Qin Wu for useful review and commentary.
an G.718 session are dropped.
o Transport blocks or payloads carrying EDUs belonging to redundant 10. References
frames included in the payload are dropped.
7. IANA Considerations 10.1. Normative References
IANA is kindly requested to register a media type for the G.718 codec [AMR-WB] 3GPP, "Speech codec speech processing functions;
for RTP transport, as specified in section 4.1. of this document. Adaptive Multi-Rate - Wideband (AMR-WB) speech
codec; General description", 3GPP TS 26.171 5.0.0,
April 2001.
APPENDIX A: Payload examples [ITU.G718.2008] International Telecommunications Union, "Frame Error
Robust Narrowband and Wideband Embedded Variable
Bit-Rate Coding of Speech and Audio from 8-32
Kbit/s", ITU-T Recommendation G.718, May 2008.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer
Model with Session Description Protocol (SDP)",
RFC 3264, June 2002.
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
Jacobson, "RTP: A Transport Protocol for Real-Time
Applications", STD 64, RFC 3550, July 2003.
[RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for
Audio and Video Conferences with Minimal Control",
STD 65, RFC 3551, July 2003.
[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E.,
and K. Norrman, "The Secure Real-time Transport
Protocol (SRTP)", RFC 3711, March 2004.
[RFC4288] Freed, N. and J. Klensin, "Media Type Specifications
and Registration Procedures", BCP 13, RFC 4288,
December 2005.
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP:
Session Description Protocol", RFC 4566, July 2006.
[RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and
J. Rey, "Extended RTP Profile for Real-time
Transport Control Protocol (RTCP)-Based Feedback
(RTP/AVPF)", RFC 4585, July 2006.
[RFC4855] Casner, S., "Media Type Registration of RTP Payload
Formats", RFC 4855, February 2007.
[RFC4867] Sjoberg, J., Westerlund, M., Lakaniemi, A., and Q.
Xie, "RTP Payload Format and File Storage Format for
the Adaptive Multi-Rate (AMR) and Adaptive Multi-
Rate Wideband (AMR-WB) Audio Codecs", RFC 4867,
April 2007.
[RFC5583] Schierl, T. and S. Wenger, "Signaling Media Decoding
Dependency in the Session Description Protocol
(SDP)", RFC 5583, July 2009.
10.2. Informative References
[McCanne] McCanne, S., Jacobson, V., and M. Vetterli,
"Receiver-driven layered multicast", ACM SIGCOMM
Computer Communication Review Volume 26 Issue 4,
October 1996.
[RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real
Time Streaming Protocol (RTSP)", RFC 2326,
April 1998.
[RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session
Announcement Protocol", RFC 2974, October 2000.
[RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson,
L-E., and G. Fairhurst, "The Lightweight User
Datagram Protocol (UDP-Lite)", RFC 3828, July 2004.
[RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram
Congestion Control Protocol (DCCP)", RFC 4340,
March 2006.
[RFC5117] Westerlund, M. and S. Wenger, "RTP Topologies",
RFC 5117, January 2008.
Appendix A. Payload Examples
The G.718 payload structure enables flexible transport either by The G.718 payload structure enables flexible transport either by
carrying all layers in the same payload or separating the layers into carrying all layers in the same payload or separating the layers into
separate payloads. The following subsections illustrate different separate payloads. The following subsections illustrate different
possibilities for transport by simple examples. Note that examples do possibilities for transport by simple examples. Note that examples
not show the full payload structure to keep the illustration simple. do not show the full payload structure to keep the illustration
simple.
A.1. Simple payload examples A.1. Simple Payload Examples
A.1.1. All the layers in the same payload A.1.1. All The Layers in The Same Payload
The illustration below shows layers L1-L3 from two encoded frames The illustration below shows layers L1-L3 from two encoded frames
encapsulated into separate payloads using single transport block. encapsulated into separate payloads using single transport block.
+-------+--------+-----+------+------+------+ +-------+--------+-----+------+------+------+
| RTP1 | L-ID=3 |NF=0 |F1-L1 |F1-L2 |F1-L3 | | RTP1 | L-ID=3 |NF=0 |F1-L1 |F1-L2 |F1-L3 |
+-------+--------+-----+------+------+------+ +-------+--------+-----+------+------+------+
+-------+--------+-----+------+------+------+ +-------+--------+-----+------+------+------+
| RTP2 | L-ID=3 |NF=0 |F2-L1 |F2-L2 |F2-L3 | | RTP2 | L-ID=3 |NF=0 |F2-L1 |F2-L2 |F2-L3 |
+-------+--------+-----+------+------+------+ +-------+--------+-----+------+------+------+
In case the same layers from two input frames are encapsulated into In the case where the same layers from two input frames are
one payload using single transport block, the structure is as shown encapsulated into one payload using single transport block, the
below. structure is as shown below.
+-------+--------+-----+------+------+------+------+------+------+ +-------+--------+-----+------+------+------+------+------+------+
| RTP1 | L-ID=3 |NF=1 |F1-L1 |F2-L1 |F1-L2 |F2-L2 |F3-L3 |F2-L3 | | RTP1 | L-ID=3 |NF=1 |F1-L1 |F2-L1 |F1-L2 |F2-L2 |F3-L3 |F2-L3 |
+-------+--------+-----+------+------+------+------+------+------+ +-------+--------+-----+------+------+------+------+------+------+
The third example illustrates the case where the layers L1-L3 from The third example illustrates the case where the layers L1-L3 from
two input frames are encapsulated into one payload using two separate two input frames are encapsulated into one payload using two separate
transport blocks, the first one carrying L1 and the other one transport blocks, the first one carrying L1 and the other one
containing L2 and L3. containing L2 and L3.
+-------+--------+-----+------+------+ +-------+--------+-----+------+------+
| RTP1 | L-ID=1 |NF=1 |F1-L1 |F2-L1 | | RTP1 | L-ID=1 |NF=1 |F1-L1 |F2-L1 |
+-------+--------+-----+------+------+------+------+ +-------+--------+-----+------+------+------+------+
| L-ID=7 |NF=1 |F1-L2 |F2-L2 |F2-L2 |F2-L3 | | L-ID=7 |NF=1 |F1-L2 |F2-L2 |F2-L2 |F2-L3 |
+--------+-----+------+------+------+------+ +--------+-----+------+------+------+------+
A.1.2. Layers in separate RTP streams A.1.2. Layers in Seperate RTP Streams
In this case the data for each layer is transmitted in its own In this case the data for each layer is transmitted in its own
payload. payload.
In the first example each transport block including a single EDU is In the first example each transport block including a single EDU is
carried in its own RTP payload. carried in its own RTP payload.
+-------+--------+-----+-----+ +-------+--------+-----+-----+ +-------+--------+-----+-----+ +-------+--------+-----+-----+
| RTP1a | L-ID=1 |NF=0 |F1-L1| | RTP1b | L-ID=6 |NF=0 |F1-L2| | RTP1a | L-ID=1 |NF=0 |F1-L1| | RTP1b | L-ID=6 |NF=0 |F1-L2|
+-------+--------+-----+-----+ +-------+--------+-----+-----+ +-------+--------+-----+-----+ +-------+--------+-----+-----+
skipping to change at page 25, line 5 skipping to change at page 23, line 39
| RTP1c |L-ID=10 |NF=0 |F1-L3| | RTP2a | L-ID=1 |NF=0 |F2-L1| | RTP1c |L-ID=10 |NF=0 |F1-L3| | RTP2a | L-ID=1 |NF=0 |F2-L1|
+-------+--------+-----+-----+ +-------+--------+-----+-----+ +-------+--------+-----+-----+ +-------+--------+-----+-----+
+-------+--------+-----+-----+ +-------+--------+-----+-----+ +-------+--------+-----+-----+ +-------+--------+-----+-----+
| RTP2b | L-ID=6 |NF=0 |F2-L2| | RTP2c |L-ID=10 |NF=0 |F2-L3| | RTP2b | L-ID=6 |NF=0 |F2-L2| | RTP2c |L-ID=10 |NF=0 |F2-L3|
+-------+--------+-----+-----+ +-------+--------+-----+-----+ +-------+--------+-----+-----+ +-------+--------+-----+-----+
If the payloads carry data from two consecutive input frames, the If the payloads carry data from two consecutive input frames, the
same encoded data as in the previous example is arranged as follows. same encoded data as in the previous example is arranged as follows.
+-------+--------+-----+-----+-----+ +-------+--------+-----+-----+-----+
| RTP1a | L-ID=1 |NF=1 |F1-L1|F2-L1| | RTP1a | L-ID=1 |NF=1 |F1-L1|F2-L1|
+-------+--------+-----+-----+-----+ +-------+--------+-----+-----+-----+
+-------+--------+-----+-----+-----+ +-------+--------+-----+-----+-----+
| RTP1b | L-ID=6 |NF=1 |F1-L2|F2-L2| | RTP1b | L-ID=6 |NF=1 |F1-L2|F2-L2|
+-------+--------+-----+-----+-----+ +-------+--------+-----+-----+-----+
+-------+--------+-----+-----+-----+ +-------+--------+-----+-----+-----+
| RTP1c |L-ID=10 |NF=1 |F1-L3|F2-L3| | RTP1c |L-ID=10 |NF=1 |F1-L3|F2-L3|
+-------+--------+-----+-----+-----+ +-------+--------+-----+-----+-----+
A.2. Advanced examples A.2. Advanced Examples
A.2.1. Different update rate for subset of layers A.2.1. Different Update Rate for Subset of Layers
An example employing different update rates (i.e. different number of An example employing different update rates (i.e. different number of
frames per packet) for selected subsets of layers. In these examples frames per packet) for selected subsets of layers. In these examples
all core codec layers L1-L5 are shown. all core codec layers L1-L5 are shown.
+-------+--------+-----+-----+-----+-----+-----+ +-------+--------+-----+-----+-----+-----+-----+
| RTP1 | L-ID=1 |NF=3 |F1-L1|F2-L1|F3-L1|F4-L1| | RTP1 | L-ID=1 |NF=3 |F1-L1|F2-L1|F3-L1|F4-L1|
+-------+--------+-----+-----+-----+-----+-----+ +-------+--------+-----+-----+-----+-----+-----+
+-------+--------+-----+-----+-----+-----+-----+ +-------+--------+-----+-----+-----+-----+-----+
| RTP2a | L-ID=7 |NF=1 |F1-L2|F2-L2|F1-L3|F2-L3| | RTP2a | L-ID=7 |NF=1 |F1-L2|F2-L2|F1-L3|F2-L3|
+-------+--------+-----+-----+-----+-----+-----+ +-------+--------+-----+-----+-----+-----+-----+
skipping to change at page 26, line 33 skipping to change at page 25, line 5
+-------+--------+-----+-----+-----+-----+-----+ +-------+--------+-----+-----+-----+-----+-----+
+-------+--------+-----+-----+-----+ +-------+--------+-----+-----+-----+
| RTP3c |L-ID=14 |NF=0 |F3-L4|F3-L5| | RTP3c |L-ID=14 |NF=0 |F3-L4|F3-L5|
+-------+--------+-----+-----+-----+ +-------+--------+-----+-----+-----+
+-------+--------+-----+-----+-----+ +-------+--------+-----+-----+-----+
| RTP3d |L-ID=14 |NF=0 |F4-L4|F4-L5| | RTP3d |L-ID=14 |NF=0 |F4-L4|F4-L5|
+-------+--------+-----+-----+-----+ +-------+--------+-----+-----+-----+
A.2.2. Redundant frames with limited set of layers A.2.2. Redundant Frames With Limited Set of Layers
An example transmitting layers L1-L3 as primary data and L1 (of the An example transmitting layers L1-L3 as primary data and L1 (of the
previous frame) as redundant data is shown below. Each payload previous frame) as redundant data is shown below. Each payload
carries one primary (i.e. new) frame in one transport block and one carries one primary (i.e. new) frame in one transport block and one
redundant frame, which in this example is the frame preceding the redundant frame, which in this example is the frame preceding the
primary frame, in another transport block. primary frame, in another transport block.
+-------+--------+-----+-----+--------+-----+-----+-----+-----+ +-------+--------+-----+-----+--------+-----+-----+-----+-----+
| RTP1 | L-ID=1 |NF=0 |F0-L1| L-ID=3 |NF=0 |F1-L1|F1-L2|F1-L3| | RTP1 | L-ID=1 |NF=0 |F0-L1| L-ID=3 |NF=0 |F1-L1|F1-L2|F1-L3|
+-------+--------+-----+-----+--------+-----+-----+-----+-----+ +-------+--------+-----+-----+--------+-----+-----+-----+-----+
+-------+--------+-----+-----+--------+-----+-----+-----+-----+ +-------+--------+-----+-----+--------+-----+-----+-----+-----+
| RTP2 | L-ID=1 |NF=0 |F1-L1| L-ID=3 |NF=0 |F2-L1|F2-L2|F2-L3| | RTP2 | L-ID=1 |NF=0 |F1-L1| L-ID=3 |NF=0 |F2-L1|F2-L2|F2-L3|
skipping to change at page 27, line 34 skipping to change at page 25, line 42
+-------+--------+-----+-----+-----+-----+--------+-----+-----+ +-------+--------+-----+-----+-----+-----+--------+-----+-----+
| RTP2 | L-ID=3 |NF=0 |F1-L1|F1-L2|F1-L3| L-ID=1 |NF=0 |F2-L1| | RTP2 | L-ID=3 |NF=0 |F1-L1|F1-L2|F1-L3| L-ID=1 |NF=0 |F2-L1|
+-------+--------+-----+-----+-----+-----+--------+-----+-----+ +-------+--------+-----+-----+-----+-----+--------+-----+-----+
+-------+--------+-----+-----+-----+-----+--------+-----+-----+ +-------+--------+-----+-----+-----+-----+--------+-----+-----+
| RTP3 | L-ID=3 |NF=0 |F2-L1|F2-L2|F2-L3| L-ID=1 |NF=0 |F3-L1| | RTP3 | L-ID=3 |NF=0 |F2-L1|F2-L2|F2-L3| L-ID=1 |NF=0 |F3-L1|
+-------+--------+-----+-----+-----+-----+--------+-----+-----+ +-------+--------+-----+-----+-----+-----+--------+-----+-----+
Now the first transport block carries the primary data and the second Now the first transport block carries the primary data and the second
transport block carries the redundant data, which in this case covers transport block carries the redundant data, which in this case covers
the frame following the primary frame. The benefit of this approach the frame following the primary frame. The benefit of this approach
is that the redundant data is included in the last (secondary) is that the redundant data is included in the last (secondary)
transport block of the payload, which might be beneficial for transport block of the payload, which might be beneficial for
possible payload scaling operation within the network. possible payload scaling operation within the network.
8. References Authors' Addresses
8.1. Normative References
[AMR-WB] 3GPP TS 26.171, "Adaptive Multi-Rate Wideband (AMR-WB)
speech codec; General description (Release 7)", v7.0.0,
September 2006.
[G.718] ITU-T Recommendation G.718, "Frame Error Robust Narrowband
and Wideband Embedded Variable Bit-Rate Coding of Speech
and Audio from 8-32 Kbit/s", (consented) May 2008.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3264] Rosenberg, J., Schulzrinne, H., "An Offer/Answer Model with
Session Description Protocol (SDP)", RFC 3264, June 2002.
[RFC3550]Schulzrinne, H., Casner, S., Frederick, R. and Jacobson, V.,
"RTP: A Transport Protocol for Real-Time Applications", STD
64, RFC 3550, July 2003.
[RFC3551] Schulzrinne, H., Casner, S., "RTP Profile for Audio and
Video Conferences with Minimal Control", STD 65, RFC 3551,
July 2003.
[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., Norrman,
K., "The Secure Real-Time Transport Protocol (SRTP)", RFC
3711, March 2004.
[RFC4288] Freed, N., Klensin, J., "Media Type Specifications and
Registration Procedures", BCP 13, RFC 4288, December 2005.
[RFC4566] Handley, M., Jacobson, V. and Perkins, C., "SDP: Session
Description Protocol", RFC 4566, July 2006.
[RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., Rey, J.,
"Extended RTP Profile for Real-Time Transport Control
Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July
2006.
[RFC4855] Casner, S., "Media Type Registration of RTP Payload
Formats", RFC 4855, February 2007.
[RFC4867] Sjoberg, J., Westerlund, M., Lakaniemi, A., Xie, Q., "RTP
Payload Format and File Storage Format fort he Adaptive
Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB)
Audio Codecs", RFC 4867, April 2007.
[RFC5104] Wenger, S., Chandra, U., Westerlund, M., Burman, B., "Codec
Control Messages in the RTP Audio-Visual Profile with
Feedback (AVPF)", RFC 5104, Feburary 2008.
[smd-sdp] Schierl, T., Wenger, S., "Signaling media decoding
dependency in Session Description Protocol (SDP)", draft-
schierl-mmusic-layered-codec-04 (work in progress), June
2007.
8.2. Informative References
[McCanne] McCanne, S., Jacobson, V., and Vetterli, M., "Receiver-
driven layered multicast", in Proc. of ACM SIGCOMM'96,
pages 117--130, Stanford, CA, August 1996.
[RFC2326] Schulzrinne, H., Rao, A., Lanphier, R., "Real Time
Streaming Protocol (RTSP)", RFC 2326, April 1998.
[RFC2974] Handley, M., Perkins, C., Whelan, E., "Session Announcement
Protocol", RFC 2974, October 2000.
[RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Glen Zorn (editor)
Fairhurst, G., "The Lightweight User Datagram Protocol Network Zen
(UDP-Lite)", RFC 3828, July 2004. 227/358 Thanon Sanphawut
Bang Na, Bangkok 10260
Thailand
[RFC4340] Kohler, E., Handley, M., Floyd, S., "Data Congestion Phone: +66 (0) 87-040-4617
Control Protocol (DCCP)", RFC 4340, March 2006. EMail: gwz@net-zen.net
[RFC5117] Westerlund, M., Wenger, S., "RTP Topologies", RFC 5117, Ye-Kui Wang
January 2008. Huawei Technologies
400 Somerset Corp Blvd.
Suite 402
Bridgewater, NJ 08807
USA
Author's Addresses Phone: +1 (908) 541-3518
EMail: yekuiwang@huawei.com
Ari Lakaniemi Ari Lakaniemi
Nokia Nokia
P.O.Box 407 P.O.Box 407
FIN-00045 Nokia Group, FINLAND FIN-00045 Nokia Group
Finland
Phone: +358-71-8008000 Phone: +358-71-8008000
Email: ari.lakaniemi@nokia.com EMail: ari.lakaniemi@nokia.com
Ye-Kui Wang
Huawei Technologies
400 Somerset Corp Blvd, Suite 602
Bridgewater, NJ 08807, USA
Phone: +1-908-541-3518
EMail: yekuiwang@huawei.com
Acknowledgment
Funding for the RFC Editor function is currently provided by the
Internet Society.
9. Open Issues
1) Support of super-wideband (SWB) audio and stereophonic encoding
extensions to ITU-T G.718 currently being worked on by ITU-T is to
be specified after ITU-T completes the work in that regards.
a. Some further study is needed to see if separate parameters
for sending and receiving capabilities/preferences are needed
-- especially for upcoming stereo and SWB options.
b. The support for upcoming SWB and stereo options needs to be
taken into account. Basically we can either 1) extend the
parameter "layers" to cover also this aspect, or 2) define
separate parameter(s) for these new options when more details
on the stereo/SWB support are available.
2) For streaming or other applications that allow for relatively long
end-to-end delay, sometimes it would be beneficial to aggregate
more than 4 frames in one Transport Block (TB). Should the length
of the NF field be larger?
3) On layer structure and configuration signalling. Currently, a
unique layer ID is assigned for any possible layer combinations.
See the editing notes below Table 3 for other possible approaches.
One of the alternative ways may be chosen in the final draft.
4) Currently, it is mandated that lower layer EDUs of later frames go
before higher layer EDUs of earlier frames in a transport block.
This way is friendlier to adaptation (dropping of higher layers).
However, if all layers are received, then the depacketizer needs
to reorder the EDUs to their decoding order before feeding them to
the decoder. Therefore, the other way around (i.e. lower layer
EDUs of later frames go after higher layer EDUs of earlier frames,
or EDUs in transport blocks are placed in decoding order) is more
friendly to the depacketizer. Another benefit of the latter is
that it does not introduce any end-to-end delay. Which way to be
specified (or both allowed if needed) is FFS.
5) MANEs dropping RTP packets are RTP translators. But are those
MANEs dropping a subset of the transport blocks in one packet also
RTP translators?
6) The RTCP based cross-session synchronization is not possible until
the first RTCP SRs are received in all sessions. This implies that
decoding only a subset of layers may be possible until RTCP SRs in
all sessions have been received. This may imposes higher end-to-
end delay or higher bandwidth for RTCP data, and the approach may
not work perfectly for some multicast topologies. There is a study
ongoing by some AVT members. Once there is an acceptable solution
fouthe draft documenting that solution may be referenced in this
draft.
7) It might be better to change the semantics of the media type
parameter 'layers' to be similar as that for L-ID.
8) Offer/answer with answer being capable of modifying the layer
configuration is FFS.
9) Some references need to be updated in the final draft.
10. Changes Log
From draft-ietf-avt-rtp-g718-00 to draft-ietf-avt-rtp-g718-01
- Updated the boiler template.
- Changed Ye-Kui Wang's affiliation and address.
From draft-ietf-avt-rtp-g718-01 to draft-ietf-avt-rtp-g718-02
- Updated the boiler template (added the last sentence in Copyright
Notice).
 End of changes. 176 change blocks. 
602 lines changed or deleted 587 lines changed or added

This html diff was produced by rfcdiff 1.40. The latest version is available from http://tools.ietf.org/tools/rfcdiff/