draft-ietf-avt-mpeg4-simple-03.txt   draft-ietf-avt-mpeg4-simple-04.txt 
skipping to change at page 1, line 13 skipping to change at page 1, line 13
Internet Draft Philips Electronics Internet Draft Philips Electronics
D. Mackie D. Mackie
Cisco Systems Inc. Cisco Systems Inc.
V. Swaminathan V. Swaminathan
Sun Microsystems Inc. Sun Microsystems Inc.
D. Singer D. Singer
Apple Computer Apple Computer
P. Gentric P. Gentric
Philips Electronics Philips Electronics
June 2002 July 2002
Expires December 2002 Expires January 2003
Document: draft-ietf-avt-mpeg4-simple-03.txt Document: draft-ietf-avt-mpeg4-simple-04.txt
Transport of MPEG-4 Elementary Streams Transport of MPEG-4 Elementary Streams
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 2, line 15 skipping to change at page 2, line 15
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Carriage of MPEG-4 elementary streams over RTP . . . . . . . 4 2. Carriage of MPEG-4 elementary streams over RTP . . . . . . . 4
2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 4 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 4
2.2. MPEG Access Units . . . . . . . . . . . . . . . . . . . . 4 2.2. MPEG Access Units . . . . . . . . . . . . . . . . . . . . 4
2.3. Concatenation of Access Units . . . . . . . . . . . . . . 4 2.3. Concatenation of Access Units . . . . . . . . . . . . . . 4
2.4. Fragmentation of Access Units . . . . . . . . . . . . . . 5 2.4. Fragmentation of Access Units . . . . . . . . . . . . . . 5
2.5. Interleaving . . . . . . . . . . . . . . . . . . . . . . . 5 2.5. Interleaving . . . . . . . . . . . . . . . . . . . . . . . 5
2.6. Time stamp information . . . . . . . . . . . . . . . . . . 6 2.6. Time stamp information . . . . . . . . . . . . . . . . . . 6
2.7. Random Access Indication . . . . . . . . . . . . . . . . . 6 2.7. State indication of MPEG-4 system streams . . . . . . . . 6
2.8. State indication of MPEG-4 system streams . . . . . . . . 6 2.8. Random Access Indication . . . . . . . . . . . . . . . . . 6
2.9. Carriage of auxiliary information . . . . . . . . . . . . 7 2.9. Carriage of auxiliary information . . . . . . . . . . . . 7
2.10. MIME format parameters and configuring conditional field . 7 2.10. MIME format parameters and configuring conditional field . 7
2.11. Global structure of payload format . . . . . . . . . . . . 7 2.11. Global structure of payload format . . . . . . . . . . . . 7
2.12. Modes to transport MPEG-4 streams . . . . . . . . . . . . 8 2.12. Modes to transport MPEG-4 streams . . . . . . . . . . . . 8
2.13. Alignment with RFC 3016 . . . . . . . . . . . . . . . . . 8 2.13. Alignment with RFC 3016 . . . . . . . . . . . . . . . . . 8
3. Payload format . . . . . . . . . . . . . . . . . . . . . . . 9 3. Payload format . . . . . . . . . . . . . . . . . . . . . . . 9
3.1. Usage of RTP header fields and RTCP . . . . . . . . . . . 9 3.1. Usage of RTP header fields and RTCP . . . . . . . . . . . 9
3.2. RTP payload structure . . . . . . . . . . . . . . . . . . 10 3.2. RTP payload structure . . . . . . . . . . . . . . . . . . 10
3.2.1. The AU Header Section . . . . . . . . . . . . . . . . . 10 3.2.1. The AU Header Section . . . . . . . . . . . . . . . . . 10
3.2.1.1. The AU-header . . . . . . . . . . . . . . . . . . . . 10 3.2.1.1. The AU-header . . . . . . . . . . . . . . . . . . . . 10
3.2.2. The Auxiliary Section . . . . . . . . . . . . . . . . . 13 3.2.2. The Auxiliary Section . . . . . . . . . . . . . . . . . 12
3.2.3. The Access Unit Data Section . . . . . . . . . . . . . . 13 3.2.3. The Access Unit Data Section . . . . . . . . . . . . . . 13
3.2.3.1. Fragmentation . . . . . . . . . . . . . . . . . . . . 14 3.2.3.1. Fragmentation . . . . . . . . . . . . . . . . . . . . 14
3.2.3.2. Interleaving . . . . . . . . . . . . . . . . . . . . . 14 3.2.3.2. Interleaving . . . . . . . . . . . . . . . . . . . . . 14
3.2.3.3. Constraints for interleaving . . . . . . . . . . . . . 15 3.2.3.3. Constraints for interleaving . . . . . . . . . . . . . 15
3.3. Usage of this specification . . . . . . . . . . . . . . . 16 3.2.3.4. Crucial and non-crucial AUs with MPEG-4 System data . 16
3.3.1. General . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3. Usage of this specification . . . . . . . . . . . . . . . 17
3.3.2. The generic mode . . . . . . . . . . . . . . . . . . . . 16 3.3.1. General . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3.3. Constant bit rate CELP . . . . . . . . . . . . . . . . . 17 3.3.2. The generic mode . . . . . . . . . . . . . . . . . . . . 17
3.3.3. Constant bit rate CELP . . . . . . . . . . . . . . . . . 18
3.3.4. Variable bit rate CELP . . . . . . . . . . . . . . . . . 18 3.3.4. Variable bit rate CELP . . . . . . . . . . . . . . . . . 18
3.3.5. Low bit rate AAC . . . . . . . . . . . . . . . . . . . . 19 3.3.5. Low bit rate AAC . . . . . . . . . . . . . . . . . . . . 19
3.3.6. High bit rate AAC . . . . . . . . . . . . . . . . . . . 19 3.3.6. High bit rate AAC . . . . . . . . . . . . . . . . . . . 20
3.3.7. Additional modes . . . . . . . . . . . . . . . . . . . . 20 3.3.7. Additional modes . . . . . . . . . . . . . . . . . . . . 21
4. IANA considerations . . . . . . . . . . . . . . . . . . . . 21 4. IANA considerations . . . . . . . . . . . . . . . . . . . . 22
4.1. MIME type registration . . . . . . . . . . . . . . . . . . 21 4.1. MIME type registration . . . . . . . . . . . . . . . . . . 22
4.2. Concatenation of parameters . . . . . . . . . . . . . . . 26 4.2. Registration of mode definitions with IANA . . . . . . . . 27
4.3. Usage of SDP . . . . . . . . . . . . . . . . . . . . . . . 26 4.3. Concatenation of parameters . . . . . . . . . . . . . . . 27
4.3.1. The a=fmtp keyword . . . . . . . . . . . . . . . . . . . 26 4.4. Usage of SDP . . . . . . . . . . . . . . . . . . . . . . . 28
5. Security considerations . . . . . . . . . . . . . . . . . . 27 4.4.1. The a=fmtp keyword . . . . . . . . . . . . . . . . . . . 28
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 28 5. Security considerations . . . . . . . . . . . . . . . . . . 28
7. References . . . . . . . . . . . . . . . . . . . . . . . . . 28 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 29
8. Author addresses . . . . . . . . . . . . . . . . . . . . . . 29 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 29
APPENDIX: Usage of this payload format . . . . . . . . . . . 30 8. Author addresses . . . . . . . . . . . . . . . . . . . . . . 30
A. Examples of delay analysis with interleave . . . . . . . 30 APPENDIX: Usage of this payload format . . . . . . . . . . . 31
A.1 Group interleave . . . . . . . . . . . . . . . . . . . . 30 A. Examples of delay analysis with interleave . . . . . . . 31
A.2 Continuous interleave . . . . . . . . . . . . . . . . . 31 A.1 Group interleave . . . . . . . . . . . . . . . . . . . . 31
A.2 Continuous interleave . . . . . . . . . . . . . . . . . 32
1. Introduction 1. Introduction
The MPEG Committee is Working Group 11 (WG11) in ISO/IEC JTC1 SC29 The MPEG Committee is Working Group 11 (WG11) in ISO/IEC JTC1 SC29
that specified the MPEG-1, MPEG-2 and, more recently, the MPEG-4 that specified the MPEG-1, MPEG-2 and, more recently, the MPEG-4
standards [1]. The MPEG-4 standard specifies compression of standards [1]. The MPEG-4 standard specifies compression of
audio-visual data into for example an audio or video elementary audio-visual data into for example an audio or video elementary
stream. In the MPEG-4 standard, these streams take the form of stream. In the MPEG-4 standard, these streams take the form of
audiovisual objects that may be arranged into an audio-visual scene audiovisual objects that may be arranged into an audio-visual scene
by means of a scene description. Each MPEG-4 elementary stream by means of a scene description. Each MPEG-4 elementary stream
skipping to change at page 3, line 29 skipping to change at page 3, line 29
MPEG-4 audio (including speech) streams, MPEG-4 video streams and MPEG-4 audio (including speech) streams, MPEG-4 video streams and
also MPEG-4 systems streams, such as BIFS (BInary Format for also MPEG-4 systems streams, such as BIFS (BInary Format for
Scenes), OCI (Object Content Information), OD (Object Descriptor) Scenes), OCI (Object Content Information), OD (Object Descriptor)
and IPMP (Intellectual Property Management and Protection) streams. and IPMP (Intellectual Property Management and Protection) streams.
The RTP payload defined in this document is simple to implement and The RTP payload defined in this document is simple to implement and
reasonably efficient. It allows for optional interleaving of Access reasonably efficient. It allows for optional interleaving of Access
Units (such as audio frames) to increase error resiliency in packet Units (such as audio frames) to increase error resiliency in packet
loss. loss.
Though the RTP payload format defined in this document is capable Though the RTP payload format defined in this document is capable
to transport any MPEG-4 stream, more dedicated formats may exist, of transporting any MPEG-4 stream, other, more specific, formats
such as RFC 3016 for transport of MPEG-4 video (part 2). may exist, such as RFC 3016 for transport of MPEG-4 video (part 2).
Configuration of the payload is provided to accommodate transport Configuration of the payload is provided to accommodate transport
of any MPEG-4 stream at any possible bit rate. However, for a of any MPEG-4 stream at any possible bit rate. However, for a
specific MPEG-4 elementary stream typically only very few specific MPEG-4 elementary stream typically only very few
configurations are needed. So as to allow for the design of configurations are needed. So as to allow for the design of
simplified, but dedicated receivers, this specification requires simplified, but dedicated receivers, this specification requires
that specific modes are defined for transport of MPEG-4 streams. that specific modes are defined for transport of MPEG-4 streams.
This document defines modes for MPEG-4 CELP and AAC streams, as This document defines modes for MPEG-4 CELP and AAC streams, as
well as a generic mode that can be used to transport any MPEG-4 well as a generic mode that can be used to transport any MPEG-4
stream. In the future new RFCs are expected to specify additional stream. In the future new RFCs are expected to specify additional
skipping to change at page 4, line 12 skipping to change at page 4, line 12
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
this document are to be interpreted as described in RFC 2119 [3]. this document are to be interpreted as described in RFC 2119 [3].
2. Carriage of MPEG-4 elementary streams over RTP 2. Carriage of MPEG-4 elementary streams over RTP
2.1 Introduction 2.1 Introduction
With this payload format a single MPEG-4 elementary stream can be With this payload format a single MPEG-4 elementary stream can be
transported. Information on the type of MPEG-4 stream carried in transported. Information on the type of MPEG-4 stream carried in
the payload is conveyed by MIME format parameters, for example in the payload is conveyed by MIME format parameters, for example in
an SDP [6] message or by other means. These MIME format parameters an SDP [6] message or by other means (see section 4). These MIME
specify the configuration of the payload. To allow for simplified format parameters specify the configuration of the payload. To
and dedicated receivers, a MIME format parameter is available allow for simplified and dedicated receivers, a MIME format
to signal a specific mode of using this payload. A mode definition parameter is available to signal a specific mode of using this
MAY include the type of MPEG-4 elementary stream as well as the payload. A mode definition MAY include the type of MPEG-4
applied configuration, so as to avoid the need in receivers elementary stream as well as the applied configuration, so as to
to parse all MIME format parameters. The applied mode MUST be avoid the need in receivers to parse all MIME format parameters.
signalled. The applied mode MUST be signaled.
2.2 MPEG Access Units 2.2 MPEG Access Units
For carriage of compressed audio-visual data MPEG defines Access For carriage of compressed audio-visual data MPEG defines Access
Units. An MPEG Access Unit (AU) is the smallest data entity to Units. An MPEG Access Unit (AU) is the smallest data entity to
which timing information is attributed. In case of audio an Access which timing information is attributed. In case of audio an Access
Unit may represent an audio frame and in case of video a picture. Unit may represent an audio frame and in case of video a picture.
MPEG Access Units are by definition byte aligned. If for example an MPEG Access Units are by definition octet aligned. If for example
audio frame is not byte aligned, up to 7 zero-padding bits MUST be an audio frame is not octet aligned, up to 7 zero-padding bits MUST
inserted at the end of the frame to achieve a byte-aligned Access be inserted at the end of the frame to achieve the octet-aligned
Unit. MPEG-4 decoders MUST be able to decode AUs in which such Access Units, as required by the MPEG-4 specification. MPEG-4
padding is applied. decoders MUST be able to decode AUs in which such padding is
applied.
Consistent with the MPEG-4 specification, this document requires Consistent with the MPEG-4 specification, this document requires
that each MPEG-4 part 2 video Access Unit includes all the coded that each MPEG-4 part 2 video Access Unit includes all the coded
data of a picture, any video stream headers that may precede the data of a picture, any video stream headers that may precede the
coded picture data, and any video stream stuffing that may follow coded picture data, and any video stream stuffing that may follow
it, up to, but not including the startcode indicating the start of it, up to, but not including the startcode indicating the start of
a new video stream or the next Access Unit. a new video stream or the next Access Unit.
2.3 Concatenation of Access Units 2.3 Concatenation of Access Units
skipping to change at page 5, line 23 skipping to change at page 5, line 23
integral. integral.
2.4 Fragmentation of Access Units 2.4 Fragmentation of Access Units
MPEG allows for very large Access Units. Since most IP networks MPEG allows for very large Access Units. Since most IP networks
have significantly smaller MTU sizes, this payload format allows have significantly smaller MTU sizes, this payload format allows
for the fragmentation of an Access Unit over multiple RTP packets for the fragmentation of an Access Unit over multiple RTP packets
so as to avoid IP layer fragmentation. To simplify the so as to avoid IP layer fragmentation. To simplify the
implementation of RTP receivers, an RTP packet SHALL either carry implementation of RTP receivers, an RTP packet SHALL either carry
one or more complete Access Units or a single fragment of one one or more complete Access Units or a single fragment of one
Access Unit. Access Unit (i.e. packets MUST NOT contain fragments of multiple
Access Units).
2.5 Interleaving 2.5 Interleaving
When an RTP packet carries a contiguous sequence of Access Units, When an RTP packet carries a contiguous sequence of Access Units,
the loss of such a packet can result in a "decoding gap" for the the loss of such a packet can result in a "decoding gap" for the
user. One method to alleviate this problem is to allow for the user. One method to alleviate this problem is to allow for the
Access Units to be interleaved in the RTP packets. For a modest Access Units to be interleaved in the RTP packets. For a modest
cost in latency and implementation complexity, significant error cost in latency and implementation complexity, significant error
resiliency to packet loss can be achieved. resiliency to packet loss can be achieved.
skipping to change at page 6, line 19 skipping to change at page 6, line 19
an RTP packet, the time stamps of subsequent AUs can be calculated an RTP packet, the time stamps of subsequent AUs can be calculated
if the frame period of each AU is known. For audio and video this if the frame period of each AU is known. For audio and video this
is possible if the frame rate is constant. However, in some cases is possible if the frame rate is constant. However, in some cases
it is not possible to make such calculation, for example for it is not possible to make such calculation, for example for
variable frame rate video and for MPEG-4 BIFS streams carrying variable frame rate video and for MPEG-4 BIFS streams carrying
composition information. To support such cases, this payload format composition information. To support such cases, this payload format
can be configured to carry a time stamp in the RTP payload for each can be configured to carry a time stamp in the RTP payload for each
contained Access Unit. A time stamp MAY be conveyed in the RTP contained Access Unit. A time stamp MAY be conveyed in the RTP
payload only for non-first AUs in the RTP packet, and SHALL NOT be payload only for non-first AUs in the RTP packet, and SHALL NOT be
conveyed for the first AU (fragment), as the time stamp for the conveyed for the first AU (fragment), as the time stamp for the
latter is carried by the RTP time stamp. first AU in the RTP packet is carried by the RTP time stamp.
MPEG-4 defines two type of time stamps, the composition time stamp MPEG-4 defines two type of time stamps, the composition time stamp
(CTS) and the decoding time stamp (DTS). The CTS represents the (CTS) and the decoding time stamp (DTS). The CTS represents the
sampling instance of an AU, and hence the CTS is equivalent to the sampling instance of an AU, and hence the CTS is equivalent to the
RTP time stamp. The DTS may be used only in MPEG-4 video streams RTP time stamp. The DTS may be used only in MPEG-4 video streams
that use bi-directional coding, i.e. when pictures are predicted in that use bi-directional coding, i.e. when pictures are predicted in
both forward and backward direction by using either a reference both forward and backward direction by using either a reference
picture in the past, or a reference picture in the future. The DTS picture in the past, or a reference picture in the future. The DTS
cannot be carried in the RTP header. In some cases the DTS can be cannot be carried in the RTP header. In some cases the DTS can be
derived from the RTP time stamp using frame rate information; this derived from the RTP time stamp using frame rate information; this
skipping to change at page 6, line 41 skipping to change at page 6, line 41
objectionable. But if the video frame rate is variable, the required objectionable. But if the video frame rate is variable, the required
information may not even be present in the video stream. For both information may not even be present in the video stream. For both
reasons, the capability has been defined to optionally carry the reasons, the capability has been defined to optionally carry the
DTS in the RTP payload for each contained Access Unit. DTS in the RTP payload for each contained Access Unit.
Since RTP time stamps may be re-stamped by RTP devices, each time Since RTP time stamps may be re-stamped by RTP devices, each time
stamp contained in the RTP payload is coded differentially, the CTS stamp contained in the RTP payload is coded differentially, the CTS
from the RTP time stamp, and the DTS from the CTS, so as to avoid from the RTP time stamp, and the DTS from the CTS, so as to avoid
extensive parsing by re-stamping devices. extensive parsing by re-stamping devices.
2.7 Random access indication 2.7 State indication of MPEG-4 system streams
ISO/IEC 14496-1 defines states for MPEG-4 system streams. So as to
convey state information when transporting MPEG-4 system streams,
this payload format allows for the optional carriage in the RTP
payload of the stream state for each contained Access Unit. Stream
states are used to signal "crucial" AUs that carry information whose
loss cannot be tolerated and are also useful when repeating AUs
according to the carousel mechanism defined in ISO/IEC 14496-1.
2.8 Random access indication
Random access to the content of MPEG-4 elementary streams may be Random access to the content of MPEG-4 elementary streams may be
possible at some but not all Access Units. To signal Access Units possible at some but not all Access Units. To signal Access Units
where random access is possible, a random access point flag can where random access is possible, a random access point flag can
optionally be carried in the RTP payload for each contained Access optionally be carried in the RTP payload for each contained Access
Unit. Unit. Carriage of random access points is particularly useful for
MPEG-4 system streams in combination with the stream state.
2.8 State indication of MPEG-4 system streams
ISO/IEC 14496-1 defines states for MPEG-4 system streams. So as to
convey state information when transporting MPEG-4 system streams,
this payload format allows for the optional carriage in the RTP
payload of the stream state for each contained Access Unit. The
indication of stream states is particularly useful when repeating
AUs according to the carousel mechanism defined in ISO/IEC 14496-1.
2.9 Carriage of auxiliary information. 2.9 Carriage of auxiliary information.
This payload format defines a specific field to carry auxiliary This payload format defines a specific field to carry auxiliary
data. The auxiliary data field is preceded by a field that specifies data. The auxiliary data field is preceded by a field that specifies
the length of the auxiliary data, so as to facilitate skipping of the length of the auxiliary data, so as to facilitate skipping of
the data without parsing it. The coding of the auxiliary data is not the data without parsing it. The coding of the auxiliary data is not
defined in this document, but is left to the discretion of defined in this document, but is left to the discretion of
applications. Receivers that have knowledge of the auxiliary data applications. Receivers that have knowledge of the auxiliary data
MAY decode the auxiliary data, but receivers without knowledge of MAY decode the auxiliary data, but receivers without knowledge of
skipping to change at page 7, line 33 skipping to change at page 7, line 35
efficient in either case, the fields to support these features are efficient in either case, the fields to support these features are
configurable by means of MIME format parameters. In general, a MIME configurable by means of MIME format parameters. In general, a MIME
format parameter defines the presence and length of the associated format parameter defines the presence and length of the associated
field. A length of zero indicates absence of the field. As a field. A length of zero indicates absence of the field. As a
consequence, parsing of the payload requires knowledge of MIME consequence, parsing of the payload requires knowledge of MIME
format parameters. The MIME format parameters are conveyed to the format parameters. The MIME format parameters are conveyed to the
receiver via SDP [6] messages or through other means. receiver via SDP [6] messages or through other means.
2.11 Global structure of payload format 2.11 Global structure of payload format
The RTP payload following the RTP header, contains three byte The RTP payload following the RTP header, contains three octet
aligned data sections, of which the first two MAY be empty. See aligned data sections, of which the first two MAY be empty. See
figure 1. figure 1.
+---------+-----------+-----------+---------------+ +---------+-----------+-----------+---------------+
| RTP | AU Header | Auxiliary | Access Unit | | RTP | AU Header | Auxiliary | Access Unit |
| Header | Section | Section | Data Section | | Header | Section | Section | Data Section |
+---------+-----------+-----------+---------------+ +---------+-----------+-----------+---------------+
<----------RTP Packet Payload-----------> <----------RTP Packet Payload----------->
Figure 1: Data sections within an RTP packet Figure 1: Data sections within an RTP packet
The first data section is the AU (Access Unit) Header Section, that The first data section is the AU (Access Unit) Header Section, that
contains one or more AU-headers; however, each AU-header MAY be contains one or more AU-headers; however, each AU-header MAY be
empty, in which case the entire AU Header Section is empty. The empty, in which case the entire AU Header Section is empty. The
second section is the Auxiliary Section, containing auxiliary data; second section is the Auxiliary Section, containing auxiliary data;
this section MAY also be configured empty. The third section is the this section MAY also be configured empty. The third section is the
Access Unit Data Section, containing either a single fragment of Access Unit Data Section, containing either a single fragment of
one Access Unit or one or more complete Access Units. The Access one Access Unit or one or more complete Access Units. The Access
Unit Data Section is never empty. Unit Data Section MUST NOT be empty.
2.12 Modes to transport MPEG-4 streams 2.12 Modes to transport MPEG-4 streams
While it is possible to build fully configurable receivers capable While it is possible to build fully configurable receivers capable
of receiving any MPEG-4 stream, this specification also allows for of receiving any MPEG-4 stream, this specification also allows for
the design of simplified, but dedicated receivers, that are capable the design of simplified, but dedicated receivers, that are capable
for example of receiving only one type of MPEG-4 stream. This for example of receiving only one type of MPEG-4 stream. This
is achieved by requiring that specific modes be defined for using is achieved by requiring that specific modes be defined for using
this specification. Each mode may define constraints for transport this specification. Each mode may define constraints for transport
of one or more type of MPEG-4 streams, for instance on the payload of one or more type of MPEG-4 streams, for instance on the payload
configuration. configuration.
The applied mode MUST be signalled. Signalling the mode is The applied mode MUST be signaled. Signaling the mode is
particularly important for receivers that are only capable of particularly important for receivers that are only capable of
decoding one or more specific modes. Such receivers need to decoding one or more specific modes. Such receivers need to
determine whether the applied mode is supported, so as to avoid determine whether the applied mode is supported, so as to avoid
problems with processing of payloads that are beyond the problems with processing of payloads that are beyond the
capabilities of the receiver. capabilities of the receiver.
In this document several modes are defined for transport of MPEG-4 In this document several modes are defined for transport of MPEG-4
CELP and AAC streams, as well as a generic mode that can be used CELP and AAC streams, as well as a generic mode that can be used
for any MPEG-4 stream. In future, new RFCs are expected to specify for any MPEG-4 stream. In future, new RFCs are expected to specify
additional modes of using this specification. New modes can be additional modes of using this specification. New modes can be
skipping to change at page 8, line 49 skipping to change at page 8, line 49
Conversely, receivers that comply with the specification in this Conversely, receivers that comply with the specification in this
document SHOULD be able to decode payloads, names and parameters document SHOULD be able to decode payloads, names and parameters
defined for MPEG-4 video in RFC 3016. In this respect it is defined for MPEG-4 video in RFC 3016. In this respect it is
strongly recommended to implement the ability to ignore "in band" strongly recommended to implement the ability to ignore "in band"
video decoder configuration packets in the RFC 3016 payload. video decoder configuration packets in the RFC 3016 payload.
Note the "out of band" availability of the video decoder Note the "out of band" availability of the video decoder
configuration is optional in RFC 3016. To achieve maximum configuration is optional in RFC 3016. To achieve maximum
interoperability with the RTP payload format defined in this interoperability with the RTP payload format defined in this
document, applications that use RFC 3016 to transport MPEG-4 video document, applications that use RFC 3016 to transport MPEG-4 video
(part 2) are recommended to make the video decoder configuration (part 2) are RECOMMENDED to make the video decoder configuration
available as a MIME parameter. available as a MIME parameter.
3. Payload Format 3. Payload Format
3.1 Usage of RTP Header Fields and RTCP 3.1 Usage of RTP Header Fields and RTCP
Payload Type (PT): The assignment of an RTP payload type for this Payload Type (PT): The assignment of an RTP payload type for this
RTP packet format is outside the scope of this document, and will RTP packet format is outside the scope of this document, and will
not be specified here. It is expected that the RTP profile for a not be specified here. It is expected that the RTP profile for a
particular class of applications will assign a payload type for particular class of applications will assign a payload type for
this encoding, or if that is not done, then a payload type in the this encoding, or if that is not done, then a payload type in the
dynamic range shall be chosen. dynamic range shall be chosen.
Marker (M) bit: The M bit is set to 1 to indicate that the RTP Marker (M) bit: The M bit is set to 1 to indicate that the RTP
packet payload includes the end of each Access Unit of which data packet payload includes the end of each Access Unit of which data
is contained in this RTP packet. As the payload either carries one is contained in this RTP packet. As the payload either carries one
or more complete Access Units or a single fragment of an Access or more complete Access Units or a single fragment of an Access
Unit, the M bit is always set to 1, except when the packet carries Unit, the M bit is usually set to 1, except when the packet carries
a single fragment of an Access Unit that is not the last one. a single fragment of an Access Unit that is not the last one.
Extension (X) bit: Defined by the RTP profile used. Extension (X) bit: Defined by the RTP profile used.
Sequence Number: The RTP sequence number SHOULD be generated by Sequence Number: The RTP sequence number SHOULD be generated by the
the sender with a constant random offset. sender in the usual manner with a constant random offset.
Timestamp: Indicates the sampling instance of the first AU Timestamp: Indicates the sampling instance of the first AU
contained in the RTP payload. This sampling instance is equivalent contained in the RTP payload. This sampling instance is equivalent
to the CTS in the MPEG-4 time domain. When using SDP the clock rate to the CTS in the MPEG-4 time domain. When using SDP the clock rate
of the RTP time stamp MUST be expressed using the "rtpmap" of the RTP time stamp MUST be expressed using the "rtpmap"
attribute. If an MPEG-4 audio stream is transported, the rate SHOULD attribute. If an MPEG-4 audio stream is transported, the rate SHOULD
be set to the same value as the sampling rate of the audio stream. be set to the same value as the sampling rate of the audio stream.
If an MPEG-4 video stream is transported, it is RECOMMENDED to set If an MPEG-4 video stream is transported, it is RECOMMENDED to set
the rate to 90 kHz. the rate to 90 kHz.
In all cases, the sender SHALL make sure that RTP time stamps In all cases, the sender SHALL make sure that RTP time stamps
are identical only if the RTP time stamp refers to fragments of the are identical only if the RTP time stamp refers to fragments of the
same Access Unit. same Access Unit.
According to RFC 1889 [2] (section 5.1), RTP time stamps are According to RFC 1889 [2] (section 5.1), RTP time stamps are
recommended to start at a random value for security reasons. This RECOMMENDED to start at a random value for security reasons. This
is not an issue for synchronization of multiple RTP streams. is not an issue for synchronization of multiple RTP streams. When,
However, in applications where streams from multiple sources are to however, streams from multiple sources are to be synchronized (for
be synchronized (for example one stream from local storage, another example one stream from local storage, another from an RTP streaming
from a RTP streaming server), synchronization may become impossible. server), synchronization may become impossible if the receiver only
To also enable synchronization in such cases, it may be necessary to knows the original time stamp relationships. Synchronization in such
provide the required relationship between time stamps for obtaining cases, may require to provide the correct relationship between time
synchronization by out of band means. The format of such information stamps for obtaining synchronization by out of band means. The
as well as methods to convey such information are beyond the scope format of such information as well as methods to convey such
of this specification. information are beyond the scope of this specification.
SSRC: set as described in RFC1889 [2]. SSRC: set as described in RFC1889 [2].
CC and CSRC fields are used as described in RFC 1889 [2]. CC and CSRC fields are used as described in RFC 1889 [2].
RTCP SHOULD be used as defined in RFC 1889 [2]. RTCP SHOULD be used as defined in RFC 1889 [2].
3.2 RTP Payload Structure 3.2 RTP Payload Structure
3.2.1 The AU Header Section 3.2.1 The AU Header Section
skipping to change at page 10, line 21 skipping to change at page 10, line 21
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- .. -+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- .. -+-+-+-+-+-+-+-+-+-+
|AU-headers-length|AU-header|AU-header| |AU-header|padding| |AU-headers-length|AU-header|AU-header| |AU-header|padding|
| | (1) | (2) | | (n) | bits | | | (1) | (2) | | (n) | bits |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- .. -+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- .. -+-+-+-+-+-+-+-+-+-+
Figure 2: The AU Header Section Figure 2: The AU Header Section
The AU-headers are configured using MIME format parameters and MAY The AU-headers are configured using MIME format parameters and MAY
be empty. If the AU-header is configured empty, the be empty. If the AU-header is configured empty, the
AU-headers-length field SHALL not be present and consequently the AU-headers-length field SHALL NOT be present and consequently the
AU Header Section is empty. If the AU-header is not configured AU Header Section is empty. If the AU-header is not configured
empty, then the AU-headers-length is a two octet field that empty, then the AU-headers-length is a two octet field that
specifies the length in bits of the immediately following specifies the length in bits of the immediately following
AU-headers, excluding the padding bits. AU-headers, excluding the padding bits.
Each AU-header is associated with a single Access Unit (fragment) Each AU-header is associated with a single Access Unit (fragment)
contained in the Access Unit Data Section in the same RTP packet. contained in the Access Unit Data Section in the same RTP packet.
For each contained Access Unit (fragment) there is exactly one For each contained Access Unit (fragment) there is exactly one
AU-header. Within the AU Header Section, the AU-headers are AU-header. Within the AU Header Section, the AU-headers are
bit-wise concatenated in the order in which the Access Units are bit-wise concatenated in the order in which the Access Units are
contained in the Access Unit Data Section. Hence, the n-th contained in the Access Unit Data Section. Hence, the n-th
AU-header refers to the n-th AU (fragment). If the concatenated AU-header refers to the n-th AU (fragment). If the concatenated
AU-headers consume a non-integer number of octets, up to 7 AU-headers consume a non-integer number of octets, up to 7
zero-padding bits MUST be inserted at the end in order to achieve zero-padding bits MUST be inserted at the end in order to achieve
byte-alignment of the AU Header Section. octet-alignment of the AU Header Section.
3.2.1.1 The AU-header 3.2.1.1 The AU-header
The AU-header contains the fields given in figure 3. The length in Each AU-header may contain the fields given in figure 3. The length
bits of the above fields with the exception of the CTS-flag, the in bits of the above fields with the exception of the CTS-flag, the
DTS-flag and the RAP-flag fields is defined by MIME format DTS-flag and the RAP-flag fields is defined by MIME format
parameters; see section 4.1. If a MIME format parameter has the parameters; see section 4.1. If a MIME format parameter has the
default value of zero, then the associated field is not present. default value of zero, then the associated field is not present.
If present, the fields MUST occur in the mutual order given in
figure 3. In the general case a receiver can only discover the size
of an AU-header by parsing it since the presence of the CTS-delta
and DTS-delta fields is signaled by the value of the CTS-flag and
DTS-flag, respectively.
+---------------------------------------+ +---------------------------------------+
| AU-size | | AU-size |
+---------------------------------------+ +---------------------------------------+
| AU-Index / AU-Index-delta | | AU-Index / AU-Index-delta |
+---------------------------------------+ +---------------------------------------+
| CTS-flag | | CTS-flag |
+---------------------------------------+ +---------------------------------------+
| CTS-delta | | CTS-delta |
+---------------------------------------+ +---------------------------------------+
| DTS-flag | | DTS-flag |
skipping to change at page 12, line 14 skipping to change at page 12, line 14
present in any subsequent (non-first) AU-header. When the present in any subsequent (non-first) AU-header. When the
AU-Index-delta is coded with the value 0, it indicates that AU-Index-delta is coded with the value 0, it indicates that
the Access Units are consecutive in decoding order. An the Access Units are consecutive in decoding order. An
AU-Index-delta value larger than 0 signals that interleaving AU-Index-delta value larger than 0 signals that interleaving
is applied. is applied.
CTS-flag: Indicates whether the CTS-delta field is present. CTS-flag: Indicates whether the CTS-delta field is present.
A value of 1 indicates that the field is present, a value A value of 1 indicates that the field is present, a value
of 0 that it is not present. of 0 that it is not present.
The CTS-flag field MUST be present in each AU-header if the The CTS-flag field MUST be present in each AU-header if the
length of the CTS-delta field is signalled to be larger than length of the CTS-delta field is signaled to be larger than
zero. In that case, the CTS-flag field MUST have the value 0 zero. In that case, the CTS-flag field MUST have the value 0
in the first AU-header and MAY have the value 1 in all in the first AU-header and MAY have the value 1 in all
non-first AU-headers. The CTS-flag field SHOULD be 0 for non-first AU-headers. The CTS-flag field SHOULD be 0 for
any non-first fragment of an Access Unit. any non-first fragment of an Access Unit.
CTS-delta: Encodes the CTS by specifying the value of CTS as a 2's CTS-delta: Encodes the CTS by specifying the value of CTS as a 2's
complement offset (delta) from the time stamp in the RTP complement offset (delta) from the time stamp in the RTP
header of this RTP packet. The CTS MUST use the same clock header of this RTP packet. The CTS MUST use the same clock
rate as the time stamp in the RTP header. rate as the time stamp in the RTP header.
DTS-flag: Indicates whether the DTS-delta field is present. A value DTS-flag: Indicates whether the DTS-delta field is present. A value
of 1 indicates that DTS-delta is present, a value of 0 that of 1 indicates that DTS-delta is present, a value of 0 that
it is not present. it is not present.
The DTS-flag field MUST be present in each AU-header if the The DTS-flag field MUST be present in each AU-header if the
length of the DTS-delta field is signalled to be larger than length of the DTS-delta field is signaled to be larger than
zero. The DTS-flag field SHOULD be 0 for any non-first zero. The DTS-flag field SHOULD be 0 for any non-first
fragment of an Access Unit. fragment of an Access Unit.
DTS-delta: Specifies the value of the DTS as a 2's complement DTS-delta: Specifies the value of the DTS as a 2's complement
offset (delta) from the CTS. The DTS MUST use the offset (delta) from the CTS. The DTS MUST use the
same clock rate as the time stamp in the RTP header. same clock rate as the time stamp in the RTP header.
RAP-flag: Indicates when set to 1 that the associated Access Unit RAP-flag: Indicates when set to 1 that the associated Access Unit
provides a random access point to the content of the stream. provides a random access point to the content of the stream.
If an Access Unit is fragmented, the RAP flag, if present, If an Access Unit is fragmented, the RAP flag, if present,
MUST be set to 0 for each non-first fragment of the AU. MUST be set to 0 for each non-first fragment of the AU.
Stream-state: Specifies the state of the stream for the AU of an Stream-state: Specifies the state of the stream for an AU of an
MPEG-4 system stream. For states of MPEG-4 system streams see MPEG-4 system stream; each state is identified by a value of
ISO/IEC 14496-1. The stream state is set either to 0 or to 1. a modulo counter. In ISO/IEC 14496-1, MPEG-4 system streams
A change of the stream state value (either from 1 to 0 or from use the AU_SequenceNumber to signal stream states. When the
0 to 1) indicates another state of the stream. At an AU that stream state changes, the value of stream-state MUST be
provides a random access point, as signalled by the RAP-flag, incremented by one.
a change in the stream state MUST occur, unless the AU is a
repeated random access point. Hence, receivers MAY ignore AUs
with the RAP-flag set to 1 if the stream state does not
change. Receivers that don't ignore a repeated random access
point SHOULD take care that such processing does not disrupt
the decoding process.
Note: no relation is required between stream-states of Note: no relation is required between stream-states of
different streams. different streams.
If present, the fields MUST occur in the mutual order given in
figure 3. In the general case a receiver can only discover the size
of an AU-header by parsing it since the presence of the CTS-delta
and DTS-delta fields is signalled by the value of the CTS-flag and
DTS-flag, respectively.
3.2.2 The Auxiliary Section 3.2.2 The Auxiliary Section
The Auxiliary Section consists of the auxiliary-data-size field The Auxiliary Section consists of the auxiliary-data-size field
followed by the auxiliary-data field. Receivers MAY (but are not followed by the auxiliary-data field. Receivers MAY (but are not
required to) parse the auxiliary-data field; to facilitate skipping required to) parse the auxiliary-data field; to facilitate skipping
of the auxiliary-data field by receivers, the auxiliary-data-size of the auxiliary-data field by receivers, the auxiliary-data-size
field indicates the length in bits of the auxiliary-data. If the field indicates the length in bits of the auxiliary-data. If the
concatenation of the auxiliary-data-size and the auxiliary-data concatenation of the auxiliary-data-size and the auxiliary-data
fields consume a non-integer number of octets, up to 7 zero padding fields consume a non-integer number of octets, up to 7 zero padding
bits MUST be inserted immediately after the auxiliary data in order bits MUST be inserted immediately after the auxiliary data in order
to achieve byte-alignment. See figure 4. to achieve octet-alignment. See figure 4.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- .. -+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- .. -+-+-+-+-+-+-+-+-+
| auxiliary-data-size | auxiliary-data |padding bits | | auxiliary-data-size | auxiliary-data |padding bits |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- .. -+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- .. -+-+-+-+-+-+-+-+-+
Figure 4: The fields in the Auxiliary Section Figure 4: The fields in the Auxiliary Section
The length in bits of the auxiliary-data-size field is configurable The length in bits of the auxiliary-data-size field is configurable
by a MIME format parameter; see section 4.1. The default length of by a MIME format parameter; see section 4.1. The default length of
zero indicates that the entire Auxiliary Section is absent. zero indicates that the entire Auxiliary Section is absent.
skipping to change at page 14, line 7 skipping to change at page 13, line 43
of octets. See figure 5. The AUs inside the Access Unit Data of octets. See figure 5. The AUs inside the Access Unit Data
Section MUST be in decoding order. Section MUST be in decoding order.
The size and number of Access Units SHOULD be adjusted such that The size and number of Access Units SHOULD be adjusted such that
the resulting RTP packet is not larger than the path MTU. To handle the resulting RTP packet is not larger than the path MTU. To handle
larger packets, this payload format relies on lower layers for larger packets, this payload format relies on lower layers for
fragmentation, which may not be desirable. fragmentation, which may not be desirable.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|AU(1) | |AU(1) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- | + |
| | | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |AU(2) | | |AU(2) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+ |
| | | |
| -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | AU(n) | | | AU(n) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | |AU(n) continued|
|-+-+-+-+-+-+-+-+ |-+-+-+-+-+-+-+-+
Figure 5: Access Unit Data Section; each AU is byte aligned. Figure 5: Access Unit Data Section; each AU is octet aligned.
When multiple Access Units are carried, the size of each AU MUST be When multiple Access Units are carried, the size of each AU MUST be
made available to the receiver. If the AU size is variable then the made available to the receiver. If the AU size is variable then the
size of each AU MUST be indicated in the AU-size field of the size of each AU MUST be indicated in the AU-size field of the
corresponding AU-header. However, if the AU size is constant for a corresponding AU-header. However, if the AU size is constant for a
stream, this mechanism SHOULD NOT be used, but instead the fixed stream, this mechanism SHOULD NOT be used, but instead the fixed
size SHOULD be signalled by the MIME format parameter size SHOULD be signaled by the MIME format parameter
"ConstantSize", see section 4.1. "ConstantSize", see section 4.1.
The absence of both AU-size in the AU-header and the ConstantSize The absence of both AU-size in the AU-header and the ConstantSize
MIME format parameter indicates carriage of a single AU (fragment), MIME format parameter indicates carriage of a single AU (fragment),
i.e. that a single Access Unit (fragment) is transported in each i.e. that a single Access Unit (fragment) is transported in each
RTP packet for that stream. RTP packet for that stream.
3.2.3.1 Fragmentation 3.2.3.1 Fragmentation
A packet SHALL carry either one or more Access Units, or a single A packet SHALL carry either one or more Access Units, or a single
skipping to change at page 15, line 24 skipping to change at page 15, line 4
packet with RTP time stamp T is calculated as follows: packet with RTP time stamp T is calculated as follows:
Timestamp[0] = T Timestamp[0] = T
Timestamp[i, i > 0] = T +(Sum(for k=1 to i of (AU-Index-delta[k] Timestamp[i, i > 0] = T +(Sum(for k=1 to i of (AU-Index-delta[k]
+ 1))) * access-unit-duration + 1))) * access-unit-duration
When AU-Index-delta is always 0, this reduces to T + i * (access- When AU-Index-delta is always 0, this reduces to T + i * (access-
unit-duration). This is the non-interleaved case, where the frames unit-duration). This is the non-interleaved case, where the frames
are consecutive in decoding order. Note that the AU-Index field are consecutive in decoding order. Note that the AU-Index field
(present for the first Access Unit) is not needed in this (present for the first Access Unit) is not needed in this
calculation. Hence in cases where the Access-unit-duration has a calculation. Hence in cases where the access-unit-duration has a
fixed and known value, the AU-Index does not need to provide index fixed and known value, the AU-Index does not need to provide index
information and can be coded with the value 0. See also the information and can be coded with the value 0. See also the
semantics of the AU-Index field in 3.2.1.1. semantics of the AU-Index field in 3.2.1.1.
If the Access Units are not fixed duration, the AU-Index is not
redundant, and MUST provide the index information required for
re-ordering. The number of bits of the AU-Index field MUST be chosen
so that valid index information is provided at the applied
interleaving scheme, without causing problems due to roll-over of
the AU-Index field. Note that the CTS-delta may be required to
compute the correct time stamp for each AU.
When an RTP packet arrives (after any reordering has been done), When an RTP packet arrives (after any reordering has been done),
receivers may 'flush' all Access Units from the interleave buffer receivers may 'flush' all Access Units from the interleave buffer
which have a time stamp strictly less than the time stamp of the if the time stamp of each Access Units in the interleave buffer is
arriving packet. Similarly the first Access Unit of every arriving strictly less than the time stamp of the arriving packet. Access
packet can always be flushed (as no following packet can provide Units should also be flushed in time to be played; this can be
an earlier Access Unit), and any Access Units which are consecutive important if there is loss before end-of-stream, before a silence
with it which have already been received. Access Units should also interval, or before a large drop-out.
be flushed in time to be played; this can be important if there is
loss before end-of-stream, before a silence interval, or before a
large drop-out.
3.2.3.3 Constraints for interleaving 3.2.3.3 Constraints for interleaving
The size of the packets should be suitably chosen to be appropriate The size of the packets should be suitably chosen to be appropriate
to both the path MTU and the duration and capacity of the receiver's to both the path MTU and the duration and capacity of the receiver's
de-interleave buffer. The maximum packet size for a session should de-interleave buffer. The maximum packet size for a session SHOULD
be chosen not to exceed the path MTU. be chosen not to exceed the path MTU.
In order to control receiver latency and mitigate the effects of In order to control receiver latency and mitigate the effects of
loss, there are profile-based limits on the size of the packet. loss, there are profile-based limits on the size of the packet.
This is expressed as a duration: it is calculated from the duration This is expressed as a duration: it is calculated from the duration
of the Access Units contained within a packet. Note that this of the Access Units contained within a packet. Note that this
duration is NOT the difference between the time stamps of the first duration is NOT the difference between the time stamps of the first
and last Access Unit in a packet. and last Access Unit in a packet.
No matter what interleaving scheme is used, the scheme must be No matter what interleaving scheme is used, the scheme must be
analyzed to calculate the minimum number of frames a receiver has analyzed to calculate the minimum number of frames a receiver has
to buffer in order to de-interleave. to buffer in order to de-interleave.
Three profiles are defined to constrain the latency when Three profiles are defined to constrain the latency when
interleaving. The applied profile is signalled by the MIME format interleaving. The applied profile is signaled by the MIME format
parameter "Profile", indicating the decimal number of the profile. parameter "Profile", indicating the decimal number of the profile.
The maximum de-interleave buffer required at the receiver can be The maximum de-interleave buffer required at the receiver can be
determined if the maximum packet duration is known. The maximum determined if the maximum packet duration is known. The maximum
packet duration in milliseconds for the three profiles, shall not packet duration in milliseconds for the three profiles, SHALL NOT
exceed: exceed:
Profile 0 -- 200 milliseconds Profile 0 -- 200 milliseconds
Profile 1 -- 500 milliseconds Profile 1 -- 500 milliseconds
Profile 2 -- 1500 milliseconds Profile 2 -- 1500 milliseconds
When interleaving is applied, the applied profile MUST be signaled
When interleaving is applied, the applied profile MUST be signalled
by the MIME format parameter "Profile"; see section 4.1. by the MIME format parameter "Profile"; see section 4.1.
Note that for low bit-rate material, this duration limit may make Note that for low bit-rate material, this duration limit may make
packets shorter than the MTU size. packets shorter than the MTU size.
3.2.3.4. Crucial and non-crucial AUs with MPEG-4 System data
Some Access Units with MPEG-4 system data, called "crucial" AUs,
carry information whose loss cannot be tolerated, either in the
presentation or in the decoder. At each crucial AU in an MPEG-4
system stream, the stream state changes. The stream-state MAY
remain constant at non-crucial AUs. In ISO/IEC 14496-1, MPEG-4
system streams use the AU_SequenceNumber to signal stream states.
Example: Given three AUs, AU1 = "Insertion of node X", AU2 = "Set
position of node X", AU3 = "Set position of node X". AU1 is crucial,
since if it is lost, AU2 cannot be executed. However, AU2 is not
crucial, since AU3 can be executed even if AU2 is lost.
When a crucial AU is (possibly) lost, the stream is corrupted. For
example, when an AU is lost and the stream state has changed at the
next received AU, then it is possible that the lost AU was crucial.
Once corrupted, the stream remains corrupted until the next random
access point. Note that loss of non-crucial AUs does not corrupt the
stream. When a decoder starts receiving a stream, the decoder MUST
consider the stream corrupted until an AU is received that provides
a random access point.
An AU that provides a random access point, as signaled by the
RAP-flag, may be crucial or not. Non-crucial RAP AUs provide a
"repeated" random access point for use by decoders that recently
joined the stream or that need to re-start decoding after a stream
corruption. Non-crucial RAP AUs MUST include all updates since the
last crucial RAP AU.
Upon receiving AUs, decoders are to react as follows:
a) if the RAP-flag is set to 1 and the stream-state changes, then
the AU is a crucial RAP AU, and the AU MUST be decoded.
b) if the RAP-flag is set to 1 and the stream state does not change,
then the AU is a non-crucial RAP AU, and the receiver SHOULD
decode it if the stream is corrupted. Otherwise, the decoder MUST
ignore the AU.
c) if the RAP-flag is set to 0, then the AU MUST be decoded, unless
the stream is corrupted, in which case the AU MUST be ignored.
3.3 Usage of this specification 3.3 Usage of this specification
3.3.1 General 3.3.1 General
Usage of this specification requires definition of a mode. A mode Usage of this specification requires definition of a mode. A mode
defines how to use this specification, as deemed appropriate. defines how to use this specification, as deemed appropriate.
Senders MUST signal the applied mode via the MIME format parameter Senders MUST signal the applied mode via the MIME format parameter
"Mode". This specification defines a generic mode that can be used "Mode". This specification defines a generic mode that can be used
for any MPEG-4 stream, as well as specific modes for transport of for any MPEG-4 stream, as well as specific modes for transport of
MPEG-4 CELP and MPEG-4 AAC streams, defined in ISO/IEC 14496-3. MPEG-4 CELP and MPEG-4 AAC streams, defined in ISO/IEC 14496-3.
skipping to change at page 16, line 49 skipping to change at page 17, line 31
For audio streams, <encoding parameters> specifies the number of For audio streams, <encoding parameters> specifies the number of
audio channels: 2 for stereo material (see RFC 2327) and 1 for audio channels: 2 for stereo material (see RFC 2327) and 1 for
mono. Provided no additional parameters are needed, this parameter mono. Provided no additional parameters are needed, this parameter
may be omitted for mono material, hence its default value is 1. may be omitted for mono material, hence its default value is 1.
3.3.2 The generic mode 3.3.2 The generic mode
The generic mode can be used for any MPEG-4 stream. In this mode The generic mode can be used for any MPEG-4 stream. In this mode
no mode-specific constraints are applied; hence, in the generic no mode-specific constraints are applied; hence, in the generic
mode the full flexibility of this specification can be exploited. mode the full flexibility of this specification can be exploited.
The generic mode is signalled by mode=generic. The generic mode is signaled by mode=generic.
An example is given below for transport of a BIFS stream. In this An example is given below for transport of a BIFS stream. In this
example carriage of multiple BIFS Access Units is allowed in one example carriage of multiple BIFS Access Units is allowed in one
RTP packet. The AU-header contains the AU-size field, the CTS-flag RTP packet. The AU-header contains the AU-size field, the CTS-flag
and, if the CTS flag is set to 1, the CTS-delta field. The number and, if the CTS flag is set to 1, the CTS-delta field. The number
of bits of the AU-size and the CTS-delta fields is 14 and 15, of bits of the AU-size and the CTS-delta fields is 10 and 16,
respectively. The AU-header also contains the RAP-flag and the respectively. The AU-header also contains the RAP-flag and the
Stream-state, both of 1 bits. This results in an AU-header with a Stream-state of 4 bits. This results in an AU-header with a
Total size of two or four octets per BIFS AU. The RTP time stamp total size of two or four octets per BIFS AU. The RTP time stamp
uses a 1 kHz clock. Note that the media type name is video, uses a 1 kHz clock. Note that the media type name is video,
because the BIFS stream is part of an audiovisual presentation. For because the BIFS stream is part of an audiovisual presentation. For
conventions on media type names see section 4.1. conventions on media type names see section 4.1.
In detail: In detail:
m=video 49230 RTP/AVP 96 m=video 49230 RTP/AVP 96
a=rtpmap:96 mpeg4-generic/1000 a=rtpmap:96 mpeg4-generic/1000
a=fmtp:96 streamtype=3; profile-level-id=257; mode=generic; a=fmtp:96 streamtype=3; profile-level-id=257; mode=generic;
ObjectType=2; config=BIFSConfiguration(); SizeLength=15; ObjectType=2; config=BIFSConfiguration(); SizeLength=10;
CTSDeltaLength=16; RandomAccessIndication=1; CTSDeltaLength=16; RandomAccessIndication=1;
StreamStateIndication=1 StreamStateIndication=4
Note that BIFSConfiguration() is defined in ISO/IEC 14496-1; for Note: The a=fmtp line has been wrapped to fit the page, it comprises
the description of MIME parameters see section 4.1. a single line in the SDP file.
BIFSConfiguration() is the hexadecimal string as defined in ISO/IEC
14496-1; for the description of MIME parameters see section 4.1.
3.3.3 Constant bit-rate CELP 3.3.3 Constant bit-rate CELP
This mode is signalled by mode=CELP-cbr. In this mode one or more This mode is signaled by mode=CELP-cbr. In this mode one or more
fixed size CELP frames can be transported in one RTP packet; there fixed size CELP frames can be transported in one RTP packet; there
is no support for interleaving. The RTP payload consist of one or is no support for interleaving. The RTP payload consist of one or
more concatenated CELP frames, each of the same size. Both the AU more concatenated CELP frames, each of the same size. Both the AU
Header Section and the Auxiliary Section are empty. Header Section and the Auxiliary Section MUST be empty.
The MIME format parameter ConstantSize MUST be provided to specify The MIME format parameter ConstantSize MUST be provided to specify
the length of each CELP frame. the length of each CELP frame.
For example: For example:
m=audio 49230 RTP/AVP 96 m=audio 49230 RTP/AVP 96
a=rtpmap:96 mpeg4-generic/44100/2 a=rtpmap:96 mpeg4-generic/44100/2
a=fmtp:96 streamtype=5; profile-level-id=15; mode=CELP-cbr; config= a=fmtp:96 streamtype=5; profile-level-id=15; mode=CELP-cbr; config=
AudioSpecificConfig(); ConstantSize=xxx; AudioSpecificConfig(); ConstantSize=xxx;
The AudioSpecificConfig(), defined in ISO/IEC 14496-3, specifies Note: The a=fmtp line has been wrapped to fit the page, it comprises
that the audio stream type is CELP. For the description of MIME a single line in the SDP file.
parameters see section 4.1.
AudioSpecificConfig() is the haxadecimal string as defined in
ISO/IEC 14496-3. AudioSpecificConfig() specifies that the audio
stream type is CELP. For the description of MIME parameters see
section 4.1.
3.3.4 Variable bit-rate CELP 3.3.4 Variable bit-rate CELP
This mode is signalled by mode=CELP-vbr. With this mode one or This mode is signaled by mode=CELP-vbr. With this mode one or
more variable size CELP frames can be transported in one RTP packet more variable size CELP frames can be transported in one RTP packet
with optional interleaving. As the largest possible frame size in with optional interleaving. As the largest possible frame size in
this mode is greater than the maximum CELP frame size, there is no this mode is greater than the maximum CELP frame size, there is no
support for fragmentation of CELP frames. support for fragmentation of CELP frames.
In this mode the RTP payload consists of the AU Header Section, In this mode the RTP payload consists of the AU Header Section,
followed by one or more concatenated CELP frames. The Auxiliary followed by one or more concatenated CELP frames. The Auxiliary
Section is empty. For each CELP frame contained in the payload Section MUST be empty. For each CELP frame contained in the payload
there is a one octet AU-header in the AU Header Section to there MUST be a one octet AU-header in the AU Header Section to
provide: provide:
(a) the size of each CELP frame in the payload and (a) the size of each CELP frame in the payload and
(b) index information for computing the sequence (and hence timing) (b) index information for computing the sequence (and hence timing)
of each CELP frame. of each CELP frame.
Transport of CELP frames requires that the AU-size field is coded Transport of CELP frames requires that the AU-size field is coded
with 6 bits. In this mode therefore 6 bits are allocated to the with 6 bits. In this mode therefore 6 bits are allocated to the
AU-size field, and 2 bits to the AU-Index(-delta) field. Each AU-size field, and 2 bits to the AU-Index(-delta) field. Each
AU-Index field MUST be coded with the value 0. In the AU Header AU-Index field MUST be coded with the value 0. In the AU Header
Section, the concatenated AU-headers are preceded by the 16-bit Section, the concatenated AU-headers are preceded by the 16-bit
AU-headers-length field, as specified in 3.2.1. AU-headers-length field, as specified in section 3.2.1.
In addition to the required MIME format parameters, the following In addition to the required MIME format parameters, the following
parameters MUST be present: SizeLength, IndexLength, and parameters MUST be present: SizeLength, IndexLength, and
IndexDeltaLength. IndexDeltaLength.
When interleaving is applied (AU-Index-delta coded with a value When interleaving is applied (AU-Index-delta coded with a value
larger than 0), the parameter Profile MUST also be present. larger than 0), the parameter Profile MUST also be present.
For example: For example:
m=audio 49230 RTP/AVP 96 m=audio 49230 RTP/AVP 96
a=rtpmap:96 mpeg4-generic/44100/2 a=rtpmap:96 mpeg4-generic/44100/2
a=fmtp:96 streamtype=5; profile-level-id=15; mode=CELP-vbr; config= a=fmtp:96 streamtype=5; profile-level-id=15; mode=CELP-vbr; config=
AudioSpecificConfig(); SizeLength=6; IndexLength=2; AudioSpecificConfig(); SizeLength=6; IndexLength=2;
IndexDeltaLength=2; Profile=1 IndexDeltaLength=2; Profile=1
The AudioSpecificConfig(), defined in ISO/IEC 14496-3, specifies Note: The a=fmtp line has been wrapped to fit the page, it comprises
that the audio stream type is CELP. For the description of MIME a single line in the SDP file.
parameters see section 4.1.
AudioSpecificConfig() is the hexadecimal string as defined in
ISO/IEC 14496-3, AudioSpecificConfig() specifies that the audio
stream type is CELP. For the description of MIME parameters see
section 4.1.
3.3.5 Low bit-rate AAC 3.3.5 Low bit-rate AAC
This mode is signalled by mode=AAC-lbr. This mode supports transport This mode is signaled by mode=AAC-lbr. This mode supports transport
of one or more variable size AAC frames with optional support for of one or more variable size AAC frames with optional support for
interleaving and fragmenting. The maximum size of an AAC frame interleaving and fragmenting. The maximum size of an AAC frame
(fragment) in this mode is 63 octets. (fragment) in this mode is 63 octets.
The payload configuration in this mode is the same as in the The payload configuration in this mode is the same as in the
variable bit-rate CELP mode as defined in 3.3.4. The RTP payload variable bit-rate CELP mode as defined in 3.3.4. The RTP payload
consists of the AU Header Section, followed by concatenated AAC consists of the AU Header Section, followed by concatenated AAC
frames. The Auxiliary Section is empty. For each AAC frame contained frames. The Auxiliary Section MUST be empty. For each AAC frame
in the payload the one octet AU-header provides: contained in the payload the one octet AU-header MUST provide:
(a) the size of each AAC frame in the payload and (a) the size of each AAC frame in the payload and
(b) index information for computing the sequence (and hence timing) (b) index information for computing the sequence (and hence timing)
of each AAC frame. of each AAC frame.
In the AU-header, the AU-size is coded with 6 bits and the In the AU-header, the AU-size MUST be coded with 6 bits and the
AU-Index(-delta) with 2 bits; the AU-Index field MUST have the AU-Index(-delta) with 2 bits; the AU-Index field MUST have the
value 0 in each AU-header. value 0 in each AU-header.
In the AU-header Section, the concatenated AU-headers are preceded In the AU-header Section, the concatenated AU-headers MUST be
by the 16-bit AU-headers-length field, as specified in 3.2.1. preceded by the 16-bit AU-headers-length field, as specified in
section 3.2.1.
In addition to the required MIME format parameters, the following In addition to the required MIME format parameters, the following
parameters MUST be present: SizeLength, IndexLength, and parameters MUST be present: SizeLength, IndexLength, and
IndexDeltaLength. IndexDeltaLength.
When interleaving is applied (AU-Index-delta coded with a value When interleaving is applied (AU-Index-delta coded with a value
larger than 0), also the parameter Profile MUST be present. larger than 0), also the parameter Profile MUST be present.
For example: For example:
m=audio 49230 RTP/AVP 96 m=audio 49230 RTP/AVP 96
a=rtpmap:96 mpeg4-generic/44100/2 a=rtpmap:96 mpeg4-generic/44100/2
a=fmtp:96 streamtype=5; profile-level-id=15; mode=AAC-lbr; config= a=fmtp:96 streamtype=5; profile-level-id=15; mode=AAC-lbr; config=
AudioSpecificConfig(); SizeLength=6; IndexLength=2; AudioSpecificConfig(); SizeLength=6; IndexLength=2;
IndexDeltaLength=2; Profile=1 IndexDeltaLength=2; Profile=1
The AudioSpecificConfig(), defined in ISO/IEC 14496-3, specifies Note: The a=fmtp line has been wrapped to fit the page, it comprises
that the audio stream type is AAC. For the description of MIME a single line in the SDP file.
parameters see section 4.1.
AudioSpecificConfig() is the hexadecimal string as defined in ISO/IEC
14496-3. AudioSpecificConfig() specifies that the audio
stream type is AAC. For the description of MIME parameters see
section 4.1.
3.3.6 High bit-rate AAC 3.3.6 High bit-rate AAC
This mode is signalled by mode=AAC-hbr. This mode supports transport This mode is signaled by mode=AAC-hbr. This mode supports transport
of one or more large variable size AAC frames in one RTP packet with of one or more large variable size AAC frames in one RTP packet with
optional support for interleaving and fragmenting. The maximum size optional support for interleaving and fragmenting. The maximum size
of an AAC frame (fragment) in this mode is 8191 octets. of an AAC frame (fragment) in this mode is 8191 octets.
In this mode the RTP payload consists of the AU Header Section, In this mode the RTP payload consists of the AU Header Section,
followed by one or more concatenated AAC frames. The Auxiliary followed by one or more concatenated AAC frames. The Auxiliary
Section is empty. For each AAC frame contained in the payload there Section MUST be empty. For each AAC frame contained in the payload
is an AU-header in the AU Header Section to provide: there MUST be an AU-header in the AU Header Section to provide:
(a) the size of each AAC frame in the payload and (a) the size of each AAC frame in the payload and
(b) index information for computing the sequence (and hence timing) (b) index information for computing the sequence (and hence timing)
of each AAC frame. of each AAC frame.
To code the maximum size of an AAC frame requires 13 bits. Therefore To code the maximum size of an AAC frame requires 13 bits. Therefore
in this configuration 13 bits are allocated to the AU-size, and in this configuration 13 bits are allocated to the AU-size, and
3 bits to the AU-Index(-delta) field. Thus each AU-header has a size 3 bits to the AU-Index(-delta) field. Thus each AU-header has a size
of 2 octets. Each AU-Index field MUST be coded with the value 0. In of 2 octets. Each AU-Index field MUST be coded with the value 0. In
the AU Header Section, the concatenated AU-headers are preceded by the AU Header Section, the concatenated AU-headers MUST be preceded
the 16-bit AU-headers-length field, as specified in 3.2.1. by the 16-bit AU-headers-length field, as specified in section 3.2.1.
In addition to the required MIME format parameters, the following In addition to the required MIME format parameters, the following
parameters MUST be present: SizeLength, IndexLength, and parameters MUST be present: SizeLength, IndexLength, and
IndexDeltaLength. IndexDeltaLength.
When interleaving is applied (AU-Index-delta coded with a value When interleaving is applied (AU-Index-delta coded with a value
larger than 0), also the parameter Profile MUST be present. larger than 0), also the parameter Profile MUST be present.
For example: For example:
m=audio 49230 RTP/AVP 96 m=audio 49230 RTP/AVP 96
a=rtpmap:96 mpeg4-generic/44100/2 a=rtpmap:96 mpeg4-generic/44100/2
a=fmtp:96 streamtype=5; profile-level-id=15; mode=AAC-hbr; a=fmtp:96 streamtype=5; profile-level-id=15; mode=AAC-hbr;
config=AudioSpecificConfig(); SizeLength=13; IndexLength=3; config=AudioSpecificConfig(); SizeLength=13; IndexLength=3;
IndexDeltaLength=3; Profile=1 IndexDeltaLength=3; Profile=1
Note: The a=fmtp line has been wrapped to fit the page, it comprises
a single line in the SDP file.
The AudioSpecificConfig(), defined in ISO/IEC 14496-3, specifies AudioSpecificConfig() is the hexadecimal string as defined in
that the audio stream type is AAC. For the description of MIME ISO/IEC 14496-3. AudioSpecificConfig() specifies that the audio
parameters see section 4.1. stream type is AAC. For the description of MIME parameters see
section 4.1.
3.3.7 Additional modes 3.3.7 Additional modes
This specification only defines the modes specified in sections This specification only defines the modes specified in sections
3.3.2 up to 3.3.6. Additional modes are expected to be defined in 3.3.2 up to 3.3.6. Additional modes are expected to be defined in
future RFCs. Each additional mode MUST be in full compliance with future RFCs. Each additional mode MUST be in full compliance with
this specification. this specification.
When defining a new mode care MUST be taken that an implementation When defining a new mode care MUST be taken that an implementation
of all features of this specification can decode the payload format of all features of this specification can decode the payload format
skipping to change at page 22, line 14 skipping to change at page 23, line 14
Required parameters: Required parameters:
MIME format parameters are not case dependent; however for clarity MIME format parameters are not case dependent; however for clarity
both upper and lower case are used in the names of the parameters both upper and lower case are used in the names of the parameters
described in this specification. described in this specification.
StreamType: StreamType:
The integer value that indicates the type of MPEG-4 stream that The integer value that indicates the type of MPEG-4 stream that
is carried; its coding corresponds to the values of the is carried; its coding corresponds to the values of the
streamType as defined in Table 9 (objectTypeIndication Values) streamType as defined in Table 9 (objectTypeIndication Values)
in ISO/IEC 14496-1. Note that the StreamType allows signalling of in ISO/IEC 14496-1. Note that the StreamType allows signaling of
an MPEG-7 stream; this RTP payload format is not designed to an MPEG-7 stream; this RTP payload format is not designed to
carry an MPEG-7 stream, and may not be suitable for transport of carry an MPEG-7 stream, and may not be suitable for transport of
MPEG-7 streams. MPEG-7 streams.
Profile-level-id: Profile-level-id:
A decimal representation of the MPEG-4 Profile Level indication. A decimal representation of the MPEG-4 Profile Level indication.
This parameter MUST be used in the capability exchange or This parameter MUST be used in the capability exchange or
session set-up procedure to indicate the MPEG-4 Profile and Level session set-up procedure to indicate the MPEG-4 Profile and Level
combination of which the relevant MPEG-4 media codec is capable combination of which the relevant MPEG-4 media codec is capable
of. of.
skipping to change at page 23, line 11 skipping to change at page 24, line 11
For Clock Reference streams and Object Content Info streams, this For Clock Reference streams and Object Content Info streams, this
parameter has the decimal value zero, indicating that profile parameter has the decimal value zero, indicating that profile
and level information is conveyed through the OD framework. and level information is conveyed through the OD framework.
Config: Config:
A hexadecimal representation of an octet string that expresses A hexadecimal representation of an octet string that expresses
the media payload configuration. Configuration data is mapped the media payload configuration. Configuration data is mapped
onto the hexadecimal octet string in an MSB-first basis. The onto the hexadecimal octet string in an MSB-first basis. The
first bit of the configuration data SHALL be located at the MSB first bit of the configuration data SHALL be located at the MSB
of the first octet. In the last octet, if necessary to achieve of the first octet. In the last octet, if necessary to achieve
byte alignment, up to 7 zero-valued padding bits shall follow octet alignment, up to 7 zero-valued padding bits shall follow
the configuration data. the configuration data.
For MPEG-4 Audio streams, config is the audio object type For MPEG-4 Audio streams, config is the audio object type
specific decoder configuration data AudioSpecificConfig() as specific decoder configuration data AudioSpecificConfig() as
defined in ISO/IEC 14496-3. For Stuctured Audio, the defined in ISO/IEC 14496-3. For Stuctured Audio, the
AudioSpecificConfig()may be conveyed by other means, not AudioSpecificConfig()may be conveyed by other means, not
defined by this specification. If the AudioSpecificConfig() defined by this specification. If the AudioSpecificConfig()
is conveyed by other means for Stuctured Audio, then the is conveyed by other means for Stuctured Audio, then the
config MUST be a quoted empty hexadecimal octet string, as config MUST be a quoted empty hexadecimal octet string, as
follows: config="". follows: config="".
Note that a future mode of using this RTP payload format for Note that a future mode of using this RTP payload format for
skipping to change at page 23, line 50 skipping to change at page 24, line 50
defined in a future MPEG-4 IPMP specification. defined in a future MPEG-4 IPMP specification.
For Object Content Info (OCI) streams, this is the For Object Content Info (OCI) streams, this is the
OCIDecoderConfiguration() information of the OCI stream, as OCIDecoderConfiguration() information of the OCI stream, as
defined in section 8.4.2.4 in ISO/IEC 14496-1. defined in section 8.4.2.4 in ISO/IEC 14496-1.
For OD streams, Clock Reference streams and MPEG-J streams, this For OD streams, Clock Reference streams and MPEG-J streams, this
is a quoted empty hexadecimal octet string (config=""), as is a quoted empty hexadecimal octet string (config=""), as
no information on the decoder configuration is required. no information on the decoder configuration is required.
Mode: Mode:
The mode in which this specification is used. The following modes The mode in which this specification is used. The following modes
can be signalled: can be signaled:
mode=generic, mode=generic,
mode=CELP-cbr, mode=CELP-cbr,
mode=CELP-vbr, mode=CELP-vbr,
mode=AAC-lbr and mode=AAC-lbr and
mode=AAC-hbr. mode=AAC-hbr.
Other modes are expected to be defined in future RFCs. See also Other modes are expected to be defined in future RFCs. See also
section 3.3.7. section 3.3.7 and 4.2 of RFCxxxx.
Optional general parameters: Optional general parameters:
ObjectType: ObjectType:
The decimal value from Table 8 in ISO/IEC 14496-1, indicating The decimal value from Table 8 in ISO/IEC 14496-1, indicating
the value of the objectTypeIndication of the transported stream. the value of the objectTypeIndication of the transported stream.
For BIFS streams this parameter MUST be present to signal the For BIFS streams this parameter MUST be present to signal the
version of BIFSConfiguration(). Note that the ObjectType MAY version of BIFSConfiguration(). Note that the ObjectType MAY
signal a non-MPEG-4 stream, and that the RTP payload format signal a non-MPEG-4 stream, and that the RTP payload format
defined in this document may not be suitable to carry a stream defined in this document may not be suitable to carry a stream
that is not defined by MPEG-4. that is not defined by MPEG-4.
ConstantSize: ConstantSize:
The constant size in octets of each Access Unit for this stream. The constant size in octets of each Access Unit for this stream.
Simultaneous presence of ConstantSize and the SizeLength Simultaneous presence of ConstantSize and the SizeLength
parameters is not permitted. parameters is not permitted.
Profile: Profile:
The decimal representation of the applied profile to constrain The decimal representation of the applied profile to constrain
the latency when interleaving; see section 3.2.3.3. Absence of the latency when interleaving; see section 3.2.3.3. Absence of
this parameter signals that the profile is not specified. this parameter signals that the profile is not specified. This
parameter MUST be present when interleaving is applied.
Optional configuration parameters: Optional configuration parameters:
SizeLength: SizeLength:
The number of bits on which the AU-size field is encoded in the The number of bits on which the AU-size field is encoded in the
AU-header. Simultaneous presence of SizeLength and the AU-header. Simultaneous presence of SizeLength and the
ConstantSize parameter is not permitted. ConstantSize parameter is not permitted.
IndexLength: IndexLength:
The number of bits on which the AU-Index is encoded in the first The number of bits on which the AU-Index is encoded in the first
skipping to change at page 25, line 6 skipping to change at page 26, line 6
DTSDeltaLength: DTSDeltaLength:
The number of bits on which the DTS-delta field is encoded in The number of bits on which the DTS-delta field is encoded in
the AU-header. the AU-header.
RandomAccessIndication: RandomAccessIndication:
A decimal value of zero or one, indicating whether the RAP-flag A decimal value of zero or one, indicating whether the RAP-flag
is present in the AU-header. The decimal value of one indicates is present in the AU-header. The decimal value of one indicates
presence of the RAP-flag, the default value zero its absence. presence of the RAP-flag, the default value zero its absence.
StreamStateIndication: StreamStateIndication:
A decimal value of zero or one, indicating whether the The number of bits on which the Stream-state field is encoded in
Stream-state field is present in the AU-header. The decimal the AU-header. This parameter MAY be present when transporting
value of one indicates presence of the Stream-state field, the MPEG-4 system streams, and SHALL NOT be present MPEG-4 audio and
default value zero its absence. MPEG-4 video streams.
AuxiliaryDataSizeLength: AuxiliaryDataSizeLength:
The number of bits that is used to encode the auxiliary-data-size The number of bits that is used to encode the auxiliary-data-size
field. field.
Applications MAY use more parameters, in addition to those defined Applications MAY use more parameters, in addition to those defined
above. Receivers MUST tolerate the presence of such additional above. Each additional parameters MUST be registered with IANA, to
parameters, but these parameters SHALL not impact the decoding of ensure that there is no clash of names. Each additional parameter
receivers that comply to this specification. MUST be accompanied by a specification in the form of an RFC, MPEG
standard, or other permanent and readily available reference (the
"Specification Required" policy defined in RFC 2434). Receivers MUST
tolerate the presence of such additional parameters, but these
parameters SHALL NOT impact the decoding of receivers that comply to
this specification.
Encoding considerations: Encoding considerations:
System bitstreams MUST be generated according to MPEG-4 Systems System bitstreams MUST be generated according to MPEG-4 Systems
specifications (ISO/IEC 14496-1). Video bitstreams MUST be generated specifications (ISO/IEC 14496-1). Video bitstreams MUST be generated
according to MPEG-4 Visual specifications (ISO/IEC 14496-2). Audio according to MPEG-4 Visual specifications (ISO/IEC 14496-2). Audio
bitstreams MUST be generated according to MPEG-4 Audio bitstreams MUST be generated according to MPEG-4 Audio
specifications (ISO/IEC 14496-3). The RTP packets MUST be packetized specifications (ISO/IEC 14496-3). The RTP packets MUST be packetized
according to the RTP payload format defined in RFC xxxx. according to the RTP payload format defined in RFC xxxx.
Security considerations: Security considerations:
skipping to change at page 26, line 28 skipping to change at page 27, line 33
Macintosh File Type Code(s): none Macintosh File Type Code(s): none
Person & email address to contact for further information: Person & email address to contact for further information:
Authors of RFC xxxx, IETF Audio/Video Transport working group. Authors of RFC xxxx, IETF Audio/Video Transport working group.
Intended usage: COMMON Intended usage: COMMON
Author/Change controller: Author/Change controller:
Authors of RFC xxxx, IETF Audio/Video Transport working group. Authors of RFC xxxx, IETF Audio/Video Transport working group.
4.2 Concatenation of parameters 4.2 Registration of mode definitions with IANA
This specification can be used in a number of modes. The mode of
operation is signalled using the "Mode" MIME parameter, with the
initial set of values specified in Section 4.1. New modes may be
defined at any time, as described in Section 3.3.7. These modes
MUST be registered with IANA, to ensure that there is no clash
of names.
A new mode registration MUST be accompanied by a specification in
the form of an RFC, MPEG standard, or other permanent and readily
available reference (the "Specification Required" policy defined
in RFC 2434).
4.3 Concatenation of parameters
Multiple parameters SHOULD be expressed as a MIME media type string, Multiple parameters SHOULD be expressed as a MIME media type string,
in the form of a semicolon-separated list of parameter=value pairs in the form of a semicolon-separated list of parameter=value pairs
(for parameter usage examples see sections 3.3.2 up to 3.3.6). (for parameter usage examples see sections 3.3.2 up to 3.3.6).
4.3 Usage of SDP 4.4 Usage of SDP
4.3.1 The a=fmtp keyword 4.4.1 The a=fmtp keyword
It is assumed that one typical way to transport the above-described It is assumed that one typical way to transport the above-described
parameters associated with this payload format is via a SDP message parameters associated with this payload format is via a SDP message
[6] for example transported to the client in reply to a RTSP [6] for example transported to the client in reply to a RTSP
DESCRIBE or via SAP. In that case the (a=fmtp) keyword MUST be used DESCRIBE or via SAP. In that case the (a=fmtp) keyword MUST be used
as described in RFC 2327 [6], section 6, the syntax being then: as described in RFC 2327 [6], section 6, the syntax being then:
a=fmtp:<format> <parameter name>=<value>[; <parameter name>=<value>] a=fmtp:<format> <parameter name>=<value>[; <parameter name>=<value>]
5. Security Considerations 5. Security Considerations
skipping to change at page 31, line 12 skipping to change at page 32, line 12
(which has a time-stamp beyond the times of all the frames seen so (which has a time-stamp beyond the times of all the frames seen so
far), that we can finish dealing with the loss, even though the far), that we can finish dealing with the loss, even though the
first group has, in fact, ended. (This is in contrast to schemes first group has, in fact, ended. (This is in contrast to schemes
which signal the group size explicitly; if the receiver knows that which signal the group size explicitly; if the receiver knows that
this is packet 3 of 3, then even if 2 of 3 is missing, it can this is packet 3 of 3, then even if 2 of 3 is missing, it can
de-interleave this group without waiting for the next one to start). de-interleave this group without waiting for the next one to start).
In the above example the AU-Index is coded with the value 0, as In the above example the AU-Index is coded with the value 0, as
required for the modes defined in this document. To reconstruct the required for the modes defined in this document. To reconstruct the
original order, the RTP time stamp and the AU-Index-delta are used. original order, the RTP time stamp and the AU-Index-delta are used.
See also 3.2.3.2. See also section 3.2.3.2.
Another example of forming packets with group interleave is given Another example of forming packets with group interleave is given
below. In this example the packets are formed such that the loss of below. In this example the packets are formed such that the loss of
two subsequent RPT packets does not cause the loss of two subsequent two subsequent RPT packets does not cause the loss of two subsequent
audio frames. Note that in this example the RTP time stamps of audio frames. Note that in this example the RTP time stamps of
packets 3 and 4 are earlier than the RTP time stamps of packets 1 packets 3 and 4 are earlier than the RTP time stamps of packets 1
and 2. and 2, respectively.
Packet Time stamp Frame Numbers AU-Index, AU-Index-delta Packet Time stamp Frame Numbers AU-Index, AU-Index-delta
0 T[0] 0, 5, 10, 15 0, 5, 5, 5 0 T[0] 0, 5, 10, 15 0, 5, 5, 5
1 T[2] 2, 7, 12, 17 0, 5, 5, 5 1 T[2] 2, 7, 12, 17 0, 5, 5, 5
2 T[4] 4, 9, 14, 19 0, 5, 5, 5 2 T[4] 4, 9, 14, 19 0, 5, 5, 5
3 T[1] 1, 6, 11, 16 0, 5, 5, 5 3 T[1] 1, 6, 11, 16 0, 5, 5, 5
4 T[3] 3, 8, 13, 18 0, 5, 5, 5 4 T[3] 3, 8, 13, 18 0, 5, 5, 5
5 T[20] 20, 25, 30, 35 0, 5, 5, 5 5 T[20] 20, 25, 30, 35 0, 5, 5, 5
and so on .. and so on ..
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/