draft-ietf-avt-mpeg4-multisl-00.txt   draft-ietf-avt-mpeg4-multisl-01.txt 
skipping to change at line 12 skipping to change at line 12
Internet Engineering Task Force Avaro-France Telecom Internet Engineering Task Force Avaro-France Telecom
Internet Draft Basso-AT&T Internet Draft Basso-AT&T
Casner-Packet Design Casner-Packet Design
Civanlar-AT&T Civanlar-AT&T
Gentric-Philips Gentric-Philips
Herpel-Thomson Herpel-Thomson
Lifshitz-Optibase Lifshitz-Optibase
Lim-mp4cast Lim-mp4cast
Perkins-ISI Perkins-ISI
van der Meer-Philips van der Meer-Philips
June 2001 July 2001
Expires Dec. 2001 Expires Jan. 2002
Document: draft-ietf-avt-mpeg4-multisl-00.txt Document: draft-ietf-avt-mpeg4-multisl-01.txt
RTP Payload Format for MPEG-4 Streams RTP Payload Format for MPEG-4 Streams
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Internet-Drafts are draft documents valid for a maximum of Drafts. Internet-Drafts are draft documents valid for a maximum of
six months and may be updated, replaced, or obsoleted by other six months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet- Drafts documents at any time. It is inappropriate to use Internet- Drafts
as reference material or to cite them other than as "work in as reference material or to cite them other than as "work in
progress." progress."
This specification is a product of the Audio/Video Transport working This specification is a product of the Audio/Video Transport working
group within the Internet Engineering Task Force and ISO/IEC MPEG-4 group within the Internet Engineering Task Force and ISO/IEC MPEG-4
ad hoc group on MPEG-4 over Internet. Comments are solicited and ad hoc group on MPEG-4 over Internet. Comments are solicited and
should be addressed to the working group's mailing list at rem- should be addressed to the working group's mailing list at
conf@es.net and/or the authors. avt@ietf.org and/or the authors.
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This document contains a MIME type registration form that is This document contains a MIME type registration form that is
intended to be taken as-is and therefore makes reference to this intended to be taken as-is and therefore makes reference to this
document, using the temporary placeholder: <self-reference-to-this>. document, using the temporary placeholder: <self-reference-to-this>.
Abstract Abstract
This document describes a payload format for transporting MPEG-4 This document describes a payload format for transporting MPEG-4
encoded data using RTP. MPEG-4 is a recent standard from ISO/IEC for encoded data using RTP. MPEG-4 is a recent standard from ISO/IEC for
the coding of natural and synthetic audio-visual data. Several the coding of natural and synthetic audio-visual data. Several
services provided by RTP are beneficial for MPEG-4 encoded data services provided by RTP are beneficial for MPEG-4 encoded data
Gentric et al. Expires December 2001 1 Gentric et al. Expires January 2002 1
RTP Payload Format for MPEG-4 Streams June 2001 RTP Payload Format for MPEG-4 Streams July 2001
transport over the Internet. Additionally, the use of RTP makes it transport over the Internet. Additionally, the use of RTP makes it
possible to synchronize MPEG-4 data with other real-time data types. possible to synchronize MPEG-4 data with other real-time data types.
1. Introduction 1. Introduction
MPEG-4 is a recent standard from ISO/IEC for the coding of natural MPEG-4 is a recent standard from ISO/IEC for the coding of natural
and synthetic audio-visual data in the form of audiovisual objects and synthetic audio-visual data in the form of audiovisual objects
that are arranged into an audiovisual scene by means of a scene that are arranged into an audiovisual scene by means of a scene
description [1][2][3][4]. This draft specifies an RTP [5] payload description [1][2][3][4]. This draft specifies an RTP [5] payload
skipping to change at line 111 skipping to change at line 111
technologies that are different than the one it is specifically technologies that are different than the one it is specifically
designed to operate with. designed to operate with.
The hierarchical relations, location and properties of ESs in a The hierarchical relations, location and properties of ESs in a
presentation are described by a dynamic set of Object Descriptors presentation are described by a dynamic set of Object Descriptors
(ODs). Each OD groups one or more ES Descriptors referring to a (ODs). Each OD groups one or more ES Descriptors referring to a
single content item (audio-visual object). Hence, multiple single content item (audio-visual object). Hence, multiple
alternative or hierarchical representations of each content item are alternative or hierarchical representations of each content item are
possible. possible.
Gentric et al. Expires December 2001 2 Gentric et al. Expires January 2002 2
RTP Payload Format for MPEG-4 Streams June 2001 RTP Payload Format for MPEG-4 Streams July 2001
ODs are themselves conveyed through one or more ESs. A complete set ODs are themselves conveyed through one or more ESs. A complete set
of ODs can be seen as an MPEG-4 resource or session description at a of ODs can be seen as an MPEG-4 resource or session description at a
stream level. The resource description may itself be hierarchical, stream level. The resource description may itself be hierarchical,
i.e. an ES conveying an OD may describe other ESs conveying other i.e. an ES conveying an OD may describe other ESs conveying other
ODs. ODs.
The session description is accompanied by a dynamic scene The session description is accompanied by a dynamic scene
description, Binary Format for Scene (BIFS), again conveyed through description, Binary Format for Scene (BIFS), again conveyed through
one or more ESs. At this level, content is identified in terms of one or more ESs. At this level, content is identified in terms of
skipping to change at line 167 skipping to change at line 167
then encapsulated in SL packets and in the following we will then encapsulated in SL packets and in the following we will
describe this payload format as transporting SL packets, although in describe this payload format as transporting SL packets, although in
many cases SL packet payloads are actually (entire) Access Units many cases SL packet payloads are actually (entire) Access Units
payloads i.e. encoded media frames. All consecutive data from one payloads i.e. encoded media frames. All consecutive data from one
stream is called an SL-packetized stream at this layer. The stream is called an SL-packetized stream at this layer. The
interface between the compression layer and the SL is called the interface between the compression layer and the SL is called the
Elementary Stream Interface (ESI). The ESI is informative i.e. it is Elementary Stream Interface (ESI). The ESI is informative i.e. it is
extremely useful in order to define concepts and mechanisms but does extremely useful in order to define concepts and mechanisms but does
not have to be implemented. For the same reason this draft describes not have to be implemented. For the same reason this draft describes
Gentric et al. Expires December 2001 3 Gentric et al. Expires January 2002 3
RTP Payload Format for MPEG-4 Streams June 2001 RTP Payload Format for MPEG-4 Streams July 2001
the transport of SL packets i.e. Access Units or fragments thereof. the transport of SL packets i.e. Access Units or fragments thereof.
It is important to note however that a SL stream can be configured It is important to note however that a SL stream can be configured
so that SL packets are reduced to the media (compressed) data and in so that SL packets are reduced to the media (compressed) data and in
that case implementations do not need to be aware of the SL at all. that case implementations do not need to be aware of the SL at all.
The Delivery Layer in MPEG-4 consists of the Delivery Multimedia The Delivery Layer in MPEG-4 consists of the Delivery Multimedia
Integration Framework defined in ISO/IEC 14496-6 [4]. This layer is Integration Framework defined in ISO/IEC 14496-6 [4]. This layer is
media unaware but delivery technology aware. It provides transparent media unaware but delivery technology aware. It provides transparent
access to and delivery of content irrespective of the technologies access to and delivery of content irrespective of the technologies
skipping to change at line 221 skipping to change at line 221
+-------------------------------------------+ +-------------------------------------------+
Figure 1: Conceptual MPEG-4 terminal architecture Figure 1: Conceptual MPEG-4 terminal architecture
1.2 MPEG-4 Elementary Stream Data Packetization 1.2 MPEG-4 Elementary Stream Data Packetization
The ESs from the encoders are fed into the SL with indications of AU The ESs from the encoders are fed into the SL with indications of AU
boundaries, random access points, desired composition time and the boundaries, random access points, desired composition time and the
current time. current time.
Gentric et al. Expires December 2001 4 Gentric et al. Expires January 2002 4
RTP Payload Format for MPEG-4 Streams June 2001 RTP Payload Format for MPEG-4 Streams July 2001
The Sync Layer fragments the ESs into SL packets, each containing a The Sync Layer fragments the ESs into SL packets, each containing a
header that encodes information conveyed through the ESI. If the AU header that encodes information conveyed through the ESI. If the AU
is larger than a SL packet, subsequent packets containing remaining is larger than a SL packet, subsequent packets containing remaining
parts of the AU are generated with subset headers until the complete parts of the AU are generated with subset headers until the complete
AU is packetized. AU is packetized.
The syntax of the Sync Layer is configurable and can be adapted to The syntax of the Sync Layer is configurable and can be adapted to
the needs of the stream to be transported. This includes the the needs of the stream to be transported. This includes the
possibility to select the presence or absence of individual syntax possibility to select the presence or absence of individual syntax
skipping to change at line 277 skipping to change at line 277
when transporting MPEG-4 systems, it is desirable to remove the when transporting MPEG-4 systems, it is desirable to remove the
redundancy between the SL packet header and the RTP packet header. redundancy between the SL packet header and the RTP packet header.
To be independent on the use of MPEG-4 systems, synchronization can To be independent on the use of MPEG-4 systems, synchronization can
rely on the parameters provided in the RTP header. rely on the parameters provided in the RTP header.
In case SL headers are used, the redundant fields are removed from In case SL headers are used, the redundant fields are removed from
the SL header, producing "reduced SL headers". the SL header, producing "reduced SL headers".
The remaining information from the SL header, if any, is contained The remaining information from the SL header, if any, is contained
inside the RTP packet payload, together with the SL packet payload. inside the RTP packet payload, together with the SL packet payload.
Gentric et al. Expires December 2001 5 Gentric et al. Expires January 2002 5
RTP Payload Format for MPEG-4 Streams June 2001 RTP Payload Format for MPEG-4 Streams July 2001
The combination of RTP packet headers and reduced SL packet headers The combination of RTP packet headers and reduced SL packet headers
can be used to logically map the RTP packets to complete SL packets. can be used to logically map the RTP packets to complete SL packets.
Some of the information contained in the reduced SL headers is also Some of the information contained in the reduced SL headers is also
useful for transport over RTP when MPEG-4 systems is not used. useful for transport over RTP when MPEG-4 systems is not used.
For that reason the information in the "reduced" SL headers is split For that reason the information in the "reduced" SL headers is split
into "general useful information" and "MPEG-4 systems only into "general useful information" and "MPEG-4 systems only
information". information".
skipping to change at line 329 skipping to change at line 329
<----RTP Packet Payload-------------------> <----RTP Packet Payload------------------->
Figure 2: Mapping of SL Packet into RTP packet Figure 2: Mapping of SL Packet into RTP packet
When the configuration is such that SL packet headers map directly When the configuration is such that SL packet headers map directly
to RTP headers this process of mapping SL packet headers is purely to RTP headers this process of mapping SL packet headers is purely
conceptual. For example this RTP payload format has been designed so conceptual. For example this RTP payload format has been designed so
that it is by default configured to be identical to RFC 3016 for the that it is by default configured to be identical to RFC 3016 for the
recommended MPEG-4 video configurations (see section 5.5). Hence recommended MPEG-4 video configurations (see section 5.5). Hence
receivers that comply with this payload specification can decode receivers that comply with this payload specification can decode
such RTP payload without knowledge about the Synch Layer (see also such RTP payload without knowledge about the Synch Layer (see the
the example in Appendix.2). In a similar fashion MPEG-4 audio (see example in Appendix.1). In a similar fashion MPEG-4 audio (see
Gentric et al. Expires December 2001 6 Gentric et al. Expires January 2002 6
RTP Payload Format for MPEG-4 Streams June 2001 RTP Payload Format for MPEG-4 Streams July 2001
Appendix.3 and Appendix.4) can be transported without explicit use Appendix for examples) can be transported without explicit use of
of the Synch Layer. the Synch Layer.
3. Payload Format 3. Payload Format
The RTP Payload corresponds to an integer number of SL packets. The RTP Payload corresponds to an integer number of SL packets.
If multiple SL packets are transported in each RTP packet, they MUST If multiple SL packets are transported in each RTP packet, they MUST
be in decoding order, i.e: be in decoding order, i.e:
i) decodingTimeStamp order, if present i) decodingTimeStamp order, if present
ii) packetSequenceNumber order, if present ii) packetSequenceNumber order, if present
iii) Implicit decoding order in all other cases. iii) Implicit decoding order in all other cases.
skipping to change at line 389 skipping to change at line 389
new packet format is outside the scope of this document, and will new packet format is outside the scope of this document, and will
not be specified here. It is expected that the RTP profile for a not be specified here. It is expected that the RTP profile for a
particular class of applications will assign a payload type for this particular class of applications will assign a payload type for this
encoding, or if that is not done then a payload type in the dynamic encoding, or if that is not done then a payload type in the dynamic
range shall be chosen. range shall be chosen.
Marker (M) bit: The M bit is set to 1 when all SL packets in the RTP Marker (M) bit: The M bit is set to 1 when all SL packets in the RTP
packet are Access Units ends i.e. the M bit maps to the Synch Layer packet are Access Units ends i.e. the M bit maps to the Synch Layer
accessUnitEndFlag. accessUnitEndFlag.
Gentric et al. Expires December 2001 7 Gentric et al. Expires January 2002 7
RTP Payload Format for MPEG-4 Streams June 2001 RTP Payload Format for MPEG-4 Streams July 2001
Specifically the M bit is set to 0 when the RTP packet contains one Specifically the M bit is set to 0 when the RTP packet contains one
or more Access Unit fragments that are not Access Unit ends, and the or more Access Unit fragments that are not Access Unit ends, and the
M bit is set to 1 for RTP packets that contain either: M bit is set to 1 for RTP packets that contain either:
. A single complete Access Unit . A single complete Access Unit
. The last fragment of an Access Unit . The last fragment of an Access Unit
. Several complete Access Units . Several complete Access Units
. Several last fragments of Access Units . Several last fragments of Access Units
. A mix of complete Access Units and last fragments of Access Units . A mix of complete Access Units and last fragments of Access Units
skipping to change at line 444 skipping to change at line 444
at the time the RTP packet is created. at the time the RTP packet is created.
According to RFC1889 [5, Section 5.1] timestamps are recommended to According to RFC1889 [5, Section 5.1] timestamps are recommended to
start at a random value for security reasons. However then, a start at a random value for security reasons. However then, a
receiver is not in the general case able to reconstruct the original receiver is not in the general case able to reconstruct the original
MPEG-4 Time Stamps (CTS, DTS, OCR) which can be of use for MPEG-4 Time Stamps (CTS, DTS, OCR) which can be of use for
applications where streams from multiple sources are to be applications where streams from multiple sources are to be
synchronized. Therefore the usage of such a random offset SHOULD be synchronized. Therefore the usage of such a random offset SHOULD be
avoided. avoided.
Gentric et al. Expires December 2001 8 Gentric et al. Expires January 2002 8
RTP Payload Format for MPEG-4 Streams June 2001 RTP Payload Format for MPEG-4 Streams July 2001
Note that since RTP devices may re-stamp the stream, all time stamps Note that since RTP devices may re-stamp the stream, all time stamps
inside of the RTP payload (CTS and DTS in MSLH, OCR in RSLH) MUST be inside of the RTP payload (CTS and DTS in MSLH, OCR in RSLH) MUST be
expressed as difference to the RTP time stamp. Since this expressed as difference to the RTP time stamp. Since this
subtraction may lead to negative values, the offset MUST be encoded subtraction may lead to negative values, the offset MUST be encoded
as a two's complement signed integer in network byte order. Note as a two's complement signed integer in network byte order. Note
these offsets (delta) typically require much fewer bits to be these offsets (delta) typically require much fewer bits to be
encoded than the original length, which is another justification. encoded than the original length, which is another justification.
When startCompositionTimeStamp is signaled in the SLConfigDescriptor When startCompositionTimeStamp is signaled in the SLConfigDescriptor
skipping to change at line 500 skipping to change at line 500
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X| CC |M| PT | sequence number | |V=2|P|X| CC |M| PT | sequence number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp | | timestamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| synchronization source (SSRC) identifier | | synchronization source (SSRC) identifier |
Gentric et al. Expires December 2001 9 Gentric et al. Expires January 2002 9
RTP Payload Format for MPEG-4 Streams June 2001 RTP Payload Format for MPEG-4 Streams July 2001
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: contributing source (CSRC) identifiers : : contributing source (CSRC) identifiers :
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| | | |
| MSLHSection (byte aligned) | | MSLHSection (byte aligned) |
| | | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
skipping to change at line 555 skipping to change at line 555
This size field is absent in the Single-SL mode not because it is This size field is absent in the Single-SL mode not because it is
not needed (which would be a minor gain) but for compatibility with not needed (which would be a minor gain) but for compatibility with
RFC 3016. RFC 3016.
This size field is also absent when the value would always be zero This size field is also absent when the value would always be zero
because the MSLH is always empty, which may happen when a constant because the MSLH is always empty, which may happen when a constant
size in signaled using ConstantSize. size in signaled using ConstantSize.
0 1 2 3 0 1 2 3
Gentric et al. Expires December 2001 10 Gentric et al. Expires January 2002 10
RTP Payload Format for MPEG-4 Streams June 2001 RTP Payload Format for MPEG-4 Streams July 2001
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MSLH section size in bits | MSLH | etc | | MSLH section size in bits | MSLH | etc |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| as many bit-wise concatenated MSLHs | | as many bit-wise concatenated MSLHs |
| as SL packets in this RTP packet | | as SL packets in this RTP packet |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| : padding bits| | : padding bits|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
skipping to change at line 606 skipping to change at line 606
by parsing it since for example the presence of CTSDelta is signaled by parsing it since for example the presence of CTSDelta is signaled
by the value of CTSFlag. by the value of CTSFlag.
3.4.1 Fields of MSLH 3.4.1 Fields of MSLH
PayloadSize: Indicates the size in bytes of the associated SL Packet PayloadSize: Indicates the size in bytes of the associated SL Packet
Payload, which can be found in the SLPPSection of the RTP packet. Payload, which can be found in the SLPPSection of the RTP packet.
The length in bits of this field is signaled by the SizeLength The length in bits of this field is signaled by the SizeLength
parameter (see section 4.1). parameter (see section 4.1).
IndexDelta: Encodes the packetSequenceNumber (serial number) of the There is an exception to that: when the RTP packet contains a single
SL Packet. When making streams specifically for transport with this SL packet the PayloadSize field SHALL contain the size of the entire
payload format this is useful for interleaving. Since a mapping to corresponding Access Unit, for two reasons, firstly the size of the
RTP sequence number is not possible in the Multiple-SL mode there is fragment is not needed when there is only one fragment, secondly
no requirement for a correspondence.
Gentric et al. Expires December 2001 11 Gentric et al. Expires January 2002 11
RTP Payload Format for MPEG-4 Streams June 2001 RTP Payload Format for MPEG-4 Streams July 2001
Index is found only for the first SL packet of a RTP packet. this is useful in order to detect that a full Access Unit has been
IndexDelta is optional and -if present- appears for subsequent (non- received after the loss of a packet carrying M bit set to 1.
first) SL packets in a RTP packet.
Index, IndexDelta: Encodes the packetSequenceNumber (serial number)
of the SL Packet. When making streams specifically for transport
with this payload format IndexDelta is useful for interleaving (see
section 3.8). Since a mapping of packetSequenceNumber to RTP
sequence number is not possible in the Multiple-SL mode there is no
requirement for a correspondence.
Index is optional and -if present- appears for the first SL packet
in a RTP packet.
The length in bits of the Index field is defined by the IndexLength The length in bits of the Index field is defined by the IndexLength
parameter (see section 4.1). parameter (see section 4.1).
IndexDelta is optional and -if present- appears for subsequent (non-
first) SL packets in a RTP packet.
The length in bits of the IndexDelta field is defined by the The length in bits of the IndexDelta field is defined by the
IndexDeltaLength parameter (see section 4.1). IndexDeltaLength parameter (see section 4.1).
If the parameter IndexDeltaLength is defined, non-first SL packets If the parameter IndexDeltaLength is defined, non-first SL packets
inside a RTP packet have their packetSequenceNumber encoded as a inside a RTP packet have their packetSequenceNumber encoded as a
difference named IndexDelta. This difference is relative to the difference (thus the name IndexDelta). This difference is relative
previous SL packet in the RTP packet according to (with i>=0): to the previous SL packet in the RTP packet according to (with
i>=0):
packetSequenceNumber(0) = Index(0) packetSequenceNumber(0) = Index(0)
packetSequenceNumber(i+1) = packetSequenceNumber(i) + packetSequenceNumber(i+1) = packetSequenceNumber(i) +
IndexDelta(i+1) + 1 IndexDelta(i+1) + 1
If the parameter IndexDeltaLength is not defined the default value If the parameter IndexDeltaLength is not defined the default value
is zero and then the IndexDelta field is not present for non-first is zero and then the IndexDelta field is not present for non-first
SL packets. Nevertheless receivers SHALL then apply the above SL packets. Nevertheless receivers SHALL then apply the above
formula with IndexDelta equal to zero. In other words by default formula with IndexDelta equal to zero. In other words by default
packetSequenceNumber is incremented by 1 for each SL packet in one packetSequenceNumber is incremented by 1 for each SL packet in one
RTP packet. RTP packet.
skipping to change at line 654 skipping to change at line 666
If CTSDeltaLength is not zero, CTSFlag is present in all MSLH If CTSDeltaLength is not zero, CTSFlag is present in all MSLH
regardless of whether the SL packet is an Access Unit start or not. regardless of whether the SL packet is an Access Unit start or not.
CTSDelta (CTSDeltaLength bits): Specifies the value of the CTS as a CTSDelta (CTSDeltaLength bits): Specifies the value of the CTS as a
2-complement offset (delta) from the timestamp in the RTP header of 2-complement offset (delta) from the timestamp in the RTP header of
the RTP packet. The length in bits of each CTSDelta field is the RTP packet. The length in bits of each CTSDelta field is
specified by the CTSDeltaLength parameter (see section 4.1). specified by the CTSDeltaLength parameter (see section 4.1).
The CTSDelta field is present if CTSFlag is 1. The CTSDelta field is present if CTSFlag is 1.
Gentric et al. Expires January 2002 12
RTP Payload Format for MPEG-4 Streams July 2001
For the first MSLH of each RTP packet CTSFlag is always 0, since the For the first MSLH of each RTP packet CTSFlag is always 0, since the
composition time stamp of the first SL packet in the RTP packet is composition time stamp of the first SL packet in the RTP packet is
mapped to the RTP time stamp. In all cases the sender MUST remove mapped to the RTP time stamp. In all cases the sender MUST remove
the compositionTimeStamp from the RSLH. the compositionTimeStamp from the RSLH.
DTSFlag (1 bit): Indicates whether the DTSDelta field is present. A DTSFlag (1 bit): Indicates whether the DTSDelta field is present. A
value of 1 indicates that DTSDelta is present, a value of 0 that it value of 1 indicates that DTSDelta is present, a value of 0 that it
is not present. is not present.
If DTSDeltaLength is not zero, DTSFlag is present in all MSLH If DTSDeltaLength is not zero, DTSFlag is present in all MSLH
regardless of whether the SL packet is an Access Unit start or not; regardless of whether the SL packet is an Access Unit start or not;
the receiver needs this flag in order to reconstruct the the receiver needs this flag in order to reconstruct the
decodingTimeStampFlag of SL Headers. decodingTimeStampFlag of SL Headers.
Gentric et al. Expires December 2001 12
RTP Payload Format for MPEG-4 Streams June 2001
DTSDelta (DTSDeltaLength bits): encodes (compositionTimeStamp - DTSDelta (DTSDeltaLength bits): encodes (compositionTimeStamp -
decodingTimeStamp) for the same SL packet (always positive). The decodingTimeStamp) for the same SL packet (always positive). The
length in bits of each DTSDelta field is specified by the length in bits of each DTSDelta field is specified by the
DTSDeltaLength parameter (see section 4.1). DTSDeltaLength parameter (see section 4.1).
The DTSDelta field appears when DTSFlag is 1. The sender MUST always The DTSDelta field appears when DTSFlag is 1. The sender MUST always
remove the decodingTimeStamp from the RSLH. remove the decodingTimeStamp from the RSLH.
If DTSDelta is zero i.e. if decodingTimeStamp equals If DTSDelta is zero i.e. if decodingTimeStamp equals
compositionTimeStamp then DTSFlag MUST be set to 0 and no DTSDelta compositionTimeStamp then DTSFlag MUST be set to 0 and no DTSDelta
skipping to change at line 710 skipping to change at line 722
+---------------------------+---------------------------------+ +---------------------------+---------------------------------+
| DTSFlag | 1 If (DTSDeltaLength > 0) | | DTSFlag | 1 If (DTSDeltaLength > 0) |
+---------------------------+---------------------------------+ +---------------------------+---------------------------------+
| DTSDelta | DTSDeltaLength If (DTSFlag==1) | | DTSDelta | DTSDeltaLength If (DTSFlag==1) |
+---------------------------+---------------------------------+ +---------------------------+---------------------------------+
Table 1: Relationship between MSLH field size and parameters Table 1: Relationship between MSLH field size and parameters
3.5 RSLHSection structure 3.5 RSLHSection structure
Gentric et al. Expires January 2002 13
RTP Payload Format for MPEG-4 Streams July 2001
This section consists of a field (RSLHSectionSize) giving the size This section consists of a field (RSLHSectionSize) giving the size
in bits of the following block of bit-wise concatenated RSLHs. in bits of the following block of bit-wise concatenated RSLHs.
If the section consumes a non-integer number of bytes, up to 7 zero If the section consumes a non-integer number of bytes, up to 7 zero
padding bits MUST be inserted at the end in order to achieve byte- padding bits MUST be inserted at the end in order to achieve byte-
alignment. alignment.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RSLHSectionSize (RSLHSectionSizeLength bits)| RSLH (variable | | RSLHSectionSize (RSLHSectionSizeLength bits)| RSLH (variable |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| number of bits) | | number of bits) |
| | | |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Gentric et al. Expires December 2001 13
RTP Payload Format for MPEG-4 Streams June 2001
| | RSLH (variable number of bits) | | | RSLH (variable number of bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| etc | | etc |
| as many bit-wise concatenated RSLHs | | as many bit-wise concatenated RSLHs |
| as SL Packets in this RTP packet | | as SL Packets in this RTP packet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RSLH (variable number of bits) | | RSLH (variable number of bits) |
| +-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+
| : padding bits| | : padding bits|
|-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 7: RSLHSection structure Figure 7: RSLHSection structure
The length in bits of the RSLHSectionSize field is The length in bits of the RSLHSectionSize field is
RSLHSectionSizeLength and is specified with a default value of zero RSLHSectionSizeLength and is specified with a default value of zero
indicating that the whole RSLHSection is absent. indicating that the whole RSLHSection is absent. Compatibility with
RFC 3016 requires that the RSLHSection should be empty, including
the RSLHSectionSize field. This is the reason why there is such a
variable length with a default value indicating absence of the
RSLHSectionSize field.
+=================================+===============================+ +=================================+===============================+
| Fields of RSLHSection | Number of bits | | Fields of RSLHSection | Number of bits |
+=================================+===============================+ +=================================+===============================+
| RSLHSectionSize | RSLHSectionSizeLength | | RSLHSectionSize | RSLHSectionSizeLength |
+---------------------------------+-------------------------------+ +---------------------------------+-------------------------------+
| all bit-wise concatenated RSLHs | RSLHSectionSize | | all bit-wise concatenated RSLHs | RSLHSectionSize |
+---------------------------------+-------------------------------+ +---------------------------------+-------------------------------+
Table 2: Sizes in bits inside RSLHSection Table 2: Sizes in bits inside RSLHSection
Parsing of the bit-wise concatenated RSLHs requires MPEG-4 system Parsing of the bit-wise concatenated RSLHs requires MPEG-4 system
awareness, specifically it requires to understand the MPEG-4 awareness, specifically it requires to understand the MPEG-4
Synchronization Layer (SL) syntax and the modifications to this Synchronization Layer (SL) syntax and the modifications to this
syntax described in the next section. syntax described in the next section.
However thanks to the RSLHSectionSize field non-MPEG-4-system However thanks to the RSLHSectionSize field non-MPEG-4-system
receivers MAY skip this part by rounding up RSLPHSize/8 to the next receivers MAY skip this part by rounding up RSLPHSize/8 to the next
integer number of bytes. integer number of bytes.
Gentric et al. Expires January 2002 14
RTP Payload Format for MPEG-4 Streams July 2001
3.6 RSLH structure 3.6 RSLH structure
A Remaining SL Packet Header (RSLH) is what remains of an SL header A Remaining SL Packet Header (RSLH) is what remains of an SL header
after modifications for mapping into this payload format. after modifications for mapping into this payload format.
The following modifications of the SL packet header MUST be applied. The following modifications of the SL packet header MUST be applied.
The other fields of the SL packet header MUST remain unchanged but The other fields of the SL packet header MUST remain unchanged but
are bit-shifted to fill in the gaps left by the operations specified are bit-shifted to fill in the gaps left by the operations specified
below. below.
3.6.1 Removal of fields 3.6.1 Removal of fields
The following SL Packet Header fields -if present- are removed since The following SL Packet Header fields -if present- are removed since
they are mapped either in the RTP header or in the corresponding they are mapped either in the RTP header or in the corresponding
MSLH: MSLH:
. compositionTimeStampFlag . compositionTimeStampFlag
. compositionTimeStamp . compositionTimeStamp
Gentric et al. Expires December 2001 14
RTP Payload Format for MPEG-4 Streams June 2001
. decodingTimeStampFlag . decodingTimeStampFlag
. decodingTimeStamp . decodingTimeStamp
. packetSequenceNumber . packetSequenceNumber
. AccessUnitEndFlag (in Single-SL mode only) . AccessUnitEndFlag (in Single-SL mode only)
The AccessUnitEndFlag, when present for a given stream, MUST be The AccessUnitEndFlag, when present for a given stream, MUST be
removed from every RSLH when using the Single-SL mode since it has removed from every RSLH when using the Single-SL mode since it has
the same meaning as the Marker bit (and for compatibility with RFC the same meaning as the Marker bit (and for compatibility with RFC
3016). However when using the Multiple-SL mode, AccessUnitEndFlag 3016). However when using the Multiple-SL mode, AccessUnitEndFlag
MUST NOT be removed since it is useful to signal individual AU ends. MUST NOT be removed since it is useful to signal individual AU ends.
skipping to change at line 821 skipping to change at line 835
For streams that use the optional degradationPriority field in the For streams that use the optional degradationPriority field in the
SL Packet Headers, only SL packets with the same degradation SL Packet Headers, only SL packets with the same degradation
priority SHALL be transported by one RTP packet so that components priority SHALL be transported by one RTP packet so that components
may dispatch the RTP packets according to appropriate QOS or may dispatch the RTP packets according to appropriate QOS or
protection schemes. Furthermore only the first RSLH of one RTP protection schemes. Furthermore only the first RSLH of one RTP
packet SHALL contain the degradationPriority field since it would be packet SHALL contain the degradationPriority field since it would be
otherwise redundant. otherwise redundant.
3.7 SLPPSection structure 3.7 SLPPSection structure
Gentric et al. Expires January 2002 15
RTP Payload Format for MPEG-4 Streams July 2001
The SLPPSection (SL Packet Payload Section) contains the The SLPPSection (SL Packet Payload Section) contains the
concatenated SL Packet Payloads. By definition SL Packet Payloads concatenated SL Packet Payloads. By definition SL Packet Payloads
are byte aligned. are byte aligned.
For efficiency SL packets do not carry their own payload size. This For efficiency SL packets do not carry their own payload size. This
is not an issue for RTP packets that contain a single SL Packet. is not an issue for RTP packets that contain a single SL Packet.
However in the Multiple-SL mode the size of each SL packet payload However in the Multiple-SL mode the size of each SL packet payload
MUST be available to the receiver. MUST be available to the receiver.
If the SL packet payload size is constant for a stream, the size If the SL packet payload size is constant for a stream, the size
information SHOULD NOT be transported in the RTP packet. However in information SHOULD NOT be transported in the RTP packet. However in
that case it MUST be signaled using the ConstantSize parameter (see that case it MUST be signaled using the ConstantSize parameter (see
section 4.1). section 4.1).
Gentric et al. Expires December 2001 15
RTP Payload Format for MPEG-4 Streams June 2001
If the SL packet payload size is variable then the size of each SL If the SL packet payload size is variable then the size of each SL
packet payload MUST be indicated in the corresponding MSLH. In order packet payload MUST be indicated in the corresponding MSLH. In order
to do so the MSLH MUST contain a PayloadSize field. The number of to do so the MSLH MUST contain a PayloadSize field. The number of
bits on which this PayloadSize field is encoded MUST be indicated bits on which this PayloadSize field is encoded MUST be indicated
using the SizeLength parameter (see section 4.1). using the SizeLength parameter (see section 4.1).
The absence of either ConstantSize or SizeLength indicates the The absence of either ConstantSize or SizeLength indicates the
Single-SL mode i.e. that a single SL packet is transported in each Single-SL mode i.e. that a single SL packet is transported in each
RTP packet for that stream. RTP packet for that stream.
skipping to change at line 873 skipping to change at line 887
|-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 8: SLPPSection structure Figure 8: SLPPSection structure
3.8 Interleaving 3.8 Interleaving
SL Packets MAY be interleaved. Senders MAY perform interleaving. SL Packets MAY be interleaved. Senders MAY perform interleaving.
Receivers MUST support interleaving. Receivers MUST support interleaving.
When interleaving of SL packets is used it SHALL be implemented When interleaving of SL packets is used it SHALL be implemented
using the Index field of MSLH. using the Index and IndexDelta fields of MSLH.
Gentric et al. Expires January 2002 16
RTP Payload Format for MPEG-4 Streams July 2001
The conjunction of RTP sequence number and Index, IndexDelta can
produce a quasi-unique identifier for each SL packet so that a
receiver can unambiguously reconstruct the original order even in
case of out-of-order packets, packet loss or duplication.
However implementors of receivers must take care that when
IndexLength is small, Index will rollover often; for that reason
timestamps SHOULD be used as a basis for implementation of de-
interleaving, i.e. the reordering algorithm should consider
timestamps and IndexDelta first and use Index only when CTS are not
available. Symmetrically senders MUST either use properly large
values for IndexLength or use small values only when CTS are either
present in MSLH or can be otherwise unambiguously computed for each
SL packet (for example audio streams as in Appendix.5).
The AUSequenceNumber field of the SL header MUST NOT be used for The AUSequenceNumber field of the SL header MUST NOT be used for
interleaving since firstly it may collide with the Scene Description interleaving since firstly it may collide with the Scene Description
Carousel usage described in section 4.1 and secondly it is not Carousel usage described in section 4.1 and secondly it is not
visible to non-MPEG-4 system receivers. visible to non-MPEG-4 system receivers.
The conjunction of RTP sequence number and Index can produce a
quasi-unique identifier for each SL packet so that a receiver can
unambiguously reconstruct the original order even in case of out-of-
order packets, packet loss or duplication.
3.9 Fragmentation Rules 3.9 Fragmentation Rules
This section specifies rules for senders in order to prevent media This section specifies rules for senders in order to prevent media
decoding difficulties at the receiver end. decoding difficulties at the receiver end.
Gentric et al. Expires December 2001 16
RTP Payload Format for MPEG-4 Streams June 2001
MPEG-4 Access Units are the default fragments for MPEG-4 bitstreams MPEG-4 Access Units are the default fragments for MPEG-4 bitstreams
and SHOULD be mapped directly into RTP packets of this format with and SHOULD be mapped directly into RTP packets of this format with
two exceptions: two exceptions:
- Access Units larger than the MTU - Access Units larger than the MTU
- When using interleaving for better packet loss resilience. - When using interleaving for better packet loss resilience.
In all cases Access Unit start MUST be aligned with SL packet start. In all cases Access Unit start MUST be aligned with SL packet start.
This section gives rules to apply when performing Access Unit This section gives rules to apply when performing Access Unit
fragmentation. fragmentation.
skipping to change at line 921 skipping to change at line 945
Therefore encoders and decoders are both aware whether they are Therefore encoders and decoders are both aware whether they are
operating in such a mode or not (however since this codec operating in such a mode or not (however since this codec
configuration is an opaque data block this is not explicitly configuration is an opaque data block this is not explicitly
signaled by this payload format). signaled by this payload format).
If not operating in such a mode it is obvious that the decoder has If not operating in such a mode it is obvious that the decoder has
to skip packets after a loss until an Access Unit start is received. to skip packets after a loss until an Access Unit start is received.
Similarly decoder implementations that do not implement robust Similarly decoder implementations that do not implement robust
decoding of Access Units fragments have to discard all packets after decoding of Access Units fragments have to discard all packets after
a packet loss until an Access Unit start is received. In the same a packet loss until an Access Unit start is received. In the same
Gentric et al. Expires January 2002 17
RTP Payload Format for MPEG-4 Streams July 2001
way decoder implementations that do not implement re-synchronization way decoder implementations that do not implement re-synchronization
at any Access Units start have to discard all packets after a packet at any Access Units start have to discard all packets after a packet
loss until a Random Access Point Access Unit is received. These are loss until a Random Access Point Access Unit is received. These are
all obvious things that a good implementation would do. all obvious things that a good implementation would do.
However serious problems would arise for decoder implementations However serious problems would arise for decoder implementations
that try to restart decoding after a packet loss if independently that try to restart decoding after a packet loss if independently
decodable fragments are signaled (in the decoder configuration) but decodable fragments are signaled (in the decoder configuration) but
the fragments actually received are not independently decodable the fragments actually received are not independently decodable
because the RTP sender has made RTP packets on different boundaries because the RTP sender has made RTP packets on different boundaries
skipping to change at line 945 skipping to change at line 973
For this reason the following rules must apply to SL streams that For this reason the following rules must apply to SL streams that
are specifically made for transport with this payload format: are specifically made for transport with this payload format:
SL packets SHOULD be codec-semantic entities in the spirit of ALF SL packets SHOULD be codec-semantic entities in the spirit of ALF
i.e. either complete Access Units or fragments of Access Units that i.e. either complete Access Units or fragments of Access Units that
are independently decodable. Specifically when a given codec has an are independently decodable. Specifically when a given codec has an
independently decodable Access Unit fragments optional syntax this independently decodable Access Unit fragments optional syntax this
option SHOULD be used. option SHOULD be used.
Gentric et al. Expires December 2001 17
RTP Payload Format for MPEG-4 Streams June 2001
Furthermore when streams are generated using independently decodable Furthermore when streams are generated using independently decodable
Access Units fragments these Access Units fragments MUST be mapped Access Units fragments these Access Units fragments MUST be mapped
one-to-one into SL packets. Consequently independently decodable one-to-one into SL packets. Consequently independently decodable
Access Units fragments MUST NOT be split across several SL packets Access Units fragments MUST NOT be split across several SL packets
and therefore MUST NOT be split across several RTP packets. and therefore MUST NOT be split across several RTP packets.
For example an MPEG-4 audio stream encoded using the ESC syntax MUST For example an MPEG-4 audio stream encoded using the ESC syntax MUST
NOT split one ESC across 2 RTP packets. NOT split one ESC across 2 RTP packets.
This rule is relaxed when using MPEG-4 Video Packets for two This rule is relaxed when using MPEG-4 Video Packets for two
skipping to change at line 977 skipping to change at line 1002
the same SL packet. the same SL packet.
4. Types and Names 4. Types and Names
This section describes the MIME types and names associated with this This section describes the MIME types and names associated with this
payload format. Section 4.1 is intended for registration with IANA payload format. Section 4.1 is intended for registration with IANA
as in RFC 2048. as in RFC 2048.
This format may require additional information about the mapping to This format may require additional information about the mapping to
be made available to the receiver. This is done using parameters be made available to the receiver. This is done using parameters
Gentric et al. Expires January 2002 18
RTP Payload Format for MPEG-4 Streams July 2001
described in the next section. The absence of any of these fields is described in the next section. The absence of any of these fields is
equivalent to a field set to the default value, which is always equivalent to a field set to the default value, which is always
zero. The absence of any such parameters resolves into a default zero. The absence of any such parameters resolves into a default
"basic" configuration. "basic" configuration compatible with RFC3016 for MPEG-4 video.
In the MPEG-4 framework the SL stream configuration information is In the MPEG-4 framework the SL stream configuration information is
carried using the Object Descriptor. For compatibility with carried using the Object Descriptor. For compatibility with
receivers that do not implement the full MPEG-4 system specification receivers that do not implement the full MPEG-4 system specification
this information MAY also be signaled using parameters described this information MAY also be signaled using parameters described
here. When such information is present both in an Object Descriptor here. When such information is present both in an Object Descriptor
and as a parameter of this payload format it MUST be exactly the and as a parameter of this payload format it MUST be exactly the
same. same.
For transport of MPEG-4 audio and video without the use of MPEG-4 For transport of MPEG-4 audio and video without the use of MPEG-4
systems, as well as to support non-MPEG-4 system receivers, it is systems, as well as to support non-MPEG-4 system receivers, it is
also possible to transport information on the profile and level of also possible to transport information on the profile and level of
the stream and on the decoder configuration. This is also described the stream and on the decoder configuration. This is also described
in the next section. in the next section.
Finally this MIME type also defines a mode parameter and a profile
parameter that are intended for future derivations of this payload
format.
4.1 MIME type registration 4.1 MIME type registration
MIME media type name: "video" or "audio" or "application" MIME media type name: "video" or "audio" or "application"
Gentric et al. Expires December 2001 18 "video" SHOULD be used for MPEG-4 Visual streams (i.e. video as
RTP Payload Format for MPEG-4 Streams June 2001 defined in ISO/IEC 14496-2 [2] and/or graphics as defined in ISO/IEC
14496-1 [1]) or MPEG-4 Systems streams that convey information
"video" SHOULD be used for MPEG-4 Video streams (ISO/IEC 14496-2) or needed for an audio/visual presentation.
MPEG-4 Systems streams that convey information needed for an
audio/visual presentation.
"audio" SHOULD be used for MPEG-4 Audio streams (ISO/IEC 14496-3) or "audio" SHOULD be used for MPEG-4 Audio streams (ISO/IEC 14496-3) or
MPEG-4 Systems streams that convey information needed for an audio MPEG-4 Systems streams that convey information needed for an audio
only presentation. only presentation.
"application" SHOULD be used for MPEG-4 Systems streams "application" SHOULD be used for MPEG-4 Systems streams
(ISO/IEC14496-1) that serve other purposes than audio/visual (ISO/IEC14496-1) that serve other purposes than audio/visual
presentation, e.g. in some cases when MPEG-J streams are presentation, e.g. in some cases when MPEG-J streams are
transmitted. transmitted.
MIME subtype name: mpeg4-sl MIME subtype name: mpeg4-generic
Required parameters: none Required parameters: none
Optional parameters: Optional parameters:
Mode:
The mode in which this specification is used. This specification
itself defines only the default mode (Mode=default). When the mode
parameter is not present the default mode SHALL be assumed. In the
default mode all parameters are optional and as defined here. Other
modes may be defined as needed in other RFCs. A mode MUST be a
Gentric et al. Expires January 2002 19
RTP Payload Format for MPEG-4 Streams July 2001
subset of this specification. Specifically when defining a mode care
MUST be taken that an implementation of this specification can
decode the payload format corresponding to this new mode. For this
reason a mode MUST NOT specify new default values for MIME
parameters and MIME parameters MUST be present (unless they have the
default value) even if it is redundant in case the mode assigns
fixed values. A mode may define additionally that some MIME
parameters are required instead of optional, that some MIME
parameters have fixed values (or ranges), and that there are rules
restricting the usage (for example forbidding the carriage of
multiple AU fragments in the same RTP packet).
Profile:
The meaning of this parameter may be defined by a mode. This is
meant to be used in order to define sub-configurations of a given
mode, for example the maximum delay (and therefore the size of
buffers) induced by the usage of interleaving. Implementations of
this specification can ignore this parameter.
DTSDeltaLength: DTSDeltaLength:
The number of bits on which the DTSDelta field is encoded in MSLH. The number of bits on which the DTSDelta field is encoded in MSLH.
The default value is zero and indicates the absence of DTSFlag and The default value is zero and indicates the absence of DTSFlag and
DTSDelta in MSLH (the stream does not transport decodingTimeStamps). DTSDelta in MSLH (the stream does not transport decodingTimeStamps).
A value larger than zero indicates that there is a DTSFlag in each A value larger than zero indicates that there is a DTSFlag in each
MSLH. Since decodingTimeStamp -if present- must be encoded as a MSLH. Since decodingTimeStamp -if present- must be encoded as a
difference to the RTP time stamp, the DTSDeltaLength parameter MUST difference to the RTP time stamp, the DTSDeltaLength parameter MUST
be present in order to transport decodingTimeStamps with this be present in order to transport decodingTimeStamps with this
payload format. payload format.
skipping to change at line 1057 skipping to change at line 1117
The default value is zero and indicates the absence of OCR for this The default value is zero and indicates the absence of OCR for this
stream. Since objectClockReference -if present- must be encoded as a stream. Since objectClockReference -if present- must be encoded as a
difference to the RTP time stamp, the OCRDeltaLength parameter MUST difference to the RTP time stamp, the OCRDeltaLength parameter MUST
be present in order to transport objectClockReferences with this be present in order to transport objectClockReferences with this
payload format. payload format.
SizeLength: SizeLength:
The number of bits on which the PayloadSize field of MSLH is The number of bits on which the PayloadSize field of MSLH is
encoded. The default value is zero and indicates the Single-SL mode encoded. The default value is zero and indicates the Single-SL mode
Gentric et al. Expires December 2001 19 Gentric et al. Expires January 2002 20
RTP Payload Format for MPEG-4 Streams June 2001 RTP Payload Format for MPEG-4 Streams July 2001
(unless ConstantSize is present). Simultaneous presence of this (unless ConstantSize is present). Simultaneous presence of this
parameter and ConstantSize is illegal. Either the SizeLength or parameter and ConstantSize is illegal. Either the SizeLength or
ConstantSize parameter MUST be present in order to signal the ConstantSize parameter MUST be present in order to signal the
Multiple-SL mode of this payload format. Multiple-SL mode of this payload format.
ConstantSize: ConstantSize:
The constant size in bytes of each SL Packet Payload for this The constant size in bytes of each SL Packet Payload for this
stream. The default value is zero and indicates variable SL Packet stream. The default value is zero and indicates variable SL Packet
Payload size (or the Single-SL mode if SizeLength is absent). Payload size (or the Single-SL mode if SizeLength is absent).
skipping to change at line 1086 skipping to change at line 1146
The number of bits on which the Index is encoded in the first MSLH. The number of bits on which the Index is encoded in the first MSLH.
The default value is zero and indicates the absence of Index and The default value is zero and indicates the absence of Index and
IndexDelta for all MSLHs. Since packetSequenceNumber -if present- IndexDelta for all MSLHs. Since packetSequenceNumber -if present-
must be mapped in MSLH, the IndexLength parameter MUST be present in must be mapped in MSLH, the IndexLength parameter MUST be present in
order to transport packetSequenceNumber with this payload format. order to transport packetSequenceNumber with this payload format.
IndexDeltaLength: IndexDeltaLength:
The number of bits on which the IndexDelta are encoded in any non- The number of bits on which the IndexDelta are encoded in any non-
first MSLH. The default value is zero and indicates that first MSLH. The default value is zero and indicates that
packetSequenceNumber MUST be incremented by one for each SL packet packetSequenceNumber MUST be incremented by one for each SL packet
in the RTP packet (see section 3.5). Since when interleaving in the RTP packet (see section 3.5). IndexDeltaLength parameter MUST
packetSequenceNumber does not increment by 1 inside a RTP packet, be present when using interleaving with this payload format.
the IndexDeltaLength parameter MUST be present when using
interleaving with this payload format.
RSLHSectionSizeLength: RSLHSectionSizeLength:
The number of bits that is used to encode the RSLHSectionSize field. The number of bits that is used to encode the RSLHSectionSize field.
The default value is zero and indicates the absence of the whole The default value is zero and indicates the absence of the whole
RSLHSection for all RTP packets of this stream. Compatibility with RSLHSection for all RTP packets of this stream.
RFC 3016 requires that the RSLHSection must be empty, including the
RSLHSectionSize field. This is the reason why there is such a
variable length with a default value indicating absence of the
RSLHSectionSize field.
SLConfigDescriptor: SLConfigDescriptor:
A base-64 encoding of the SLConfigDescriptor. This SHALL be the A base-64 encoding of the SLConfigDescriptor. This SHALL be the
original SLConfigDescriptor and it SHALL be the same as the one original SLConfigDescriptor and it SHALL be the same as the one
transported by the OD framework, if any. transported by the OD framework, if any.
profile-level-id: profile-level-id:
A decimal representation of the MPEG-4 Profile Level indication A decimal representation of the MPEG-4 Profile Level indication
value. For audio this parameter indicates which MPEG-4 Audio tool value. For audio this parameter indicates which MPEG-4 Audio tool
subsets are applied to encode the audio stream and is defined in subsets are applied to encode the audio stream and is defined in
defined in ISO/IEC 14496-1. For video this parameter indicates which ISO/IEC 14496-1 [1]. For video this parameter indicates which MPEG-4
MPEG-4 Visual tool subsets are applied to encode the video stream Visual tool subsets are applied to encode the video stream and is
and is defined in Table G-1 of ISO/IEC 14496-2. This parameter MAY defined in Table G-1 of ISO/IEC 14496-2 [2]. This parameter MAY be
be used in the capability exchange or session setup procedure to used in the capability exchange or session setup procedure to
Gentric et al. Expires December 2001 20
RTP Payload Format for MPEG-4 Streams June 2001
indicate MPEG-4 Profile and Level combination of which the relevant indicate MPEG-4 Profile and Level combination of which the relevant
MPEG-4 media codec is capable. If this parameter is not specified by MPEG-4 media codec is capable. If this parameter is not specified
the procedure, its default value of 1 (Simple Profile/Level 1) is its default value is 1 (Simple Profile/Level 1) for video (for
used. compatibility with RFC 3016) and otherwise 0xFE (defined in ISO/IEC
14496-1 [1] as being the generic default value).
Gentric et al. Expires January 2002 21
RTP Payload Format for MPEG-4 Streams July 2001
Config: Config:
A hexadecimal representation of an octet string that expresses the A hexadecimal representation of an octet string that expresses the
media payload configuration. Configuration data is mapped onto the media payload configuration. Configuration data is mapped onto the
octet string in an MSB-first basis. The first bit of the octet string in an MSB-first basis. The first bit of the
configuration data SHALL be located at the MSB of the first octet. configuration data SHALL be located at the MSB of the first octet.
In the last octet, zero-valued padding bits, if necessary, shall In the last octet, zero-valued padding bits, if necessary, shall
follow the configuration data. For audio this is a follow the configuration data. For audio streams, config is the
"StreamMuxConfig", as defined in ISO/IEC 14496-3. For video this audio object type specific decoder configuration data
expresses the MPEG-4 Visual configuration information, as defined in AudioSpecificConfig() as defined in ISO/IEC 14496-3 [3]. For video
subclause 6.2.1 Start codes of ISO/IEC14496-2[2][4][9] and the this expresses the MPEG-4 Visual configuration information, as
defined in subclause 6.2.1 Start codes of ISO/IEC14496-2 [2] and the
configuration information indicated by this parameter SHALL be the configuration information indicated by this parameter SHALL be the
same as the configuration information in the corresponding MPEG-4 same as the configuration information in the corresponding MPEG-4
Visual stream, except for first-half-vbv-occupancy and latter-half- Visual stream, except for first-half-vbv-occupancy and latter-half-
vbv-occupancy, if it exists, which may vary in the repeated vbv-occupancy, if it exists, which may vary in the repeated
configuration information inside an MPEG-4 Visual stream (See 6.2.1 configuration information inside an MPEG-4 Visual stream (See 6.2.1
Start codes of ISO/IEC14496-2). Start codes of ISO/IEC14496-2).
StreamType: StreamType:
The integer value that indicates the type of MPEG-4 stream that is The integer value that indicates the type of MPEG-4 stream that is
carried; its coding corresponds to the values of the streamType as carried; its coding corresponds to the values of the streamType as
skipping to change at line 1169 skipping to change at line 1224
As in RFC <self-reference-to-this>. As in RFC <self-reference-to-this>.
Interoperability considerations: Interoperability considerations:
MPEG-4 provides a large and rich set of tools for the coding of MPEG-4 provides a large and rich set of tools for the coding of
visual objects. For effective implementation of the standard, visual objects. For effective implementation of the standard,
subsets of the MPEG-4 tool sets have been provided for use in subsets of the MPEG-4 tool sets have been provided for use in
specific applications. These subsets, called 'Profiles', limit the specific applications. These subsets, called 'Profiles', limit the
size of the tool set a decoder is required to implement. In order to size of the tool set a decoder is required to implement. In order to
restrict computational complexity, one or more 'Levels' are set for restrict computational complexity, one or more 'Levels' are set for
each Profile. A Profile@Level combination allows: each Profile. A Profile@Level combination allows:
Gentric et al. Expires December 2001 21
RTP Payload Format for MPEG-4 Streams June 2001
. a codec builder to implement only the subset of the standard he . a codec builder to implement only the subset of the standard he
needs, while maintaining interworking with other MPEG-4 devices needs, while maintaining interoperability with other MPEG-4 devices
included in the same combination, and included in the same combination, and
. checking whether MPEG-4 devices comply with the standard . checking whether MPEG-4 devices comply with the standard
('conformance testing'). ('conformance testing').
Gentric et al. Expires January 2002 22
RTP Payload Format for MPEG-4 Streams July 2001
A stream SHALL be compliant with the MPEG-4 Profile@Level specified A stream SHALL be compliant with the MPEG-4 Profile@Level specified
by the parameter "profile-level-id". Interoperability between a by the parameter "profile-level-id". Interoperability between a
sender and a receiver may be achieved by specifying the parameter sender and a receiver may be achieved by specifying the parameter
"profile-level-id" in MIME content, or by arranging in the "profile-level-id" in MIME content, or by arranging in the
capability exchange/announcement procedure to set this parameter capability exchange/announcement procedure to set this parameter
mutually to the same value. mutually to the same value.
Published specification: Published specification:
The specifications for MPEG-4 streams are presented in ISO/IEC The specifications for MPEG-4 streams are presented in ISO/IEC
14469-1, 14469-2, and 14469-3. The RTP payload format is described 14469-1, 14469-2, and 14469-3. The RTP payload format is described
in RFC <self-reference-to-this>. in RFC <self-reference-to-this>.
Applications which use this media type: Applications that use this media type:
Multimedia streaming and conferencing tools, Internet messaging and Multimedia streaming and conferencing tools, Internet messaging and
Email applications. Also supra-relativistic elementary particle Email applications. Also trans-galactic supra-relativistic
hyperspace tunneling trans-galactic communication devices :-) elementary particle hyperspace tunneling communication devices :-)
Additional information: none Additional information: none
Magic number(s): none Magic number(s): none
File extension(s): File extension(s):
None. A file format with the extension .mp4 has been defined for None. A file format with the extension .mp4 has been defined for
MPEG-4 content but is not directly correlated with this MIME type MPEG-4 content but is not directly correlated with this MIME type
which sole purpose is RTP transport. which sole purpose is RTP transport.
skipping to change at line 1225 skipping to change at line 1280
Multiple parameters SHOULD be expressed as a MIME media type string, Multiple parameters SHOULD be expressed as a MIME media type string,
in the form of a semicolon-separated list of parameter=value pairs in the form of a semicolon-separated list of parameter=value pairs
(see examples in Appendix). (see examples in Appendix).
4.3 Usage of SDP 4.3 Usage of SDP
4.3.1 The a=fmtp keyword 4.3.1 The a=fmtp keyword
It is assumed that one typical way to transport the above-described It is assumed that one typical way to transport the above-described
parameters associated with this payload format is via a SDP message parameters associated with this payload format is via an SDP [10]
message for example transported to the client in reply to a RTSP
Gentric et al. Expires December 2001 22 [13] DESCRIBE message or via SAP [14]. In that case the (a=fmtp)
RTP Payload Format for MPEG-4 Streams June 2001 keyword MUST be used as described in RFC 2327 [10, section 6]. The
syntax being then:
for example transported to the client in reply to a RTSP DESCRIBE of Gentric et al. Expires January 2002 23
via SAP. In that case the (a=fmtp) keyword MUST be used as described RTP Payload Format for MPEG-4 Streams July 2001
in RFC 2327 [10, section 6]. The syntax being then:
a=fmtp:<format> <parameter name>=<value> a=fmtp:<format> <parameter name>=<value>
4.3.2 SDP example 4.3.2 SDP example
The following is an example of SDP syntax for the description of a The following is an example of SDP syntax for the description of a
session containing one MPEG-4 audio stream, one MPEG-4 video and session containing one MPEG-4 audio stream, one MPEG-4 video and
three MPEG-4 system streams, the first one being BIFS, the second three MPEG-4 system streams, the first one being BIFS, the second
one OD and the third one IPMP. All are transported using this format one OD and the third one IPMP. All are transported using this format
and the AVP profile [12]. Note that the video stream DTSDelta are and the AVP profile [12]. Note that the video stream DTSDelta are
skipping to change at line 1282 skipping to change at line 1337
5.1 SL packetized stream reconstruction 5.1 SL packetized stream reconstruction
The purpose of this section is to document how a receiver can The purpose of this section is to document how a receiver can
reconstruct a valid SL packetized stream. Since this format directly reconstruct a valid SL packetized stream. Since this format directly
transports SL packets this reconstruction is performed by reversing transports SL packets this reconstruction is performed by reversing
the payload structure rules (section 3). We explicitly describe here the payload structure rules (section 3). We explicitly describe here
the most complex transformations. the most complex transformations.
In the following let (i) be the index of SL packets inside one RTP In the following let (i) be the index of SL packets inside one RTP
packet (starting at zero for each RTP packet), let SLPacketHeader.x packet (starting at zero for each RTP packet), let SLPacketHeader.x
Gentric et al. Expires December 2001 23
RTP Payload Format for MPEG-4 Streams June 2001
denote field x of the reconstructed SL packet header, let MSLH.x denote field x of the reconstructed SL packet header, let MSLH.x
denote field x of the received MSLH, etc. denote field x of the received MSLH, etc.
SLPacketHeader.packetSequenceNumber is restored from MSLH.Index and SLPacketHeader.packetSequenceNumber is restored from MSLH.Index and
MSLH.IndexDelta using: MSLH.IndexDelta using:
Gentric et al. Expires January 2002 24
RTP Payload Format for MPEG-4 Streams July 2001
If ( IndexLength == 0) { // or is absent If ( IndexLength == 0) { // or is absent
if ( SLConfig.packetSeqNumLength == 0 ) { if ( SLConfig.packetSeqNumLength == 0 ) {
// this stream does not have SL packet sequence number // this stream does not have SL packet sequence number
} }
else { else {
// illegal, normally the sender MUST map // illegal, normally the sender MUST map
// SLPacketHeader.packetSequenceNumber in MSLH // SLPacketHeader.packetSequenceNumber in MSLH
// and set a relevant IndexLength value; // and set a relevant IndexLength value;
// otherwise it is unfortunately impossible for the receiver // otherwise it is unfortunately impossible for the receiver
// to reconstruct the correct sequence // to reconstruct the correct sequence
skipping to change at line 1339 skipping to change at line 1393
// CTS is not transported for this RTP stream // CTS is not transported for this RTP stream
if (i == 0){ // first SL packet in RTP packet if (i == 0){ // first SL packet in RTP packet
if ( SLConfig.useTimeStamps == 1 ) { if ( SLConfig.useTimeStamps == 1 ) {
if ( SLPacketHeader.accessUnitStartFlag == 1 ) { if ( SLPacketHeader.accessUnitStartFlag == 1 ) {
SLPacketHeader.compositionTimeStampFlag(0) = 1; SLPacketHeader.compositionTimeStampFlag(0) = 1;
SLPacketHeader.compositionTimeStamp(0) = RTP TimeStamp; SLPacketHeader.compositionTimeStamp(0) = RTP TimeStamp;
} }
else { else {
// ignore // ignore
} }
Gentric et al. Expires December 2001 24
RTP Payload Format for MPEG-4 Streams June 2001
} }
else { else {
// empty // empty
} }
} }
Gentric et al. Expires January 2002 25
RTP Payload Format for MPEG-4 Streams July 2001
else { // non-first SL packets in RTP packet else { // non-first SL packets in RTP packet
if ( SLConfig.useTimeStamps == 1 ) { if ( SLConfig.useTimeStamps == 1 ) {
if ( SLPacketHeader.accessUnitStartFlag == 1 ) { if ( SLPacketHeader.accessUnitStartFlag == 1 ) {
SLPacketHeader.compositionTimeStampFlag(i) = 0; SLPacketHeader.compositionTimeStampFlag(i) = 0;
} }
else { else {
// ignore // ignore
} }
} }
else { else {
skipping to change at line 1396 skipping to change at line 1450
else { else {
// ignore // ignore
} }
} }
else { else {
// empty // empty
} }
} }
else { else {
// DTS is transported for this stream // DTS is transported for this stream
Gentric et al. Expires December 2001 25
RTP Payload Format for MPEG-4 Streams June 2001
if ( SLConfig.useTimeStamps == 1 ) { if ( SLConfig.useTimeStamps == 1 ) {
if ( SLPacketHeader.accessUnitStartFlag == 1 ) { if ( SLPacketHeader.accessUnitStartFlag == 1 ) {
SLPacketHeader.decodingTimeStampFlag(i) = SLPacketHeader.decodingTimeStampFlag(i) =
MSLH.DTSFlag(i); MSLH.DTSFlag(i);
SLPacketHeader.decodingTimeStamp(i) = SLPacketHeader.decodingTimeStamp(i) =
Gentric et al. Expires January 2002 26
RTP Payload Format for MPEG-4 Streams July 2001
RTP TimeStamp + MSLH.DTSDelta(i); RTP TimeStamp + MSLH.DTSDelta(i);
} }
else { else {
// ignore DTSFlag (which must be zero) // ignore DTSFlag (which must be zero)
} }
} }
else { else {
// this is strange and sub-optimal at best // this is strange and sub-optimal at best
// a receiver should ignore this // a receiver should ignore this
} }
skipping to change at line 1453 skipping to change at line 1507
from the M bit, as follows: from the M bit, as follows:
if ( SLConfig.useAccessUnitEndFlag == 0 ) { if ( SLConfig.useAccessUnitEndFlag == 0 ) {
// this SL stream does not signal access unit ends // this SL stream does not signal access unit ends
else { else {
SLPacketHeader.AccessUnitEndFlag = M bit; SLPacketHeader.AccessUnitEndFlag = M bit;
} }
In the multipleSL mode the AccessUnitEndFlag is untouched in RSLH. In the multipleSL mode the AccessUnitEndFlag is untouched in RSLH.
Gentric et al. Expires December 2001 26
RTP Payload Format for MPEG-4 Streams June 2001
The other SL packet header fields SHALL remain as found in RSLH. The other SL packet header fields SHALL remain as found in RSLH.
It is obvious that in the general case the reconstruction of the It is obvious that in the general case the reconstruction of the
original SL packetized stream requires SL-awareness. However this original SL packetized stream requires SL-awareness. However this
Gentric et al. Expires January 2002 27
RTP Payload Format for MPEG-4 Streams July 2001
payload format allows in all cases a receiver that does not know payload format allows in all cases a receiver that does not know
about the SL syntax to reconstruct the semantic of SL for the about the SL syntax to reconstruct the semantic of SL for the
following very useful features: following very useful features:
- Packet order (decoding order) - Packet order (decoding order)
- Access Unit boundaries (using the M bit) - Access Unit boundaries (using the M bit)
- Access Unit fragments (i.e. SL packet boundaries using - Access Unit fragments (i.e. SL packet boundaries using
MSLH.PayloadSize) MSLH.PayloadSize)
- Composition Time Stamps (using the RTP Time Stamp and - Composition Time Stamps (using the RTP Time Stamp and
MSLH.CTSDelta) MSLH.CTSDelta)
- Decoding Time Stamps (using the RTP Time Stamp and MSLH.DTSDelta) - Decoding Time Stamps (using the RTP Time Stamp and MSLH.DTSDelta)
skipping to change at line 1507 skipping to change at line 1562
timely fashion MPEG-4 has defined a number of techniques in order to timely fashion MPEG-4 has defined a number of techniques in order to
encode the scene description in a manner that makes it behave encode the scene description in a manner that makes it behave
similarly to other temporal encoding schemes such as audio and similarly to other temporal encoding schemes such as audio and
video. This payload format is intended for this usage. video. This payload format is intended for this usage.
Note that in many cases the application will consist of first the Note that in many cases the application will consist of first the
reliable transmission of a static initial scene followed by the reliable transmission of a static initial scene followed by the
streaming of animations and updates. For this reason the usage of streaming of animations and updates. For this reason the usage of
this payload format is attractive since it offers a unique solution. this payload format is attractive since it offers a unique solution.
Gentric et al. Expires December 2001 27
RTP Payload Format for MPEG-4 Streams June 2001
Senders must be aware that suitable schemes should be used when Senders must be aware that suitable schemes should be used when
scene description streams transport sensitive configuration scene description streams transport sensitive configuration
information. For example in case the RTP packet transporting an OD- information. For example in case the RTP packet transporting an OD-
update command would be lost, the corresponding media stream would update command would be lost, the corresponding media stream would
not be accessible by the receiver. not be accessible by the receiver.
Gentric et al. Expires January 2002 28
RTP Payload Format for MPEG-4 Streams July 2001
Redundancy is a possibility and may either be added by tools Redundancy is a possibility and may either be added by tools
hierarchically higher than this payload format, e.g. by packet based hierarchically higher than this payload format, e.g. by packet based
FEC, re-transmission, or similar tools. In such a case, the general FEC, re-transmission, or similar tools. In such a case, the general
congestion control principles have to be observed. congestion control principles have to be observed.
Since BIFS and OD streams may be modified during the session with Since BIFS and OD streams may be modified during the session with
update commands, there is a need to send both update commands and update commands, there is a need to send both update commands and
full BIFS/OD refresh. For that reason MPEG-4 defines Random Access full BIFS/OD refresh. For that reason MPEG-4 defines Random Access
Points (RAP) for scene description streams (OD and BIFS) where by Points (RAP) for scene description streams (OD and BIFS) where by
definition a decoder can restart decoding i.e. receives a "full definition a decoder can restart decoding i.e. receives a "full
skipping to change at line 1562 skipping to change at line 1617
multiplexing scheme that allows selective bundling of several ESs. multiplexing scheme that allows selective bundling of several ESs.
This is beyond the scope of the payload format defined here. This is beyond the scope of the payload format defined here.
The MPEG-4's Flexmux multiplexing scheme may be used for this The MPEG-4's Flexmux multiplexing scheme may be used for this
purpose and a specific RTP payload format is being developed [11]. purpose and a specific RTP payload format is being developed [11].
Another approach may be to develop a generic RTP multiplexing scheme Another approach may be to develop a generic RTP multiplexing scheme
usable for MPEG-4 data. The multiplexing scheme reported in [8] may usable for MPEG-4 data. The multiplexing scheme reported in [8] may
be a candidate for this approach. be a candidate for this approach.
Gentric et al. Expires December 2001 28
RTP Payload Format for MPEG-4 Streams June 2001
For MPEG-4 applications, the multiplexing technique needs to address For MPEG-4 applications, the multiplexing technique needs to address
the following requirements: the following requirements:
i. The ESs multiplexed in one stream can change frequently during a i. The ESs multiplexed in one stream can change frequently during a
session. Consequently, the coding type, individual packet size and session. Consequently, the coding type, individual packet size and
temporal relationships between the multiplexed data units must be temporal relationships between the multiplexed data units must be
handled dynamically. handled dynamically.
Gentric et al. Expires January 2002 29
RTP Payload Format for MPEG-4 Streams July 2001
ii. The multiplexing scheme should have a mechanism to determine the ii. The multiplexing scheme should have a mechanism to determine the
ES identifier (ES_ID) for each of the multiplexed packets. ES_ID is ES identifier (ES_ID) for each of the multiplexed packets. ES_ID is
not a part of the SL header. not a part of the SL header.
iii. In general, an SL packet does not contain information about its iii. In general, an SL packet does not contain information about its
size. The multiplexing scheme should be able to delineate the size. The multiplexing scheme should be able to delineate the
multiplexed packets whose lengths may vary from a few bytes to close multiplexed packets whose lengths may vary from a few bytes to close
to the path-MTU. to the path-MTU.
5.5 Overlap with RFC 3016 5.5 Overlap with RFC 3016
skipping to change at line 1618 skipping to change at line 1673
type (i.e. excluding media data processing) does not exhibit any type (i.e. excluding media data processing) does not exhibit any
significant non-uniformity in the receiver side to cause a denial- significant non-uniformity in the receiver side to cause a denial-
of-service threat. of-service threat.
However, it is possible to inject non-compliant MPEG streams (Audio, However, it is possible to inject non-compliant MPEG streams (Audio,
Video, and Systems) to overload the receiver/decoder's buffers which Video, and Systems) to overload the receiver/decoder's buffers which
might compromise the functionality of the receiver or even crash it. might compromise the functionality of the receiver or even crash it.
This is especially true for end-to-end systems like MPEG where the This is especially true for end-to-end systems like MPEG where the
buffer models are precisely defined. buffer models are precisely defined.
Gentric et al. Expires December 2001 29
RTP Payload Format for MPEG-4 Streams June 2001
MPEG-4 Systems supports stream types including commands that are MPEG-4 Systems supports stream types including commands that are
executed on the terminal like OD commands, BIFS commands, etc. and executed on the terminal like OD commands, BIFS commands, etc. and
programmatic content like MPEG-J (Java(TM) Byte Code) and programmatic content like MPEG-J (Java(TM) Byte Code) and
ECMAScript. It is possible to use one or more of the above in a ECMAScript. It is possible to use one or more of the above in a
manner non-compliant to MPEG to crash or temporarily make the manner non-compliant to MPEG to crash or temporarily make the
receiver unavailable. receiver unavailable.
Gentric et al. Expires January 2002 30
RTP Payload Format for MPEG-4 Streams July 2001
Authentication mechanisms can be used to validate of the sender and Authentication mechanisms can be used to validate of the sender and
the data to prevent security problems due to non-compliant malignant the data to prevent security problems due to non-compliant malignant
MPEG-4 streams. MPEG-4 streams.
A security model is defined in MPEG-4 Systems streams carrying MPEG- A security model is defined in MPEG-4 Systems streams carrying MPEG-
J access units which comprises Java(TM) classes and objects. MPEG-J J access units which comprises Java(TM) classes and objects. MPEG-J
defines a set of Java APIs and a secure execution model. MPEG-J defines a set of Java APIs and a secure execution model. MPEG-J
content can call this set of APIs and Java(TM) methods from a set of content can call this set of APIs and Java(TM) methods from a set of
Java packages supported in the receiver within the defined security Java packages supported in the receiver within the defined security
model. According to this security model, downloaded byte code is model. According to this security model, downloaded byte code is
skipping to change at line 1664 skipping to change at line 1719
8. References 8. References
[1] ISO/IEC 14496-1:2001 MPEG-4 Systems [1] ISO/IEC 14496-1:2001 MPEG-4 Systems
[2] ISO/IEC 14496-2:2001 MPEG-4 Visual [2] ISO/IEC 14496-2:2001 MPEG-4 Visual
[3] ISO/IEC 14496-3:2001 MPEG-4 Audio [3] ISO/IEC 14496-3:2001 MPEG-4 Audio
[4] ISO/IEC 14496-6:2001 Delivery Multimedia Integration Framework. [4] ISO/IEC 14496-6:2001 Delivery Multimedia Integration Framework.
[5] Schulzrinne, Casner, Frederick, Jacobson RTP: A Transport [5] H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson, RTP: A
Protocol for Real Time Applications RFC 1889, Internet Engineering Transport Protocol for Real Time Applications, RFC 1889, Internet
Task Force, January 1996. Engineering Task Force, January 1996.
[6] S. Bradner, Key words for use in RFCs to Indicate Requirement [6] S. Bradner, Key words for use in RFCs to Indicate Requirement
Levels, RFC 2119, Internet Engineering Task Force, March 1997. Levels, RFC 2119, Internet Engineering Task Force, March 1997.
[7] Y. Kikuchi, T. Nomura, S. Fukunaga, Y. Matsui, H. Kimata, RTP [7] Y. Kikuchi, T. Nomura, S. Fukunaga, Y. Matsui, H. Kimata, RTP
payload format for MPEG-4 Audio/Visual streams, Internet Engineering payload format for MPEG-4 Audio/Visual streams, Internet Engineering
Task Force, RFC 3016. Task Force, RFC 3016.
Gentric et al. Expires December 2001 30
RTP Payload Format for MPEG-4 Streams June 2001
[8] B. Thompson, T. Koren, D. Wing, Tunneling multiplexed Compressed [8] B. Thompson, T. Koren, D. Wing, Tunneling multiplexed Compressed
RTP ("TCRTP"), work in progress, draft-ietf-avt-tcrtp-02.txt, RTP ("TCRTP"), work in progress, draft-ietf-avt-tcrtp-02.txt,
November 2000. November 2000.
Gentric et al. Expires January 2002 31
RTP Payload Format for MPEG-4 Streams July 2001
[9] D. Singer, Y Lim, A Framework for the delivery of MPEG-4 over [9] D. Singer, Y Lim, A Framework for the delivery of MPEG-4 over
IP-based Protocols, work in progress, draft-singer-mpeg4-ip-02.txt, IP-based Protocols, work in progress, draft-singer-mpeg4-ip-02.txt,
May 2001. May 2001.
[10] Handley, Jacobson, SDP: Session Description Protocol, RFC 2327, [10] M. Handley, V. Jacobson, SDP: Session Description Protocol, RFC
Internet Engineering Task Force, April 1998. 2327, Internet Engineering Task Force, April 1998.
[11] C.Roux & al, RTP Payload Format for MPEG-4 FlexMultiplexed [11] C.Roux & al, RTP Payload Format for MPEG-4 FlexMultiplexed
Streams, work in progress, draft-curet-avt-rtp-mpeg4-flexmux-00.txt, Streams, work in progress, draft-curet-avt-rtp-mpeg4-flexmux-00.txt,
February 2001. February 2001.
[12] H. Schulzrinne, RTP Profile for Audio and Video Conferences [12] H. Schulzrinne, RTP Profile for Audio and Video Conferences
with Minimal Control, RFC1890, Internet Engineering Task Force, with Minimal Control, RFC1890, Internet Engineering Task Force,
January 1996. January 1996.
[13] H. Schulzrinne, A. Rao, R. Lanphier, Real Time Streaming
Protocol, RFC 2326, Internet Engineering Task Force, April 1998.
[14] M. Handley, C. Perkins, E. Whelan, Session Announcement
Protocol, RFC 2974, Internet Engineering Task Force, October 2000.
9. Authors' Addresses 9. Authors' Addresses
Olivier Avaro Olivier Avaro
France Telecom France Telecom
35 A Schutzenhuttenweg 35 A Schutzenhuttenweg
60598 Frankfurt am Main 60598 Frankfurt am Main
Deutschland Deutschland
e-mail: olivier.avaro@francetelecom.fr e-mail: olivier.avaro@francetelecom.fr
Andrea Basso Andrea Basso
skipping to change at line 1728 skipping to change at line 1789
e-mail: casner@acm.org e-mail: casner@acm.org
M. Reha Civanlar M. Reha Civanlar
AT&T Labs - Research AT&T Labs - Research
100 Schultz Drive 100 Schultz Drive
Red Bank, NJ 07701 Red Bank, NJ 07701
USA USA
e-mail: civanlar@research.att.com e-mail: civanlar@research.att.com
Philippe Gentric Philippe Gentric
Philips Digital Networks MP4Net
Gentric et al. Expires December 2001 31 Gentric et al. Expires January 2002 32
RTP Payload Format for MPEG-4 Streams June 2001 RTP Payload Format for MPEG-4 Streams July 2001
Philips Digital Networks MP4Net
51 rue Carnot 51 rue Carnot
92156 Suresnes 92156 Suresnes
France France
e-mail: philippe.gentric@philips.com e-mail: philippe.gentric@philips.com
Carsten Herpel Carsten Herpel
THOMSON multimedia THOMSON multimedia
Karl-Wiechert-Allee 74 Karl-Wiechert-Allee 74
30625 Hannover 30625 Hannover
Germany Germany
skipping to change at line 1778 skipping to change at line 1839
Cederlaan 4 Cederlaan 4
5600 JB Eindhoven 5600 JB Eindhoven
Netherlands Netherlands
e-mail : jan.vandermeer@philips.com e-mail : jan.vandermeer@philips.com
APPENDIX: Examples of usage APPENDIX: Examples of usage
This payload format has been designed to transport efficiently a This payload format has been designed to transport efficiently a
very versatile packetization scheme: the MPEG-4 Synch Layer; as a very versatile packetization scheme: the MPEG-4 Synch Layer; as a
result its complexity is larger than the average RTP payload format. result its complexity is larger than the average RTP payload format.
Gentric et al. Expires January 2002 33
RTP Payload Format for MPEG-4 Streams July 2001
For this reason this section describes a number of key examples of For this reason this section describes a number of key examples of
how this payload format can be used. how this payload format can be used.
Gentric et al. Expires December 2001 32
RTP Payload Format for MPEG-4 Streams June 2001
A C++-like syntax called SDL (Syntactic Description Language) A C++-like syntax called SDL (Syntactic Description Language)
defined in [1, section 14] is used to economically describe MPEG-4 defined in [1, section 14] is used to economically describe MPEG-4
system data structures. system data structures.
However, as discussed in section 2, this payload format can also be
used without explicit knowledge of SL (logically equivalent to
configuring the SL headers as being empty), several examples
(Appendix 1,3,4,5) cover this case.
Furthermore these examples assume that the (a=fmtp) SDP syntax is Furthermore these examples assume that the (a=fmtp) SDP syntax is
used to convey the MIME parameters of the payload format. used to convey the MIME parameters of the payload format.
Appendix.1 MPEG-4 Video Appendix.1 RFC 3016 compatible MPEG-4 Video (no SL)
This is an example of a video stream where the SL is configured to
produce RTP packets compatible with RFC 3016.
SLConfigDescriptor
In this example the SLConfigDescriptor is:
class SLConfigDescriptor extends BaseDescriptor : bit(8)
tag=SLConfigDescrTag {
bit(8) predefined;
if (predefined==0) {
bit(1) useAccessUnitStartFlag; = 0
bit(1) useAccessUnitEndFlag; = 1
bit(1) useRandomAccessPointFlag; = 0
bit(1) hasRandomAccessUnitsOnlyFlag; = 0
bit(1) usePaddingFlag; = 0
bit(1) useTimeStampsFlag; = 0
bit(1) useIdleFlag; = 0
bit(1) durationFlag; = 0
bit(32) timeStampResolution; = 0
bit(32) OCRResolution; = 0
bit(8) timeStampLength; = 0
bit(8) OCRLength; = 0
bit(8) AU_Length; = 0
bit(8) instantBitrateLength; = 0
bit(4) degradationPriorityLength; = 0
bit(5) AU_seqNumLength; = 0
bit(5) packetSeqNumLength; = 0
bit(2) reserved=0b11;
}
if (durationFlag) {
bit(32) timeScale; // NOT USED
bit(16) accessUnitDuration; // NOT USED
bit(16) compositionUnitDuration; // NOT USED
}
if (!useTimeStampsFlag) {
Gentric et al. Expires January 2002 34
RTP Payload Format for MPEG-4 Streams July 2001
bit(timeStampLength) startDecodingTimeStamp; = 0
bit(timeStampLength) startCompositionTimeStamp; = 0
}
}
SL Packet Header structure
With this configuration we have the following SL packet header
structure:
aligned(8) class SL_PacketHeader (SLConfigDescriptor SL) {
if (SL.useAccessUnitEndFlag) {
bit(1) accessUnitEndFlag; // 1 bit
}
}
In this case this payload produces RTP packets that are exactly
conformant to RFC 3016 and the Synch Layer is reduced to a purely
logical construction that neither sender nor receiver need to
implement.
Parameters
This configuration is the default one; no parameters are required.
RTP packet structure
Note that accessUnitEndFlag is mapped to the RTP header M bit.
+=========================================+=============+
| Field | size |
+=========================================+=============+
| RTP header | - |
+-----------------------------------------+-------------+
| SL packet payload | 1400 bytes |
+-----------------------------------------+-------------+
Overhead
In this example we have an RTP overhead of 40 bytes for 1400 bytes
of payload i.e. 3 % overhead.
Appendix.2 MPEG-4 Video with SL
Let us consider the case of a 30 frames per second MPEG-4 video Let us consider the case of a 30 frames per second MPEG-4 video
stream which bit rate is high enough that Access Units have to be stream which bit rate is high enough that Access Units have to be
split in several SL packets (typically above 300 kb/s). split in several SL packets (typically above 300 kb/s).
Let us assume also that the video codec generates in that case Video Let us assume also that the video codec generates in that case Video
Packets suitable to fit in one SL packet i.e that the video codec is Packets suitable to fit in one SL packet i.e that the video codec is
MTU aware and the MTU is 1500 bytes. We assume furthermore that this MTU aware and the MTU is 1500 bytes. We assume furthermore that this
stream contains B frames and that decodingTimeStamps are present. stream contains B frames and that decodingTimeStamps are present.
Gentric et al. Expires January 2002 35
RTP Payload Format for MPEG-4 Streams July 2001
SLConfigDescriptor SLConfigDescriptor
In this example the SLConfigDescriptor is: In this example the SLConfigDescriptor is:
class SLConfigDescriptor extends BaseDescriptor : bit(8) class SLConfigDescriptor extends BaseDescriptor : bit(8)
tag=SLConfigDescrTag { tag=SLConfigDescrTag {
bit(8) predefined; bit(8) predefined;
if (predefined==0) { if (predefined==0) {
bit(1) useAccessUnitStartFlag; = 1 bit(1) useAccessUnitStartFlag; = 1
bit(1) useAccessUnitEndFlag; = 0 bit(1) useAccessUnitEndFlag; = 0
skipping to change at line 1837 skipping to change at line 1991
bit(2) reserved=0b11; bit(2) reserved=0b11;
} }
if (durationFlag) { if (durationFlag) {
bit(32) timeScale; // NOT USED bit(32) timeScale; // NOT USED
bit(16) accessUnitDuration; // NOT USED bit(16) accessUnitDuration; // NOT USED
bit(16) compositionUnitDuration; // NOT USED bit(16) compositionUnitDuration; // NOT USED
} }
if (!useTimeStampsFlag) { if (!useTimeStampsFlag) {
bit(timeStampLength) startDecodingTimeStamp; // NOT USED bit(timeStampLength) startDecodingTimeStamp; // NOT USED
bit(timeStampLength) startCompositionTimeStamp; // NOT USED bit(timeStampLength) startCompositionTimeStamp; // NOT USED
Gentric et al. Expires December 2001 33
RTP Payload Format for MPEG-4 Streams June 2001
} }
} }
The useRandomAccessPointFlag is set so that the The useRandomAccessPointFlag is set so that the
randomAccessPointFlag can indicate that the corresponding SL packet randomAccessPointFlag can indicate that the corresponding SL packet
contains a GOV and the first Video Packet of an Intra coded frame. contains a GOV and the first Video Packet of an Intra coded frame.
SL Packet Header structure SL Packet Header structure
With this configuration we have the following SL packet header With this configuration we have the following SL packet header
structure: structure:
aligned(8) class SL_PacketHeader (SLConfigDescriptor SL) { aligned(8) class SL_PacketHeader (SLConfigDescriptor SL) {
bit(1) accessUnitStartFlag; // 1 bit bit(1) accessUnitStartFlag; // 1 bit
if (accessUnitStartFlag) { if (accessUnitStartFlag) {
bit(1) randomAccessPointFlag; // 1 bit bit(1) randomAccessPointFlag; // 1 bit
bit(1) decodingTimeStampFlag; // 1 bit bit(1) decodingTimeStampFlag; // 1 bit
bit(1) compositionTimeStampFlag; // 1 bit bit(1) compositionTimeStampFlag; // 1 bit
Gentric et al. Expires January 2002 36
RTP Payload Format for MPEG-4 Streams July 2001
if (decodingTimeStampFlag) { if (decodingTimeStampFlag) {
bit(SL.timeStampLength) decodingTimeStamp; bit(SL.timeStampLength) decodingTimeStamp;
} }
if (compositionTimeStampFlag) { if (compositionTimeStampFlag) {
bit(SL.timeStampLength) compositionTimeStamp; bit(SL.timeStampLength) compositionTimeStamp;
} }
} }
Parameters Parameters
skipping to change at line 1894 skipping to change at line 2048
+=========================================+=============+ +=========================================+=============+
| Field | size | | Field | size |
+=========================================+=============+ +=========================================+=============+
| RTP header | - | | RTP header | - |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| DTSFlag = 1 | 1 bit | | DTSFlag = 1 | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| DTSDelta | 7 bits | | DTSDelta | 7 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
Gentric et al. Expires December 2001 34
RTP Payload Format for MPEG-4 Streams June 2001
| bits to byte alignment | 0 bits | | bits to byte alignment | 0 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| RSLHSectionSize = 4 | 3 bits | | RSLHSectionSize = 4 | 3 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| accessUnitStartFlag = 1 | 1 bit | | accessUnitStartFlag = 1 | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| randomAccessPointFlag | 1 bit | | randomAccessPointFlag | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| decodingTimeStampFlag | 1 bit | | decodingTimeStampFlag | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| compositionTimeStampFlag | 1 bit | | compositionTimeStampFlag | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| bits to byte alignment | 1 bit | | bits to byte alignment | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| SL packet payload | N bytes | | SL packet payload | N bytes |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
Gentric et al. Expires January 2002 37
RTP Payload Format for MPEG-4 Streams July 2001
For packets that transport non-first fragments of Access Units we For packets that transport non-first fragments of Access Units we
have: have:
+=========================================+=============+ +=========================================+=============+
| Field | size | | Field | size |
+=========================================+=============+ +=========================================+=============+
| RTP header | - | | RTP header | - |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| DTSFlag = 0 | 1 bit | | DTSFlag = 0 | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
skipping to change at line 1941 skipping to change at line 2094
| bits to byte alignment | 4 bits | | bits to byte alignment | 4 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| SL packet payload | N bytes | | SL packet payload | N bytes |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
Overhead estimation Overhead estimation
In this example we have a RTP overhead of 40 + 2 bytes for 1400 In this example we have a RTP overhead of 40 + 2 bytes for 1400
bytes of payload i.e. 3 % overhead. bytes of payload i.e. 3 % overhead.
Appendix.2 RFC 3016 compatible MPEG-4 Video Appendix.3 Low delay MPEG-4 Audio (no SL)
This is an example of a video stream where the SL is configured to
produce RTP packets compatible with RFC 3016.
SLConfigDescriptor
In this example the SLConfigDescriptor is:
Gentric et al. Expires December 2001 35
RTP Payload Format for MPEG-4 Streams June 2001
class SLConfigDescriptor extends BaseDescriptor : bit(8)
tag=SLConfigDescrTag {
bit(8) predefined;
if (predefined==0) {
bit(1) useAccessUnitStartFlag; = 0
bit(1) useAccessUnitEndFlag; = 1
bit(1) useRandomAccessPointFlag; = 0
bit(1) hasRandomAccessUnitsOnlyFlag; = 0
bit(1) usePaddingFlag; = 0
bit(1) useTimeStampsFlag; = 0
bit(1) useIdleFlag; = 0
bit(1) durationFlag; = 0
bit(32) timeStampResolution; = 0
bit(32) OCRResolution; = 0
bit(8) timeStampLength; = 0
bit(8) OCRLength; = 0
bit(8) AU_Length; = 0
bit(8) instantBitrateLength; = 0
bit(4) degradationPriorityLength; = 0
bit(5) AU_seqNumLength; = 0
bit(5) packetSeqNumLength; = 0
bit(2) reserved=0b11;
}
if (durationFlag) {
bit(32) timeScale; // NOT USED
bit(16) accessUnitDuration; // NOT USED
bit(16) compositionUnitDuration; // NOT USED
}
if (!useTimeStampsFlag) {
bit(timeStampLength) startDecodingTimeStamp; = 0
bit(timeStampLength) startCompositionTimeStamp; = 0
}
}
SL Packet Header structure
With this configuration we have the following SL packet header
structure:
aligned(8) class SL_PacketHeader (SLConfigDescriptor SL) {
if (SL.useAccessUnitEndFlag) {
bit(1) accessUnitEndFlag; // 1 bit
}
}
In this case this payload produces RTP packets that are exactly
conformant to RFC 3016 and the Synch Layer is reduced to a purely
logical construction that neither sender nor receiver need to
implement.
Parameters
This configuration is the default one; no parameters are required.
Gentric et al. Expires December 2001 36
RTP Payload Format for MPEG-4 Streams June 2001
RTP packet structure
Note that accessUnitEndFlag is mapped to the RTP header M bit.
+=========================================+=============+
| Field | size |
+=========================================+=============+
| RTP header | - |
+-----------------------------------------+-------------+
| SL packet payload | 1400 bytes |
+-----------------------------------------+-------------+
Overhead
In this example we have a RTP overhead of 40 bytes for 1400 bytes of
payload i.e. 3 % overhead.
Appendix.3 Low delay MPEG-4 Audio
This example is for a low delay audio service. For this reason a This example is for a low delay audio service. For this reason a
single SL packet is transported in each RTP packet. single SL packet is transported in each RTP packet. Actually each SL
packet contains a complete Access Unit.
SLConfigDescriptor SLConfigDescriptor
Since CTS=DTS and Access Unit duration is constant signaling of Since CTS=DTS and Access Unit duration is constant signaling of
MPEG-4 time stamps is not needed (the durationFlag of SLConfig is MPEG-4 time stamps is not needed (the durationFlag of SLConfig is
set) set)
We also assume here an audio Object Type for which all Access Units We also assume here an audio Object Type for which all Access Units
are Random Access Points, which is signaled using the are Random Access Points, which is signaled using the
hasRandomAccessUnitsOnlyFlag in the SLConfigDescriptor. hasRandomAccessUnitsOnlyFlag in the SLConfigDescriptor.
skipping to change at line 2053 skipping to change at line 2121
and equal to 5 bytes (which is signaled with AU_Length). and equal to 5 bytes (which is signaled with AU_Length).
In this example the SLConfigDescriptor is: In this example the SLConfigDescriptor is:
class SLConfigDescriptor extends BaseDescriptor : bit(8) class SLConfigDescriptor extends BaseDescriptor : bit(8)
tag=SLConfigDescrTag { tag=SLConfigDescrTag {
bit(8) predefined; bit(8) predefined;
if (predefined==0) { if (predefined==0) {
bit(1) useAccessUnitStartFlag; = 0 bit(1) useAccessUnitStartFlag; = 0
bit(1) useAccessUnitEndFlag; = 0 bit(1) useAccessUnitEndFlag; = 0
Gentric et al. Expires January 2002 38
RTP Payload Format for MPEG-4 Streams July 2001
bit(1) useRandomAccessPointFlag; = 0 bit(1) useRandomAccessPointFlag; = 0
bit(1) hasRandomAccessUnitsOnlyFlag; = 1 bit(1) hasRandomAccessUnitsOnlyFlag; = 1
bit(1) usePaddingFlag; = 0 bit(1) usePaddingFlag; = 0
bit(1) useTimeStampsFlag; = 0 bit(1) useTimeStampsFlag; = 0
bit(1) useIdleFlag; = 0 bit(1) useIdleFlag; = 0
bit(1) durationFlag; = 1 // signals constant AU duration bit(1) durationFlag; = 1 // signals constant AU duration
bit(32) timeStampResolution; = 0 bit(32) timeStampResolution; = 0
bit(32) OCRResolution; = 0 bit(32) OCRResolution; = 0
bit(8) timeStampLength; = 0 bit(8) timeStampLength; = 0
Gentric et al. Expires December 2001 37
RTP Payload Format for MPEG-4 Streams June 2001
bit(8) OCRLength; = 0 bit(8) OCRLength; = 0
bit(8) AU_Length; = 5 bit(8) AU_Length; = 5
bit(8) instantBitrateLength; = 0 bit(8) instantBitrateLength; = 0
bit(4) degradationPriorityLength; = 0 bit(4) degradationPriorityLength; = 0
bit(5) AU_seqNumLength; = 0 bit(5) AU_seqNumLength; = 0
bit(5) packetSeqNumLength; = 0 bit(5) packetSeqNumLength; = 0
bit(2) reserved=0b11; bit(2) reserved=0b11;
} }
if (durationFlag) { if (durationFlag) {
bit(32) timeScale; = 1000 // for milliseconds bit(32) timeScale; = 1000 // for milliseconds
skipping to change at line 2087 skipping to change at line 2155
bit(16) compositionUnitDuration; = 10 // ms bit(16) compositionUnitDuration; = 10 // ms
} }
if (!useTimeStampsFlag) { if (!useTimeStampsFlag) {
bit(timeStampLength) startDecodingTimeStamp; = 0 bit(timeStampLength) startDecodingTimeStamp; = 0
bit(timeStampLength) startCompositionTimeStamp; = 0 bit(timeStampLength) startCompositionTimeStamp; = 0
} }
} }
SL packet header SL packet header
With this configuration the SL packet header is empty. With this configuration the SL packet header is empty. The Synch
Layer is reduced to a purely logical construction that neither
sender nor receiver need to implement.
Parameters Parameters
No parameters are required. No parameters are required.
RTP packet structure RTP packet structure
Note that the RTP header M bit should be always set to 1. Note that the RTP header M bit should be always set to 1.
+=========================================+=============+ +=========================================+=============+
| Field | size | | Field | size |
+=========================================+=============+ +=========================================+=============+
| RTP header | - | | RTP header | - |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| SL packet payload | 5 bytes | | SL packet payload | 5 bytes |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
Overhead estimation Overhead estimation
Gentric et al. Expires January 2002 39
RTP Payload Format for MPEG-4 Streams July 2001
The overhead is extremely large i.e. more than 800 %, since 40 bytes The overhead is extremely large i.e. more than 800 %, since 40 bytes
of headers are required to transport 5 bytes of data. Note however of headers are required to transport 5 bytes of data. Note however
that RTP header compression would work well since time stamps that RTP header compression would work well since time stamps
increments are constant. increments are constant.
Appendix.4 Media delivery MPEG-4 Audio Appendix.4 Media delivery MPEG-4 Audio (no SL)
Gentric et al. Expires December 2001 38
RTP Payload Format for MPEG-4 Streams June 2001
This example is for a media delivery service where delay is not an This example is for a media delivery service where delay is not an
issue but efficiency is. In this case several SL Packets are issue but efficiency is. In this case several SL Packets are
transported in each RTP packet. transported in each RTP packet.
SLConfigDescriptor SLConfigDescriptor
Is the same as in Appendix.3 Similar to previous example.
SL packet header SL packet header
With this configuration the SL packet header is empty. With this configuration the SL packet header is empty. The Synch
Layer is reduced to a purely logical construction that neither
sender nor receiver need to implement.
Parameters Parameters
The absence of RSLHSectionSizeLength indicates that the RSLHSection The absence of RSLHSectionSizeLength indicates that the RSLHSection
is empty. is empty.
The size of SL Packets (which are all complete Access Units in this The size of SL Packets (which are all complete Access Units in this
case) is constant and is indicated with: case) is constant and is indicated with:
a=fmtp:<format> ConstantSize=5 a=fmtp:<format> ConstantSize=5
skipping to change at line 2157 skipping to change at line 2229
to the receiver that only complete Access Units are transported. to the receiver that only complete Access Units are transported.
+=========================================+=============+ +=========================================+=============+
| Field | size | | Field | size |
+=========================================+=============+ +=========================================+=============+
| RTP header | - | | RTP header | - |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| SL packet payload | 5 bytes | | SL packet payload | 5 bytes |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| SL packet payload | 5 bytes | | SL packet payload | 5 bytes |
Gentric et al. Expires January 2002 40
RTP Payload Format for MPEG-4 Streams July 2001
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| etc, until MTU is reached | | etc, until MTU is reached |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| SL packet payload | 5 bytes | | SL packet payload | 5 bytes |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
Overhead estimation Overhead estimation
The overhead is 3% i.e. minimal. The overhead is 3% i.e. minimal.
Appendix.5 A more complex case: AAC with interleaving Appendix.5 AAC with interleaving (no SL)
Gentric et al. Expires December 2001 39 Let us consider AAC at 128 kb/s where each Access Unit is in the
RTP Payload Format for MPEG-4 Streams June 2001 average 320 bytes. Interleaving is applied with a continuous
interleaving scheme (see table below) where 4 Access Units are used
to construct each RTP packet in order to match a MTU of 1500 bytes.
IndexDelta is constant and equal to 2 (since +1 is automatically
added); it is encoded on 3 bits.
Index (being encoded on 3 bits) rolls over very fast and is not very
useful for reordering. However this a case as explained in section
3.8 where time stamps should be used for de-interleaving; receivers
know that each SL packet is a complete Access Unit because all RTP
packets have the M bit set to 1 and therefore, since Access Unit
duration is constant, Access Unit timestamps can be computed from
RTP timestamps and IndexDelta values; this can be used for de-
interleaving even in case of losses.
+-----------------------------------------------------------------+
| RTP packet | RTP Timestamp | Aus | Index,IndexDelta |
+-----------------------------------------------------------------+
| 1 | CTS(AU1) | 1 | 1 |
+-----------------------------------------------------------------+
| 2 | CTS(AU2) | 2, 5 | 2,2 |
+-----------------------------------------------------------------+
| 3 | CTS(AU3) | 3, 6, 9 | 3,2,2 |
+-----------------------------------------------------------------+
| 4 | CTS(AU4) | 4, 7,10,13 | 4,2,2,2 |
+-----------------------------------------------------------------+
| 5 | CTS(AU8) | 8,11,14,17 | 0,2,2,2 |
+-----------------------------------------------------------------+
| 6 | CTS(AU12) | 12,15,18,21 | 4,2,2,2 |
+-----------------------------------------------------------------+
| 7 | CTS(AU16) | 16,19,22,25 | 0,2,2,2 |
+----------------------------------------------------------------+
| 8 | CTS(AU20) | 20,23,26,29 | 4,2,2,2 |
+-----------------------------------------------------------------+
| 9 | CTS(AU24) | 24,27,30,33 | 0,2,2,2 |
+-----------------------------------------------------------------+
| 10 | CTS(AU28) | 28,31,34,37 | 4,2,2,2 |
Gentric et al. Expires January 2002 41
RTP Payload Format for MPEG-4 Streams July 2001
+-----------------------------------------------------------------+
| etc |
+-----------------------------------------------------------------+
SLConfigDescriptor
Similar to previous example.
SL Packet Header
Similar to previous example (empty).
Parameters
The resulting concatenated fmtp line is:
a=fmtp:<format> SizeLength=13;IndexLength=3;IndexDeltaLength=3
RTP packet structure
+=========================================+=============+
| Field | size |
+=========================================+=============+
| RTP header | - |
+-----------------------------------------+-------------+
MSLHSection
+=========================================+=============+
| MSLHSection size in bits = 135 | 2 bytes |
+-----------------------------------------+-------------+
| PayloadSize | 13 bits |
+-----------------------------------------+-------------+
| Index | 3 bits |
+-----------------------------------------+-------------+
| PayloadSize | 13 bits |
+-----------------------------------------+-------------+
| IndexDelta | 3 bits |
+-----------------------------------------+-------------+
| PayloadSize | 13 bits |
+-----------------------------------------+-------------+
| IndexDelta | 3 bits |
+-----------------------------------------+-------------+
| PayloadSize | 13 bits |
+-----------------------------------------+-------------+
| IndexDelta | 3 bits |
+-----------------------------------------+-------------+
| bits to byte alignment | 0 bits |
+-----------------------------------------+-------------+
SLPPSection
+=========================================+=============+
| AAC Access Unit | x bytes |
+-----------------------------------------+-------------+
| AAC Access Unit | x bytes |
+-----------------------------------------+-------------+
Gentric et al. Expires January 2002 42
RTP Payload Format for MPEG-4 Streams July 2001
| AAC Access Unit | x bytes |
+-----------------------------------------+-------------+
| AAC Access Unit | x bytes |
+-----------------------------------------+-------------+
Overhead estimation
The MSLHSection is 8 bytes; in this example we have therefore a RTP
overhead of 40 + 8 bytes for 1400 bytes (approx) of payload i.e.
around 4 % overhead.
Appendix.6 A more complex case: AAC with interleaving and SL
Let us consider AAC around 130 kb/s where each Access Unit is split Let us consider AAC around 130 kb/s where each Access Unit is split
in 4 SL packets corresponding to Error Sensitivity Categories (ESC) in 4 SL packets corresponding to Error Sensitivity Categories (ESC)
of maximum 90 bytes for which interleaving is very useful in terms of maximum 90 bytes for which interleaving is very useful in terms
of error resilience. We thus use an interleaving scheme where 15 SL of error resilience. We thus use an interleaving scheme where 15 SL
Packets (extracted from 15 consecutive Access Units) are used to Packets (extracted from 15 consecutive Access Units) are used to
construct each RTP packet in order to match a MTU of 1500 bytes. construct each RTP packet in order to match a MTU of 1500 bytes.
Note that since ESC fragments are not byte aligned we also use the Note that since ESC fragments are not byte aligned we also use the
paddingFlag and paddingBits features of the Synch Layer. paddingFlag and paddingBits features of the Synch Layer.
skipping to change at line 2209 skipping to change at line 2395
reports indicating high loss rates) can (for example) choose to reports indicating high loss rates) can (for example) choose to
duplicate for each interleaving sequence the first RTP packet that duplicate for each interleaving sequence the first RTP packet that
contains the most useful data in terms of ESC or apply other error contains the most useful data in terms of ESC or apply other error
protection techniques, with due care to congestion issues. protection techniques, with due care to congestion issues.
In this example we will also show several other SL features (OCR, AU In this example we will also show several other SL features (OCR, AU
boundary flags, padding, as detailed below). boundary flags, padding, as detailed below).
One feature demonstrated by this example is the degradation One feature demonstrated by this example is the degradation
priority. We assume degradation priority can take 4 different priority. We assume degradation priority can take 4 different
Gentric et al. Expires January 2002 43
RTP Payload Format for MPEG-4 Streams July 2001
values, mapped to Error Sensitivity Categories, and is encoded on 2 values, mapped to Error Sensitivity Categories, and is encoded on 2
bits. This interleaving scheme makes sure that only SL packets of bits. This interleaving scheme makes sure that only SL packets of
identical degradation priorities are grouped in the same RTP packet identical degradation priorities are grouped in the same RTP packet
(3.6.3) and that only the first RSLH of each RTP packet transports (3.6.3) and that only the first RSLH of each RTP packet transports
the degradation priority. the degradation priority.
We also assume that for each last SL packet of each RTP packet the We also assume that for each last SL packet of each RTP packet the
server inserts an OCR. server inserts an OCR.
SLConfigDescriptor SLConfigDescriptor
In this example the SLConfigDescriptor is: In this example the SLConfigDescriptor is:
class SLConfigDescriptor extends BaseDescriptor : bit(8) class SLConfigDescriptor extends BaseDescriptor : bit(8)
tag=SLConfigDescrTag { tag=SLConfigDescrTag {
Gentric et al. Expires December 2001 40
RTP Payload Format for MPEG-4 Streams June 2001
bit(8) predefined; bit(8) predefined;
if (predefined==0) { if (predefined==0) {
bit(1) useAccessUnitStartFlag; = 1 bit(1) useAccessUnitStartFlag; = 1
bit(1) useAccessUnitEndFlag; = 1 bit(1) useAccessUnitEndFlag; = 1
bit(1) useRandomAccessPointFlag; = 0 bit(1) useRandomAccessPointFlag; = 0
bit(1) hasRandomAccessUnitsOnlyFlag; = 1 bit(1) hasRandomAccessUnitsOnlyFlag; = 1
bit(1) usePaddingFlag; = 1 // we need to signal padding bits bit(1) usePaddingFlag; = 1 // we need to signal padding bits
bit(1) useTimeStampsFlag; = 0 bit(1) useTimeStampsFlag; = 0
bit(1) useIdleFlag; = 0 bit(1) useIdleFlag; = 0
bit(1) durationFlag; = 1 bit(1) durationFlag; = 1
skipping to change at line 2265 skipping to change at line 2451
bit(timeStampLength) startDecodingTimeStamp; = 0 bit(timeStampLength) startDecodingTimeStamp; = 0
bit(timeStampLength) startCompositionTimeStamp; = 0 bit(timeStampLength) startCompositionTimeStamp; = 0
} }
} }
SL Packet Header structure SL Packet Header structure
With this configuration we have the following SL packet header With this configuration we have the following SL packet header
structure: structure:
Gentric et al. Expires January 2002 44
RTP Payload Format for MPEG-4 Streams July 2001
aligned(8) class SL_PacketHeader (SLConfigDescriptor SL) { aligned(8) class SL_PacketHeader (SLConfigDescriptor SL) {
bit(1) accessUnitStartFlag; bit(1) accessUnitStartFlag;
bit(1) accessUnitEndFlag; bit(1) accessUnitEndFlag;
bit(1) OCRflag; bit(1) OCRflag;
bit(1) paddingFlag; bit(1) paddingFlag;
if (paddingFlag) bit(3) paddingBits; if (paddingFlag) bit(3) paddingBits;
bit(SL.packetSeqNumLength) packetSequenceNumber; bit(SL.packetSeqNumLength) packetSequenceNumber;
bit(1) DegPrioflag; bit(1) DegPrioflag;
if (DegPrioflag) { if (DegPrioflag) {
bit(SL.degradationPriorityLength) degradationPriority;} bit(SL.degradationPriorityLength) degradationPriority;}
if (OCRflag) { if (OCRflag) {
bit(SL.OCRLength) objectClockReference;} bit(SL.OCRLength) objectClockReference;}
} }
} }
Parameters Parameters
Gentric et al. Expires December 2001 41
RTP Payload Format for MPEG-4 Streams June 2001
The RSLHSectionSize cannot exceed 2 bits, which is encoded on 2 bits The RSLHSectionSize cannot exceed 2 bits, which is encoded on 2 bits
and signaled by RSLHSectionSizeLength. and signaled by RSLHSectionSizeLength.
The resulting concatenated fmtp line is: The resulting concatenated fmtp line is:
a=fmtp:<format> a=fmtp:<format>
SizeLength=6;RSLHSectionSizeLength=2;IndexLength=2;IndexDeltaLength= SizeLength=6;RSLHSectionSizeLength=2;IndexLength=2;IndexDeltaLength=
2;OCRDeltaLength=16 2;OCRDeltaLength=16
RTP packet structure RTP packet structure
skipping to change at line 2311 skipping to change at line 2497
MSLHSection MSLHSection
+=========================================+=============+ +=========================================+=============+
| MSLHSection size in bits = 135 | 2 bytes | | MSLHSection size in bits = 135 | 2 bytes |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| PayloadSize | 7 bits | | PayloadSize | 7 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| Index = 0 or 1 or 2 or 3 | 2 bits | | Index = 0 or 1 or 2 or 3 | 2 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| PayloadSize | 7 bits | | PayloadSize | 7 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| SLPSeqDeltaNum = 3 | 2 bits | | IndexDelta = 3 | 2 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| etc + 12 times 9 bits | | etc + 12 times 9 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| PayloadSize | 7 bits | | PayloadSize | 7 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| SLPSeqDeltaNum = 3 | 2 bits | | IndexDelta = 3 | 2 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| bits to byte alignment | 7 bits | | bits to byte alignment | 7 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
Gentric et al. Expires January 2002 45
RTP Payload Format for MPEG-4 Streams July 2001
RSLHSection RSLHSection
+=========================================+=============+ +=========================================+=============+
| RSLHSectionSize | 6 bits | | RSLHSectionSize | 6 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| accessUnitStartFlag | 1 bit | | accessUnitStartFlag | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| accessUnitEndFlag | 1 bit | | accessUnitEndFlag | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| OCRFlag = 0 | 1 bit | | OCRFlag = 0 | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| paddingFlag = 1 | 1 bit | | paddingFlag = 1 | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| paddingBits | 3 bits | | paddingBits | 3 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| DegPrioflag = 1 | 1 bit | | DegPrioflag = 1 | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
Gentric et al. Expires December 2001 42
RTP Payload Format for MPEG-4 Streams June 2001
| degradationPriority | 2 bits | | degradationPriority | 2 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| accessUnitStartFlag | 1 bit | | accessUnitStartFlag | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| accessUnitEndFlag | 1 bit | | accessUnitEndFlag | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| OCRFlag = 0 | 1 bit | | OCRFlag = 0 | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| paddingFlag = 1 | 1 bit | | paddingFlag = 1 | 1 bit |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
skipping to change at line 2378 skipping to change at line 2564
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| bits to byte alignment | 5 bits | | bits to byte alignment | 5 bits |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
SLPPSection SLPPSection
+=========================================+=============+ +=========================================+=============+
| SL packet payload |max 90 bytes | | SL packet payload |max 90 bytes |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| etc + 13 SL packets | | etc + 13 SL packets |
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
| SL packet payload |max 90 bytes | | SL packet payload |max 90 bytes |
Gentric et al. Expires January 2002 46
RTP Payload Format for MPEG-4 Streams July 2001
+-----------------------------------------+-------------+ +-----------------------------------------+-------------+
Note that in the above table the last SL packet in the RTP packet Note that in the above table the last SL packet in the RTP packet
has a payload that is byte-aligned (at the end). When this happens has a payload that is byte-aligned (at the end). When this happens
paddingFlag is set to zero and the paddingBits field is omitted. paddingFlag is set to zero and the paddingBits field is omitted.
Overhead estimation Overhead estimation
The MSLHSection is 19 bytes, the RSLHSection is 16 bytes; in this The MSLHSection is 19 bytes, the RSLHSection is 16 bytes; in this
example we have therefore a RTP overhead of 40 + 35 bytes for 1350 example we have therefore a RTP overhead of 40 + 35 bytes for 1350
bytes (max) of payload i.e. around 6 % overhead. bytes (max) of payload i.e. around 6 % overhead.
Gentric et al. Expires December 2001 43 Gentric et al. Expires January 2002 47
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/