draft-ietf-avt-uxp-03.txt   draft-ietf-avt-uxp-04.txt 
Internet Engineering Task Force G. Liebl, Internet Engineering Task Force G. Liebl
T.Stockhammer
Internet Draft LNT, Munich Univ. of Internet Draft LNT, Munich Univ. of
Technology Technology
Document: draft-ietf-avt-uxp-03.txt Document: draft-ietf-avt-uxp-04.txt
June 2002 M. Wagner, J.Pandel, November 2002 M. Wagner, J. Pandel,
W. Weng, G. Baese, W. Weng
M. Nguyen, F. Burkert Expires: May 2003 Siemens AG, Munich
Expires: December 2002 Siemens AG, Munich
An RTP Payload Format for Erasure-Resilient Transmission of An RTP Payload Format for Erasure-Resilient Transmission of
Progressive Multimedia Streams Progressive Multimedia Streams
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance This document is an Internet-Draft and is in full conformance
with all provisions of Section 10 of RFC2026. with all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Internet-Drafts are draft documents valid for a maximum Drafts. Internet-Drafts are draft documents valid for a maximum
of six months and may be updated, replaced, or obsoleted by other of six months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet- documents at any time. It is inappropriate to use Internet-
Drafts as reference material or to cite them other than as "work Drafts as reference material or to cite them other than as "work
in progress." in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
skipping to change at page 2, line ? skipping to change at page 2, line ?
stream, thus allowing a graceful degradation of application stream, thus allowing a graceful degradation of application
quality with increasing packet loss rate on the network. Hence, quality with increasing packet loss rate on the network. Hence,
this type of unequal erasure protection (UXP) schemes is intended this type of unequal erasure protection (UXP) schemes is intended
to cope with the rapidly varying channel conditions on wireless to cope with the rapidly varying channel conditions on wireless
access links to the Internet backbone. Nevertheless, backward access links to the Internet backbone. Nevertheless, backward
compatibility to currently standardized non-progressive compatibility to currently standardized non-progressive
multimedia codecs is ensured, since equal erasure protection multimedia codecs is ensured, since equal erasure protection
(EXP) represents a subset of generic UXP. By applying (EXP) represents a subset of generic UXP. By applying
interleaving and RS codes a payload format is defined, which can interleaving and RS codes a payload format is defined, which can
be easily integrated into the existing framework for RTP. be easily integrated into the existing framework for RTP.
1. Introduction 1. Introduction
Due to the increasing popularity of high-quality multimedia Due to the increasing popularity of high-quality multimedia
applications over the Internet and the high level of public applications over the Internet and the high level of public
Liebl,Stockhammer,Wagner,Pandel,Weng,Baese,Nguyen,Burkert [Page1]
acceptance of existing mobile communication systems, there is a acceptance of existing mobile communication systems, there is a
strong demand for a future combination of these two techniques: strong demand for a future combination of these two techniques:
One possible scenario consists of an integrated communication One possible scenario consists of an integrated communication
environment, where users can set up multimedia connections environment, where users can set up multimedia connections
anytime and anywhere via radio access links to the Internet. anytime and anywhere via radio access links to the Internet.
Liebl,Wagner,Pandel,Weng [Page1]
For this reason, several packet-oriented transmission modes have For this reason, several packet-oriented transmission modes have
been proposed for next generation wireless standards like EGPRS been proposed for next generation wireless standards like EGPRS
(Enhanced General Packet Radio Service) or UMTS (Universal Mobile (Enhanced General Packet Radio Service) or UMTS (Universal Mobile
Telecommunications System), which are mostly based on the same Telecommunications System), which are mostly based on the same
principle: Long message blocks, i.e. IP packets, that enter the principle: Long message blocks, i.e. IP packets, that enter the
wireless part of the network are split up into segments of wireless part of the network are split up into segments of
desired length, which can be multiplexed onto link layer packets desired length, which can be multiplexed onto link layer packets
of fixed size. The latter are then transmitted sequentially over of fixed size. The latter are then transmitted sequentially over
the wireless link, reassembled, and passed on to the next network the wireless link, reassembled, and passed on to the next network
element. element.
skipping to change at page 3, line 6 skipping to change at page 2, line ?
application cannot operate on partly complete blocks, they were application cannot operate on partly complete blocks, they were
optimized with respect to assigning equal erasure protection over optimized with respect to assigning equal erasure protection over
the whole message block. However, recent developments both in the whole message block. However, recent developments both in
audio and video coding have introduced the notion of audio and video coding have introduced the notion of
progressively encoded media streams, for which unequal erasure progressively encoded media streams, for which unequal erasure
protection strategies seem to be more promising, as it will be protection strategies seem to be more promising, as it will be
explained in more detail below. Although the scheme defined in explained in more detail below. Although the scheme defined in
[1] is in principle capable of supporting some kind of unequal [1] is in principle capable of supporting some kind of unequal
erasure protection, possible implementations seem to be quite erasure protection, possible implementations seem to be quite
complex with respect to the gain in performance. Finally, in [1] complex with respect to the gain in performance. Finally, in [1]
it is assumed that subsequent RTP packets can have variable it is assumed that consecutive RTP packets can have variable
length, which would cause significant segmentation overhead at length, which would cause significant segmentation overhead at
the link layer of almost all wireless systems. the link layer of almost all wireless systems.
This document defines a payload format for RTP, such that This document defines a payload format for RTP, such that
different elements in a progressively encoded multimedia stream different elements in a progressively encoded multimedia stream
can be protected against packet erasures according to their can be protected against packet erasures according to their
respective quality-of-service requirement. The general principle, respective quality-of-service requirement. The general principle,
including the use of Reed-Solomon codes together with an including the use of Reed-Solomon codes together with an
appropriate interleaving scheme for adding redundancy, follows appropriate interleaving scheme for adding redundancy, follows
the ideas already presented in [2], but allows for finer the ideas already presented in [2], but allows for finer
granularity in the structure of the progressive media stream. The granularity in the structure of the progressive media stream. The
proposed scheme is generic in the way that it (1) is independent proposed scheme is generic in the way that it (1) is independent
of the type of media stream, be it audio or video, and (2) can be of the type of media stream, be it audio or video, and (2) can be
skipping to change at page 3, line 21 skipping to change at page 3, line 17
can be protected against packet erasures according to their can be protected against packet erasures according to their
respective quality-of-service requirement. The general principle, respective quality-of-service requirement. The general principle,
including the use of Reed-Solomon codes together with an including the use of Reed-Solomon codes together with an
appropriate interleaving scheme for adding redundancy, follows appropriate interleaving scheme for adding redundancy, follows
the ideas already presented in [2], but allows for finer the ideas already presented in [2], but allows for finer
granularity in the structure of the progressive media stream. The granularity in the structure of the progressive media stream. The
proposed scheme is generic in the way that it (1) is independent proposed scheme is generic in the way that it (1) is independent
of the type of media stream, be it audio or video, and (2) can be of the type of media stream, be it audio or video, and (2) can be
adapted to varying transmission quality very quickly by use of adapted to varying transmission quality very quickly by use of
inband-signaling. inband-signaling.
2. Conventions used in this document 2. Conventions used in this Document
The following terms are used throughout this document: The following terms are used throughout this document:
1.) Message block: a higher layer transport unit (e.g. an IP 1.) Segment: denotes a link layer transport unit.
packet), that enters/leaves the segmentation/reassembly 2.) Segmentation/Reassembly Process: If the size of the
stage at the interface to wireless data link layers.
2.) Segment: denotes a link layer transport unit.
3.) CRC: Cyclic Redundancy Check, usually added to transport
units at the sender to detect the existence of erroneous
bits in a transport unit at the receiver.
4.) Segmentation/Reassembly Process: If the size of the
transport units at the link layer is smaller than that at transport units at the link layer is smaller than that at
the upper layers, message blocks have to be split up into the upper layers, message blocks have to be split up into
several parts, i.e. segments, which are then transmitted several parts, i.e. segments, which are then transmitted
subsequently over the link. If nothing is lost, the original subsequently over the link. If nothing is lost, the original
message block can be restored at the receiving entity message block can be restored at the receiving entity
(reassembly). (reassembly).
5.) Quality-of-service: application-dependent criterion to 3.) Codec: denotes a functional pair consisting of a source
define a certain desired operation point.
6.) Codec: denotes a functional pair consisting of a source
encoding unit at the sender and a corresponding source encoding unit at the sender and a corresponding source
decoding unit at the receiver; usually standardized for decoding unit at the receiver; usually standardized for
different multimedia applications like audio or video. different media applications like audio or video.
7.) Media stream: A bitstream. which results at the output of an 4.) Media stream: A bitstream. which results at the output of an
encoder for a specific media type, e.g. H.263, MPEG-4-video. encoder for a specific media type, e.g. H.263, MPEG-4
8.) Progressive media stream: A media stream which can be Visual.
divided into successive elements. . The distinct elements 5.) Progressive media stream: A media stream which can be
are of different importance to the reconstruction process at divided into successive elements. The distinct elements are
the decoder and are commonly ordered from highest to least of different importance to the decoding process and are
importance, where the latter elements depend on the commonly ordered from highest to least importance, where the
previous. latter elements depend on the previous.
9.) Progressive source coding: results in a progressive media 6.) Progressive source coding: results in a progressive media
stream. stream.
10.) Reed-Solomon (RS) code: belongs to the class of linear 7.) Reed-Solomon (RS) code: belongs to the class of linear
nonbinary block codes, and is uniquely specified by the nonbinary block codes, and is uniquely specified by the
block length n, the number of parity symbols t, and the block length n, the number of parity symbols t, and the
symbol alphabet. symbol alphabet.
11.) n: is a variable, which denotes both the block length of a 8.) n: is a variable, which denotes both the block length of a
RS codeword, and the number of columns in a TB (see 19). RS codeword, and the number of columns in a TB (see 19).
12.) k: is a variable, which denotes the number of information 9.) k: is a variable, which denotes the number of information
symbols in a RS codeword. symbols in an RS codeword.
13.) t: is a variable, which denotes the number of parity symbols 10.) t: is a variable, which denotes the number of parity symbols
in a RS codeword. in an RS codeword.
14.) Erasure: When a packet is lost during transmission, an 11.) Erasure: When a packet is lost during transmission, an
erasure is said to have happened. Since the position of the erasure is said to have happened. Since the position of the
erased packet in a sequence is usually known, a erased packet in a sequence is usually known, a
corresponding erasure marker can be set at the receiving corresponding erasure marker can be set at the receiving
entity. entity.
15.) Base layer: comprises the first and most important elements
of the progressively encoded source, without which all 12.) Base layer: comprises the first and most important elements
of the progressive media stream, without which all
subsequent information is useless. subsequent information is useless.
16.) Enhancement layer: comprises one or more sets of the less 13.) Enhancement layer: comprises one or more sets of the less
important subsequent elements of the progressively encoded important subsequent elements of the progressive media
source. A specific enhancement layer can be decoded, if and stream. A specific enhancement layer can be decoded, if and
only if the base layer and all previous enhancement layer only if the base layer and all previous enhancement layer
data (of higher importance) is available. data (of higher importance) are available.
17.) Info stream: denotes the bitstream which has to be 14.) Info stream: denotes the bitstream which has to be
protected by the UXP scheme. It usually consists of the protected by the UXP scheme. It usually consists of the
media stream (progressively source encoded or not), which is media stream (progressively source encoded or not), which is
arranged according to a desired syntax (e.g. to achieve an arranged according to a desired syntax (e.g. to achieve an
appropriate framing, see 6.3 ). In any case, it is assumed appropriate framing, see Sect. 6.3 ). In any case, it is
that every info stream is already octet-aligned according to assumed that every info stream is already octet-aligned
the standard procedures defined in the context of the used according to the standard procedures defined in the context
syntax specifications. of the used syntax specifications.
18.) Info octet: Denotes one element of the info stream. 15.) Info octet: Denotes one element of the info stream.
19.) Transmission block (TB): denotes a memory array of L rows 16.) Transmission block (TB): denotes a memory array of L rows
and n columns. Each row of a TB represents a RS codeword, and n columns. Each row of a TB represents a RS codeword,
whereas each column, together with the respective UXP header whereas each column, together with the respective UXP header
(see 36) in front, forms the payload of a single RTP packet. (see 36) in front, forms the payload of a single RTP packet.
Each TB consists of at least two distinct transmission sub Each TB consists of at least two distinct transmission sub
blocks (TSB, see20): The first L_s rows belong to the blocks (TSB, see20): The first L_s rows belong to the
signaling TSB, whereas the last L_d=(L-L_s) rows belong to signaling TSB, whereas the last L_d=(L-L_s) rows belong to
one or more data TSB. one or more data TSB.
17.) Transmission sub block (TSB): denotes a memory array of
20.) Transmission sub block (TSB): denotes a memory array of
0<l<L rows and n columns, which is a horizontal slice of a 0<l<L rows and n columns, which is a horizontal slice of a
TB. Depending on whether the info octet positions are filled TB. Depending on whether the info octet positions are filled
with descriptors (see31) or media data, the TSB is of type with descriptors (see31) or media data, the TSB is of type
signaling or data, respectively. signaling or data, respectively.
21.) L: is a variable, which denotes both the number of rows in a 18.) L: is a variable, which denotes both the number of rows in a
TB and the payload length (without UXP header, see 36) of an TB and the payload length (without UXP header, see 36) of an
RTP packet in octets. RTP packet in octets.
22.) Unequal erasure protection (UXP): denotes a specific 19.) Unequal erasure protection (UXP): denotes a specific
strategy which varies the level of erasure protection across strategy which varies the level of erasure protection across
a TB according to a given redundancy profile. a TB according to a given redundancy profile.
20.) Equal erasure protection (EXP): is a subset of UXP, for
23.) Equal erasure protection (EXP): is a subset of UXP, for
which the level of erasure protection is kept constant which the level of erasure protection is kept constant
across a TB. across a TB.
24.) Redundancy profile: describes the size of the different 21.) Redundancy profile: describes the size of the different
erasure protection classes in a TB, i.e. the number of rows erasure protection classes in a TB, i.e. the number of rows
(codewords) per class. (codewords) per class.
25.) Erasure protection class: contains a set of rows (codewords) 22.) Erasure protection class: contains a set of rows (codewords)
of the TB with same erasure correction capability. of the TB with same erasure correction capability.
26.) i: is a variable, which denotes the number of parity 23.) i: is a variable, which denotes the number of parity
symbols for each row in erasure protection class i. symbols for each row in erasure protection class i.
27.) EPC_i: is a variable, which denotes the set of rows 24.) EPC_i: is a variable, which denotes the set of rows
contained in erasure protection class i. contained in erasure protection class i.
28.) R_i: is a variable, which denotes the total number of rows
25.) R_i: is a variable, which denotes the total number of rows
contained in erasure protection class i, i.e. the contained in erasure protection class i, i.e. the
cardinality of EPC_i. cardinality of EPC_i.
29.) T: is a variable, which denotes the number of parity 26.) T: is a variable, which denotes the number of parity
symbols for each row in the highest erasure protection class symbols for each row in the highest erasure protection class
(with respect to application data) in a TB. (with respect to application data) in a TB.
30.) EPV: denotes the erasure protection vector of length (T+1) 27.) EPV: denotes the erasure protection vector of length (T+1)
used to describe a certain redundancy profile. used to describe a certain redundancy profile.
31.) DP: descriptor used for in-band signaling of the erasure 28.) DP: descriptor used for in-band signaling of the erasure
protection vector. protection vector.
32.) SI: stuffing indicator, which contains the number of media 29.) SI: stuffing indicator, which contains the number of media
stuffing symbols at the end of a data TSB (see 34). stuffing symbols at the end of a data TSB (see 34).
33.) Descriptor Stuffing: insertion of otherwise unused 30.) Descriptor Stuffing: insertion of otherwise unused
descriptor values (i.e. 0x00) at the end of the signaling descriptor values (i.e. 0x00) at the end of the signaling
TSB. Descriptor stuffing is performed, if the final sequence TSB. Descriptor stuffing is performed, if the final sequence
of descriptors and stuffing indicators for a valid of descriptors and stuffing indicators for a valid
redundancy profile is shorter than the space initially redundancy profile is shorter than the space initially
reserved for it in the signaling TSB. reserved for it in the signaling TSB.
34.) Media Stuffing: insertion of additional symbols at the end 31.) Media Stuffing: insertion of additional symbols at the end
of a data TSB. Media stuffing is performed, if the info of a data TSB. Media stuffing is performed, if the info
stream (see 17) is shorter than the space reserved for it in stream (see 17) is shorter than the space reserved for it in
the data TSB for a desired redundancy profile. Since the the data TSB for a desired redundancy profile. Since the
number of stuffing symbols is signaled in the respective SI, number of stuffing symbols is signaled in the respective SI,
any octet value may be used (e.g. 0x00). any octet value may be used (e.g. 0x00).
35.) Interleaver: performs the spreading of a codeword, i.e. a 32.) Interleaver: performs the spreading of a codeword, i.e. a
row in the TB, over n successive packets, such that the row in the TB, over n successive packets, such that the
probability of an erasure burst in a codeword is kept small. probability of an erasure burst in a codeword is kept small.
36.) UXP header: is the additional header information contained 33.) UXP header: is the additional header information contained
in each RTP packet after UXP has been applied. It is always in each RTP packet after UXP has been applied. It is always
present at the start of the payload section of an RTP present at the start of the payload section of an RTP
packet. packet.
37.) X: denotes a currently not used extension field of 1 bit in 34.) X: denotes a currently not used extension field of 1 bit in
the UXP header. the UXP header.
38.) P: is a variable which denotes the number of parity symbols 35.) P: is a variable which denotes the number of parity symbols
per row used to protect the inband signaling of the per row used to protect the inband signaling of the
redundancy profile. redundancy profile.
39.) ceil(.): denotes the ceiling function, i.e. rounding up to 36.) ceil(.): denotes the ceiling function, i.e. rounding up to
the next integer. the next integer.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in "OPTIONAL" in this document are to be interpreted as described in
RFC-2119. RFC-2119.
3. Reed-Solomon Codes 3. Reed-Solomon Codes
Reed-Solomon (RS) codes are a special class of linear nonbinary Reed-Solomon (RS) codes are a special class of linear nonbinary
block codes, which are known to offer maximum erasure correction block codes, which are known to offer maximum erasure correction
capability with minimum amount of redundancy. capability with minimum amount of redundancy.
An arbitrary t-erasure-correcting (n,k) RS code defined over An arbitrary t-erasure-correcting (n,k) RS code defined over
Galois field GF(q) has the following parameters [3]: Galois field GF(q) has the following parameters [3]:
- Block length: n=q-1 - Block length: n=q-1
- No. of information symbols in a codeword: k - No. of information symbols in a codeword: k
- No. of parity-check symbols in a codeword: n-k=t - No. of parity-check symbols in a codeword: n-k=t
- Minimum distance: d=t+1 - Minimum distance: d=t+1
skipping to change at page 6, line 45 skipping to change at page 6, line 37
+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+
<------------><---> <------------><--->
k=n-t t k=n-t t
(&:info) (*:parity) (&:info) (*:parity)
Fig. 1: Structure of a systematic RS codeword Fig. 1: Structure of a systematic RS codeword
4. Progressive Source Coding 4. Progressive Source Coding
The output of an encoder for a specific media type, e.g. H.263 or The output of an encoder for a specific media type, e.g. H.263 or
MPEG-4-video is said to be a media stream. If the media stream MPEG-4 Visual is said to be a media stream. If the media stream
consists of several distinct elements, which are of different consists of several distinct elements, which are of different
importance with respect to the quality of the reconstruction importance with respect to the quality of the decoding process at
process at the receiver, then the media stream is progressive. the receiver, then the media stream is progressive. The
The progressive media stream is often organized in separate progressive media stream is often organized in separate layers.
layers. Hence, there exists at least one layer, often called base Hence, there exists at least one layer, often called base layer,
layer, without which reconstruction fails at all, whereas all the without which decoding fails at all, whereas all the other
other layers, often called enhancement layers, just help to layers, often called enhancement layers, just help to continually
continually improve the quality. Consequently, the different improve the quality. Consequently, the different layers are
layers are usually contained in the (source-)encoded media stream usually contained in the (source-)encoded media stream in
in decreasing order of importance, i.e. the base layer data is decreasing order of importance, i.e. the base layer data is
followed by the various enhancement layers. followed by the various enhancement layers.
An example can be found in the fine granular scalability modes An example can be found in the fine granular scalability modes
which have been proposed to various standardization bodies like which have been proposed to various standardization bodies like
MPEG-4, where the resolution of the scaling process in the MPEG, where the resolution of the scaling process in the
progressive source encoder is as low as one symbol in the progressive source encoder is as low as one symbol in the
enhancement layer [4]. Another example is given by data enhancement layer [4]. Another example is given by data
partitioning which can be applied to the ITU/MPEG H.26L standard partitioning which can be applied to the ITU/MPEG H.26L standard
[5], MPEG-4, and H.263++. Also, the existence of I,P, and B [5], MPEG-4, and H.263++. Also, the existence of I,P, and B
frames in streams which comply with standards like MPEG-2 can be frames in streams which comply with standards like MPEG-2 can be
interpreted as progressive. interpreted as progressive.
From the above definition, it is quite obvious that the most From the above definition, it is quite obvious that the most
important base layer data must be protected as strongly as important base layer data must be protected as strongly as
possible against packet loss during transmission. However, the possible against packet loss during transmission. However, the
protection of the enhancement layers could be continually protection of the enhancement layers can be continually lowered,
lowered, since a loss at this stage has only minor consequences since a loss at these stages has only minor consequences for the
for the reconstruction process. Thus, by using a suitable unequal decoding process. Thus, by using a suitable unequal erasure
erasure protection strategy across a progressive media stream, protection strategy across a progressive media stream, the
the overhead due to redundancy is reduced. Furthermore, if overhead due to redundancy is reduced. Furthermore, if channel
channel conditions get worse during transmission, only more and conditions get worse during transmission, only more and more
more enhancement layers are lost, i.e. a graceful degradation in enhancement layers are lost, i.e. a graceful degradation in
application quality at the receiver is achieved [6]. application quality at the receiver is achieved [6].
Nevertheless, it should be mentioned that the specific structure Nevertheless, it should be mentioned that the specific structure
of the media stream strongly depends on the actual media codec in of the media stream strongly depends on the actual media codec in
use and does not always provide suitable mechanisms for transport use and does not always provide suitable mechanisms for transport
over data networks, like framing (see also 6.3 ). In order to over data networks, like framing (see also Sect. 6.3 ). In order
keep the description of the unequal erasure protection strategy to keep the description of the unequal erasure protection
in section 5 as general as possible, the final bitstream which strategy in Sect. 5 as general as possible, the final bitstream
has to be protected by the proposed UXP scheme will be called which has to be protected by the proposed UXP scheme will be
"info stream" in the following. Furthermore, it is assumed that called "info stream" in the following. Furthermore, it is assumed
every info stream is already octet-aligned according to the that every info stream is already octet-aligned according to the
standard procedures defined in the context of the used syntax standard procedures defined in the context of the used syntax
specifications. specifications.
5. General Structure of UXP schemes 5. General Structure of UXP Schemes
In this section, the principle features of the proposed UXP In this section, the principle features of the proposed UXP
scheme are described with a special focus on the protection and scheme are described with a special focus on the protection and
reconstruction procedure which is applied to the info stream. In reconstruction procedure which is applied to the info stream. In
addition, the behavior of the sender and receiver is specified as addition, the behavior of the sender and receiver is specified as
far as it concerns the reconstruction of the info stream. far as it concerns the reconstruction of the info stream.
However, the complete UXP payload structure, including the However, the complete UXP payload structure, including the
additional UXP header, is described in section 6. additional UXP header, is described in Sect. 6.
The reason for using the term "info stream" as well as the The reason for using the term "info stream" as well as the
details of the construction are described in Section 6.3 . For details of the construction are described in Sect. 6.3 . For now,
now, we assume that we have an info stream which has to be we assume that we have an info stream which has to be protected.
protected.
Fig. 1 already illustrated the structure of a systematic Fig. 1 already illustrated the structure of a systematic RS
codeword, which shall be represented by a single row with n codeword, which shall be represented by a single row with n
successive symbols that contain the information and the parity successive symbols that contain the information and the parity
octets. This structure shall now be extended by forming a octets. This structure shall now be extended by forming a
transmission block (TB) consisting of L codewords of length n transmission block (TB) consisting of L codewords of length n
octets each, which amounts to a total of L rows and n columns octets each, which amounts to a total of L rows and n columns
[7]: Each column, together with the respective UXP header in [7]: Each column, together with the respective UXP header in
front, shall represent the payload of an RTP packet, i.e. the front, shall represent the payload of an RTP packet, i.e. the
whole data of a TB is transmitted via a sequence of n RTP packets whole data of a TB is transmitted via a sequence of n RTP packets
all carrying a payload of length (L+2) octets (UXP header all carrying a payload of length (L+2) octets (UXP header
included). included).
skipping to change at page 8, line 21 skipping to change at page 7, line 54
codeword, which shall be represented by a single row with n codeword, which shall be represented by a single row with n
successive symbols that contain the information and the parity successive symbols that contain the information and the parity
octets. This structure shall now be extended by forming a octets. This structure shall now be extended by forming a
transmission block (TB) consisting of L codewords of length n transmission block (TB) consisting of L codewords of length n
octets each, which amounts to a total of L rows and n columns octets each, which amounts to a total of L rows and n columns
[7]: Each column, together with the respective UXP header in [7]: Each column, together with the respective UXP header in
front, shall represent the payload of an RTP packet, i.e. the front, shall represent the payload of an RTP packet, i.e. the
whole data of a TB is transmitted via a sequence of n RTP packets whole data of a TB is transmitted via a sequence of n RTP packets
all carrying a payload of length (L+2) octets (UXP header all carrying a payload of length (L+2) octets (UXP header
included). included).
Each TB usually consists of two or more horizontal sub blocks, Each TB usually consists of two or more horizontal sub blocks,
the so-called transmission sub blocks (TSB), as can be seen in the so-called transmission sub blocks (TSB), as can be seen in
Fig. : The first L_s rows always belong to the signaling TSB, Fig. 2: The first L_s rows always belong to the signaling TSB,
which is used to convey the actual redundancy profile in the data which is used to convey the actual redundancy profile in the data
part to the receiver (see 6.4.). The following L_d=(L-L_s) rows part to the receiver (see 6.4.). The following L_d=(L-L_s) rows
belong to one or more data TSBs, which contain the interleaved belong to one or more data TSBs, which contain the interleaved
and RS encoded info stream, as will be described below. and RS encoded info stream, as will be described below.
Transmission Block (TB) Transmission Block (TB)
/\ +-+-+-+-+-+-+-+-+-+ /\ /\ +-+-+-+-+-+-+-+-+-+ /\
| | signaling TSB | | L_s octets | | signaling TSB | | L_s octets
| +-+-+-+-+-+-+-+-+-+ \/ | +-+-+-+-+-+-+-+-+-+ \/
| | | /\ /\ | | | /\ /\
| + data TSB #1 + | L_d(1) octets | | + data TSB #1 + | L_d(1) octets |
| | | | | | | | | |
| +-+-+-+-+-+-+-+-+-+ \/ | | +-+-+-+-+-+-+-+-+-+ \/ |
L octets | | | /\ | L octets | | | /\ |
payload | + data TSB #2 + | L_d(2) octets | payload | + data TSB #2 + | L_d(2) octets |
per packet | + | | | L_d per packet | + | | | L_d oct.
octets
| +-+-+-+-+-+-+-+-+-+ \/ | | +-+-+-+-+-+-+-+-+-+ \/ |
| | . | . | | | . | . |
| + . + . | | + . + . |
| | . | . | | | . | . |
| +-+-+-+-+-+-+-+-+-+ /\ | | +-+-+-+-+-+-+-+-+-+ /\ |
| | data TSB #z | | L_d(z) octets | | | data TSB #z | | L_d(z) octets |
\/ +-+-+-+-+-+-+-+-+-+ \/ \/ \/ +-+-+-+-+-+-+-+-+-+ \/ \/
<-----------------> <----------------->
n packets n packets
Fig. 2: General structure of a TB Fig. 2: General structure of a TB
Since the UXP procedure is mainly applied to the data TSBs, it Since the UXP procedure is mainly applied to the data TSBs, it
will be described next, whereas the content and syntax of the will be described next, whereas the content and syntax of the
signaling TSB will be defined in section 6.4. signaling TSB will be defined in section 6.4.
For means of simplification, only one single data TSB will be For means of simplification, only one single data TSB will be
assumed throughout the following explanation of the encoding and assumed throughout the following explanation of the encoding and
decoding procedure. However, an extension to more than one data decoding procedure. However, an extension to more than one data
TSB per TB is straightforward, and will be shown in section 6.5. TSB per TB is straightforward, and will be shown in section 6.5.
As depicted in Fig. 3, the rows of a transmission sub block shall As depicted in Fig. 3, the rows of a transmission sub block shall
be partitioned into T+1 different classes EPC_i, where i=0...T, be partitioned into T+1 different classes EPC_i, where i=0...T,
such that each class contains exactly R_i=|EPC_i| consecutive such that each class contains exactly R_i=|EPC_i| consecutive
skipping to change at page 9, line 42 skipping to change at page 9, line 17
signaling TSB will be defined in section 6.4. signaling TSB will be defined in section 6.4.
For means of simplification, only one single data TSB will be For means of simplification, only one single data TSB will be
assumed throughout the following explanation of the encoding and assumed throughout the following explanation of the encoding and
decoding procedure. However, an extension to more than one data decoding procedure. However, an extension to more than one data
TSB per TB is straightforward, and will be shown in section 6.5. TSB per TB is straightforward, and will be shown in section 6.5.
As depicted in Fig. 3, the rows of a transmission sub block shall As depicted in Fig. 3, the rows of a transmission sub block shall
be partitioned into T+1 different classes EPC_i, where i=0...T, be partitioned into T+1 different classes EPC_i, where i=0...T,
such that each class contains exactly R_i=|EPC_i| consecutive such that each class contains exactly R_i=|EPC_i| consecutive
rows of the matrix, where the R_i have to satisfy the following rows of the matrix, where the R_i have to satisfy the following
relationship: relationship:
A_0+A_1+...+A_T=L_d R_0+R_1+...+R_T=L_d
Data Transmission Sub Block (data TSB) Data Transmission Sub Block (data TSB)
T T
<-------> <------->
/\ +-+-+-+-+-+-+-+-+-+ /\ /\ +-+-+-+-+-+-+-+-+-+ /\
| |&|&|&|&|&|*|*|*|*| | | |&|&|&|&|&|*|*|*|*| |
| +-+-+-+-+-+-+-+-+-+ | A_T=3 | +-+-+-+-+-+-+-+-+-+ | R_T=3
| |&|&|&|&|&|*|*|*|*| | | |&|&|&|&|&|*|*|*|*| |
| +-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+ |
L_d octets | |&|&|&|&|&|*|*|*|*| \/ L_d octets | |&|&|&|&|&|*|*|*|*| \/
per packet | +-+-+-+-+-+-+-+-+-+ /\ per packet | +-+-+-+-+-+-+-+-+-+ /\
| |%|%|%|%|%|%|*|*|*| | A_(T-1)=1 | |%|%|%|%|%|%|*|*|*| | R_(T-1)=1
| +-+-+-+-+-+-+-+-+-+ \/ | +-+-+-+-+-+-+-+-+-+ \/
| |$|$|$|$|$|$|$|*|*| . | |$|$|$|$|$|$|$|*|*| .
| +-+-+-+-+-+-+-+-+-+ . | +-+-+-+-+-+-+-+-+-+ .
| |!|!|!|!|!|!|!|!|*| . | |!|!|!|!|!|!|!|!|*| .
| +-+-+-+-+-+-+-+-+-+ /\ | +-+-+-+-+-+-+-+-+-+ /\
| |#|#|#|#|#|#|#|#|#| | A_0=1 | |#|#|#|#|#|#|#|#|#| | R_0=1
\/ +-+-+-+-+-+-+-+-+-+ \/ \/ +-+-+-+-+-+-+-+-+-+ \/
<-----------------> <----------------->
n packets n packets
&,%,$,!,# : info octets belonging to a certain info stream in &,%,$,!,# : info octets belonging to a certain info stream in
decreasing order of importance decreasing order of importance
* : parity octets gained from Reed-Solomon coding * : parity octets gained from Reed-Solomon coding
Fig. 3: General structure for coding with unequal erasure Fig. 3: General structure for coding with unequal erasure
protection protection
Furthermore, all rows in a particular class EPC_i shall contain Furthermore, all rows in a particular class EPC_i shall contain
exactly the same number of parity octets, which is equal to the exactly the same number of parity octets, which is equal to the
index i of the class. For each row in a certain class EPC_i, the index i of the class. For each row in a certain class EPC_i, the
same (n,n-i) RS code shall be applied. same (n,n-i) RS code shall be applied.
As can be observed from Fig. 3, class EPC_T contains the largest As can be observed from Fig. 3, class EPC_T contains the largest
number of parity octets per row, i.e. offers the highest erasure number of parity octets per row, i.e. offers the highest erasure
protection capability in the block. Consequently, the most protection capability in the block. Consequently, the most
important element in the info stream must be assigned to class important element in the info stream must be assigned to class
EPC_T, where the value of T should be chosen according to the EPC_T, where the value of T should be chosen according to the
desired outage threshold of the application given a certain desired outage threshold of the application given a certain
skipping to change at page 11, line 21 skipping to change at page 10, line 45
for each of them (quality-of-service requirement). for each of them (quality-of-service requirement).
5.) Any suitable optimization algorithm may be used for deriving 5.) Any suitable optimization algorithm may be used for deriving
an adequate redundancy profile. However, the result has to an adequate redundancy profile. However, the result has to
satisfy the following constraints: satisfy the following constraints:
a) All available info octet positions in the data TSB have to be a) All available info octet positions in the data TSB have to be
completely filled. If the info stream is too short for a desired completely filled. If the info stream is too short for a desired
profile, media stuffing may be applied to the empty info octet profile, media stuffing may be applied to the empty info octet
positions at the end of the data TSB by appending a sufficient positions at the end of the data TSB by appending a sufficient
number of octets (with arbitrary value, e.g. 0x00). The actual number of octets (with arbitrary value, e.g. 0x00). The actual
number of stuffing symbols per data TSB is then signaled via the number of stuffing symbols per data TSB is then signaled via the
respective stuffing indicator (see 6.4.). However, before respective stuffing indicator (see Sect. 6.4.). However, before
resorting to any stuffing, it should be checked whether it is resorting to any stuffing, it should be checked whether it is
possible to strengthen the protection of certain rows instead, possible to strengthen the protection of certain rows instead,
thus improving the overall robustness of the decoding process. thus improving the overall robustness of the decoding process.
b) The info stream should be fully contained within the data TSB b) The info stream should be fully contained within the data TSB
(unless cutting it off at a specific point is explicitly allowed (unless cutting it off at a specific point is explicitly allowed
by the properties of the used media codec). by the properties of the used media codec).
c) The number of required descriptors and stuffing indicators c) The number of required descriptors and stuffing indicators
(see section 6.4.) to signal the profile shall not exceed the (see section 6.4.) to signal the profile shall not exceed the
space initially reserved for them in the signaling TSB. space initially reserved for them in the signaling TSB.
Constraints a) and b) should be already incorporated in the Constraints a) and b) should be already incorporated in the
optimization algorithm. However, if constraint c) is not met, the optimization algorithm. However, if constraint c) is not met, the
data TSB has to be reduced by one row in favor of the signaling data TSB has to be reduced by one row in favor of the signaling
TSB to accommodate more space for the descriptors and stuffing TSB to accommodate more space for the descriptors and stuffing
indicators, i.e. steps 2-5 have to be repeated until a valid indicators, i.e. steps 2-5 have to be repeated until a valid
redundancy profile has been obtained. redundancy profile has been obtained.
6.) For each nonempty class EPC_i, i=T...0, in the data TSB, the 6.) For each nonempty class EPC_i, i=T...0, in the data TSB, the
following steps have to be performed: following steps have to be performed:
a) All rows of this specific class shall be filled from left to a) All rows of this specific class shall be filled from left to
right and top to bottom with data octets of the info stream in right and top to bottom with data octets of the info stream.
decreasing order of importance (i.e. starting with the most
important element).
b) For each row in the class, the required i parity-check octets b) For each row in the class, the required i parity-check octets
are computed from the same set of codewords of an (n,n-i) RS are computed from the same set of codewords of an (n,n-i) RS
code, and filled in the empty positions at the end of each row. code, and filled in the empty positions at the end of each row.
Thus, every row in the class constitutes a valid codeword of the Thus, every row in the class constitutes a valid codeword of the
chosen RS code. chosen RS code.
7.) After having filled the whole data TSB with information and 7.) After having filled the whole data TSB with information and
parity octets, the redundancy profile is mapped to the signaling parity octets, the redundancy profile is mapped to the signaling
TSB as described in section 6.4. TSB as described in section 6.4.
8.) Each column of the resulting TB is now read out octet-wise 8.) Each column of the resulting TB is now read out octet-wise
from top to bottom and, together with the respective UXP header from top to bottom and, together with the respective UXP header
(see section 6.2.) in front, is mapped onto the payload section (see section 6.2.) in front, is mapped onto the payload section
of one and only one RTP packet. of one and only one RTP packet.
9.) The n resulting RTP packets shall be transmitted
9.) The n resulting RTP packets shall be transmitted subsequently consecutively to the remote host, starting with the leftmost one.
to the remote host, starting with the leftmost one.
10.) At the corresponding protocol entity at the remote host, the 10.) At the corresponding protocol entity at the remote host, the
payload (without the UXP header) of all successfully received RTP payload (without the UXP header) of all successfully received RTP
packets belonging to the same sending TB shall be filled into a packets belonging to the same sending TB shall be filled into a
similar receiving TB column-wise from top to bottom and left to similar receiving TB column-wise from top to bottom and left to
right. right.
11.) For every erased packet of a received TB, the respective 11.) For every erased packet of a received TB, the respective
column in the TB shall be filled with a suitable erasure marker. column in the TB shall be filled with a suitable erasure marker.
12.) Before any other operations can be performed, the redundancy 12.) Before any other operations can be performed, the redundancy
profile has to be restored from the signaling TSB according to profile has to be restored from the signaling TSB according to
the procedure defined in section 6.4.. If the attempt fails the procedure defined in Sect. 6.4.. If the attempt fails because
because of too many lost packets, the whole TB shall be discarded of too many lost packets, the whole TB shall be discarded and the
and the receiving entity should wait for the next incoming TB. receiving entity should wait for the next incoming TB.
13.) If the attempt to recover the redundancy profile has been 13.) If the attempt to recover the redundancy profile has been
successful, a decoding operation shall be performed for each row successful, a decoding operation shall be performed for each row
of the data TSB by applying any suitable algorithm for erasure of the data TSB by applying any suitable algorithm for erasure
decoding. decoding.
14.) For all rows of the data TSB for which the decoding 14.) For all rows of the data TSB for which the decoding
operation has been successful, the reconstructed data octets are operation has been successful, the reconstructed data octets are
read out from left to right and top to bottom, and appended to read out from left to right and top to bottom, and appended to
the reconstructed version of the info stream. the reconstructed version of the info stream.
One can easily realize that the above rules describe an One can easily realize that the above rules describe an
skipping to change at page 12, line 34 skipping to change at page 12, line 4
operation has been successful, the reconstructed data octets are operation has been successful, the reconstructed data octets are
read out from left to right and top to bottom, and appended to read out from left to right and top to bottom, and appended to
the reconstructed version of the info stream. the reconstructed version of the info stream.
One can easily realize that the above rules describe an One can easily realize that the above rules describe an
interleaver, i.e. at the sender a single codeword of a TB is interleaver, i.e. at the sender a single codeword of a TB is
spread out over n successive packets. Thus, each codeword of a spread out over n successive packets. Thus, each codeword of a
transmitted TB experiences the same number of erasures at exactly transmitted TB experiences the same number of erasures at exactly
the same positions. the same positions.
Two important conclusions can be drawn from this: Two important conclusions can be drawn from this:
a) Since the same RS code is applied to all rows contained in a a) Since the same RS code is applied to all rows contained in a
specific class, either all of them can be correctly decoded or specific class, either all of them can be correctly decoded or
not. Hence, there exist no partly decodable classes at the none. Hence, there exist no partly decodable classes at the
receiver. receiver.
b) If decoding is successful for a certain class EPC_i, all the b) If decoding is successful for a certain class EPC_i, all the
classes EPC_(i+1)...EPC_T can also be decoded, since they are classes EPC_(i+1)...EPC_T can also be decoded, since they are
protected by at least one more parity octet per row. Together protected by at least one more parity octet per row. Together
with rule 6, it is therefore always ensured, that in case a with rule 6, it is therefore always ensured, that in case a
decodable enhancement layer exists, all other layers it depends decodable enhancement layer exists, all other layers it depends
on can also be reconstructed! on can also be reconstructed!
Given the maximum erasure protection value T, the redundancy Given the maximum erasure protection value T, the redundancy
profile for a data TSB of size (L_d x n) shall be denoted by a profile for a data TSB of size (L_d x n) shall be denoted by a
so-called erasure protection vector EPV of length (T+1), where so-called erasure protection vector EPV of length (T+1), where
EPV:=(A_0,A_1,...,A_(T-1),A_T) EPV:=(R_0,R_1,...,R_(T-1),R_T)
From the above definition, it is easy to realize that the trivial From the above definition, it is easy to realize that the trivial
cases of no erasure protection and EXP are a subset of UXP: cases of no erasure protection and EXP are a subset of UXP:
a) no erasure protection at all: all application data is mapped a) no erasure protection at all: all application data is mapped
onto onto
class EPC_0, i.e. EPV=(L_d,0,0,...,0). class EPC_0, i.e. EPV=(L_d,0,0,...,0).
b) EXP: all application data is mapped onto class EPC_T, i.e. b) EXP: all application data is mapped onto class EPC_T, i.e.
EPV=(0,0,...,0,A_T=L_d). EPV=(0,0,...,0,R_T=L_d).
Hence, backward compatibility to currently standardized non- Hence, backward compatibility to currently standardized non-
progressive multimedia codecs is definitely achieved. progressive multimedia codecs is definitely achieved.
6. RTP payload structure 6. RTP payload structure
This section is organized as follows. First, the specific This section is organized as follows. First, the specific
settings in the RTP header is shown. Next, the RTP payload header settings in the RTP header are shown. Next, the RTP payload
for UXP (the so-called UXP header) is specified. After that, the header for UXP (the so-called UXP header) is specified. After
structure of the bitstream which is protected by UXP, the so- that, the structure of the bitstream which is protected by UXP,
called info stream, is discussed. Finally, the in-band signaling the so-called info stream, is discussed. Finally, the in-band
of the erasure protection vector is introduced signaling of the erasure protection vector is introduced.
For every packet, the UXP payload is formed by reading out a For every packet, the UXP payload is formed by reading out a
column of the TB and prefixing it with the UXP header. Thus, an column of the TB and prefixing it with the UXP header. Thus, an
UXP-compliant RTP packet looks as follows: UXP-compliant RTP packet looks as follows:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
|RTP Header| UXP Header| one column of the TB | |RTP Header| UXP Header| one column of the TB |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
6.1 Specific settings in the RTP header 6.1 Specific Settings in the RTP Header
The timestamp of each RTP packet is set to the sampling time of The timestamp of each RTP packet is set to the sampling time of
the first octet of the progressive media stream in the the first octet of the progressive media stream in the
corresponding TB. If several data TSBs are included in one TB, corresponding TB. If several data TSBs are included in one TB,
the sampling time of data TSB #1 is relevant. This results in the the sampling time of data TSB #1 is relevant. This results in the
TS value being the same for all RTP packets belonging to a TS value being the same for all RTP packets belonging to a
specific TB. specific TB.
The payload type is of dynamic type, and obtained through out-of- The payload type is of dynamic type, and obtained through out-of-
band signaling similar to [1]. End systems, which cannot band signaling similar to [1]. End systems, which cannot
recognize a payload type, must discard it. recognize a payload type, must discard it.
The marker bit is set to 1 for every last packet in a TB. The marker bit is set to 1 in the last packet of a TB; otherwise,
Otherwise, its value is 0. its value is 0.
All other fields in the RTP header are set to those values All other fields in the RTP header are set to those values
proposed for regular multimedia transmission using the RTP-format proposed for regular multimedia transmission using the RTP-format
of the media stream which is protected by UXP. of the media stream which is protected by UXP, e.g for MPEG-4
Visual as specified in RFC 3016.
6.2. Structure of the UXP header 6.2. Structure of the UXP Header
The UXP header shall consist of 2 octets, and is shown in Fig. 4: The UXP header shall consist of 2 octets, and is shown in Fig. 4:
0 1 1 1 1 1 1 0 1 1 1 1 1 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|X| block PT | block length n| |X| block PT | block length n|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Fig. 4: Proposed UXP header Fig. 4: Proposed UXP header
The fields in the UXP header are defined as follows:
The fields in the header shall be defined as follows:
- X (bit 0): extension bit, reserved for future enhancements, - X (bit 0): extension bit, reserved for future enhancements,
currently not in use -> default value: 0 currently not in use -> default value: 0
- block PT (bits 1-7): regular RTP payload type to indicate the - block PT (bits 1-7): regular RTP payload type to indicate the
media type contained in the info stream media type contained in the info stream
- block length n (bits 8-15): indicates total number of RTP - block length n (bits 8-15): indicates total number of RTP
- packets - packets
resulting from one TB (which equals resulting from one TB (which equals
the number of columns of the TB) the number of columns of the TB)
The syntax of the info stream which is protected by UXP is The syntax of the info stream which is protected by UXP is
specified by the RTP payload type field contained in the UXP specified by the RTP payload type field contained in the UXP
header. The details of the info stream are described in Sec. 6.3 header. The details of the info stream are described in Sec. 6.3
For example, payload type H.263 means that the info stream For example, payload type H.263 means that the info stream
conforms to the specifications of the RTP profile for H.263 and conforms to the specifications of the RTP profile for H.263 and
does not represent the "raw" H.263 media stream produced by an does not represent the "raw" H.263 media stream produced by an
H.263 encoder. H.263 encoder.
However, UXP can also be applied to the "raw" media stream (in However, UXP can also be applied to the "raw" media stream (in
case it is already octet-aligned), if this can be signaled to the case it is already octet-aligned), if this can be signaled to the
receiver via other means, e.g. by use of H.245 or SDP. receiver via other means, e.g. by use of H.245 or SDP.
skipping to change at page 14, line 36 skipping to change at page 13, line 48
For example, payload type H.263 means that the info stream For example, payload type H.263 means that the info stream
conforms to the specifications of the RTP profile for H.263 and conforms to the specifications of the RTP profile for H.263 and
does not represent the "raw" H.263 media stream produced by an does not represent the "raw" H.263 media stream produced by an
H.263 encoder. H.263 encoder.
However, UXP can also be applied to the "raw" media stream (in However, UXP can also be applied to the "raw" media stream (in
case it is already octet-aligned), if this can be signaled to the case it is already octet-aligned), if this can be signaled to the
receiver via other means, e.g. by use of H.245 or SDP. receiver via other means, e.g. by use of H.245 or SDP.
Based on the RTP sequence number, the marker bit, and the Based on the RTP sequence number, the marker bit, and the
repetition of the block length n in each UXP header, the repetition of the block length n in each UXP header, the
receiving entity is able to recognize both TB boundaries and the receiving entity is able to recognize both TB boundaries and the
actual position of lost packets in the TB. actual position of packets (both received and lost ones) in the
TB.
6.3 Framing and Timing Mechanism in UXP: The info stream. 6.3 Framing and Timing Mechanism in UXP: The Info Stream
As described in Sect. 5, UXP creates its own packetization scheme
As described in section 5, UXP creates its own packetization by interleaving. The regular framing and timing structure of RTP
scheme by interleaving. The regular framing and timing structure is therefore destroyed. This section describes which kind of
of RTP is therefore destroyed. This section describes which kind problems arise with interleaving and how they can be solved. This
of problems arise with interleaving and how they can be solved. finally leads to the specification of the info stream.
This finally leads to the specification of the info stream.
The timestamp of an RTP packet usually describes the sampling The timestamp of an RTP packet usually describes the sampling
time of the first octet included in the RTP data packet. This is time of the first octet included in the RTP data packet. This is
in principle also true for UXP RTP packets. According to the time in principle also true for UXP RTP packets. According to the time
stamp definition in 6.1 every packet contains the timestamp of stamp definition in Sect. 6.1 every packet contains the
the sampling time of the first octet in the corresponding TB. timestamp of the sampling time of the first octet in the
Therefore, all packets which belong to one TB contain the same corresponding TB. Therefore, all packets which belong to one TB
timestamp. This can lead to problems since due to the theoretical contain the same timestamp. This can lead to problems since due
size limit of a TB, it can contain data from different sampling to the theoretical size limit of a TB (the limit for the number
of columns is 256, and the limit for the number of rows is the
maximum packet size), it can contain data from different sampling
time instances, e.g. several video frames. Then the timing time instances, e.g. several video frames. Then the timing
information of the later frames has to be determined from the information of the later frames has to be determined from the
media stream itself and not from the RTP timestamp. media stream itself and not from the RTP timestamp.
A second problem arising with interleaving is that the framing A second problem arising with interleaving is that the framing
mechanism of RTP is not supported. Consider a media encoder, mechanism of RTP is not supported. Consider a media encoder,
which does not create a fully decodable bitstream, e.g. H.26L which does not create a fully decodable bitstream, e.g. H.26L
with the video coding layer (VCL) and network adaptation layer with the video coding layer (VCL) and network adaptation layer
(NAL) concept [9]. In this concept the VCL creates slices which (NAL) concept [9]. In this concept the VCL creates slices which
are NAL prepared for transmission over several networks at the are prepared for transmission over several networks at the NAL.
NAL. Consequently, in case of RTP transmission, header Consequently, in case of RTP transmission, header information
information which allows to decode the slices is included only in which allows to decode the slices is included only in the RTP
the RTP packets. Thus, to fill an UXP TB with the "raw" media packets. Thus, to fill an UXP TB with the "raw" media stream from
stream from the VCL can lead, even without packet losses, to a the VCL can lead, even without packet losses, to a non-decodable
non-decodable stream. stream.
The framing problem can be solved in two ways: The framing problem can be solved in two ways:
One solution could be to use the RTP payload specification of the One solution could be to use the RTP payload specification of a
media stream to create a bitstream with an appropriate framing, given media stream to create a bitstream with an appropriate
the so-called info stream. For example, to create an H.263 info framing, resulting in the so-called info stream. For example, to
stream, the following steps are necessary: create an H.263 info stream, the following steps are necessary:
1.) Generate an H.263-compliant media stream, i.e. take a slice 1.) Generate an H.263-compliant media stream, i.e. take a slice
or a video frame directly from the H.263 encoder. or a video frame directly from the H.263 encoder.
2.) Apply the H.263 payload specification (e.g. RFC 2429) to 2.) Apply the H.263 payload specification (e.g. RFC 2429) to
create the RTP payload for only one packet. create the RTP payload for only one packet.
3.) Insert the latter row by row into one data TSB. 3.) Insert the latter row by row into one data TSB.
It is possible to apply the procedure mentioned above several It is possible to apply the procedure mentioned above several
times for different data TSBs (see 6.5.). Due to the in-band times for different data TSBs (see Sect. 6.5.). Due to the in-
signaling, it is possible to determine the beginning and end of band signaling, it is possible to determine the beginning and end
every TSB without parsing the whole TB. This allows a fast of every TSB without parsing the whole TB. This allows a fast
decomposition of the TB into the different TSB. decomposition of the TB into the different TSBs.
Another solution of the framing problem would be to rely on the
Another solution of the framing problem would be to relay on the
framing mechanism of the media stream. This is, for example, framing mechanism of the media stream. This is, for example,
possible for media streams which contain start codes. possible for media streams which contain start codes.
The timing problem can be solved in two ways. The timing problem can be solved in two ways.
One solution is to comply with the RTP payload specification of One solution is to comply with the RTP payload specification of
the media stream. If the specification allows to put into one the media stream. If the specification allows to put into one
packet octets which belong to different sampling times, this packet octets which belong to different sampling times, this
should also be allowed for a TB. should also be allowed for a TB.
The second solution for the timing problem is to rely on the The second solution for the timing problem is to rely on the
timing information contained in the media stream itself, if timing information contained in the media stream itself, if
available. available.
skipping to change at page 16, line 9 skipping to change at page 15, line 24
and two different modes for timing: and two different modes for timing:
1.) timing rules of the RTP payload specification for the media 1.) timing rules of the RTP payload specification for the media
stream, stream,
2.) timing information within the media stream. 2.) timing information within the media stream.
All combinations of timing and framing modes are possible, but All combinations of timing and framing modes are possible, but
framing mode 1 and timing mode 1 represent the default mode of framing mode 1 and timing mode 1 represent the default mode of
operation for UXP. The use of other timing and framing modes has operation for UXP. The use of other timing and framing modes has
to be signaled by non RTP means. to be signaled by non RTP means.
The info stream is thus defined by the media stream together with The info stream is thus defined by the media stream together with
framing and timing rules. framing and timing rules.
In the following, some examples will be given: In the following, some examples will be given:
1.) The info stream for MPEG-4 video according to RFC 3016 is 1.) The info stream for MPEG-4 Visual according to RFC 3016 is
the pure MPEG-4 compliant media stream, since RFC 3016 the pure MPEG-4 compliant media stream, since RFC 3016
specifies (in case of video) to take the MPEG-4 compliant specifies (in case of video) to take the MPEG-4 compliant
video stream as payload. video stream as payload.
2.) The info stream for H.263+ can be created according to RFC 2.) The info stream for H.263+ can be created according to RFC
2429 as follows: 2429 as follows:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
|H.263+ payload| H.263+ compliant stream (possibly changed with| |H.263+ payload| H.263+ compliant stream (possibly changed with|
|header | respect to RFC 2429) containing a slice/frame | |header | respect to RFC 2429) containing a slice/frame |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
This info stream is inserted into one single data TSB. This info stream is inserted into one single data TSB.
If necessary, for example, if the slices are too short to achieve If necessary, for example, if the slices are too short to achieve
a reasonable TB size, several info streams can be inserted in one a reasonable TB size, several info streams can be inserted in one
TB by concatenating several data TSBs to one TB (see 6.5.). TB by concatenating several data TSBs to a single TB (see Sect.
6.5.).
6.4. In-band signaling of the structure of the redundancy profile 6.4. In-band Signaling of the Structure of the Redundancy Profile
To enable a dynamic adaptation to varying link conditions, the To enable a dynamic adaptation to varying link conditions, the
actual redundancy profile used in the data TSB as well as the actual redundancy profile used in the data TSB as well as the
beginning and end of a TSB must be signaled to the receiving beginning and end of a TSB must be signaled to the receiving
entity. Since out-of-band signaling either results in excessive entity. Since out-of-band signaling either results in excessive
additional control traffic, or prevents quick changes of the additional control traffic, or prevents quick changes of the
profile between successive TBs, an in-band signaling procedure is profile between successive TBs, an in-band signaling procedure is
desired. desired.
As without knowledge of the correct redundancy profile, the Since without knowledge of the correct redundancy profile, the
decoding process cannot be applied to any of the erasure decoding process cannot be applied to any of the erasure
protection classes, it has to be protected at least as strongly protection classes, the redundancy profile has to be protected at
as the most important element in the info stream. Therefore, an least as strongly as the most important element in the info
additional class EPC_P is used in the signaling TSB, where the stream. Therefore, an additional class EPC_P is used in the
number of parity symbols is by default set to the following signaling TSB, where the number of parity symbols is by default
value: set to the following value:
P=ceil(n/2) P=ceil(n/2)
Hence, up to 50% of the RTP packets can be lost, before the Hence, up to 50% of the RTP packets can be lost, before the
redundancy profile cannot be recovered anymore. This seems to be redundancy profile cannot be recovered anymore. This seems to be
a reasonable value for the lowest point of operation over a lossy a reasonable value for the lowest point of operation over a lossy
link. Alternatively, P may be explicitly signaled during session link. Alternatively, P may be explicitly signaled during session
setup by means of SDP or H.245 protocol. setup by means of SDP or H.245 protocol.
Consequently, since all other classes must have equal or less Consequently, since all other classes must have equal or less
erasure protection capability, the maximum allowable value for erasure protection capability, the maximum allowable value for
class EPC_T in the data TSB is now limited to T<=P. class EPC_T in the data TSB is now limited to T<=P.
The signaling of the erasure protection vector is accomplished by The signaling of the erasure protection vector is accomplished by
means of descriptors. For each class EPC_i with R_i>0, there is a means of descriptors. In the following we describe an efficient
descriptor DP_i providing information about the size of class encoding scheme for the descriptors.
EPC_i (i.e. the value of R_i) and establishing a relationship For each class EPC_i with R_i>0, there is a descriptor DP_i
between the erasure protection of class EPC_i and that of the providing information about the size of class EPC_i (i.e. the
first preceding class EPC_(i+j) with A_(i+j)>0, where j>0. A value of R_i) and establishing a relationship between the erasure
protection of class EPC_i and that of the class EPC_(i+j), where
j>0 and j is the smallest value for which R_(i+j)>0 is true. A
descriptor DP_i is mapped onto one octet, which is sub-divided descriptor DP_i is mapped onto one octet, which is sub-divided
into two half-octets (i.e. the higher and the lower four bits). into two half-octets (i.e. the higher and the lower four bits).
The first half-octet is of type unsigned and contains the 4-bit The first half-octet is of type unsigned and contains the 4-bit
representation of the decimal value R_i. The second half-octet is representation of the decimal value R_i. The second half-octet is
of type signed and contains the difference in erasure protection of type signed and contains the difference in erasure protection
between class EPC_i and class EPC_(i+j), i.e. the signed 4-bit between class EPC_i and class EPC_(i+j), i.e. the signed 4-bit
representation of the decimal value (-j) (where the MSB denotes representation of the decimal value (-j) (where the MSB denotes
the sign, and the lower three bits the absolute value). Note that the sign, and the lower three bits the absolute value). Note that
the erasure protection P of class EPC_p is fixed, whereas the the erasure protection P of class EPC_p is fixed, whereas the
size A_P may vary. size R_P may vary.
Thus, the data to be filled into class EPC_P shall consist of a Thus, the data to be filled into class EPC_P shall consist of a
sequence of descriptors separated by stuffing indicators (see sequence of descriptors separated by stuffing indicators (see
below), where the number of descriptors is primarily given by the below), where the number of descriptors is primarily given by the
number of protection classes EPC_i, 0<=i<=T, in the data TSB with number of protection classes EPC_i, 0<=i<=T, in the data TSB with
R_i>0. R_i>0.
Without a-priori knowledge, the initial value for the size of the Without a-priori knowledge, the initial value for the size of the
signaling TSB should be set to one (row). When the number of signaling TSB, R_P, should be set to one (row). When the number
necessary descriptors and stuffing indicators exceeds the (n-P) of necessary descriptors and stuffing indicators exceeds the (n-
information positions, one or more additional rows have to be P) information positions, one or more additional rows have to be
reserved. This is usually done by increasing the value for L_s to reserved. This is usually done by increasing the value for L_s to
A_P>1, i.e. the data TSB is reduced to (L-A_P) rows. Hence, in R_P>1, i.e. the data TSB is reduced to (L-R_P) rows. Hence, in
order to indicate the actual size of the signaling TSB, an order to indicate the actual size of the signaling TSB, an
additional descriptor is inserted at the very beginning, which additional descriptor is inserted at the very beginning, which
takes on the value 0xq0, where q denotes the (octal) four bit takes on the value 0xq0, where q denotes the (octal) four bit
representation of the decimal value A_P. representation of the decimal value R_P.
Furthermore, the end of each data TSB is signaled by the Furthermore, the end of each data TSB is signaled by the
otherwise unused descriptor value 0x00, followed by exactly one otherwise unused descriptor value 0x00, followed by exactly one
stuffing indicator (SI). The latter is mapped onto an octet, stuffing indicator (SI). The latter is mapped onto an octet,
which is of type unsigned and contains the 8-bit representation which is of type unsigned and contains the 8-bit representation
of the decimal value of the number of media stuffing symbols used of the decimal value of the number of media stuffing symbols used
at the end of the respective data TSB. at the end of the respective data TSB.
The (extended) sequence of descriptors and stuffing indicators is The (extended) sequence of descriptors and stuffing indicators is
then mapped to the octet positions in the A_P rows of the then mapped to the octet positions in the R_P rows of the
signaling TSB from left to right and top to bottom. Each row is signaling TSB from left to right and top to bottom. Each row is
then encoded with the same (n,n-P) RS code. then encoded with the same (n,n-P) RS code.
If the number of descriptors and stuffing indicators is less than If the number of descriptors and stuffing indicators is less than
the available octet positions, however, empty positions in class the available octet positions, however, empty positions in class
EPC_P may be filled up with the otherwise unused descriptor 0x00. EPC_P may be filled up with the otherwise unused descriptor 0x00.
At the receiving entity, the sequence of descriptors shall be At the receiving entity, the sequence of descriptors shall be
recovered by performing erasure decoding on the first row of the recovered by performing erasure decoding on the first row of the
TB (which definitely belongs to the signaling TSB) using the same TB (which definitely belongs to the signaling TSB) using the same
algorithm as later for the data TSB. If successful, the very algorithm as later for the data TSB. If successful, the very
first descriptor now indicates the number of rows of the first descriptor now indicates the number of rows of the
signaling TSB, and the next (A_P-1) rows are decoded to signaling TSB, and the next (R_P-1) rows are decoded to
reconstruct the redundancy profile for the data TSB(s), together reconstruct the redundancy profile for the data TSB(s), together
with the number of media stuffing symbols denoted by the with the number of media stuffing symbols denoted by the
respective SI(s). respective SI(s).
The complete structure of the TB is now depicted in Fig. 5. The complete structure of the TB is now depicted in Fig. 5.
Transmission Block (TB) Transmission Block (TB)
P P
<---------> <--------->
/\ +-+-+-+-+-+-+-+-+-+ /\ /\ +-+-+-+-+-+-+-+-+-+ /\
| |?|?|?|?|*|*|*|*|*| | A_P=1 | |?|?|?|?|*|*|*|*|*| | R_P=1
| +-+-+-+-+-+-+-+-+-+ \/ | +-+-+-+-+-+-+-+-+-+ \/
| |&|&|&|&|&|*|*|*|*| /\ | |&|&|&|&|&|*|*|*|*| /\
| +-+-+-+-+-+-+-+-+-+ | A_T=3 | +-+-+-+-+-+-+-+-+-+ | R_T=3
| |&|&|&|&|&|*|*|*|*| | | |&|&|&|&|&|*|*|*|*| |
| +-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+ |
L octets | |&|&|&|&|&|*|*|*|*| \/ L octets | |&|&|&|&|&|*|*|*|*| \/
payload | +-+-+-+-+-+-+-+-+-+ /\ payload | +-+-+-+-+-+-+-+-+-+ /\
per packet | |%|%|%|%|%|%|*|*|*| | A_(T-1)=1 per packet | |%|%|%|%|%|%|*|*|*| | R_(T-1)=1
| +-+-+-+-+-+-+-+-+-+ \/ | +-+-+-+-+-+-+-+-+-+ \/
| |$|$|$|$|$|$|$|*|*| . | |$|$|$|$|$|$|$|*|*| .
| +-+-+-+-+-+-+-+-+-+ . | +-+-+-+-+-+-+-+-+-+ .
| |!|!|!|!|!|!|!|!|*| . | |!|!|!|!|!|!|!|!|*| .
| +-+-+-+-+-+-+-+-+-+ /\ | +-+-+-+-+-+-+-+-+-+ /\
| |#|#|#|#|#|#|#|#|#| | A_0=1 | |#|#|#|#|#|#|#|#|#| | R_0=1
\/ +-+-+-+-+-+-+-+-+-+ \/ \/ +-+-+-+-+-+-+-+-+-+ \/
<-----------------> <----------------->
n packets n packets
? : descriptors and stuffing indicators for in-band ? : descriptors and stuffing indicators for in-band
signaling of the redundancy profile signaling of the redundancy profile
&,%,$,!,# : info octets belonging to a certain element of the &,%,$,!,# : info octets belonging to a certain element of the
info stream in decreasing order of importance info stream in decreasing order of importance
* : parity octets gained from Reed-Solomon coding * : parity octets gained from Reed-Solomon coding
Fig. 5: General structure for UXP with in-band signaling of the Fig. 5: General structure for UXP with in-band signaling of the
redundancy profile redundancy profile
The following simple example is meant to illustrate the idea The following simple example is meant to illustrate the idea
behind using descriptors: Let an erasure protection vector of behind using descriptors: Let an erasure protection vector of
length T+1=7 be given as follows: length T+1=7 be given as follows:
EPV=(A_0,A_1,...,A_5,A_6)=(7,0,2,2,0,3,10) EPV=(R_0,R_1,...,R_5,R_6)=(7,0,2,2,0,3,10)
Hence, the length L of the TB (including one row for the Hence, the length L of the TB (including one row for the
signaling TSB) is equal to 7+2+2+3+10+1=25 (rows/octets). If the signaling TSB) is equal to 7+2+2+3+10+1=25 (rows/octets). If the
width is assumed to be equal to 20 (columns/packets), then the width is assumed to be equal to 20 (columns/packets), then the
erasure protection of the descriptors is P=10. erasure protection of the descriptors is P=10.
The corresponding sequence of descriptors can be written as The corresponding sequence of descriptors can be written as
DP=(DP_6,DP_5,DP_3,DP_2,DP_0)=(0xAC,0x39,0x2A,0x29,0x7A), DP=(DP_6,DP_5,DP_3,DP_2,DP_0)=(0xAC,0x39,0x2A,0x29,0x7A),
where the values of the descriptors are given in hexadecimal where the values of the descriptors are given in hexadecimal
notation. Next, the descriptor indicating the length of the notation. Next, the descriptor indicating the length of the
signaling TSB has to be inserted, the end of the data TSB has to signaling TSB has to be inserted, the end of the data TSB has to
be marked by 0x00, and the SI has to be appended. If the number be marked by 0x00, and the SI has to be appended. If the number
of media stuffing symbols is assumed to be 3, the 10 info octets of media stuffing symbols is assumed to be 3, the 10 info octets
in the signaling TSB take on the following values (descriptor in the signaling TSB take on the following values (descriptor
stuffing included): stuffing included):
(0x10,0xAC,0x39,0x2A,0x29,0x7A,0x00,0x03,0x00,0x00) (0x10,0xAC,0x39,0x2A,0x29,0x7A,0x00,0x03,0x00,0x00)
6.5. Optional Concatenation of Transmission Sub Blocks
6.5. Optional Concatenation of Transmission Sub Blocks:
The following procedure may be applied if a single info stream The following procedure may be applied if a single info stream
would be too short to achieve an efficient mapping to a would be too short to achieve an efficient mapping to a
transmission block with respect to the fixed payload length L and transmission block with respect to the fixed payload length L and
the desired number of packets n. For example, intra-coded video the desired number of packets n. For example, intra-coded video
frames (I-frames) are usually much larger than the following frames (I-frames) are usually much larger than the following
predicted ones (P-frames). In this case, a certain number z of predicted ones (P-frames). In this case, a certain number z of
successive small info streams should be each mapped to a successive small info streams should be each mapped to a
transmission sub block with length L_d(y) and width n, such that transmission sub block with length L_d(y) and width n, such that
L_d(1)+L_d(2)+...+L_d(z)=L_d. L_d(1)+L_d(2)+...+L_d(z)=L_d.
The resulting transmission sub blocks can then be easily The resulting transmission sub blocks can then be easily
concatenated to form a TB of size L x n having one common concatenated to form a TB of size L x n having one common
signaling TSB: Since the second half-octet of the descriptors is signaling TSB (see Fig. 2): Since the second half-octet of the
of type signed, we are able to incorporate both decreasing and descriptors is of type signed (cf. Sect. 6.4.), we are able to
increasing erasure protection profiles within one single signal both decreasing and increasing erasure protection
signaling TSB. profiles.
Note that once the lengths L_d(y) of the individual blocks have
been fixed, the respective redundancy profiles can be determined
independently of each other. However, the space initially
reserved for the signaling TSB should be already large enough to
avoid profile recalculation for each of the data TSBs in case the
sequence of descriptors gets too long!
Again, we will give a simple example to illustrate this idea: Let Again, we will give a simple example to illustrate this idea: Let
the erasure protection vectors for two concatenated data TSBs be the erasure protection vectors for two concatenated data TSBs be
given as follows: given as follows:
EPV1=(A1_0,A1_1,...,A1_5,A1_6)=(0,0,2,2,0,3,10),
EPV2=(A2_0,A2_1,...,A2_5,A2_6)=(0,0,2,2,0,3,10). EPV1=(R1_0,R1_1,...,R1_5,R1_6)=(0,0,2,2,0,3,10),
EPV2=(R2_0,R2_1,...,R2_5,R2_6)=(0,0,2,2,0,3,10).
Hence, two single identical data TSBs will be concatenated to Hence, two single identical data TSBs will be concatenated to
form a TB of length L=2*(2+2+3+10)+2=36 (rows/octets). If the form a TB of length L=2*(2+2+3+10)+2=36 (rows/octets). If the
width is again assumed to be equal to 20 (columns/packets), then width is again assumed to be equal to 20 (columns/packets), then
the erasure protection of the descriptors is P=10, and therefore the erasure protection of the descriptors is P=10. We reserve a
a total of two rows for the signaling TSB have been reserved this total of two rows for the signaling TSB. The corresponding
time. The corresponding sequence of descriptors can now be sequence of descriptors can now be written as
written as DP=(0xAC,0x39,0x2A,0x29,0xA4,0x39,0x2A,0x29), where DP=(0xAC,0x39,0x2A,0x29,0xA4,0x39,0x2A,0x29), where the values of
the values of the descriptors are given in hexadecimal notation. the descriptors are given in hexadecimal notation. The values of
If the number of media stuffing symbols is assumed to be 3 for the first four descriptors are taken from the descriptor of EPV1
each data TSB, the 20 info octet positions in the signaling TSB as described in Sect. 6.4. (without the SI). The last four
are filled with the following values (descriptor stuffing descriptors are taken from the descriptor of EPV2 (without SI)
included): with one exception. The fifth descriptor of DP (i.e. 0xA4) is
created as follows: The first half-octed is created according to
Sect. 6.4. However, the second half-octed describes no longer the
difference between R_P and R2_6. It rather describes the
difference between R1_2 and R2_6, i.e. R1_2-R2_6, which can be a
positive or negative number. If the number of media stuffing
symbols is assumed to be 3 for each data TSB, the 20 info octet
positions in the signaling TSB are filled with the following
values (descriptor stuffing included):
(0x20,0xAC,0x39,0x2A,0x29,0x00,0x03,0xA4,0x39,0x2A,0x29,0x00,0x03 (0x20,0xAC,0x39,0x2A,0x29,0x00,0x03,0xA4,0x39,0x2A,0x29,0x00,0x03
, ,
0x00,0x00,0x00,0x00,0x00,0x00,0x00) 0x00,0x00,0x00,0x00,0x00,0x00,0x00)
Therefore from the example above, the following general rule MUST
be used to create the resulting descriptors for concatenated data
TSB #u and data TSB #v, where v=u+1:
Let EPVu=(Au_0,Au_1,...) and EPVv=(Av_0, Av_1,...) be the
corresponding erasure protection vectors and DPu and DPv the
corresponding descriptors created according to Sect. 6.4. (with
stuffing). Let w be the smallest index for which Au_w >0. Let x
be the largest index for which Av_x >0. The resulting descriptor
can be created by concatenation of DPu and DPv where the first
descriptor of DPv should be changed as follows:
The second half byte is defined by Au_w-Av_x.
8. Security Considerations 7. Security Considerations
The payload of the RTP-packets consists of an interleaved media The payload of the RTP-packets consists of an interleaved media
and parity stream. Therefore, it is reasonable to encrypt the and parity stream. Therefore, it is reasonable to encrypt the
resulting stream with one key rather than using different keys resulting stream with one key rather than using different keys
for media and parity data. It should also be noted that for media and parity data. It should also be noted that
encryption of the media data without encryption of the parity encryption of the media data without encryption of the parity
data could enable known-plaintext attacks. data could enable known-plaintext attacks.
The overall proportion between parity octets and info octets The overall proportion between parity octets and info octets
should be chosen carefully if the packet loss is due to network should be chosen carefully if the packet loss is due to network
congestion. If the proportion of parity octets per TB is congestion. If the proportion of parity octets per TB is
increased in this case, it could lead to increasing network increased in this case, it could lead to increasing network
congestion. Therefore, the proportion between parity octets and congestion. Therefore, the proportion between parity octets and
info octets per TB MUST NOT be increased as packet loss increases info octets per TB MUST NOT be increased as packet loss increases
due to network congestion. due to network congestion.
The overall ratio between parity and info octets MUST NOT be The overall ratio between parity and info octets MUST NOT be
higher than 1:1, i.e. the absolute bitrate spent for redundancy higher than 1:1, i.e. the absolute bitrate spent for redundancy
must not be larger than the bitrate required for transmission of must not be larger than the bitrate required for transmission of
multimedia data itself. multimedia data itself.
9. Application Statement 8. Application Statement
There are currently two different schemes proposed for unequal There are currently two different schemes proposed for unequal
error protection in the IETF-AVT: Unequal Level Protection (ULP) error protection in the IETF-AVT: Unequal Level Protection (ULP)
and Unequal Erasure Protection (UXP). and Unequal Erasure Protection (UXP).
Although both methods seem to address the same problem, the Although both methods seem to address the same problem, the
proposed solutions differ in many respects. This section tries to proposed solutions differ in many respects. This section tries to
describe possible application scenarios and to show the strength describe possible application scenarios and to show the strengths
and weaknesses of both approaches. and weaknesses of both approaches.
The main difference between both approaches is that while ULP The main difference between both approaches is that while ULP
preserves the structure of the packets which have to be protected preserves the structure of the packets which have to be protected
and provides the redundancy in extra packets, UXP interleaves the and provides the redundancy in extra packets, UXP interleaves the
info stream which has to be protected, inserts the redundancy info stream which has to be protected, inserts the redundancy
information, and thus creates a totally new packet structure. information, and thus creates a totally new packet structure.
Another difference concerns multicast compatibility: It cannot be Another difference concerns multicast compatibility: It cannot be
assumed that all future terminals will be able to apply UXP/ULP. assumed that all future terminals will be able to apply UXP/ULP.
Therefore, backward compatibility could be an issue in some Therefore, backward compatibility could be an issue in some
cases. Since ULP does not change the original packet structure, cases. Since ULP does not change the original packet structure,
but only adds some extra packets, it is possible for terminals but only adds some extra packets, it is possible for terminals
which do not which do not
support ULP to discard the extra packets. In case of UXP, support ULP to discard the extra packets. In case of UXP,
however, two separate streams with and without erasure protection however, two separate streams with and without erasure protection
have to be sent, which increases the overall data rate. have to be sent, which increases the overall data rate.
Next, both approaches offer different mechanisms to adjust packet Next, both approaches offer different mechanisms to adjust packet
sizes, if necessary: UXP allows to adjust the packet sizes sizes, if necessary: UXP allows to adjust the packet sizes
arbitrarily. This is an advantage in case the loss probability is arbitrarily. This is an advantage in case the loss probability is
skipping to change at page 21, line 30 skipping to change at page 20, line 51
of slice structures. of slice structures.
Since ULP does not change the existing packetization scheme, this Since ULP does not change the existing packetization scheme, this
flexibility does not exist. flexibility does not exist.
The ability of UXP to adjust the packet size arbitrarily can be The ability of UXP to adjust the packet size arbitrarily can be
especially exploited in a streaming scenario, if a delay of especially exploited in a streaming scenario, if a delay of
several hundred milliseconds is acceptable. It is then possible several hundred milliseconds is acceptable. It is then possible
to fill several video frames into a single TB of desired size, to fill several video frames into a single TB of desired size,
e.g. a group of pictures consisting of I-frame, P-frames and B- e.g. a group of pictures consisting of I-frame, P-frames and B-
frames. The redundancy scheme can thus be selected in such a way frames. The redundancy scheme can thus be selected in such a way
as to guarantee the following property: In case of packet loss, as to guarantee the following property: In case of packet loss,
the streams for P-frames are only recoverable, if the I-frame, on the P-frames are only recoverable if the I-frame on which the
which the decoding of P-frames depends, is recoverable. The same decoding of P-frames depends is recoverable. The same is true for
is true for B-frames, which can only be decoded if the respective B-frames, which can only be decoded if the respective P-frames
P-frames are recoverable. This prevents situations in which, for are recoverable. This prevents situations in which, for example,
example, the B-frames have been received correctly, but the P- the B-frames have been received correctly, but the P-frames have
frames have been lost, i.e. assures a gradual decrease in been lost, i.e. assures a gradual decrease in application quality
application quality also on the frame level. Of course, a similar also on the frame level. Of course, a similar encoding is
encoding is possible with ULP. But in this case one might have to possible with ULP. But in this case one might have to send
send several frames within one packet which leads to large packet several frames within one packet which leads to large packet
sizes. sizes.
Furthermore, decoding delay is also a crucial issue in Furthermore, decoding delay is also a crucial issue in
communications. Again, both approaches have different delay communications. Again, both approaches have different delay
properties: UXP introduces a decoding delay because a reasonable properties: UXP introduces a decoding delay because a reasonable
amount of correctly received packets are necessary to start amount of correctly received packets are necessary to start
decoding of a TB. The delay in general depends on the dimensions decoding of a TB. The delay in general depends on the dimensions
of the interleaver. This should be considered for any system of the interleaver. This should be considered for any system
design which includes UXP. design which includes UXP.
With ULP, every correctly received media packet can be decoded With ULP, every correctly received media packet can be decoded
right away. However, a significant delay is introduced, if right away. However, a significant delay is introduced, if
packets are corrupted, because in this case one has to wait for packets are corrupted, because in this case one has to wait for
several redundancy packets. Thus, the delay is in general several redundancy packets. Thus, the delay is in general
dependent on the actual ULP-FEC-packet scheme and cannot be dependent on the actual ULP-FEC-packet scheme and cannot be
considered in advance during the system design phase. considered in advance during the system design phase.
Finally, we want to point out that UXP uses RS-codes which are Finally, we want to point out that UXP uses RS codes which are
known known
to be the most efficient type of block codes in terms of erasure to be the most efficient type of block codes in terms of erasure
correction capability. correction capability.
10. Intellectual Property Considerations 9. Intellectual Property Considerations
Siemens AG has filed patent applications that might possibly have Siemens AG has filed patent applications that might possibly have
technical relations to this contribution. technical relations to this contribution.
On IPR related issues, Siemens AG refers to the Siemens Statement On IPR related issues, Siemens AG refers to the Siemens Statement
on Patent Licensing, see http://www.ietf.org/ietf/IPR/SIEMENS- on Patent Licensing, see http://www.ietf.org/ietf/IPR/SIEMENS-
General. General.
11. References 10. References
[1] J. Rosenberg and H. Schulzrinne, "An RTP Payload Format for [1] J. Rosenberg and H. Schulzrinne, "An RTP Payload Format for
Generic Forward Error Correction", Request for Comments 2733, Generic Forward Error Correction", Request for Comments 2733,
Internet Engineering Task Force, Dec. 1999. Internet Engineering Task Force, Dec. 1999.
[2] A. Albanese, J. Bloemer, J. Edmonds, M. Luby, and M. Sudan, [2] A. Albanese, J. Bloemer, J. Edmonds, M. Luby, and M. Sudan,
"Priority encoding transmission", IEEE Trans. Inform. Theory, "Priority encoding transmission", IEEE Trans. Inform. Theory,
vol. 42, no. 6, pp. 1737-1744, Nov. 1996. vol. 42, no. 6, pp. 1737-1744, Nov. 1996.
[3] Shu Lin and Daniel J. Costello, Error Control Coding: [3] Shu Lin and Daniel J. Costello, Error Control Coding:
Fundamentals and Applications, Prentice-Hall, Inc., Englewood Fundamentals and Applications, Prentice-Hall, Inc., Englewood
Cliffs, N.J., 1983. Cliffs, N.J., 1983.
[4] W. Li: "Streaming video profile in MPEG-4", IEEE trans. on [4] W. Li: "Streaming video profile in MPEG-4", IEEE Trans. on
Circuits and Systems for Video Technology, Vol. 11, no. 3, 301- Circuits and Systems for Video Technology, Vol. 11, no. 3, 301-
317, Mar 2001. 317, March 2001.
[5] G. Blaettermann, G. Heising, and D. Marpe: "A Quality [5] G. Blaettermann, G. Heising, and D. Marpe: "A Quality
Scalable Mode for H.26L", ITU-T SG16, Q.15, Q15-J24, Osaka, May Scalable Mode for H.26L", ITU-T SG16, Q.15, Q15-J24, Osaka, May
2000. 2000.
[6] F. Burkert, T. Stockhammer, and J. Pandel, "Progressive A/V [6] F. Burkert, T. Stockhammer, and J. Pandel, "Progressive A/V
coding for lossy packet networks - a principle approach", Tech. coding for lossy packet networks - a principle approach", Tech.
Rep., ITU-T SG16, Q.15, Q15-I36, Red Bank, N.J., Oct. 1999. Rep., ITU-T SG16, Q.15, Q15-I36, Red Bank, N.J., Oct. 1999.
[7] Guenther Liebl, "Modeling, theoretical analysis, and coding [7] Guenther Liebl, "Modeling, theoretical analysis, and coding
for wireless packet erasure channels", Diploma Thesis, Inst. for for wireless packet erasure channels", Diploma Thesis, Inst. for
Communications Engineering, Munich University of Technology, Communications Engineering, Munich University of Technology,
1999. 1999.
[8] U. Horn, K. Stuhlmuller, M. Link, and B. Girod, "Robust [8] U. Horn, K. Stuhlmuller, M. Link, and B. Girod, "Robust
Internet video transmission based on scalable coding and unequal Internet video transmission based on scalable coding and unequal
error protection", Image Com., vol. 15, no. 1-2, pp. 77-94, Sep. error protection", Image Com., vol. 15, no. 1-2, pp. 77-94, Sep.
skipping to change at page 22, line 42 skipping to change at page 22, line 19
for wireless packet erasure channels", Diploma Thesis, Inst. for for wireless packet erasure channels", Diploma Thesis, Inst. for
Communications Engineering, Munich University of Technology, Communications Engineering, Munich University of Technology,
1999. 1999.
[8] U. Horn, K. Stuhlmuller, M. Link, and B. Girod, "Robust [8] U. Horn, K. Stuhlmuller, M. Link, and B. Girod, "Robust
Internet video transmission based on scalable coding and unequal Internet video transmission based on scalable coding and unequal
error protection", Image Com., vol. 15, no. 1-2, pp. 77-94, Sep. error protection", Image Com., vol. 15, no. 1-2, pp. 77-94, Sep.
1999. 1999.
[9] S. Wenger, "H.26L over IP: The IP-Network Adaptation Layer", [9] S. Wenger, "H.26L over IP: The IP-Network Adaptation Layer",
Packet Video 2002, Pittsburgh, Pennsylvania, USA, April 24- Packet Video 2002, Pittsburgh, Pennsylvania, USA, April 24-
26,2002. 26,2002.
11. Acknowledgments
Many thanks to Philippe Gentric, Stephen Casner, and Hermann
Hellwagner for helpful comments and improvements. The authors
would like to thank Thomas Stockhammer who came up with the
original idea of UXP. Also, the help of Gero Baese, Frank
Burkert, and Minh Ha Nguyen for the development of UXP is well
acknowledged.
12. Acknowledgments 12. Author's Addresses
Many thanks to Philippe Gentric and Stephen Casner for helpful Guenther Liebl
comments and improvements.
13. Author's Addresses
Guenther Liebl, Thomas Stockhammer
Institute for Communications Engineering (LNT) Institute for Communications Engineering (LNT)
Munich University of Technology Munich University of Technology
D-80290 Munich D-80290 Munich
Germany Germany
Email: {liebl,tom}@lnt.e-technik.tu-muenchen.de Email: {liebl}@lnt.e-technik.tu-muenchen.de
Minh-Ha Nguyen, Frank Burkert
Siemens AG - ICM D MP RD MCH 83/81
D-81675 Munich
Germany
Email: {minhha.nguyen,frank.burkert}@mch.siemens.de
Marcel Wagner, Juergen Pandel, Wenrong Weng, Gero Baese Marcel Wagner, Juergen Pandel, Wenrong Weng
Siemens AG - Corporate Technology CT IC 2 Siemens AG - Corporate Technology CT IC 2
D-81730 Munich D-81730 Munich
Germany Germany
Email: Email:
{marcel.wagner,juergen.pandel,wenrong.weng,gero.baese}@mchp.sieme {marcel.wagner,juergen.pandel,wenrong.weng}@mchp.siemens.de
ns.de
Full Copyright Statement Full Copyright Statement
"Copyright (C) The Internet Society (date). All Rights Reserved. "Copyright (C) The Internet Society (date). All Rights Reserved.
This document and translations of it may be copied and furnished This document and translations of it may be copied and furnished
to others, and derivative works that comment on or otherwise to others, and derivative works that comment on or otherwise
explain it or assist in its implementation may be prepared, explain it or assist in its implementation may be prepared,
copied, published and distributed, in whole or in part, without copied, published and distributed, in whole or in part, without
restriction of any kind, provided that the above copyright notice restriction of any kind, provided that the above copyright notice
and this paragraph are included on all such copies and derivative and this paragraph are included on all such copies and derivative
works. However, this document itself may not be modified in any works. However, this document itself may not be modified in any
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/