draft-ietf-avt-uxp-00.txt   draft-ietf-avt-uxp-01.txt 
Internet Engineering Task Force G. Liebl, Internet Engineering Task Force G. Liebl,
T.Stockhammer T.Stockhammer
Internet Draft LNT, Munich Univ. of Internet Draft LNT, Munich Univ. of
Technology Technology
Document: draft-ietf-avt-uxp-00.txt Document: draft-ietf-avt-uxp-01.txt
February 2001 M. Wagner, J.Pandel, November 2001 M. Wagner, J.Pandel,
G. Baese, M. Nguyen, W. Weng, G. Baese,
F. Burkert M. Nguyen, F. Burkert
Expires: August 2001 Siemens AG, Munich Expires: May 2002 Siemens AG, Munich
An RTP Payload Format for Erasure-Resilient Transmission of Progressive An RTP Payload Format for Erasure-Resilient Transmission of Progressive
Multimedia Streams Multimedia Streams
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026 []. all provisions of Section 10 of RFC2026 [].
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
skipping to change at page 2, line ? skipping to change at page 2, line ?
source stream, thus allowing a graceful degradation of application source stream, thus allowing a graceful degradation of application
quality with increasing packet loss rate on the network. Hence, this quality with increasing packet loss rate on the network. Hence, this
type of unequal erasure protection (UXP) schemes is intended to cope type of unequal erasure protection (UXP) schemes is intended to cope
with the rapidly varying channel conditions on wireless access links with the rapidly varying channel conditions on wireless access links
to the Internet backbone. Nevertheless, backward compatibility to to the Internet backbone. Nevertheless, backward compatibility to
currently standardized non-progressive multimedia codecs is ensured, currently standardized non-progressive multimedia codecs is ensured,
since equal erasure protection (EXP) represents a subset of generic since equal erasure protection (EXP) represents a subset of generic
UXP. By defining a comparably simple payload format, the proposed UXP. By defining a comparably simple payload format, the proposed
scheme can be easily integrated into the existing framework for RTP. scheme can be easily integrated into the existing framework for RTP.
Liebl, Stockhammer, Wagner, Pandel, Baese, Nguyen, Burkert [Page1] Liebl,Stockhammer,Wagner,Pandel,Weng,Baese,Nguyen,Burkert [Page1]
2. Conventions used in this document 2. Conventions used in this document
The following terms are used throughout this document: The following terms are used throughout this document:
1.) Message block: a higher layer transport unit (e.g. an IP 1.) Message block: a higher layer transport unit (e.g. an IP
packet), that enters/leaves the segmentation/reassembly stage at the packet), that enters/leaves the segmentation/reassembly stage at the
interface to wireless data link layers. interface to wireless data link layers.
2.) Segment: denotes a link layer transport unit. 2.) Segment: denotes a link layer transport unit.
skipping to change at page 2, line ? skipping to change at page 2, line ?
receiving entity (reassembly). receiving entity (reassembly).
5.) Quality-of-service: application-dependent criterion to define a 5.) Quality-of-service: application-dependent criterion to define a
certain desired operation point. certain desired operation point.
6.) Codec: denotes a functional pair consisting of a source encoding 6.) Codec: denotes a functional pair consisting of a source encoding
unit at the sender and a corresponding source decoding unit at the unit at the sender and a corresponding source decoding unit at the
receiver; usually standardized for different multimedia applications receiver; usually standardized for different multimedia applications
like audio or video. like audio or video.
7.) Progressive source coding: results in a stream of coded data 7.) Progressive source coding: results in successive blocks of
whose distinct elements are of different importance to the (source-)encoded data (e.g. a single video or audio frame), each of
reconstruction process at the decoder. Elements are commonly ordered which can be viewed as a bitstream of certain length, whose distinct
from highest to least importance, where the latter elements depend elements are of different importance to the reconstruction process
on the previous. at the decoder. Elements are commonly ordered from highest to least
importance, where the latter elements depend on the previous.
8.) Reed-Solomon (RS) code: belongs to the class of linear nonbinary 8.) Reed-Solomon (RS) code: belongs to the class of linear nonbinary
block codes, and is uniquely specified by the block length n, the block codes, and is uniquely specified by the block length n, the
number of parity symbols t, and the symbol alphabet. number of parity symbols t, and the symbol alphabet.
9.) n: is a variable, which denotes both the block length of a RS 9.) n: is a variable, which denotes both the block length of a RS
codeword, and the number of columns in a TB (see 15). codeword, and the number of columns in a TB (see 16).
10.) k: is a variable, which denotes the number of information 10.) k: is a variable, which denotes the number of information
symbols in a RS codeword. symbols in a RS codeword.
11.) t: is a variable, which denotes the number of parity symbols in 11.) t: is a variable, which denotes the number of parity symbols in
a RS codeword. a RS codeword.
12.) Erasure: When a packet is lost during transmission, an erasure 12.) Erasure: When a packet is lost during transmission, an erasure
is said to have happened. Since the position of the erased packet in is said to have happened. Since the position of the erased packet in
a sequence is usually known, a corresponding erasure marker can be a sequence is usually known, a corresponding erasure marker can be
skipping to change at page 3, line 17 skipping to change at page 3, line 15
13.) Base layer: comprises the first and most important elements in 13.) Base layer: comprises the first and most important elements in
a progressively encoded bitstream, without which all subsequent a progressively encoded bitstream, without which all subsequent
information is useless. information is useless.
14.) Enhancement layer: comprises one or more sets of the less 14.) Enhancement layer: comprises one or more sets of the less
important subsequent elements in a progressively encoded bitstream. important subsequent elements in a progressively encoded bitstream.
A specific enhancement layer can be decoded, if and only if the base A specific enhancement layer can be decoded, if and only if the base
layer and all previous enhancement layer data (of higher importance) layer and all previous enhancement layer data (of higher importance)
is available. is available.
15.) Transmission block (TB): denotes a memory array of L rows and n 15.) Info stream: denotes the final bitstream which has to be
protected by the proposed UXP scheme. It usually consists of the
(source-encoded) bitstream (progressive or not), which is already
arranged according to a desired syntax (e.g. as specified in the
respective RTP profile for the media codec in use).
In any case, it is assumed that every info stream is already octet-
aligned according to the standard procedures defined in the context
of the used syntax specifications.
16.) Transmission block (TB): denotes a memory array of L rows and n
columns. Each row of a TB represents a RS codeword, whereas each columns. Each row of a TB represents a RS codeword, whereas each
column represents the payload of an RTP packet. column, together with the respective UXP header (see 33) in front,
forms the payload of a single RTP packet.
Each TB consists of at least two distinct transmission sub blocks
(TSB, see 17): The first L_s rows belong to the signaling TSB,
whereas the last L_d=(L-L_s) rows belong to one or more data TSB.
16.) L: is a variable, which denotes both the number of rows in a TB 17.) Transmission sub block (TSB): denotes a memory array of 0<l<L
and the payload length of an RTP packet in bytes. rows and n columns, which is a horizontal slice of a TB. Depending
on whether the info byte positions are filled with descriptors (see
28) or media data, the TSB is of type signaling or data,
respectively.
17.) Unequal erasure protection (UXP): denotes a specific strategy 18.) L: is a variable, which denotes both the number of rows in a TB
and the payload length (without UXP header) of an RTP packet in
bytes.
19.) Unequal erasure protection (UXP): denotes a specific strategy
which varies the level of erasure protection across a TB according which varies the level of erasure protection across a TB according
to a given redundancy profile. to a given redundancy profile.
18.) Equal erasure protection (EXP): is a subset of UXP, for which 20.) Equal erasure protection (EXP): is a subset of UXP, for which
the level of erasure protection is kept constant across a TB. the level of erasure protection is kept constant across a TB.
19.) Redundancy profile: describes the size of the different erasure 21.) Redundancy profile: describes the size of the different erasure
protection classes in a TB, i.e. the number of rows (codewords) per protection classes in a TB, i.e. the number of rows (codewords) per
class. class.
20.) Erasure protection class: contains a set of rows (codewords) of 22.) Erasure protection class: contains a set of rows (codewords) of
the TB with same erasure correction capability. the TB with same erasure correction capability.
21.) i: is a variable, which denotes the number of parity bytes for 23.) i: is a variable, which denotes the number of parity bytes for
each row in erasure protection class i. each row in erasure protection class i.
22.) CA_i: is a variable, which denotes the set of rows contained in 24.) CA_i: is a variable, which denotes the set of rows contained in
erasure protection class i. erasure protection class i.
23.) A_i: is a variable, which denotes the total number of rows 25.) A_i: is a variable, which denotes the total number of rows
contained in erasure protection class i, i.e. the cardinality of contained in erasure protection class i, i.e. the cardinality of
CA_i. CA_i.
24.) T: is a variable, which denotes the number of parity bytes for 26.) T: is a variable, which denotes the number of parity bytes for
each row in the highest erasure protection class (with respect to each row in the highest erasure protection class (with respect to
application data) in a TB. application data) in a TB.
25.) AV: denotes the erasure protection vector of length (T+1) used 27.) AV: denotes the erasure protection vector of length (T+1) used
to describe a certain redundancy profile. to describe a certain redundancy profile.
26.) DP: descriptor used for in-band signaling of the erasure 28.) DP: descriptor used for in-band signaling of the erasure
protection vector protection vector.
27.) Stuffing: insertion of predefined symbol patterns. Stuffing is
performed, if the information part of an erasure protection class
cannot be filled completely with (application) payload data.
28.) Interleaver: performs the spreading of a codeword, i.e. a row 29.) SI: stuffing indicator, which contains the number of media
stuffing symbols at the end of a data TSB (see 31).
30.) Descriptor Stuffing: insertion of otherwise unused descriptor
values (i.e. 0x00) at the end of the signaling TSB. Descriptor
stuffing is performed, if the final sequence of descriptors and
stuffing indicators for a valid redundancy profile is shorter than
the space initially reserved for it in the signaling TSB.
31.) Media Stuffing: insertion of additional symbols at the end of a
data TSB. Media stuffing is performed, if the info stream (see 15)
is shorter than the space reserved for it in the data TSB for a
desired redundancy profile. Since the number of stuffing symbols is
signaled in the respective SI, any byte value may be used (e.g.
0x00).
32.) Interleaver: performs the spreading of a codeword, i.e. a row
in the TB, over n successive packets, such that the probability of in the TB, over n successive packets, such that the probability of
an erasure burst in a codeword is kept small. an erasure burst in a codeword is kept small.
29.) UXP header: is the additional header information contained in 33.) UXP header: is the additional header information contained in
each RTP packet after UXP has been applied. each RTP packet after UXP has been applied. It is always present at
the start of the payload section of an RTP packet.
30.) X: denotes a currently not used extension field of 1 bit in the 34.) X: denotes a currently not used extension field of 1 bit in the
UXP header. UXP header.
31.) P: is a variable which denotes the number of parity symbols per 35.) P: is a variable which denotes the number of parity symbols per
row used to protect the inband signaling of the redundancy profile. row used to protect the inband signaling of the redundancy profile.
32.) ceil(.): denotes the ceiling function, i.e. rounding up to the 36.) ceil(.): denotes the ceiling function, i.e. rounding up to the
next integer. next integer.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
this document are to be interpreted as described in RFC-2119 []. this document are to be interpreted as described in RFC-2119 [].
3. Introduction 3. Introduction
Due to the increasing popularity of high-quality multimedia Due to the increasing popularity of high-quality multimedia
applications over the Internet and the high level of public applications over the Internet and the high level of public
skipping to change at page 6, line 47 skipping to change at page 7, line 26
5. Progressive Source Coding 5. Progressive Source Coding
If the output of a multimedia codec, be it audio or video, is said If the output of a multimedia codec, be it audio or video, is said
to be progressive, the encoded bitstream must consist of several to be progressive, the encoded bitstream must consist of several
distinct elements, often organized in separate layers. The latter distinct elements, often organized in separate layers. The latter
shall be defined via their relative importance with respect to the shall be defined via their relative importance with respect to the
quality of the reconstruction process at the receiver. Hence, there quality of the reconstruction process at the receiver. Hence, there
exists at least one layer, often called base layer, without which exists at least one layer, often called base layer, without which
reconstruction fails at all, whereas all the other layers, often reconstruction fails at all, whereas all the other layers, often
called enhancement layers, just help to continually improve the called enhancement layers, just help to continually improve the
quality. Consequently, the different layers shall be mapped on the quality. Consequently, the different layers are usually contained in
bitstream in decreasing order of importance, i.e. the base layer the (source-)encoded bitstream in decreasing order of importance,
data is followed by the various enhancement layers. i.e. the base layer data is followed by the various enhancement
layers.
An example can be found in the fine granular scalability modes which An example can be found in the fine granular scalability modes which
have been proposed to various standardization bodies like MPEG-4 [4] have been proposed to various standardization bodies like MPEG-4 [4]
or ITU (H.26L) [5], where the resolution of the scaling process in or ITU (H.26L) [5], where the resolution of the scaling process in
the progressive source encoder is as low as one symbol in the the progressive source encoder is as low as one symbol in the
enhancement layer. enhancement layer.
From the above definition, it is quite obvious that the most From the above definition, it is quite obvious that the most
important base layer data must be protected as strongly as possible important base layer data must be protected as strongly as possible
against packet loss during transmission. However, the protection of against packet loss during transmission. However, the protection of
the enhancement layers could be continually lowered, since a loss at the enhancement layers could be continually lowered, since a loss at
this stage has only minor consequences for the reconstruction this stage has only minor consequences for the reconstruction
process. Thus, by using a suitable unequal erasure protection process. Thus, by using a suitable unequal erasure protection
strategy across the message block, which contains the progressively strategy across a progressive source stream, the overhead due to
encoded source stream, the overhead due to redundancy spent per redundancy spent per (channel-)encoded block is reduced.
block is reduced. Furthermore, if channel conditions get worse Furthermore, if channel conditions get worse during transmission,
during transmission, only more and more enhancement layers are lost, only more and more enhancement layers are lost, i.e. a graceful
i.e. a graceful degradation in application quality at the receiver degradation in application quality at the receiver is achieved [6].
is achieved [6].
Nevertheless, it should be mentioned that the specific structure of
a (source-)encoded bitstream strongly depends on the actual media
codec in use, and the desired syntax which is used for adapting the
output of the codec to a suitable transport level format (see also
7.3). In order to keep the description of the unequal erasure
protection strategy in section 6 as general as possible, the final
bitstream which has to be protected by the proposed UXP scheme will
be called "info stream" in the following. Furthermore, it is assumed
that every info stream is already octet-aligned according to the
standard procedures defined in the context of the used syntax
specifications.
6. General Structure of UXP schemes 6. General Structure of UXP schemes
In this section, the principle features of the proposed UXP scheme
are described with a special focus on the protection and
reconstruction procedure which is applied to the info stream. In
addition, the behavior of the sender and receiver is specified as
far as it concerns the reconstruction of the info stream. However,
the complete UXP payload structure, including the additional UXP
header, is described in section 7.
Fig. 1 already illustrated the structure of a systematic codeword, Fig. 1 already illustrated the structure of a systematic codeword,
which shall be represented by a single row and n successive columns which shall be represented by a single row and n successive columns
that contain the information and the parity bytes. This structure that contain the information and the parity bytes. This structure
shall now be extended by forming a transmission block (TB) shall now be extended by forming a transmission block (TB)
consisting of L codewords of length n bytes each, which amounts to a consisting of L codewords of length n bytes each, which amounts to a
total of L rows and n columns [7]: Each column shall represent the total of L rows and n columns [7]: Each column, together with the
payload of an RTP packet, i.e. the whole data of a TB is transmitted respective UXP header in front, shall represent the payload of an
via a sequence of n RTP packets all carrying a payload of length L RTP packet, i.e. the whole data of a TB is transmitted via a
bytes. sequence of n RTP packets all carrying a payload of length (L+2)
bytes (UXP header included).
The value of L should be chosen in such a way that the whole length The value of L should be chosen in such a way that the whole length
of the resulting IP packet (i.e. RTP payload plus sum of UXP, RTP, of the resulting IP packet (i.e. RTP payload plus sum of RTP, UDP,
UDP, and IP header) equals a multiple of the segment size on the and IP header) equals a multiple of the segment size on the wireless
wireless link to avoid stuffing at the data link layer. link to avoid stuffing at the data link layer.
As depicted in Fig. 2, the rows of the block shall be partitioned Each TB usually consists of two or more horizontal slices, the so-
into T+1 different classes CA_i, where i=0...T, such that each class called transmission sub blocks (TSB), as can be seen in Fig. 2: The
contains exactly A_i=|CA_i| consecutive rows of the matrix, where first L_s rows always belong to the signaling TSB, which is used to
the A_i have to satisfy the following relationship: convey the actual redundancy profile in the data part to the
receiver (see 7.3). The following L_d=(L-L_s) rows belong to one or
more data TSBs, which contain the interleaved and RS encoded info
stream, as will be described below.
A_0+A_1+...+A_T=L
Transmission Block (TB) Transmission Block (TB)
/\ +-+-+-+-+-+-+-+-+-+ /\
| | signaling TSB | | L_s bytes
| +-+-+-+-+-+-+-+-+-+ \/
| | | /\ /\
| + data TSB #1 + | L_d(1) bytes |
| | | | |
| +-+-+-+-+-+-+-+-+-+ \/ |
L bytes | | | /\ |
payload | + data TSB #2 + | L_d(2) bytes |
per packet | + | | | L_d bytes
| +-+-+-+-+-+-+-+-+-+ \/ |
| | . | . |
| + . + . |
| | . | . |
| +-+-+-+-+-+-+-+-+-+ /\ |
| | data TSB #z | | L_d(z) bytes |
\/ +-+-+-+-+-+-+-+-+-+ \/ \/
<----------------->
n packets
Fig. 2: General structure of a TB
Since the UXP procedure is mainly applied to the data TSBs, it will
be described next, whereas the content and syntax of the signaling
TSB will be defined in section 7.3.
For means of simplification, only one single data TSB will be
assumed throughout the following explanation of the encoding and
decoding procedure. However, an extension to more than one data TSB
per TB is straightforward, and will be shown in section 7.4.
As depicted in Fig. 3, the rows of a transmission sub block shall be
partitioned into T+1 different classes CA_i, where i=0...T, such
that each class contains exactly A_i=|CA_i| consecutive rows of the
matrix, where the A_i have to satisfy the following relationship:
A_0+A_1+...+A_T=L_d
Data Transmission Sub Block (data TSB)
T T
<-------> <------->
/\ +-+-+-+-+-+-+-+-+-+ /\ /\ +-+-+-+-+-+-+-+-+-+ /\
| |&|&|&|&|&|*|*|*|*| | | |&|&|&|&|&|*|*|*|*| |
| +-+-+-+-+-+-+-+-+-+ | A_T=3 | +-+-+-+-+-+-+-+-+-+ | A_T=3
| |&|&|&|&|&|*|*|*|*| | | |&|&|&|&|&|*|*|*|*| |
| +-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+ |
L bytes | |&|&|&|&|&|*|*|*|*| \/ L_d bytes | |&|&|&|&|&|*|*|*|*| \/
payload | +-+-+-+-+-+-+-+-+-+ /\ per packet | +-+-+-+-+-+-+-+-+-+ /\
per packet | +%|%|%|%|%|%|*|*|*| | A_(T-1)=1 | +%|%|%|%|%|%|*|*|*| | A_(T-1)=1
| +-+-+-+-+-+-+-+-+-+ \/ | +-+-+-+-+-+-+-+-+-+ \/
| |$|$|$|$|$|$|$|*|*| . | |$|$|$|$|$|$|$|*|*| .
| +-+-+-+-+-+-+-+-+-+ . | +-+-+-+-+-+-+-+-+-+ .
| |||||||||*| . | |||||||||*| .
| +-+-+-+-+-+-+-+-+-+ /\ | +-+-+-+-+-+-+-+-+-+ /\
| |#|#|#|#|#|#|#|#|#| | A_0=1 | |#|#|#|#|#|#|#|#|#| | A_0=1
\/ +-+-+-+-+-+-+-+-+-+ \/ \/ +-+-+-+-+-+-+-+-+-+ \/
<-----------------> <----------------->
n packets n packets
&,%,$,,# : info bytes belonging to a certain source coding layer in &,%,$,,# : info bytes belonging to a certain info stream in
decreasing order of importance decreasing order of importance
* : parity bytes gained from Reed-Solomon coding * : parity bytes gained from Reed-Solomon coding
Fig. 2: General structure for coding with unequal erasure protection Fig. 3: General structure for coding with unequal erasure protection
Furthermore, all rows in a particular class CA_i shall contain Furthermore, all rows in a particular class CA_i shall contain
exactly the same number of parity bytes, which is equal to the index exactly the same number of parity bytes, which is equal to the index
i of the class. For each row in a certain class CA_i, the same (n,n- i of the class. For each row in a certain class CA_i, the same (n,n-
i) RS code shall be applied. i) RS code shall be applied.
As can be observed from Fig. 2, class CA_T contains the largest As can be observed from Fig. 3, class CA_T contains the largest
number of parity bytes per row, i.e. offers the highest erasure number of parity bytes per row, i.e. offers the highest erasure
protection capability in the block. Consequently, all base layer protection capability in the block. Consequently, the most important
data must be assigned to class CA_T, where the value of T should be element in the info stream must be assigned to class CA_T, where the
chosen according to the desired outage threshold of the base layer value of T should be chosen according to the desired outage
given a certain packet erasure rate on the link. threshold of the application given a certain packet erasure rate on
the link.
All other classes CA_(T-1)...CA_0 shall be sequentially filled with All other classes CA_(T-1)...CA_0 shall be sequentially filled with
enhancement layer data in decreasing order of importance, where the the remaining elements of the info stream in decreasing order of
optimal choice for the size of each class (0 or more rows), i.e. the importance, where the optimal choice for the size of each class (0
structure of the redundancy profile, should depend on the quality- or more rows), i.e. the structure of the redundancy profile, should
of-service requirements for the various layers. depend on the quality-of-service requirements for the various
(progressively-encoded) layers.
The following set of rules contains a compact description of all the The following set of rules contains a compact description of all the
operations that must be performed for each transmission block: operations that must be performed for each transmission block:
1.) The total number of columns n of the TB shall be chosen 1.) The total number of columns n of the TB shall be chosen
according to the actual delay constraints of the application. according to the actual delay constraints of the application.
2.) The maximum erasure correction capability T should be chosen 2.) Next, the expected number of rows reserved for the signaling TSB
according to the desired outage threshold of the base layer given has to selected, which limits the data TSB to L_d=(L-L_s) rows.
the actual packet erasure rate on the link.
3.) The redundancy profile for the rest of the TB should depend on 3.) The maximum erasure correction capability T in the data TSB
the size and number of the various layers in the progressive source should be chosen according to the desired outage threshold of the
application given the actual packet erasure rate on the link.
4.) The redundancy profile for the rest of the data TSB should
depend on the size and number of the various layers in the info
stream, as well as the desired probability of successful decoding stream, as well as the desired probability of successful decoding
for each of them (quality-of-service requirement). for each of them (quality-of-service requirement).
4.) Beginning with the base layer, each layer in the progressive 5.) Any suitable optimization algorithm may be used for deriving an
source stream shall be assigned to exactly one class CA_T...CA_0 in adequate redundancy profile. However, the result has to satisfy the
decreasing order of importance. following constraints:
a) All available info byte positions in the data TSB have to be
completely filled. If the info stream is too short for a desired
profile, media stuffing may be applied to the empty info byte
positions at the end of the data TSB by appending a sufficient
number of bytes (with arbitrary value, e.g. 0x00). The actual number
of stuffing symbols per data TSB is then signaled via the respective
stuffing indicator (see 7.3). However, before resorting to any
stuffing, it should be checked whether it is possible to strengthen
the protection of certain rows instead, thus improving the overall
robustness of the decoding process.
b) The info stream should be fully contained within the data TSB
(unless cutting it off at a specific point is explicitly allowed by
the properties of the used media codec).
c) The number of required descriptors and stuffing indicators (see
section 7.3) to signal the profile shall not exceed the space
initially reserved for them in the signaling TSB.
Constraints a) and b) should be already incorporated in the
optimization algorithm. However, if constraint c) is not met, the
data TSB has to be reduced by one row in favor of the signaling TSB
to accomodate more space for the descriptors and stuffing
indicators, i.e. steps 2-5 have to be repeated until a valid
redundancy profile has been obtained.
5.) For each nonempty class CA_i, i=T...0, the following steps have 6.) For each nonempty class CA_i, i=T...0, in the data TSB, the
to be performed: following steps have to be performed:
a) All rows of this specific class shall be filled from left to a) All rows of this specific class shall be filled from left to
right and top to bottom with data bytes of the corresponding layer. right and top to bottom with data bytes of the info stream in
If the size of the layer is less than the available space for this decreasing order of importance (i.e. starting with the most
class, the empty positions may be filled with the first bytes of the important element).
next layer (in decreasing order of importance), such that there is
no overhead due to stuffing.
b) For each row in the class, the required i parity-check bytes are b) For each row in the class, the required i parity-check bytes are
computed from the same set of codewords of an (n,n-i) RS code, and computed from the same set of codewords of an (n,n-i) RS code, and
filled in the empty positions at the end of each row. Thus, every filled in the empty positions at the end of each row. Thus, every
row in the class constitutes a valid codeword of the chosen RS code. row in the class constitutes a valid codeword of the chosen RS code.
6.) If the total length of the progressively encoded source stream 7.) After having filled the whole data TSB with information and
exceeds the number of available info byte positions in the TB for parity bytes, the redundancy profile is mapped to the signaling TSB
the chosen redundancy profile, the final bytes of the least as described in section 7.3.
important enhancement layer shall be cut off until the remaining
parts fit completely into the TB.
7.) If the total length of the progressively encoded source stream
is less than the number of available info byte positions in the TB
for the chosen redundancy profile, byte-stuffing shall be applied to
the empty positions in the last class such that the stuffing value
does not influence the performance of the multimedia decoder at the
receiver.
8.) After having filled the whole TB with information and parity 8.) Each column of the resulting TB is now read out byte-wise from
bytes, each column is read out byte-wise from top to bottom and top to bottom and, together with the respective UXP header (see
mapped onto the payload part of one and only one RTP packet. section 7.2) in front, is mapped onto the payload section of one and
only one RTP packet.
9.) The n resulting RTP packets shall be transmitted subsequently to 9.) The n resulting RTP packets shall be transmitted subsequently to
the remote host, starting with the leftmost one. the remote host, starting with the leftmost one.
10.) At the corresponding protocol entity at the remote host, the 10.) At the corresponding protocol entity at the remote host, the
payload of all successfully received RTP packets belonging to the payload (without the UXP header) of all successfully received RTP
same sending TB shall be filled into a similar receiving TB column- packets belonging to the same sending TB shall be filled into a
wise from top to bottom and left to right. similar receiving TB column-wise from top to bottom and left to
right.
11.) For every erased packet of a received TB, the respective column 11.) For every erased packet of a received TB, the respective column
in the TB shall be filled with a suitable erasure marker. in the TB shall be filled with a suitable erasure marker.
12.) Given the redundancy profile assigned by the sender, for each 12.) Before any other operations can be performed, the redundancy
row a decoding operation shall be performed by applying any suitable profile has to be restored from the signaling TSB according to the
algorithm for erasure decoding. procedure defined in section 7.3. If the attempt fails because of
too many lost packets, the whole TB shall be discarded and the
receiving entity should wait for the next incoming TB (the source
decoder may be informed about the missing info stream, if required).
13.) For all rows for which the decoding operation has been 13.) If the attempt to recover the redundancy profile has been
successful, the reconstructed data bytes are read out from left to successful, a decoding operation shall be performed for each row of
right and top to bottom, and appended to the reconstructed version the data TSB by applying any suitable algorithm for erasure
of the progressive data stream. decoding.
14.) For all rows for which the decoding operation has not been 14.) For all rows of the data TSB for which the decoding operation
successful, a sufficient number of suitable dummy symbols may be has been successful, the reconstructed data bytes are read out from
added to the reconstructed data stream to inform the source decoder left to right and top to bottom, and appended to the reconstructed
version of the info stream.
15.) For all rows of the data TSB for which the decoding operation
has failed, a sufficient number of suitable dummy symbols may be
added to the reconstructed info stream to inform the source decoder
about the missing symbols. about the missing symbols.
One can easily realize that the above rules describe an interleaver, One can easily realize that the above rules describe an interleaver,
i.e. at the sender a single codeword of a TB is spread out over n i.e. at the sender a single codeword of a TB is spread out over n
successive packets. Thus, each codeword of a transmitted TB successive packets. Thus, each codeword of a transmitted TB
experiences the same number of erasures at exactly the same experiences the same number of erasures at exactly the same
positions. positions.
Two important conclusions can be drawn from this: Two important conclusions can be drawn from this:
a) Since the same RS code is applied to all rows contained in a a) Since the same RS code is applied to all rows contained in a
specific class, either all of them can be correctly decoded or not. specific class, either all of them can be correctly decoded or not.
Hence, there exist no partly decodable classes at the receiver. Hence, there exist no partly decodable classes at the receiver.
b) If decoding is successful for a certain class CA_i, all the b) If decoding is successful for a certain class CA_i, all the
classes CA_(i+1)...CA_T can also be decoded, since they are classes CA_(i+1)...CA_T can also be decoded, since they are
protected by at least one more parity byte per row. Together with protected by at least one more parity byte per row. Together with
rule 4, it is therefore always ensured, that in case a decodable rule 6, it is therefore always ensured, that in case a decodable
enhancement layer exists, the base layer it depends on can also be enhancement layer exists, all other layers it depends on can also be
reconstructed! reconstructed!
Given the maximum erasure protection value T, the redundancy profile Given the maximum erasure protection value T, the redundancy profile
for a TB of size (L x n) shall be denoted by a so-called erasure for a data TSB of size (L_d x n) shall be denoted by a so-called
protection vector AV of length (T+1), where erasure protection vector AV of length (T+1), where
AV:=(A_0,A_1,...,A_(T-1),A_T) AV:=(A_0,A_1,...,A_(T-1),A_T)
From the above definition, it is easy to realize that the trivial From the above definition, it is easy to realize that the trivial
cases of no erasure protection and EXP are a subset of UXP: cases of no erasure protection and EXP are a subset of UXP:
a) no erasure protection at all: all application data is mapped onto a) no erasure protection at all: all application data is mapped onto
class CA_0, i.e. AV=(L,0,0,...,0). class CA_0, i.e. AV=(L_d,0,0,...,0).
b) EXP: all application data is mapped onto class CA_T, i.e. b) EXP: all application data is mapped onto class CA_T, i.e.
AV=(0,0,...,0,A_T=L). AV=(0,0,...,0,A_T=L_d).
Hence, backward compatibility to currently standardized non- Hence, backward compatibility to currently standardized non-
progressive multimedia codecs is definitely achieved. progressive multimedia codecs is definitely achieved.
7. RTP payload structure 7. RTP payload structure
For every packet whose payload results from reading out a column of For every packet whose payload is formed by reading out a column of
the TB, the RTP header must be followed by an UXP header. the TB, the RTP header must be followed by an UXP header.
7.1. Specific settings in the RTP header 7.1. Specific settings in the RTP header
The timestamp of each RTP packet resulting from reading out a TB is The timestamp of each RTP packet resulting from reading out a TB is
set to the time instant when the first byte of the progressive set to the time instant when the first byte of the progressive
source data stream has been written into the TB. This results in the source data stream has been written into the TB. This results in the
TS value being the same for all RTP packets belonging to a specific TS value being the same for all RTP packets belonging to a specific
TB. TB.
The payload type is of dynamic type, and obtained through out-of- The payload type is of dynamic type, and obtained through out-of-
band signaling similar to [1]. The signaling protocol must establish band signaling similar to [1]. The signaling protocol must establish
a payload length to be associated with the payload type value. End a payload length to be associated with the payload type value. End
systems, which cannot recognize a payload type, must discard it. systems, which cannot recognize a payload type, must discard it.
The marker bit is set to 1 for every last packet in a TB. Otherwise,
its value is 0.
All other fields in the RTP header are set to those values proposed All other fields in the RTP header are set to those values proposed
for regular multimedia transmission using the same source codecs, for regular multimedia transmission using the same source codecs,
but no erasure protection scheme enabled. but no erasure protection scheme enabled.
The RTP payload shall consist of the UXP header followed by one
column of the TB.
7.2. Structure of the UXP header 7.2. Structure of the UXP header
The UXP header shall consist of 2 octets, and is shown in Fig. 3: The UXP header shall consist of 2 octets, and is shown in Fig. 4:
0 1 1 1 1 1 1 0 1 1 1 1 1 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|X| block PT | block length n| |X| block PT | block length n|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Fig. 3: Proposed UXP header Fig. 4: Proposed UXP header
The fields in the header shall be defined as follows: The fields in the header shall be defined as follows:
- X (bit 0): extension bit, reserved for future enhancements, - X (bit 0): extension bit, reserved for future enhancements,
currently not in use -> default value: 0 currently not in use -> default value: 0
- block PT (bits 1-7): regular RTP payload type to indicate the - block PT (bits 1-7): regular RTP payload type to indicate the
primary source encoding of the media media type contained in the info stream
- block length n (bits 8-15): indicates total number of RTP packets - block length n (bits 8-15): indicates total number of RTP packets
resulting from one TB (which equals resulting from one TB (which equals
the number of columns of the TB) the number of columns of the TB)
Based on the RTP sequence number and the repetition of the block The syntax of the info stream which is protected by UXP is specified
length n in each UXP header, the receiving entity is able to by the RTP payload type field contained in the UXP header. For
recognize both TB boundaries and the actual position of lost packets example, payload type H.263 means that the info stream conforms to
in the TB. Furthermore, the specific choice of equal TS values for the specifications of the RTP profile for H.263, but does not
all RTP packets belonging to a TB allows for overcoming possible represent the "raw" H.263 stream produced by a H.263 encoder.
sequence number overflow. However, UXP can also be applied to the raw output of the media
codec (in case it is already octet-aligned), if this can be signaled
to the receiver via other means, e.g. by use of H.245 or SDP.
Based on the RTP sequence number, the marker bit, and the repetition
of the block length n in each UXP header, the receiving entity is
able to recognize both TB boundaries and the actual position of lost
packets in the TB. Furthermore, the specific choice of equal TS
values for all RTP packets belonging to a TB allows for overcoming
possible sequence number overflow.
7.3. In-band signaling of the structure of the redundancy profile 7.3. In-band signaling of the structure of the redundancy profile
To enable a dynamic adaptation to varying link conditions, the To enable a dynamic adaptation to varying link conditions, the
actual redundancy profile used for a specific TB must be signaled to actual redundancy profile used in the data TSB must be signaled to
the receiving entity. Since out-of-band signaling either results in the receiving entity. Since out-of-band signaling either results in
excessive additional control traffic, or prevents quick changes of excessive additional control traffic, or prevents quick changes of
the profile between successive TBs, an in-band signaling procedure the profile between successive TBs, an in-band signaling procedure
is desired. is desired.
As without knowledge of the correct redundancy profile, the decoding As without knowledge of the correct redundancy profile, the decoding
process cannot be applied to any of the erasure protection classes, process cannot be applied to any of the erasure protection classes,
it has to be protected as least as strongly as the base layer data it has to be protected at least as strongly as the most important
against packet loss. Therefore, a new class CA_P is added to the element in the info stream against packet loss. Therefore, an
beginning of the TB, where the number of parity symbols is by additional class CA_P is used in the signaling TSB, where the number
default set to the following value: of parity symbols is by default set to the following value:
P=ceil(n/2) P=ceil(n/2)
Hence, up to 50% of the RTP packets can be lost, before the Hence, up to 50% of the RTP packets can be lost, before the
redundancy profile cannot be recovered anymore. This seems to be a redundancy profile cannot be recovered anymore. This seems to be a
reasonable value for the lowest point of operation over a lossy reasonable value for the lowest point of operation over a lossy
link. Alternatively, p may be explicitly signaled during session link. Alternatively, p may be explicitly signaled during session
setup by means of SDP or H.245 protocol. setup by means of SDP or H.245 protocol.
Consequently, since all other classes must have equal or less Consequently, since all other classes must have equal or less
erasure protection capability, the maximum allowable value for class erasure protection capability, the maximum allowable value for class
CA_T is now limited to T<=P. CA_T in the data TSB is now limited to T<=P.
The signaling of the erasure protection vector is accomplished by The signaling of the erasure protection vector is accomplished by
means of descriptors. For each class CA_i with A_i>0, there is a means of descriptors. For each class CA_i with A_i>0, there is a
descriptor DP_i providing information about the size of class CA_i descriptor DP_i providing information about the size of class CA_i
(i.e. the value of A_i) and establishing a relationship between the (i.e. the value of A_i) and establishing a relationship between the
erasure protection of class CA_i and that of the first preceding erasure protection of class CA_i and that of the first preceding
class CA_(i+j) with A_(i+j)>0, where j>0. A descriptor DP_i is class CA_(i+j) with A_(i+j)>0, where j>0. A descriptor DP_i is
mapped onto one byte, which is sub-divided into two half-bytes (i.e. mapped onto one byte, which is sub-divided into two half-bytes (i.e.
the higher and the lower four bits). The first half-byte is of type the higher and the lower four bits). The first half-byte is of type
unsigned and contains the 4-bit representation of the decimal value unsigned and contains the 4-bit representation of the decimal value
A_i. The second half-byte is of type signed and contains the A_i. The second half-byte is of type signed and contains the
difference in erasure protection between class CA_i and class difference in erasure protection between class CA_i and class
CA_(i+j), i.e. the signed 4-bit representation of the decimal value CA_(i+j), i.e. the signed 4-bit representation of the decimal value
-j. Note that the erasure protection p and the size A_p=1 of class (-j) (where the MSB denotes the sign, and the lower three bits the
CA_p are fixed. absolute value). Note that the erasure protection p of class CA_p is
fixed, whereas the size A_p may vary.
Thus, the data to be filled into class CA_p shall consist of a Thus, the data to be filled into class CA_p shall consist of a
sequence of descriptors, where the number of descriptors is given by sequence of descriptors separated by stuffing indicators (see
the number of protection classes CA_i, 0<=i<=T, with A_i>0. When the below), where the number of descriptors is primarily given by the
number of necessary descriptors exceeds the n-p information number of protection classes CA_i, 0<=i<=T, in the data TSB with
positions, the remaining descriptors are assigned to the next non- A_i>0.
empty class CA_i providing the highest erasure protection. If the Without a-priori knowledge, the initial value for the size of the
number of descriptors is less than n-p, however, empty positions in signaling TSB should be set to one (row). When the number of
class CA_p may be filled up with the first bytes of the base layer necessary descriptors and stuffing indicators exceeds the (n-p)
to avoid stuffing. information positions, one or more additional rows have to be
reserved. This is usually done by increasing the value for L_s to
A_p>1, i.e. the data TSB is reduced to (L-A_p) rows. Hence, in order
to indicate the actual size of the signaling TSB, an additional
descriptor is inserted at the very beginning, which takes on the
value 0xq0, where q denotes the (octal) four bit representation of
the decimal value A_p.
The transition from descriptors to payload data needs not to be Furthermore, the end of each data TSB is signaled by the otherwise
signaled to the decoder, since it can be determined by the decoder unused descriptor value 0x00, followed by exactly one stuffing
through evaluation of the decoded descriptors and the a-priori indicator (SI). The latter is mapped onto a byte, which is of type
knowledge of the length L of the transmission block TB. unsigned and contains the 8-bit representation of the decimal value
Nevertheless, it can also be signaled explicitly by the otherwise of the number of media stuffing symbols used at the end of the
unused descriptor 0x00. respective data TSB.
The complete structure of the TB is now depicted in Fig. 4. The (extended) sequence of descriptors and stuffing indicators is
then mapped to the info byte positions in the A_p rows of the
signaling TSB from left to right and top to bottom. Each row is then
encoded with the same (n,n-p) RS code.
If the number of descriptors and stuffing indicators is less than
the available info byte positions, however, empty positions in class
CA_p may be filled up with the otherwise unused descriptor 0x00.
At the receiving entity, the sequence of descriptors shall be
recovered by performing erasure decoding on the first row of the TB
(which definitely belongs to the signaling TSB) using the same
algorithm as later for the data TSB. If successful, the very first
descriptor now indicates the number of rows of the signaling TSB,
and the next (A_p-1) rows are decoded to reconstruct the redundancy
profile for the data TSB(s), together with the number of media
stuffing symbols denoted by the respective SI(s).
The complete structure of the TB is now depicted in Fig. 5.
Transmission Block (TB) Transmission Block (TB)
P P
<---------> <--------->
/\ +-+-+-+-+-+-+-+-+-+ /\ /\ +-+-+-+-+-+-+-+-+-+ /\
| |?|?|?|?|*|*|*|*|*| | A_P=1 | |?|?|?|?|*|*|*|*|*| | A_P=1
| +-+-+-+-+-+-+-+-+-+ \/ | +-+-+-+-+-+-+-+-+-+ \/
| |&|&|&|&|&|*|*|*|*| /\ | |&|&|&|&|&|*|*|*|*| /\
| +-+-+-+-+-+-+-+-+-+ | A_T=3 | +-+-+-+-+-+-+-+-+-+ | A_T=3
| |&|&|&|&|&|*|*|*|*| | | |&|&|&|&|&|*|*|*|*| |
| +-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+ |
L bytes | |&|&|&|&|&|*|*|*|*| \/ L bytes | |&|&|&|&|&|*|*|*|*| \/
payload | +-+-+-+-+-+-+-+-+-+ /\ payload | +-+-+-+-+-+-+-+-+-+ /\
per packet | +%|%|%|%|%|%|*|*|*| | A_(T-1)=1 per packet | +%|%|%|%|%|%|*|*|*| | A_(T-1)=1
| +-+-+-+-+-+-+-+-+-+ \/ | +-+-+-+-+-+-+-+-+-+ \/
| |$|$|$|$|$|$|$|*|*| . | |$|$|$|$|$|$|$|*|*| .
| +-+-+-+-+-+-+-+-+-+ . | +-+-+-+-+-+-+-+-+-+ .
| |||||||||*| . | |||||||||*| .
| +-+-+-+-+-+-+-+-+-+ /\ | +-+-+-+-+-+-+-+-+-+ /\
| |#|#|#|#|#|#|#|#|#| | A_0=1 | |#|#|#|#|#|#|#|#|#| | A_0=1
\/ +-+-+-+-+-+-+-+-+-+ \/ \/ +-+-+-+-+-+-+-+-+-+ \/
<-----------------> <----------------->
n packets n packets
? : descriptors for in-band signaling of the redundancy ? : descriptors and stuffing indicators for in-band
profile signaling of the redundancy profile
&,%,$,,# : info bytes belonging to a certain source coding layer &,%,$,,# : info bytes belonging to a certain element of the
in decreasing order of importance info stream in decreasing order of importance
* : parity bytes gained from Reed-Solomon coding * : parity bytes gained from Reed-Solomon coding
Fig. 4: General structure for UXP with in-band signaling of the Fig. 5: General structure for UXP with in-band signaling of the
redundancy profile redundancy profile
The following simple example is meant to illustrate the idea behind The following simple example is meant to illustrate the idea behind
using descriptors: Let an erasure protection vector of length T+1=7 using descriptors: Let an erasure protection vector of length T+1=7
be given as follows: be given as follows:
AV=(A_0,A_1,...,A_5,A_6)=(7,0,2,2,0,3,10) AV=(A_0,A_1,...,A_5,A_6)=(7,0,2,2,0,3,10)
Hence, the length L of the TB (including one row for the Hence, the length L of the TB (including one row for the signaling
descriptors) is equal to 7+2+2+3+10+1=25 (rows/bytes). If the width TSB) is equal to 7+2+2+3+10+1=25 (rows/bytes). If the width is
is assumed to be equal to 20 (columns/packets), then the erasure assumed to be equal to 20 (columns/packets), then the erasure
protection of the descriptors is p=10. protection of the descriptors is p=10.
The corresponding sequence of descriptors can be written as The corresponding sequence of descriptors can be written as
DP=(DP_6,DP_5,DP_3,DP_2,DP_0)=(0xAC,0x39,0x2A,0x29,0x7A), DP=(DP_6,DP_5,DP_3,DP_2,DP_0)=(0xAC,0x39,0x2A,0x29,0x7A),
where the values of the descriptors are given in hexadecimal where the values of the descriptors are given in hexadecimal
notation. notation. Next, the descriptor indicating the length of the
signaling TSB has to be inserted, the end of the data TSB has to be
marked by 0x00, and the SI has to be appended. If the number of
media stuffing symbols is assumed to be 3, the 10 info bytes in the
signaling TSB take on the following values (descriptor stuffing
included):
Optional Concatenation of Transmission Blocks: (0x10,0xAC,0x39,0x2A,0x29,0x7A,0x00,0x03,0x00,0x00)
The following procedure may be applied if a single message block 7.4 Optional Concatenation of Transmission Sub Blocks:
would be too short to achieve an efficient mapping to a transmission
block with respect to the fixed payload length L and the desired
number of packets n. For example, intra-coded video frames (I-
frames) are usually much larger than the following predicted ones
(P-frames). In this case, a certain number z of successive small
message blocks should be each mapped to a transmission block with
length L(y) and width n, such that L(1)+L(2)+?+L(z)=L.
The resulting transmission blocks can then be easily concatenated to
form a super-TB of size L x n.
Since the second half-byte of the descriptors is of type signed, we The following procedure may be applied if a single info stream would
are able to signal both decreasing and increasing erasure protection be too short to achieve an efficient mapping to a transmission block
profiles within one single sequence of descriptors at the beginning with respect to the fixed payload length L and the desired number of
of the super-TB. packets n. For example, intra-coded video frames (I-frames) are
usually much larger than the following predicted ones (P-frames). In
this case, a certain number z of successive small info streams
should be each mapped to a transmission sub block with length L_d(y)
and width n, such that L_d(1)+L_d(2)+?+L_d(z)=L_d.
The resulting transmission sub blocks can then be easily
concatenated to form a TB of size L x n having one common signaling
TSB: Since the second half-byte of the descriptors is of type
signed, we are able to incorporate both decreasing and increasing
erasure protection profiles within one single signaling TSB.
Note that once the lengths L_d(y) of the individual blocks have been
fixed, the respective redundancy profiles can be determined
independently of each other. However, the space initially reserved
for the signaling TSB should be already large enough to avoid
profile recalculation for each of the data TSBs in case the sequence
of descriptors gets too long!
Again, we will give a simple example to illustrate this idea: Let Again, we will give a simple example to illustrate this idea: Let
the erasure protection vectors for two concatenated TBs be given as the erasure protection vectors for two concatenated data TSBs be
follows: given as follows:
AV1=(A1_0,A1_1,...,A1_5,A1_6)=(0,0,2,2,0,3,10), AV1=(A1_0,A1_1,...,A1_5,A1_6)=(0,0,2,2,0,3,10),
AV2=(A2_0,A2_1,...,A2_5,A2_6)=(0,0,2,2,0,3,10). AV2=(A2_0,A2_1,...,A2_5,A2_6)=(0,0,2,2,0,3,10).
Hence, two single identical TBs will be concatenated to form a Hence, two single identical data TSBs will be concatenated to form a
super-TB of length L=2*(2+2+3+10)+1=35 (rows/bytes). If the width is TB of length L=2*(2+2+3+10)+2=36 (rows/bytes). If the width is again
again assumed to be equal to 20 (columns/packets), then the erasure assumed to be equal to 20 (columns/packets), then the erasure
protection of the descriptors is p=10. The corresponding sequence of protection of the descriptors is p=10, and therefore a total of two
descriptors can now be written as rows for the signaling TSB have been reserved this time. The
corresponding sequence of descriptors can now be written as
DP=(0xAC,0x39,0x2A,0x29,0xA4,0x39,0x2A,0x29), where the values of DP=(0xAC,0x39,0x2A,0x29,0xA4,0x39,0x2A,0x29), where the values of
the descriptors are given in hexadecimal notation. the descriptors are given in hexadecimal notation. If the number of
media stuffing symbols is assumed to be 3 for each data TSB, the 20
info byte positions in the signaling TSB are filled with the
following values (descriptor stuffing included):
(0x20,0xAC,0x39,0x2A,0x29,0x00,0x03,0xA4,0x39,0x2A,0x29,0x00,0x03,
0x00,0x00,0x00,0x00,0x00,0x00,0x00)
8. Security Considerations 8. Security Considerations
The payload of the RTP-packets consists of an interleaved The payload of the RTP-packets consists of an interleaved multimedia
multimedia- and parity-stream. Therefore, it is reasonable to and parity stream. Therefore, it is reasonable to encrypt the
encrypt the resulting stream with one key rather than using resulting stream with one key rather than using different keys for
different keys for multimedia and parity-data. It should also be multimedia and parity data. It should also be noted that encryption
noted that encryption of the multimedia data without encryption of of the multimedia data without encryption of the parity data could
the parity-data could enable known-plaintext attacks. enable known-plaintext attacks.
The amount of parity bytes per TB should be chosen carefully if the The overall proportion between parity bytes and info bytes should be
packet loss is due to network congestion. If the amount of parity chosen carefully if the packet loss is due to network congestion. If
bytes per TB is raised to cope with increasing packet loss, this can the proportion of parity bytes per TB is increased in this case, it
lead to increasing network congestion. Therefore, the amount of could lead to increasing network congestion. Therefore, the
parity bytes per TB MUST NOT be significantly increased as packet proportion between parity bytes and info bytes per TB MUST NOT be
loss increases due to network congestion. increased as packet loss increases due to network congestion.
9. References The overall ratio between parity and info bytes MUST NOT be higher
than 1:1, i.e. the absolute bitrate spent for redundancy must not be
larger than the bitrate required for transmission of multimedia data
itself.
9. Application Statement
There are currently two different schemes proposed for unequal error
protection in the IETF-AVT: Unequal Level Protection (ULP) and
Unequal Erasure Protection (UXP).
Although both methods seem to address the same problem, the proposed
solutions differ in many respects. This section tries to describe
possible application scenarios and to show the strength and
weaknesses of both approaches.
The main difference between both approaches is that while ULP
preserves the structure of the packets which have to protected and
provides the redundancy in extra packets, UXP interleaves the info
stream which has to be protected, inserts the redundancy information,
and thus creates a totally new packet structure.
Another difference concerns multicast compatibility: It cannot be
assumed that all future terminals will be able to apply UXP/ULP.
Therefore, backward compatibility could be an issue in some cases.
Since ULP does not change the original packet structure, but only
adds some extra packets, it is possible for terminals which do not
support ULP to discard the extra packets. In case of UXP, however,
two separate streams with and without erasure protection have to be
sent, which increases the bandwidth.
Next, both approaches offer different mechanism to adjust packet
sizes, if necessary: UXP allows to adjust the packet sizes
arbitrarily. This is an advantage in case the loss probability is
dependent on the packet length, which happens, for example, if the
end-to-end connection contains wireless links. In this case proper
adjustment of the packet size is one essential network adaption
technique. In addition, if a preencoded stream is sent over the
network, the packet size can be adjusted independently of slice
structures.
Since ULP does not change the existing packetization scheme, this
flexibility does not exist.
The ability of UXP to adjust the packet size arbitrarily can be
especially exploited in a streaming scenario, if a delay of several
hundred milliseconds is acceptable. It is then possible to fill
several video frames into a single TB of desired size, e.g. a group
of pictures consisting of I-frame, P-frames and B-frames. The
redundancy scheme can thus be selected in such a way as to guarantee
the following property: In case of packet loss, the streams for P-
frames are only recoverable, if the I-frame, on which the decoding of
P-frames depends, is recoverable. The same is true for B-frames,
which can only be decoded if the respective P-frames are recoverable.
This prevents situations in which, for example, the B-frames have
been received correctly, but the P-frames have been lost, i.e.
assures a gradual decrease in application quality also on the frame
level. Of course, a similar encoding is possible with ULP. But in
this case one might have to send several frames within one packet
which leads to large packet sizes.
Finally, decoding delay is also a crucial issue in communications.
Again, both approaches have different delay properties: UXP
introduces a decoding delay because a reasonable amount of correctly
received packets are necessary to start decoding of a TB. The delay
in general depends on the dimensions of the interleaver. This should
be considered for any system design which includes UXP.
With ULP, every correctly received media packet can be decoded right
away. However, a significant delay is introduced, if packets are
corrupted, because in this case one has to wait for several
redundancy packets. Thus, the delay is in general dependent on the
actual ULP-FEC-packet scheme and cannot be considered in advance
during the system design phase.
10. Intellectual Property Considerations
Siemens AG has filed patent applications that might possibly have
technical relations to this contribution.
On IPR related issues, Siemens AG refers to the Siemens Statement on
Patent Licensing, see http://www.ietf.org/ietf/IPR/SIEMENS-General.
11. References
[1] J. Rosenberg and H. Schulzrinne, "An RTP Payload Format for [1] J. Rosenberg and H. Schulzrinne, "An RTP Payload Format for
Generic Forward Error Correction", Request for Comments 2733, Generic Forward Error Correction", Request for Comments 2733,
Internet Engineering Task Force, Dec. 1999. Internet Engineering Task Force, Dec. 1999.
[2] A. Albanese, J. Bloemer, J. Edmonds, M. Luby, and M. Sudan, [2] A. Albanese, J. Bloemer, J. Edmonds, M. Luby, and M. Sudan,
"Priority encoding transmission", IEEE Trans. Inform. Theory, vol. "Priority encoding transmission", IEEE Trans. Inform. Theory, vol.
42, no. 6, pp. 1737-1744, Nov. 1996. 42, no. 6, pp. 1737-1744, Nov. 1996.
[3] Shu Lin and Daniel J. Costello, Error Control Coding: [3] Shu Lin and Daniel J. Costello, Error Control Coding:
skipping to change at page 16, line 5 skipping to change at page 21, line 34
Mode for H.26L", ITU-T SG16, Q.15, Q15-J24, Osaka, May 2000. Mode for H.26L", ITU-T SG16, Q.15, Q15-J24, Osaka, May 2000.
[6] F. Burkert, T. Stockhammer, and J. Pandel, "Progressive A/V [6] F. Burkert, T. Stockhammer, and J. Pandel, "Progressive A/V
coding for lossy packet networks - a principle approach", Tech. coding for lossy packet networks - a principle approach", Tech.
Rep., ITU-T SG16, Q.15, Q15-I36, Red Bank, N.J., Oct. 1999. Rep., ITU-T SG16, Q.15, Q15-I36, Red Bank, N.J., Oct. 1999.
[7] Guenther Liebl, "Modeling, theoretical analysis, and coding for [7] Guenther Liebl, "Modeling, theoretical analysis, and coding for
wireless packet erasure channels", Diploma Thesis, Inst. for wireless packet erasure channels", Diploma Thesis, Inst. for
Communications Engineering, Munich University of Technology, 1999. Communications Engineering, Munich University of Technology, 1999.
10. Acknowledgments 12. Acknowledgments
Many thanks to Thomas Stockhammer, who initially came up with the Many thanks to Thomas Stockhammer, who initially came up with the
idea of unequal erasure protection to improve progressive video idea of unequal erasure protection to improve progressive video
transmission over lossy networks. transmission over lossy networks.
11. Author's Addresses 13. Author's Addresses
Guenther Liebl, Thomas Stockhammer Guenther Liebl, Thomas Stockhammer
Institute for Communications Engineering (LNT) Institute for Communications Engineering (LNT)
Munich University of Technology Munich University of Technology
D-80290 Munich D-80290 Munich
Germany Germany
Email: {liebl,tom}@lnt.e-technik.tu-muenchen.de Email: {liebl,tom}@lnt.e-technik.tu-muenchen.de
Minh-Ha Nguyen, Frank Burkert Minh-Ha Nguyen, Frank Burkert
Siemens AG - ICM D MP RD MCH 83/81 Siemens AG - ICM D MP RD MCH 83/81
D-81675 Munich D-81675 Munich
Germany Germany
Email: {minhha.nguyen,frank.burkert}@mch.siemens.de Email: {minhha.nguyen,frank.burkert}@mch.siemens.de
Marcel Wagner, Juergen Pandel, Gero Baese Marcel Wagner, Juergen Pandel, Wenrong Weng, Gero Baese
Siemens AG - Corporate Technology CT IC 2 Siemens AG - Corporate Technology CT IC 2
D-81730 Munich D-81730 Munich
Germany Germany
Email: {marcel.wagner,juergen.pandel,gero.baese}@mchp.siemens.de Email:
{marcel.wagner,juergen.pandel,wenrong.weng,gero.baese}@mchp.siemens.
de
Full Copyright Statement Full Copyright Statement
"Copyright (C) The Internet Society (date). All Rights Reserved. "Copyright (C) The Internet Society (date). All Rights Reserved.
This document and translations of it may be copied and furnished to This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph kind, provided that the above copyright notice and this paragraph
are included on all such copies and derivative works. However, this are included on all such copies and derivative works. However, this
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/