draft-ietf-rmcat-video-traffic-model-05.txt | draft-ietf-rmcat-video-traffic-model-06.txt | |||
---|---|---|---|---|
Network Working Group X. Zhu | Network Working Group X. Zhu | |||
Internet-Draft S. Mena | Internet-Draft S. Mena | |||
Intended status: Informational Cisco Systems | Intended status: Informational Cisco Systems | |||
Expires: January 20, 2019 Z. Sarker | Expires: May 7, 2019 Z. Sarker | |||
Ericsson AB | Ericsson AB | |||
July 19, 2018 | November 3, 2018 | |||
Video Traffic Models for RTP Congestion Control Evaluations | Video Traffic Models for RTP Congestion Control Evaluations | |||
draft-ietf-rmcat-video-traffic-model-05 | draft-ietf-rmcat-video-traffic-model-06 | |||
Abstract | Abstract | |||
This document describes two reference video traffic models for | This document describes two reference video traffic models for | |||
evaluating RTP congestion control algorithms. The first model | evaluating RTP congestion control algorithms. The first model | |||
statistically characterizes the behavior of a live video encoder in | statistically characterizes the behavior of a live video encoder in | |||
response to changing requests on target video rate. The second model | response to changing requests on target video rate. The second model | |||
is trace-driven, and emulates the output of actual encoded video | is trace-driven, and emulates the output of actual encoded video | |||
frame sizes from a high-resolution test sequence. Both models are | frame sizes from a high-resolution test sequence. Both models are | |||
designed to strike a balance between simplicity, repeatability, and | designed to strike a balance between simplicity, repeatability, and | |||
skipping to change at page 1, line 42 ¶ | skipping to change at page 1, line 42 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on January 20, 2019. | This Internet-Draft will expire on May 7, 2019. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2018 IETF Trust and the persons identified as the | Copyright (c) 2018 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 2, line 24 ¶ | skipping to change at page 2, line 24 ¶ | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
3. Desired Behavior of A Synthetic Video Traffic Model . . . . . 3 | 3. Desired Behavior of A Synthetic Video Traffic Model . . . . . 3 | |||
4. Interactions Between Synthetic Video Traffic Source and | 4. Interactions Between Synthetic Video Traffic Source and | |||
Other Components at the Sender . . . . . . . . . . . . . . . 4 | Other Components at the Sender . . . . . . . . . . . . . . . 4 | |||
5. A Statistical Reference Model . . . . . . . . . . . . . . . . 6 | 5. A Statistical Reference Model . . . . . . . . . . . . . . . . 6 | |||
5.1. Time-damped response to target rate update . . . . . . . 7 | 5.1. Time-damped response to target rate update . . . . . . . 7 | |||
5.2. Temporary burst and oscillation during transient . . . . 8 | 5.2. Temporary burst and oscillation during the transient | |||
period . . . . . . . . . . . . . . . . . . . . . . . . . 8 | ||||
5.3. Output rate fluctuation at steady state . . . . . . . . . 8 | 5.3. Output rate fluctuation at steady state . . . . . . . . . 8 | |||
5.4. Rate range limit imposed by video content . . . . . . . . 9 | 5.4. Rate range limit imposed by video content . . . . . . . . 9 | |||
6. A Trace-Driven Model . . . . . . . . . . . . . . . . . . . . 9 | 6. A Trace-Driven Model . . . . . . . . . . . . . . . . . . . . 9 | |||
6.1. Choosing the video sequence and generating the traces . . 10 | 6.1. Choosing the video sequence and generating the traces . . 10 | |||
6.2. Using the traces in the synthetic codec . . . . . . . . . 11 | 6.2. Using the traces in the synthetic codec . . . . . . . . . 11 | |||
6.2.1. Main algorithm . . . . . . . . . . . . . . . . . . . 11 | 6.2.1. Main algorithm . . . . . . . . . . . . . . . . . . . 11 | |||
6.2.2. Notes to the main algorithm . . . . . . . . . . . . . 13 | 6.2.2. Notes to the main algorithm . . . . . . . . . . . . . 13 | |||
6.3. Varying frame rate and resolution . . . . . . . . . . . . 13 | 6.3. Varying frame rate and resolution . . . . . . . . . . . . 13 | |||
7. Combining The Two Models . . . . . . . . . . . . . . . . . . 14 | 7. Combining The Two Models . . . . . . . . . . . . . . . . . . 14 | |||
8. Implementation Status . . . . . . . . . . . . . . . . . . . . 15 | 8. Implementation Status . . . . . . . . . . . . . . . . . . . . 15 | |||
skipping to change at page 3, line 42 ¶ | skipping to change at page 3, line 44 ¶ | |||
A live video encoder employs encoder rate control to meet a target | A live video encoder employs encoder rate control to meet a target | |||
rate by varying its encoding parameters, such as quantization step | rate by varying its encoding parameters, such as quantization step | |||
size, frame rate, and picture resolution, based on its estimate of | size, frame rate, and picture resolution, based on its estimate of | |||
the video content (e.g., motion and scene complexity). In practice, | the video content (e.g., motion and scene complexity). In practice, | |||
however, several factors prevent the output video rate from perfectly | however, several factors prevent the output video rate from perfectly | |||
conforming to the input target rate. | conforming to the input target rate. | |||
Due to uncertainties in the captured video scene, the output rate | Due to uncertainties in the captured video scene, the output rate | |||
typically deviates from the specified target. In the presence of a | typically deviates from the specified target. In the presence of a | |||
significant change in target rate, it sometimes takes several frames | significant change in target rate, the encoder output frame sizes | |||
before the encoder output rate converges to the new target. Finally, | sometimes fluctuates for a short, transient period of time before the | |||
while most of the frames in a live session are encoded in predictive | output rate converges to the new target. Finally, while most of the | |||
mode, the encoder can occasionally generate a large intra-coded frame | frames in a live session are encoded in predictive mode, the encoder | |||
(or a frame partially containing intra-coded blocks) in an attempt to | can occasionally generate a large intra-coded frame (or a frame | |||
recover from losses, to re-sync with the receiver, or during the | partially containing intra-coded blocks) in an attempt to recover | |||
transient period of responding to target rate or spatial resolution | from losses, to re-sync with the receiver, or during the transient | |||
changes. | period of responding to target rate or spatial resolution changes. | |||
Hence, a synthetic video source should have the following | Hence, a synthetic video source should have the following | |||
capabilities: | capabilities: | |||
o To change bitrate. This includes ability to change framerate and/ | o To change bitrate. This includes ability to change framerate and/ | |||
or spatial resolution, or to skip frames when required. | or spatial resolution, or to skip frames when required. | |||
o To fluctuate around the target bitrate specified by the congestion | o To fluctuate around the target bitrate specified by the congestion | |||
control module. | control module. | |||
skipping to change at page 5, line 15 ¶ | skipping to change at page 5, line 15 ¶ | |||
Section 6 --- follow the same set of interactions. | Section 6 --- follow the same set of interactions. | |||
The synthetic video source dynamically generates a sequence of dummy | The synthetic video source dynamically generates a sequence of dummy | |||
video frames with varying size and interval. These dummy frames are | video frames with varying size and interval. These dummy frames are | |||
processed by other modules in order to transmit the video stream over | processed by other modules in order to transmit the video stream over | |||
the network. During the lifetime of a video transmission session, | the network. During the lifetime of a video transmission session, | |||
the synthetic video source will typically be required to adapt its | the synthetic video source will typically be required to adapt its | |||
encoding bitrate, and sometimes the spatial resolution and frame | encoding bitrate, and sometimes the spatial resolution and frame | |||
rate. | rate. | |||
In our model, the synthetic video source module has a group of | In this model, the synthetic video source module has a group of | |||
incoming and outgoing interface calls that allow for interaction with | incoming and outgoing interface calls that allow for interaction with | |||
other modules. The following are some of the possible incoming | other modules. The following are some of the possible incoming | |||
interface calls --- marked as (a) in Figure 1 --- that the synthetic | interface calls --- marked as (a) in Figure 1 --- that the synthetic | |||
video traffic source may accept. The list is not exhaustive and can | video traffic source may accept. The list is not exhaustive and can | |||
be complemented by other interface calls if deemed necessary. | be complemented by other interface calls if deemed necessary. | |||
o Target rate R_v: target rate request, typically calculated by the | o Target rate R_v: target rate request, typically calculated by the | |||
congestion control module and updated dynamically over time. | congestion control module and updated dynamically over time. | |||
Depending on the congestion control algorithm in use, the update | Depending on the congestion control algorithm in use, the update | |||
requests can either be periodic (e.g., once per second), or on- | requests can either be periodic (e.g., once per second), or on- | |||
skipping to change at page 7, line 14 ¶ | skipping to change at page 7, line 14 ¶ | |||
+===========+====================================+================+ | +===========+====================================+================+ | |||
| Notation | Parameter Name | Example Value | | | Notation | Parameter Name | Example Value | | |||
+===========+====================================+================+ | +===========+====================================+================+ | |||
| R_v | Target rate request | 1 Mbps | | | R_v | Target rate request | 1 Mbps | | |||
+-----------+------------------------------------+----------------+ | +-----------+------------------------------------+----------------+ | |||
| FPS | Target frame rate | 30 Hz | | | FPS | Target frame rate | 30 Hz | | |||
+-----------+------------------------------------+----------------+ | +-----------+------------------------------------+----------------+ | |||
| tau_v | Encoder reaction latency | 0.2 s | | | tau_v | Encoder reaction latency | 0.2 s | | |||
+-----------+------------------------------------+----------------+ | +-----------+------------------------------------+----------------+ | |||
| K_d | Burst duration during transient | 8 frames | | | K_d | Burst duration of the transient | 8 frames | | |||
| | period | | | ||||
+-----------+------------------------------------+----------------+ | +-----------+------------------------------------+----------------+ | |||
| K_B | Burst frame size during transient | 13.5 KBytes* | | | K_B | Burst frame size during the | 13.5 KBytes* | | |||
| | transient period | | | ||||
+-----------+------------------------------------+----------------+ | +-----------+------------------------------------+----------------+ | |||
| t0 | Reference frame interval 1/FPS | 33 ms | | | t0 | Reference frame interval 1/FPS | 33 ms | | |||
+-----------+------------------------------------+----------------+ | +-----------+------------------------------------+----------------+ | |||
| B0 | Reference frame size R_v/8/FPS | 4.17 KBytes | | | B0 | Reference frame size R_v/8/FPS | 4.17 KBytes | | |||
+-----------+------------------------------------+----------------+ | +-----------+------------------------------------+----------------+ | |||
| | Scaling parameter of the zero-mean | | | | | Scaling parameter of the zero-mean | | | |||
| | Laplacian distribution describing | | | | | Laplacian distribution describing | | | |||
| SCALE_t | deviations in normalized frame | 0.15 | | | SCALE_t | deviations in normalized frame | 0.15 | | |||
| | interval (t-t0)/t0 | | | | | interval (t-t0)/t0 | | | |||
+-----------+------------------------------------+----------------+ | +-----------+------------------------------------+----------------+ | |||
skipping to change at page 8, line 7 ¶ | skipping to change at page 8, line 8 ¶ | |||
5.1. Time-damped response to target rate update | 5.1. Time-damped response to target rate update | |||
While the congestion control module can update its target rate | While the congestion control module can update its target rate | |||
request R_v at any time, the statistical model dictates that the | request R_v at any time, the statistical model dictates that the | |||
encoder will only react to such changes tau_v seconds after a | encoder will only react to such changes tau_v seconds after a | |||
previous rate transition. In other words, when the encoder has | previous rate transition. In other words, when the encoder has | |||
reacted to a rate change request at time t, it will simply ignore all | reacted to a rate change request at time t, it will simply ignore all | |||
subsequent rate change requests until time t+tau_v. | subsequent rate change requests until time t+tau_v. | |||
5.2. Temporary burst and oscillation during transient | 5.2. Temporary burst and oscillation during the transient period | |||
The output rate R_o during the period [t, t+tau_v] is considered to | The output rate R_o during the period [t, t+tau_v] is considered to | |||
be in transient. Based on observations from video encoder output | be in a transient state. Based on observations from video encoder | |||
data, the transient behavior of an encoder upon reacting to a new | output data, the encoder reaction to a new target rate request can be | |||
target rate request is modelled in the form of high variation in | characterized by high variation in output frame sizes. It is assumed | |||
output frame sizes. It is assumed that the overall average output | in the model that the overall average output rate R_o during this | |||
rate R_o during this period matches the target rate R_v. | transient period matches the target rate R_v. Consequently, the | |||
Consequently, the occasional burst of large frames are followed by | occasional burst of large frames are followed by smaller-than-average | |||
smaller-than-average encoded frames. | encoded frames. | |||
This temporary burst is characterized by two parameters: | This temporary burst is characterized by two parameters: | |||
o burst duration K_d: number of frames in the burst event; and | o burst duration K_d: number of frames in the burst event; and | |||
o burst frame size K_B: size of the initial burst frame which is | o burst frame size K_B: size of the initial burst frame which is | |||
typically significantly larger than average frame size at steady | typically significantly larger than average frame size at steady | |||
state. | state. | |||
It can be noted that these burst parameters can also be used to mimic | It can be noted that these burst parameters can also be used to mimic | |||
skipping to change at page 10, line 24 ¶ | skipping to change at page 10, line 24 ¶ | |||
are representative of the target use cases for the video traffic | are representative of the target use cases for the video traffic | |||
model. For the example use case of interactive video conferencing, | model. For the example use case of interactive video conferencing, | |||
it is recommended to choose a low-motion sequence that resembles a | it is recommended to choose a low-motion sequence that resembles a | |||
"talking head", e.g. from a news broadcast or recording of an actual | "talking head", e.g. from a news broadcast or recording of an actual | |||
video conferencing call. | video conferencing call. | |||
The length of the chosen video sequence is a tradeoff. If it is too | The length of the chosen video sequence is a tradeoff. If it is too | |||
long, it will be difficult to manage the data structures containing | long, it will be difficult to manage the data structures containing | |||
the traces. If it is too short, there will be an obvious periodic | the traces. If it is too short, there will be an obvious periodic | |||
pattern in the output frame sizes, leading to biased results when | pattern in the output frame sizes, leading to biased results when | |||
evaluating congestion control performance. In our experience, a | evaluating congestion control performance. It has been empirically | |||
sequence with a length between 2 and 4 minutes is a fair tradeoff. | determined that a sequence with a length between 2 and 4 minutes | |||
strikes a fair tradeoff. | ||||
Given the chosen raw video sequence, denoted S, one can use a live | Given the chosen raw video sequence, denoted S, one can use a live | |||
encoder, e.g. some implementation of [H264] or [HEVC], to produce a | encoder, e.g. some implementation of [H264] or [HEVC], to produce a | |||
set of encoded sequences. As discussed in Section 3, the output | set of encoded sequences. As discussed in Section 3, the output | |||
bitrate of the live encoder can be achieved by tuning three input | bitrate of the live encoder can be achieved by tuning three input | |||
parameters: quantization step size, frame rate, and picture | parameters: quantization step size, frame rate, and picture | |||
resolution. In order to simplify the choice of these parameters for | resolution. In order to simplify the choice of these parameters for | |||
a given target rate, one can typically assume a fixed frame rate | a given target rate, one can typically assume a fixed frame rate | |||
(e.g. 30 fps) and a fixed resolution (e.g., 720p) when configuring | (e.g. 30 fps) and a fixed resolution (e.g., 720p) when configuring | |||
the live encoder. See Section 6.3 for a discussion on how to relax | the live encoder. See Section 6.3 for a discussion on how to relax | |||
skipping to change at page 13, line 14 ¶ | skipping to change at page 13, line 14 ¶ | |||
factor = R_v / R_min | factor = R_v / R_min | |||
framesize = max(1, factor * Traces[R_min][t_current]) | framesize = max(1, factor * Traces[R_min][t_current]) | |||
c) R_v >= R_max: the output frame size is calculated by scaling with | c) R_v >= R_max: the output frame size is calculated by scaling with | |||
respect to the highest bitrate R_max: | respect to the highest bitrate R_max: | |||
factor = R_v / R_max | factor = R_v / R_max | |||
framesize = factor * Traces[R_max][t_current] | framesize = factor * Traces[R_max][t_current] | |||
In case b), we set the minimum output size to 1 byte, since the value | In case b), the minimum output size is set to 1 byte, since the value | |||
of factor can be arbitrarily close to 0. | of factor can be arbitrarily close to 0. | |||
6.2.2. Notes to the main algorithm | 6.2.2. Notes to the main algorithm | |||
Note that main algorithm as described above can be further extended | Note that main algorithm as described above can be further extended | |||
to mimic some additional typical behaviors of a live video encoder. | to mimic some additional typical behaviors of a live video encoder. | |||
Two examples are given below: | Two examples are given below: | |||
o I-frames on demand: The synthetic codec can be extended to | o I-frames on demand: The synthetic codec can be extended to | |||
simulate the sending of I-frames on demand, e.g., as a reaction to | simulate the sending of I-frames on demand, e.g., as a reaction to | |||
skipping to change at page 14, line 42 ¶ | skipping to change at page 14, line 42 ¶ | |||
whereas it is straightforward for a trace-driven model to obtain | whereas it is straightforward for a trace-driven model to obtain | |||
encoded frame size data. On the other hand, once validated, the | encoded frame size data. On the other hand, once validated, the | |||
statistical model is more flexible in mimicking a wide range of | statistical model is more flexible in mimicking a wide range of | |||
encoder/content behaviors by simply varying the correponding | encoder/content behaviors by simply varying the correponding | |||
parameters in the model. In this regard, a trace-driven model relies | parameters in the model. In this regard, a trace-driven model relies | |||
-- by definition -- on additional data collection efforts for | -- by definition -- on additional data collection efforts for | |||
accommodating new codecs or video contents. | accommodating new codecs or video contents. | |||
In general, the trace-driven model is more realistic for mimicking | In general, the trace-driven model is more realistic for mimicking | |||
ongoing, steady-state behavior of a video traffic source whereas the | ongoing, steady-state behavior of a video traffic source whereas the | |||
statistical model is more versatile for simulating transient events | statistical model is more versatile for simulating its transient- | |||
(e.g., when target rate changes from A to B with temporary bursts | state behavior such as a sudden rate change. It is also possible to | |||
during the transition). It is also possible to combine both models | combine both methods into a hybrid model, so that the steady-state | |||
into a hybrid approach, using traces during steady-state and | behavior is driven by traces during steady-state and the transient- | |||
statistical model during transients. | state behavior is driven by the statistical model. | |||
+---------------+ | transient +---------------+ | |||
transient | Generate next | | state | Generate next | | |||
+------>| K_d transient | | +------>| K_d transient | | |||
+-------------+ / | frames | | +-------------+ / | frames | | |||
R_v | Compare | / +---------------+ | R_v | Compare | / +---------------+ | |||
------->| against |/ | ------->| against |/ | |||
| previous | | | previous | | |||
| target rate |\ | | target rate |\ | |||
+-------------+ \ +---------------+ | +-------------+ \ +---------------+ | |||
\ | Generate next | | \ | Generate next | | |||
+------>| frame from | | +------>| frame from | | |||
steady-state | trace | | steady | trace | | |||
+---------------+ | state +---------------+ | |||
Figure 3: Hybrid approach for modeling video traffic | Figure 3: A hybrid video traffic model | |||
As shown in Figure 3, the video traffic model operates in transient | As shown in Figure 3, the video traffic model operates in transient | |||
state if the requested target rate R_v is substantially higher than | state if the requested target rate R_v is substantially higher than | |||
the previous target, or else it operates in steady state. During | the previous target, or else it operates in steady state. During the | |||
transient state, a total of K_d frames are generated by the | transient state, a total of K_d frames are generated by the | |||
statistical model, resulting in one (1) big burst frame with size K_B | statistical model, resulting in one (1) big burst frame with size K_B | |||
followed by K_d-1 smaller frames. When operating at steady-state, | followed by K_d-1 smaller frames. When operating at steady-state, | |||
the video traffic model simply generates a frame according to the | the video traffic model simply generates a frame according to the | |||
trace-driven model given the target rate, while modulating the frame | trace-driven model given the target rate, while modulating the frame | |||
interval according to the distribution specified by the statistical | interval according to the distribution specified by the statistical | |||
model. One example criterion for determining whether the traffic | model. One example criterion for determining whether the traffic | |||
model should operate in transient state is whether the rate increase | model should operate in transient state is whether the rate increase | |||
exceeds 10% of previous target rate. Finally, as this model follows | exceeds 10% of previous target rate. Finally, as this model follows | |||
transient state behavior dictated by the statistical model, upon a | transient state behavior dictated by the statistical model, upon a | |||
End of changes. 18 change blocks. | ||||
38 lines changed or deleted | 42 lines changed or added | |||
This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |