draft-ietf-payload-vp9-14.txt   draft-ietf-payload-vp9-15.txt 
AVTCore Working Group J. Uberti AVTCore Working Group J. Uberti
Internet-Draft S. Holmer Internet-Draft S. Holmer
Intended status: Standards Track M. Flodman Intended status: Standards Track M. Flodman
Expires: 5 December 2021 D. Hong Expires: 6 December 2021 D. Hong
Google Google
J. Lennox J. Lennox
8x8 / Jitsi 8x8 / Jitsi
3 June 2021 4 June 2021
RTP Payload Format for VP9 Video RTP Payload Format for VP9 Video
draft-ietf-payload-vp9-14 draft-ietf-payload-vp9-15
Abstract Abstract
This specification describes an RTP payload format for the VP9 video This specification describes an RTP payload format for the VP9 video
codec. The payload format has wide applicability, as it supports codec. The payload format has wide applicability, as it supports
applications from low bit-rate peer-to-peer usage, to high bit-rate applications from low bit-rate peer-to-peer usage, to high bit-rate
video conferences. It includes provisions for temporal and spatial video conferences. It includes provisions for temporal and spatial
scalability. scalability.
Status of This Memo Status of This Memo
skipping to change at page 1, line 38 skipping to change at page 1, line 38
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on 5 December 2021. This Internet-Draft will expire on 6 December 2021.
Copyright Notice Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document. license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
skipping to change at page 4, line 48 skipping to change at page 4, line 48
and helps it understand the temporal layer structure. Since this is and helps it understand the temporal layer structure. Since this is
signaled in each packet it makes it possible to have very flexible signaled in each packet it makes it possible to have very flexible
temporal layer hierarchies, and scalability structures which are temporal layer hierarchies, and scalability structures which are
changing dynamically. changing dynamically.
In non-flexible mode, frames are encoded using a fixed, recurring In non-flexible mode, frames are encoded using a fixed, recurring
pattern of dependencies; the set of pictures that recur in this pattern of dependencies; the set of pictures that recur in this
pattern is known as a Picture Group (PG). In this mode, the inter- pattern is known as a Picture Group (PG). In this mode, the inter-
picture dependencies (the reference indices) of the Picture Group picture dependencies (the reference indices) of the Picture Group
MUST be pre-specified as part of the scalability structure (SS) data. MUST be pre-specified as part of the scalability structure (SS) data.
A Picture Group is a recurring pattern of spatial and temporal Each packet has an index to refer to one of the described pictures in
dependencies which In this mode, each packet has an index to refer to the PG, from which the pictures referenced by the picture transmitted
one of the described pictures in the PG, from which the pictures in the current packet for inter-picture prediction can be identified.
referenced by the picture transmitted in the current packet for
inter-picture prediction can be identified.
(Note: A "Picture Group", as used in this document, is not the same (Note: A "Picture Group", as used in this document, is not the same
thing as the term "Group of Pictures" as it is traditionally used in thing as the term "Group of Pictures" as it is traditionally used in
video coding, i.e. to mean an independently-decoadable run of video coding, i.e. to mean an independently-decoadable run of
pictures beginning with a keyframe.) pictures beginning with a keyframe.)
The SS data can also be used to specify the resolution of each The SS data can also be used to specify the resolution of each
spatial layer present in the VP9 stream for both flexible and non- spatial layer present in the VP9 stream for both flexible and non-
flexible modes. flexible modes.
skipping to change at page 7, line 49 skipping to change at page 7, line 49
V: | SS | V: | SS |
| .. | | .. |
+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+
Figure 3 Figure 3
I: Picture ID (PID) present. When set to one, the OPTIONAL PID MUST I: Picture ID (PID) present. When set to one, the OPTIONAL PID MUST
be present after the mandatory first octet and specified as below. be present after the mandatory first octet and specified as below.
Otherwise, PID MUST NOT be present. If the V bit was set in the Otherwise, PID MUST NOT be present. If the V bit was set in the
stream's most recent start of a keyframe (i.e. the SS field was stream's most recent start of a keyframe (i.e. the SS field was
present, and non-flexible scalability mode is in use), then this present) and the F bit is set to 0 (i.e. non-flexible scalability
bit MUST be set on every packet. mode is in use), then this bit MUST be set on every packet.
P: Inter-picture predicted frame. When set to zero, the frame does P: Inter-picture predicted frame. When set to zero, the frame does
not utilize inter-picture prediction. In this case, up-switching not utilize inter-picture prediction. In this case, up-switching
to a current spatial layer's frame is possible from directly lower to a current spatial layer's frame is possible from directly lower
spatial layer frame. P SHOULD also be set to zero when encoding a spatial layer frame. P SHOULD also be set to zero when encoding a
layer synchronization frame in response to an LRR layer synchronization frame in response to an LRR
[I-D.ietf-avtext-lrr] message (see Section 5.3). When P is set to [I-D.ietf-avtext-lrr] message (see Section 5.3). When P is set to
zero, the TID field (described below) MUST also be set to 0 (if zero, the TID field (described below) MUST also be set to 0 (if
present). Note that the P bit does not forbid intra-picture, present). Note that the P bit does not forbid intra-picture,
inter-layer prediction from earlier frames of the same picture, if inter-layer prediction from earlier frames of the same picture, if
skipping to change at page 13, line 14 skipping to change at page 13, line 14
Note that for a given picture, all frames follow the same inter- Note that for a given picture, all frames follow the same inter-
picture dependency structure. However, the frame rate of each picture dependency structure. However, the frame rate of each
spatial layer can be different from each other and this can be spatial layer can be different from each other and this can be
described with the use of the D bit described above. The described with the use of the D bit described above. The
specified dependency structure in the SS data MUST be for the specified dependency structure in the SS data MUST be for the
highest frame rate layer. highest frame rate layer.
In a scalable stream sent with a fixed pattern, the SS data SHOULD be In a scalable stream sent with a fixed pattern, the SS data SHOULD be
included in the first packet of every key frame. This is a packet included in the first packet of every key frame. This is a packet
with P bit equal to zero, SID or Lis not the bit equal to zero, and B with P bit equal to zero, SID or L bit equal to zero, and B bit equal
bit equal to 1. The SS data MUST only be changed on the picture that to 1. The SS data MUST only be changed on the picture that
corresponds to the first picture specified in the previous SS data's corresponds to the first picture specified in the previous SS data's
PG (if the previous SS data's N_G was greater than 0). PG (if the previous SS data's N_G was greater than 0).
4.3. Frame Fragmentation 4.3. Frame Fragmentation
VP9 frames are fragmented into packets, in RTP sequence number order, VP9 frames are fragmented into packets, in RTP sequence number order,
beginning with a packet with the B bit set, and ending with a packet beginning with a packet with the B bit set, and ending with a packet
with the E bit set. There is no mechanism for finer-grained access with the E bit set. There is no mechanism for finer-grained access
to parts of a VP9 frame. to parts of a VP9 frame.
 End of changes. 7 change blocks. 
13 lines changed or deleted 11 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/