Date: Monday, 29 June 2020 to Friday, 3 July 2020
The 131st WG 11 (MPEG) meeting was held online, 29 June – 3 July 2020
WG11 (MPEG) Announces VVC – the Versatile Video Coding Standard
WG11 (MPEG) is pleased to announce the completion of the new Versatile Video Coding (VVC) standard at its 131st meeting. The document has been progressed to its final approval ballot as ISO/IEC 23090-3 and will also be known as H.266 in the ITU-T.
VVC is the latest in a series of very successful standards for video coding that have been jointly developed with ITU-T, and it is the direct successor to the well-known and widely used HEVC (Rec. ITU-T H.265 | ISO/IEC 23008-2) and AVC (Rec. ITU-T H.264 | ISO/IEC 14496-10) standards. VVC provides a major benefit in compression over HEVC. Plans are underway to conduct a verification test with formal subjective testing to confirm that VVC achieves an estimated 50% bit rate reduction versus HEVC for equal subjective video quality. Test results have already demonstrated that VVC typically provides about a 40%-bit rate reduction for 4K/UHD video sequences in tests using objective metrics. Application areas especially targeted for the use of VVC include ultra-high definition 4K and 8K video, video with a high dynamic range and wide colour gamut, and video for immersive media applications such as 360° omnidirectional video. Conventional standard-definition and high-definition video content are also supported with similar gains in compression. In addition to improving coding efficiency, VVC also provides highly flexible syntax supporting such use cases as subpicture bitstream extraction, bitstream merging, temporal sublayering and layered coding scalability.
The VVC standard includes the specification of six profiles to serve the needs of industry in a wide variety of applications. These include the “Main 10” profile that supports 8- and 10-bit 4:2:0 video, the “Main 10 4:4:4” profile with 4:4:4 and 4:2:2 format support, corresponding “Multilayer Main 10” and “Multilayer Main 10 4:4:4” profiles with support for layered coding, and the “Main 10 Still Picture” and “Main 10 4:4:4 Still Picture” profiles for still image coding employing the same coding tools as in the corresponding video profiles.
MPEG also announces completion of ISO/IEC 23002-7 “Versatile supplemental enhancement information for coded video bitstreams” (VSEI), developed jointly with ITU-T as Rec. ITU-T H.274. The new VSEI standard specifies the syntax and semantics of video usability information (VUI) parameters and supplemental enhancement information (SEI) messages for use with coded video bitstreams. VSEI is especially intended for use with VVC, although it is drafted to be generic and flexible so that it may also be used with other types of coded video bitstreams. Once specified in VSEI, different video coding standards and systems-environment specifications can re-use the same SEI messages without the need for defining special-purpose data customized to the specific usage context.
Point Cloud Compression – WG11 (MPEG) promotes a Video-based Point Cloud Compression Technology to the FDIS stage
At its 131st meeting, WG11 (MPEG) promoted its Video-based Point Cloud Compression (V-PCC) standard to Final Draft International Standard (FDIS) stage. V-PCC addresses lossless and lossy coding of 3D point clouds with associated attributes such as colors and reflectance. Point clouds are typically represented by extremely large amounts of data, which is a significant barrier for mass market applications. However, the relative ease to capture and render spatial information as point clouds compared to other volumetric video representations makes point clouds increasingly popular to present immersive volumetric data. With the current V-PCC encoder implementation providing a compression in the range of 100:1 to 300:1, a dynamic point cloud of one million points could be encoded at 8 Mbit/s with good perceptual quality. Real-time decoding and rendering of V-PCC bitstreams has also been demonstrated on current mobile hardware.
The V-PCC standard leverages video compression technologies and the video eco-system in general (hardware acceleration, transmission services and infrastructure), while enabling new kinds of applications. The V-PCC standard contains several profiles that leverage existing AVC and HEVC implementations, which may make them suitable to run on existing and emerging platforms. The standard is also extensible to upcoming video specifications such as Versatile Video Coding (VVC) and Essential Video Coding (EVC).
The V-PCC standard is based on Visual Volumetric Video-based Coding (V3C), which is expected to be re-used by other MPEG-I volumetric codecs under development. MPEG is also developing a standard for carriage of V-PCC and V3C data (ISO/IEC 23090-10) which has been promoted to DIS status at the 130th MPEG meeting.
By providing high-level immersiveness at currently available bandwidths, the V-PCC standard is expected to enable several types of applications and services such as six Degrees of Freedom (6 DoF) immersive media, virtual reality (VR) / augmented reality (AR), immersive real-time communication and cultural heritage.
MPEG-H 3D Audio – WG11 (MPEG) promotes Baseline Profile for 3D Audio to final stage
At its 131st meeting, WG11 (MPEG) announces the completion of the new ISO/IEC 23008-3:2019, Amendment 2, “3D Audio Baseline profile, Corrections and Improvements,” which has been promoted to Final Draft Amendment (FDAM) status. This amendment introduces a new profile called Baseline profile addressing industry demands. Tailored for broadcast, streaming, and high-quality immersive music delivery use cases, the 3D Audio Baseline profile supports channel and object signals and is a subset of the existing Low Complexity profile. The 3D Audio Baseline profile can be signaled in a backwards compatible fashion, enabling interoperability with existing devices implementing the 3D Audio Low Complexity profile. In addition to its advanced loudness and Dynamic Range Control (DRC), interactivity and accessibility features, the Baseline profile enables the usage of up to 24 audio objects in Level 3 for high quality immersive music delivery.
At the same time, MPEG initiates New Editions at Committee Draft (CD) status for MPEG-H 3D Audio Reference Software and Conformance which incorporate the 3D Audio Baseline profile functionality.
In addition to finalizing the Amendment, WG11 made available the “MPEG-H 3D Audio Baseline Profile Verification Test Report”. This reports on the results of five subjective listening tests assessing the performance of the 3D Audio Baseline profile. Covering a wide range of bit rates and immersive audio use cases, the tests were conducted in nine different test sites with a total of 341 listeners.
Analysis of the test data resulted in the following conclusions:
- Test 1 measured performance for the “Ultra-HD Broadcast” use case, in which highly immersive audio material was coded at 768 kb/s and presented using 22.2 or 7.1+4H channel loudspeaker layouts. The test showed that at the bit rate of 768 kb/s, the 3D Audio Baseline Profile easily achieves “ITU-R High-Quality Emission” quality, as needed in broadcast applications.
- Test 2 measured performance for the “HD Broadcast” or “A/V Streaming” use case, in which immersive audio material was coded at three bit rates: 512 kb/s, 384 kb/s and 256 kb/s and presented using 7.1+4H or 5.1+2H channel loudspeaker layouts. The test showed that for all bit rates, the 3D Audio Baseline Profile achieved a quality of “Excellent” on the MUSHRA subjective quality scale.
- Test 3 measured performance for the “High Efficiency Broadcast” use case, in which audio material was coded at three bit rates, with specific bit rates depending on the number of channels in the material. Bitrates ranged from 256 kb/s (5.1+2H) to 48 kb/s (stereo). The test showed that for all bit rates, the 3D Audio Baseline Profile achieved a quality of “Excellent” on the MUSHRA subjective quality scale.
- Test 4 measured performance for the “Mobile” use case, in which immersive audio material was coded at 384 kb/s, and presented via headphones. The 3D Audio FD binaural renderer was used to render a virtual, immersive audio sound stage for the headphone presentation. The test showed that at 384 kb/s, the 3D Audio Baseline Profile with binaural rendering achieved a quality of “Excellent” on the MUSHRA subjective quality scale.
- Test 5 measured performance for the “High Quality Immersive Music Delivery” use case in which object based immersive music is delivered to the receiver with up to 24 objects at high per object bit rates. This test used 11.1 (as 7.1+4H) as presentation format, with material coded at a rate of 1536 kb/s. The test showed that at that bit rate, the 3D Audio Baseline Profile easily achieves “ITU-R High-Quality Emission” quality, as needed in high quality music delivery applications.
Call for Proposals on Technologies for MPEG-21 Contracts to Smart Contracts Conversion
In the last few years, WG11 (MPEG) has developed a number of standardized ontologies catering to the needs of the music and media industry with respect to codification of Intellectual Property Rights (IPR) information toward the fair trade of music and media. MPEG IPR ontologies and contract expression languages have been developed under the MPEG-21 Multimedia Framework (ISO/IEC 21000) family of standards. MPEG IPR ontologies and contracts can be used by music and media value chain stakeholders to share and exchange in an interoperable way all metadata and contractual information. However, a challenge has been identified, that is, how MPEG IPR ontologies and contracts can be converted to smart contracts that can be executed on existing blockchain environments and, thus, enriching blockchain environments with inference and reasoning capabilities inherently associated with ontologies? By addressing this challenge in a standard way for several smart contract languages would also ensure that MPEG IPR ontologies and contracts prevail as the interlingua for transferring verified contractual data from one blockchain to another.
At its 131st meeting, MPEG issued a Call for Proposals (CfP) on technologies for MPEG-21 IPR contracts to smart contracts conversion. All parties that believe they have relevant technologies are invited to submit proposals for consideration by MPEG. These parties do not necessarily have to be MPEG members. The review of the submissions is planned in the context of the 132nd MPEG meeting. Please contact Jörn Ostermann (email@example.com) for details on attending this meeting if you are not an MPEG delegate.
WG11 (MPEG) issues a
Call for Proposals on extension and improvements to
23092 standard series
The current MPEG-G standard series (ISO/IEC 23092) is the first generation of MPEG standards that address the representation, compression, and transport of genome sequencing data, supporting with a single unified approach data from the output of sequencing machines up to secondary and tertiary analysis. New technology for compressing and indexing a wide variety of annotation data is currently under advanced standardization phase.
In line with the traditional MPEG practice of investigating and applying whenever possible improvements to the performance and functionality of its standards, at its 131st meeting, MPEG has issued a Call for Proposals (CfP) addressing two specific objectives: (i) to increase the speed performance of massively parallel codec implementations and (ii) to enable advanced queries and search capabilities on the compressed data.
Answers to the CfP are expected to be evaluated prior to the 132nd MPEG meeting. Best performing technology are expected to be introduced in a new high-performance profile of current ISO/IEC 23092 standard series.
Widening support for storage and delivery of MPEG-5 EVC
At its 131st meeting, WG11 (MPEG) widened the support for storage and delivery of MPEG-5 Essential Video Coding (EVC; ISO/IEC 23094-1).
- One of the oldest but most popular MPEG standards for content delivery, MPEG-2 Systems (ISO/IEC 13818-1) is adding support for EVC. WG11 (MPEG) promoted the 3rd amendment to the 2019 edition of the MPEG-2 Systems standard to the Committee Draft of Amendment stage, the first milestone of the ISO standard development process. It is entitled Carriage of EVC in MPEG-2 TS and update of the MPEG-H 3D Audio descriptor and provides a definition all of the necessary descriptors and T-STD model extension to carry MPEG-5 EVC elementary streams.
- Recognizing the use of video coding standards for still picture applications is rapidly growing in the market, WG11 (MPEG) promoted the 3rd amendment to the Image File Format to the Committee Draft of Amendment stage, the first milestone of ISO standard development process. It is entitled Support for VVC, EVC, slideshows and other improvements and includes support of the most advanced video coding standard, Versatile Video Coding (VVC), as well to provide a complete list of choices to the markets whose requirements vary widely.
It is currently expected that both standards will reach its final milestone by the mid 2021.
Multi-Image Application Format adds support of HDR
Within less than two years after it has reached its last milestone of standard developments the Multi-Image Application Format (MIAF; ISO/IEC 23000-22) has become the default format for the storage of still pictures within the smart phones. However, it lacks with support of one of the killer features for image quality enhancement, i.e., High Dynamic Range (HDR). To quickly answer such market needs, WG11 (MPEG) has promoted the 2nd Amendment to the Multi-Image Application Format, MIAF HEVC Advanced HDR profile and other clarifications, its first milestone of ISO standard development process. This amendment adds support of use of PQ (Perceptual Quantizer) and HLG (Hybrid Log Gamma) color transfer characteristics and P3 mastering display color volume properties with D65 white point for HEVC encoded still pictures to support widely used HDR technologies. It is currently expected that the standard will reach its final milestone by the mid 2021.
Carriage of Geometry-based Point Cloud Data progresses to Committee Draft
At its 131st meeting, WG11 (MPEG) has promoted the carriage of Geometry-based point cloud data (ISO/IEC 23090-18) to the Committee Draft stage, the first milestone of ISO standard development process. This standard is the second standard introducing the support of volumetric media in the industry-famous ISO base media file format (ISOBMFF) family of standards after the standard on the carriage of video-based point cloud data (ISO/IEC 23090-10). This standard (i.e., ISO/IEC 23090-18) supports the carriage of point cloud data within multiple file format tracks in order to support individual access of each attributes comprising a single point cloud. Additionally, it also allows the carriage of point cloud data in one file format track for simple applications. Understanding the point cloud data could cover large geographical area and the size of the data could be massive in some application the standard support 3D region-based partial access of the data stored in the file so that the application can efficiently access the portion of data required to be processed. It is currently expected that the standard will reach its final milestone by the mid 2021.
MPEG Immersive Video (MIV) progresses to Committee Draft
At the 131st MPEG meeting, it was decided to output the committee draft of ISO/IEC 23090-12 MPEG Immersive Video. The name was changed from “Immersive Video” to “MPEG Immersive Video” (MIV), to clearly differentiate from other uses of the term “Immersive Video” in general parlance. MIV supports compression of immersive video content, in which a real or virtual 3D scene is captured by multiple real or virtual cameras. The use of this standard enables storage and distribution of immersive video content over existing and future networks, for playback with 6 degrees of freedom of view position and orientation.
Neural Network Compression for Multimedia Applications – WG11 (MPEG) progresses to Committee Draft
Artificial neural networks have been adopted for a broad range of tasks in multimedia analysis and processing, such as visual and acoustic classification, extraction of multimedia descriptors or image and video coding. The trained neural networks for these applications contain a large number of parameters (i.e., weights), resulting in a considerable size. Thus, transferring them to a number of clients using them in applications (e.g., mobile phones, smart cameras) requires compressed representation of neural networks.
WG11 (MPEG) has completed the CD of the specification at its 131st meeting. Considering the fact that the compression of neural networks is likely to have a hardware dependent and hardware independent component, the standard is designed as a toolbox of compression technologies. The specification contains different parameter sparsification, parameter reduction (e.g., matrix decomposition), parameter quantization, and entropy coding methods, that can be assembled to encoding pipelines combining one or more (in the case of sparsification/reduction) methods from each group. The results show that trained neural networks for many common multimedia problems such as image or audio classification or image compression can be compressed to 10% of their original size with no or very small performance loss, and even significantly more at small performance loss. The specification is independent of a particular neural network exchange format, and interoperability with common formats is described in the annexes.
WG11 (MPEG) issues Committee Draft of Conformance and Reference Software for Essential Video Coding (EVC)
At its 131st meeting, WG11 (MPEG) promoted the specification of the Conformance and Reference Software for Essential Video Coding (ISO/IEC 23094-4) to Committee Draft (CD) level. The Essential Video Coding (EVC) standard (ISO/IEC 23094-1) provides an improved compression capability over existing video coding standards with timely publication of licensing terms. The issued specification of the Conformance and Reference Software for Essential Video Coding includes conformance bitstreams as well as a reference software for the generation of those conformance bitstreams. This important standard will greatly help industry achieve effective interoperability between products using EVC and provide valuable information to ease the development of such products. The final specification is expected to be available in early 2021.