MPEG issues Call for Learning-Based Video Codecs for
Study of Quality Assessment
At the 144th MPEG meeting, MPEG Visual Quality Assessment (AG 5) issued a call for learning-based video codecs for study of quality assessment. AG 5 has been conducting subjective quality evaluations for coded video content and studying their correlation with objective quality metrics. Most of these studies focused on the High Efficiency Video Coding (HEVC) and Versatile Video Coding (VVC) standards. MPEG maintains the Compressed Video for study of Quality Metrics (CVQM) dataset for the purpose of this study.
Given the recent advancements in the development of learning-based video compression algorithms, MPEG studies compression using learning-based codecs. MPEG anticipates that different types of distortion would be present in a reconstructed video that has been compressed using learning-based codecs compared to those induced by traditional block-based motion-compensated video coding designs. In order to facilitate a deeper understanding of these distortions and their impact on visual quality, MPEG issued a public call for learning-based video codecs for study of quality assessment. MPEG welcomes inputs in response to the call. Upon evaluating the responses, MPEG will invite those responses that meet the call’s requirements to submit compressed bitstreams for further study of their subjective quality and potential inclusion into the CVQM dataset.
Given the continued rapid advancements in the development of learning-based video compression algorithms, MPEG will keep this call open and anticipates future updates to the call.
Interested parties are requested to contact the MPEG AG 5 Convenor Mathias Wien (email@example.com) and submit responses for review at the 145th MPEG meeting in January 2024. Further details are given in the call, issued as AG 5 document N 104 and available from the mpeg.org website.
MPEG evaluates Call for Proposals on
Feature Compression for Video Coding for Machines
At the 144th MPEG meeting, MPEG Technical Requirements (WG 2) evaluated the responses to the Call for Proposals (CfP) on Feature Compression for Video Coding for Machines (FCVCM). Feature Compression for Video Coding for Machines investigates technology directed towards compression of intermediate ‘features’ encountered within neural networks, enabling use cases such as distributed execution of neural networks. This stands in contrast to Video Coding for Machines, which compresses conventional video data but with optimizations targeting machine consumption of the decoded video, rather than human consumption.
According to the 12 responses received to this CfP, the overall pipeline of FCVCM can be divided into two stages: (1) feature reduction and (2) feature coding. Technologies related to feature reduction include – but are not limited to – neural network-based feature fusion, temporal and spatial resampling, and adaptive feature truncation. Technologies related to feature coding include learning-based codecs, block-based exiting video codecs, and hybrid codecs.
All responses were evaluated on three tasks across four datasets. The results provide an overall gain, measured in average Bjøntegaard-Delta (BD) rate, of up to 94% against the feature anchors and 69% against the visual anchors. All requirements that were defined by WG 2 were addressed by different proposals and a test model has been defined.
Given the success of this call, MPEG will continue working on video feature compression methods for machine vision purposes. The work will continue in MPEG Video Coding (WG 4) where a new standardization project will be started and is planned to be completed and reach the status of Final Draft International Standard (FDIS) by July 2025.
WG 2 thanks the proponents who submitted responses to the CfP and the test administrator. MPEG will continue to collect and solicit feedback to improve the test model in the upcoming meetings.
MPEG progresses ISOBMFF-related Standards for the
Carriage of Network Abstraction Layer Video Data
At the 144th MPEG meeting, MPEG Systems (WG 3) progressed the development of various ISO Base Media File Format (ISOBMFF) related standards.
As a part of the family of ISOBMFF-related standards, ISO/IEC 14496-15 defines the carriage of Network Abstract Layer (NAL) unit structured video data such as Advanced Video Coding (AVC), High Efficiency Video Coding (HEVC), Versatile Video Coding (VVC), Essential Video Coding (EVC), and Low Complexity Enhancement Video Coding (LCEVC). ISO/IEC 14496-15 has been further improved by adding support for enhanced features such as Picture-in-Picture (PiP) use cases particularly enabled by VVC, which resulted in the approval of the Final Draft Amendment (FDAM). Additionally, separately developed amendments have been consolidated in the 7th edition of ISO/IEC 14496-15, which has been promoted to Final Draft International Standard (FDIS), the final milestone of the standard development.
At the same time, the 2nd edition of ISO/IEC14496-32 (file format reference software and conformance) has been promoted to Committee Draft (CD), the first stage of standard development, and is planned to be completed and reach the status of Final Draft International Standard (FDIS) by the beginning of 2025. This standard will be essential for industry professionals who require a reliable and standardized method of verifying the conformance of their implementation.
MPEG enhances the Support of Energy-Efficient Media Consumption
At the 144th MPEG meeting, MPEG Systems (WG 3) promoted the ISO/IEC 23001-11 Amendment 1 (energy-efficient media consumption (green metadata) for Essential Video Coding (EVC)) to Final Draft Amendment (FDAM), the final milestone of the standard development. This latest amendment defines metadata that enables a reduction in decoder power consumption for ISO/IEC 23094-1 (Essential Video Coding (EVC)).
At the same time, ISO/IEC 23001-11 Amendment 2 (energy-efficient media consumption for new display power reduction metadata) has been promoted to Committee Draft Amendment (CDAM), the first stage of standard development. This amendment introduces a novel way to carry metadata about display power reduction encoded as a video elementary stream interleaved with the video it describes. The amendment is expected to be completed and reach the status of Final Draft Amendment (FDAM) by the beginning of 2025. These developments represent a significant step towards more energy-efficient media consumption and a more sustainable future.
MPEG ratifies the Support of Temporal Scalability for
Geometry-based Point Cloud Compression
At the 144th MPEG meeting, MPEG Systems (WG 3) promoted ISO/IEC 23090-18 Amendment 1 (support of temporal scalability) to Final Draft Amendment (FDAM), the final stage of standard development. The amendment enables the compression of a single elementary stream of point cloud data using ISO/IEC 23090-9 and storing it in more than one track of ISO Base Media File Format (ISOBMFF)-based files, thereby enabling support for applications that require multiple frame rates within a single file. The amendment introduces a track grouping mechanism to indicate multiple tracks carrying a specific temporal layer of a single elementary stream separately. The standard also provides information about reconstructing a single elementary stream from the data stored in more than one track, taking into consideration the frame rate suitable for specific applications.
MPEG reaches the First Milestone for the Interchange of 3D Graphics Formats
At the 144th MPEG meeting, MPEG Coding of 3D Graphics and Haptics (WG 7) promoted ISO/IEC 23090‑28 (efficient 3D graphics media representation for render-based systems and applications) to Committee Draft (CD), the first stage of standard development. This standard aims to streamline the interchange of 3D graphics formats. It primarily tackles the challenge of consistent asset interchange among prevalent 3D formats such as glTF, USD, ITMF, and others across multiple rendering platforms. For instance, a glTF scene might not render similarly on different renderers or players due to existing interchange limitations. ISO/IEC 23090-28 addresses this by introducing a comprehensive metadata vocabulary designed to ensure compatibility between popular 3D model formats on platforms like Unity Technologies and Unreal Engine. This standard delineates the initial mappings for the standard, starting with aligning ISO/IEC 23090‑28 metadata to the glTF2.0 specification, most recently recognized as ISO/IEC 12113. This standard is planned to be completed, i.e., to reach the status of Final Draft International Standard (FDIS), by the beginning of 2025.
MPEG announces Completion of Coding of Genomic Annotations
At the 144th MPEG meeting, MPEG Genomic Coding (WG 8) announced the completion of ISO/IEC 23092‑6 (coding of genomic annotations). This standard addresses the need to provide compressed representations of genomic annotations linked to the compressed representation of raw sequencing data and metadata.
ISO/IEC 23092-6 complements existing MPEG genomics standards to incorporate not only the primary (raw sequencing data) and secondary (aligned sequencing data) but also tertiary genomic data, including variant calls, gene expressions, mapping statistics, contact matrices (e.g., Hi-C), genomic tracks information, and functional annotations, which are collectively referred to as annotation data in the ISO/IEC 23092 series of standards, with efficient compression, indexing, and search capabilities. The formats specified in ISO/IEC 23092-6 also include advanced features such as selective encryption and signing of the data, auditing support, data provenance information, traceability, and support for direct linkage to external clinical data repositories expressed in common standard formats.