MPEG 140

MPEG 140 took place in Mainz from 2022-10-24 until 2022-10-28.

Press Release

MPEG evaluates the Call for Proposals on Video Coding for Machines

At the 140th MPEG meeting, MPEG Technical Requirements (WG 2) evaluated the responses to the Call for Proposals (CfP) for technologies and solutions enabling efficient video coding for machine vision tasks. A total of 17 responses to this CfP were received, with responses providing various technologies such as (i) learning-based video codecs, (ii) block-based video codecs, (iii) hybrid solutions combining (i) and (ii), and (iv) novel video coding architectures. Several proposals use a region of interest-based approach, where different areas of the frames are coded in varying qualities.

The responses to the CfP reported an improvement in compression efficiency of up to 57% on object tracking, up to 45% on instance segmentation, and up to 39% on object detection, respectively, in terms of bit rate reduction for equivalent task performance. Notably, all requirements defined by WG 2 were addressed by a variety of proposals.

Given the success of this call, MPEG will continue working on video compression methods for machine vision tasks. The work will continue in MPEG Video Coding (WG 4) within a new standardization project. A test model will be developed based on technologies from the responses to the CfP and results from the first round of core experiments in one or two meeting cycles. At the same time, the Joint Video Team with ITU-T SG 16 (WG 5) will study encoder optimization methods for machine vision tasks on top of existing MPEG video compression standards.

WG 2 thanks all proponents who submitted responses to this CfP. MPEG will continue to collect and solicit feedback to improve the test model for video coding for machines in the upcoming meetings.

MPEG evaluates Call for Evidence on Video Coding for Machines Feature Coding

At the 140th MPEG meeting, MPEG Technical Requirements (WG 2) evaluated the responses to the Call for Evidence (CfE) for technologies and solutions enabling efficient feature coding for machine vision tasks. A total of eight responses to this CfE were received, whereof six responses were considered valid based on the conditions described in the call:

  • For the tested video dataset, increases in compression efficiency of up to 87% compared to the video anchor and over 90% compared to the feature anchor were reported.
  • For the tested image dataset, the compression efficiency can be increased by over 90% compared to both image and feature anchors.

Based on the successful outcome of the CfE, WG 2 will continue working toward issuing a Call for Proposals (CfP). WG 2 thanks all proponents who submitted responses to this CfE.

MPEG reaches the First Milestone for Haptics Coding

At the 140th MPEG meeting, MPEG Coding of 3D Graphics and Haptics (WG 7) reached the first milestone in the approval process for the Haptics Coding (ISO/IEC CD 23090-31) standard by promoting the text to Committee Draft (CD) status. The CD comprises the MPEG-I Haptics Phase 1 codec specification which includes a JSON descriptive format based on a parametric representation of haptics and a perceptually optimized wavelet compression format addressing temporal haptic signals. These formats allow the MPEG-I Haptics Phase 1 codec to be used for the creation, editing, and interchange of haptics as well as for the efficient encoding, distribution, streaming, and storage of haptics. The JSON format is compatible with the current glTF specification allowing for future extensions of spatial and interactive haptics. The technologies selected for the CD include descriptive, human-readable representations and highly efficient psychophysical compression schemes, as well as support for both vibrotactile and kinesthetic devices. They incorporate a number of refinements and enhancements to the initial set of technologies retained after the call for proposals (termed RM0) that have passed rigorous objective and subjective perceptual tests designed to assess the quality of haptics at various bitrates.

MPEG completes a New Standard for
Video Decoding Interface for Immersive Media

One of the most distinctive features of immersive media compared to 2D media is that only a tiny portion of the content is presented to the user. Such a portion is interactively selected at the time of consumption. For example, a user may not see the same point cloud object’s front and back sides simultaneously. Thus, for efficiency reasons and depending on the users’ viewpoint, only the front or back sides need to be delivered, decoded, and presented. Similarly, parts of the scene behind the observer may not need to be accessed.

At the 140th MPEG meeting, MPEG Systems (WG 3) reached the final milestone of the Video Decoding Interface for Immersive Media (VDI) standard (ISO/IEC 23090-13) by promoting the text to Final Draft International Standard (FDIS). The standard defines the basic framework and specific implementation of this framework for various video coding standards, including support for application programming interface (API) standards that are widely used in practice, e.g., Vulkan by Khronos.

The VDI standard allows for dynamic adaptation of video bitstreams to provide the decoded output pictures in such a way that the number of actual video decoders can be smaller than the number of the elementary video streams to be decoded. In other cases, virtual instances of video decoders can be associated with the portions of elementary streams required to be decoded. With this standard, the resource requirements of a platform running multiple virtual video decoder instances can be further optimized by considering the specific decoded video regions to be presented to the users rather than considering only the number of video elementary streams in use. The first edition of the VDI standard includes support for the following video coding standards: High Efficiency Video Coding (HEVC), Versatile Video Coding (VVC), and Essential Video Coding (EVC).

MPEG completes Development of
Conformance and Reference Software for Compression of Neural Networks

At the 140th MPEG meeting, MPEG Video Coding (WG 4) reached the final milestone for Conformance and Reference Software for Compression of Neural Networks (ISO/IEC 15938-18) by promoting the text to Final Draft International Standard (FDIS). It complements the recently published first edition of the standard for Compression of Neural Networks for Multimedia Content Description and Analysis (ISO/IEC 15938-17).

The neural network coding standard is designed as a toolbox of coding technologies. The specification contains different methods for three compression steps, i.e., parameter reduction (e.g., pruning, sparsification, and matrix decomposition), parameter transformation (e.g., quantization), and entropy coding methods, that can be assembled into encoding pipelines combining one or more (in the case of reduction) methods from each step. The reference software is written in Python and provides a framework defining interfaces for these three steps in the coding pipeline and components implementing all supported methods. Additionally, bitstreams for testing the conformance to the neural network coding standard are provided.

MPEG White Papers

At the 140th MPEG meeting, MPEG Liaison and Communication (AG 3) approved the following two MPEG white papers.

MPEG-H 3D Audio

The MPEG-H 3D Audio standard specifies a universal audio coding and rendering environment that is designed to efficiently represent high-quality spatial or immersive audio content for storage and transmission. Since there is no generally accepted “one-size-fits-all” format for immersive audio, it supports (i) common loudspeaker setups including mono, stereo, surround, and 3D audio (i.e., setups including loudspeakers above ear level and possibly below ear level) and (ii) rendering over a wide range of reproduction conditions (i.e., various loudspeaker setups or headphones, possibly with background noise in the listening environment).

MPEG-I Scene Description

MPEG has been working on technologies and standards for immersive media under the umbrella of the MPEG immersive media coding project (MPEG-I). MPEG Systems (WG 3) recognized the need for an interoperable and distributable scene description solution as a key element to foster the emergence of immersive media services and to enable the delivery of its immersive content in the consumer market. As part of the MPEG-I project, WG 3 started investigating architectures for immersive media and possible solutions for a scene description format in 2017, which resulted in the ISO/IEC 23090-14 standard.

This white paper introduces ISO/IEC 23090-14, which provides a set of extensions under the “MPEG” prefix to Khronos glTF (also available as ISO/IEC 12113), as well as extensions to the MPEG-defined ISO Base Media file format, also known as ISO/IEC 14496-12 ISOBMFF. These extensions enable the description and delivery of timed immersive media into glTF-based immersive scenes. Furthermore, the standard defines an architecture together with an application programming interface (API) that allows the application to separate the access to the immersive timed media content from the rendering of this media. The white paper concludes with an outlook and future plans for the standard.

Standard documents published in MPEG 140

MPEG-I

#PartTitle
2Omnidirectional Media FormatTechnologies under Consideration for OMAF
3Versatile Video CodingPreliminary working draft 2 of SEI processing order SEI message in VVC
3Versatile Video CodingTest Model 18 for Versatile Video Coding (VTM 18)
4Immersive AudioMPEG-I Immersive Audio Encoder Input Format, Version 3
6Immersive Media MetricsTechnologies under Consideration for ISO/IEC 23090-6
7Immersive Media MetadataWD of ISO/IEC 23090-7 AMD 1 Common metadata for immersive media
7Immersive Media MetadataTechnologies under Consideration for Immersive media metadata
8Network based Media ProcessingNBMP reference software and conformance framework
8Network based Media ProcessingTechnologies under Consideration for NBMP
10Carriage of Visual Volumetric Video-based Coding DataDefect under investigation on ISO/IEC 23090-10
10Carriage of Visual Volumetric Video-based Coding DataTechnologies under consideration on carriage of V3C data
12Immersive VideoTest Model 15 for MPEG immersive video
12Immersive VideoCall for MPEG immersive video test materials
13Video Decoding Interface for Immersive MediaTechnologies under consideration for VDI
13Video Decoding Interface for Immersive MediaProcedures for standard development and software of ISO/IEC 23090-13
14Scene Description for MPEG MediaWD of ISO/IEC 23090-14 AMD 2 Support for haptics, augmented reality, avatars, interactivity, MPEG-I audio and lighting
14Scene Description for MPEG MediaTechnologies under Consideration on Scene Description
14Scene Description for MPEG MediaProcedures for standard development for ISO/IEC 23090-14 (MPEG-I Scene Description)
14Scene Description for MPEG MediaExploration Experiments for MPEG-I Scene Description
14Scene Description for MPEG MediaFinal registration of Khronos extensions for 1st edition
14Scene Description for MPEG MediaDraft registration of Khronos extensions 2nd edition
17Reference Software and Conformance for OMAFWD of Reference software and conformance for omnidirectional media format (OMAF) 2nd edition
18Carriage of Geometry-based Point Cloud Compression DataPotential improvement of ISO/IEC 23090-18 DAM 1 Support for temporal scalability
18Carriage of Geometry-based Point Cloud Compression DataTechnologies under Considerations on Carriage of geometry-based point cloud compression data
24Conformance and Reference Software for Scene Description for MPEG MediaWD of Conformance and reference software for scene description
24Conformance and Reference Software for Scene Description for MPEG MediaProcedures for test scenarios and reference software development for MPEG-I Scene Description
25Conformance and Reference Software for Carriage of Visual Volumetric Video-based Coding DataText of ISO/IEC CD 23090-25 Conformance and reference software for carriage of visual volumetric video-based coding data
26Conformance and Reference Software for Carriage of Geometry-based Point Cloud Compression DataWD of ISO/IEC 23090-26 Conformance and reference software for carriage of geometry-based point cloud compression data
31Haptics codingText of ISO/IEC CD 23090-31 Haptics Coding
32Carriage of haptics dataWD of ISO/IEC 23090-32 Carriage of haptics data

MPEG-DASH

#PartTitle
1Media Presentation Description and Segment FormatsDraft text of ISO/IEC 23009-1 5th edition FDAM 1 Alternative MPD event, nonlinear playback and other extensions
1Media Presentation Description and Segment FormatsWD of ISO/IEC 23009-1 5th edition AMD 2 EDRAP streaming and other extensions
1Media Presentation Description and Segment FormatsTechnologies under Consideration for DASH
1Media Presentation Description and Segment FormatsDefects under Investigation on DASH
7Delivery of CMAF content with DASHExploration on alignment of ISOBMFF/DASH/CMAF terminology, concepts and solutions
9Encoder and packager synchronizationWD of ISO/IEC 23009-9 Redundant encoding and packaging for segmented live media (REAP)

MPEG-H

#PartTitle
12Image File FormatDraft text of ISO/IEC 23008-12 DAM 1 Support for predictive image coding, bursts, bracketing and other improvements
12Image File FormatPreliminary WD of ISO/IEC 23008-12 2nd Edition AMD 2 Renderable text items and other improvements
12Image File FormatTechnology under Consideration on ISO/IEC 23008-12
12Image File FormatDefect Report for ISO/IEC 23008-12:2017

MPEG-4

#PartTitle
1SystemsText of ISO/IEC CD 14496-1 5th edition Systems
12ISO base Media File FormatPotential improvement of ISO/IEC DIS 14496-12 8th edition ISO base media file format
12ISO base Media File FormatPreliminary WD of 14496-12 8th Edition AMD 1 Support for T.35, original sample duration and other improvements
12ISO base Media File FormatTechnologies under Consideration for ISO/IEC 14496-12 (ISOBMFF)
12ISO base Media File FormatDefect Report of ISO/IEC 14496-12
14MP4 File FormatTechnologies under Consideration for ISO/IEC 14496-14
15Carriage of Network Abstraction Layer (NAL) Unit Structured Video in the ISO base Media File FormatDraft text of ISO/IEC 14496-15 6th edition DAM 2 Picture-in-picture support and other extensions
15Carriage of Network Abstraction Layer (NAL) Unit Structured Video in the ISO base Media File FormatPreliminary WD of 14496-15 6th edition AMD 3 Support for neural-network post-filter supplemental enhancement information and other improvements
15Carriage of Network Abstraction Layer (NAL) Unit Structured Video in the ISO base Media File FormatTechnologies under Consideration for ISO/IEC 14496-15
22Open Font FormatWD of ISO/IEC 14496-22 5th edition Open font format
32File Format ReferenceTechnology under consideration on ISO/IEC 14496-32 File format reference software and conformance
34Syntactic description languageText of ISO/IEC CD 14496-34 Syntactic description language

MPEG-2

#PartTitle
1SystemsDefects under investigation for ISO/IEC 13818-1

MPEG-IoMT

#PartTitle
4Reference Software and Conformance for OMAFIoMT Simulation Software and User Manual

MPEG-CICP

#PartTitle
2VideoPreliminary working draft of additional colour type identifiers for AVC, HEVC and Video CICP

MPEG-C

#PartTitle
9Film grain synthesis technology for video applicationsWorking draft 3 of ISO/IEC TR 23002-9 Film grain synthesis technology for video applications

MPEG-B

#PartTitle
10Carriage of Timed Metadata Metrics of Media in ISO Base Media File FormatTechnologies under Consideration for ISO/IEC 23001-10
11Energy-Efficient Media Consumption (green metadata)WD of ISO/IEC 23001-11 AMD 1 Energy-efficient media consumption (green metadata) for EVC
16Derived Visual Tracks in the ISO Base Media File FormatTechnologies under Consideration for Derived visual tracks including further visual derivations
17Carriage of Uncompressed Video in ISOBMFFDraft text of ISO/IEC DIS 23001-17 Carriage of uncompressed video and images in ISOBMFF

MPEG-A

#PartTitle
19Common Media Application Format (CMAF) for Segmented MediaProposed Working Draft of ISO/IEC 23000-19 AMD New Structural CMAF Brand Profile
19Common Media Application Format (CMAF) for Segmented MediaTechnology under consideration on CMAF
22Multi-Image Application Format (MIAF)Text of ISO/IEC 23000-22 CDAM 3 Chroma subsampling and other technologies
23Decentralized media rights application formatTechnologies under Consideration for ISO/IEC 23000-23

Administrative

#PartTitle
Request for offers to host an MPEG meeting (MPEG 143 - MPEG 150)

Explorations

#PartTitle
34Video Coding for MachinesCfE response report for Video Coding for Machines
34Video Coding for MachinesCfP response report for Video Coding for Machines
36Neural Network-based Video CompressionExploration experiment on neural network-based video coding (EE1)
36Neural Network-based Video CompressionDescription of algorithms and software in Neural Network-based Video Coding (NNVC)
41Enhanced compression beyond VVC capabilityExploration experiment on enhanced compression beyond VVC capability (EE2)
41Enhanced compression beyond VVC capabilityAlgorithm description of Enhanced Compression Model 7 (ECM 7)
41Enhanced compression beyond VVC capabilityVisual quality comparison of ECM/VTM encoding

Management

#PartTitle
WG2 AHGs established at the 9th MPEG WG2 meeting (MPEG 140)
MPEG Roadmap after the MPEG 140 meeting
MPEG Roadmap after the MPEG 140 meeting (Extended PPT)
List of SC 29/WG 03 AHGs established at the 9th meeting (MPEG 140)

All

#PartTitle
White paper on MPEG-H 3D Audio
White paper on MPEG-I Scene Description
Assets of communication
Press Release of MPEG 140th meeting