IBC2022 Technical Papers: Data Compression for 6 Degrees of Virtual Reality Applications | technical papers


Summary

6 Degrees of Freedom (DoF) is used in virtual reality (VR) applications to improve the user experience compared to standard 3 DoF solutions. Due to its scattered nature, 6 DoF information is usually represented in the form of a point cloud, where each element describes the position of a point in 3D space, as well as its attributes (for example, color and transparency). Although it enhances the user experience, 6 DoF requires more data volume than 3 DoF, which has made content distribution difficult and also limited its applications to high-end specialized devices. The goal of our work was to design a new point cloud compression system to allow real-time 6DoF VR applications to run on high-end consumer devices, such as gaming laptops and desktops. Although our solution is specifically designed for PresenZ 6 DOF VR movie format, it can be easily applied to other volumetric video formats as well.

an introduction

In a typical virtual reality (VR) scenario, degrees of freedom (DoF) are used to track the movement of a user wearing a headset within a three-dimensional (3D) space and adjust the image seen by the user accordingly. 3 DoF apps only track rotational motion around the x, y, and z axes (known as pitch, yaw, roll), while 6 DoF apps also track translational motion (levitation, wobble, and levitation), allowing additional effects, such as forward/backward motion , left / right, and up / down. In addition to an improved user experience, 6DoF VR can help reduce motion sickness and feelings of disorientation, by providing a better sense of being.

Due to its scattered nature, 6 DoF information is usually represented in the form of a point cloud, where each element describes the three-dimensional position of a point, as well as its color, transparency, orientation, and movement. It may also contain additional data, such as information about the camera(s) used to capture the 3D view. The actual number of points depends on the complexity of the visual scene: a typical frame may consist of more than 5 million points.

Although it enhances the user experience, 6 DoF requires more data volume than 3 DoF, which has made content distribution difficult and also limited its applications to high-end specialized devices. The main challenges one needs to address are: 1) high data entropy, which typically exceeds the capacity of traditional communication channels, such as 500MB/s for solid-state drives (SSDs), and 2) real-time video rendering at relatively high frame rates ( 30 frames per second). In this work, we describe our approach towards addressing the above challenges using a new data compression scheme, designed specifically for point cloud datasets.

Our data compression format describes each frame individually, and consists of a fixed header layer, as well as several optional data layers. The static header layer describes basic information, such as the number of points and color space used, as well as the types and attributes of coding tools and techniques used for subsets of the point cloud. Depending on the information contained in the static header, there may be additional header layers in the bitstream, further describing encoding methods, parameters, and metadata. Finally, additional base layers are used to store the encoded values ​​for each attribute.

We also designed and implemented the Markup API, which allows encoding and decoding a series of bitmap cloud frames in real time on high-end laptops and gaming desktops. Our actual encoding and decoding implementations are developed in C++, using technologies such as Hyper-Threading and IntelTM Single-Instruction-Multiple-Data (SIMD) components.

This paper discusses background work for point cloud compression and virtual reality applications, then describes our approach in detail, our experimental results, conclusions, and discusses other potential developments.


Leave a Reply

Your email address will not be published. Required fields are marked *