A Low Complexity Algorithm for 3D-HEVC Depth Map Intra Coding Based on MAD and ResNet
As an extension of HEVC, 3D-HEVC retains the quadtree structure inherent to HEVC and is currently recognized as the most widely adopted international standard for stereoscopic video coding. In intra coding, quadtree partitioning is determined recursively through rate-distortion cost calculations. Th...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/11045913/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | As an extension of HEVC, 3D-HEVC retains the quadtree structure inherent to HEVC and is currently recognized as the most widely adopted international standard for stereoscopic video coding. In intra coding, quadtree partitioning is determined recursively through rate-distortion cost calculations. This process demands extensive computational resources and results in high encoding complexity. To mitigate this challenge, the present paper proposes a deep learning-based encoding algorithm designed to replace the intricate coding unit (CU) partitioning process utilized in HTM. First, we introduce the Mean Absolute Difference (MAD), which quantifies the dispersion of pixel values around the mean within a given region. By calculating the ratio of a coding unit’s MAD to its pixel mean, we categorize <inline-formula> <tex-math notation="LaTeX">$64\times 64$ </tex-math></inline-formula> CUs into smooth and complex CUs. For smooth CUs, partitioning is terminated prematurely to minimize redundant rate-distortion optimization (RDO) computations. In contrast, for complex CUs, we propose a lightweight ResNet (Residual Neural Network) model that substitutes standard convolutions with depthwise separable convolutions (DSC) in order to decrease the number of parameters. This model effectively integrates both local and global features to generate partitioning predictions at various depths, while incorporating the quantization parameter (QP) into the input to enhance prediction accuracy. Experimental results indicate that, in comparison to the original HTM-16.2 method, the proposed approach achieves a reduction in encoding time of 48.16%, while only resulting in an increase of 0.28% in BDBR. |
---|---|
ISSN: | 2169-3536 |