摘要
深度神经网络(DNN)已被广泛应用到高效视频编码(HEVC)编码树单元(CTU)的深度划分中,显著降低了编码复杂度。然而现有的基于DNN的CTU深度划分方法却忽略了不同尺度编码单元(CU)间的特征相关性和存在着分类错误累积等缺陷。基于此,该文提出一种多尺度多输入的互补分类网络(MCCN)来实现更高效且更准确的HEVC帧内CTU深度划分。首先,提出一种多尺度多输入的卷积神经网络(MMCNN),通过融合不同尺度CU的特征来建立CU间的关联,进一步提升网络的表达能力。然后,提出一种互补的分类策略(CCS),通过结合二分类和三分类,并采用投票机制来决定CTU中每个CU的最终深度值,有效避免了现有方法中存在的错误累积效应,实现了更准确的CTU深度划分。大量的实验结果表明,该文所提MCCN能够更大程度降低HEVC编码的复杂度,同时实现更准确的CTU深度划分:仅以增加3.18%的平均增量比特率(BD-BR)为代价,降低了71.49%的平均编码复杂度。同时,预测32×32 CU和16×16 CU的深度准确率分别提升了0.65%~0.93%和2.14%~9.27%。
Deep Neural Networks(DNN)have been widely applied to Coding Tree Unit(CTU)partition of intra-mode High Efficiency Video Coding(HEVC)for reducing the HEVC encoding complexity,however,existing DNN-based CTU partition methods always neglect the correlation of features between Coding Units(CU)at different scales and suffer from the accumulation of classification errors.Therefore,in this paper,a Multi-scale-multi-input Complementation Classification Network(MCCN)for faster and more accurate CTU partition is proposed.First,a Multi-scale Multi-input Convolutional Neural Network(MMCNN)is proposed,which builds up the correlation of features between CUs at different scales by fusing multi-scale CU features.Therefore,our MMCNN possess more powerful representation abilities.Second,a Complementary Classification Strategy(CCS)is proposed,in which the final depth prediction results for each CU are determined by combining the results of multi-classification with the results of binary classification and triplex classification with the voting mechanism.The proposed CCS avoids the accumulation of classification errors and achieves more accurate CTU partition.Extensive experiments demonstrate that our MCCN achieves lower HEVC encoding complexity and more accurate CTU partition:reduce the average encoding complexity by 71.49%only at the cost of a 3.18%average Bj?ntegaard Delta Bit-Rate(BD-BR).And the average accuracies of 32×32 CU depth prediction and 16×16 CU depth prediction are increased by 0.65%~0.93%and 2.14%~9.27%respectively.
作者
唐述
周广义
谢显中
赵瑜
杨书丽
TANG Shu;ZHOU Guangyi;XIE Xianzhong;ZHAO Yu;YANG Shuli(College of Computer Science and Technology,Chongqing University of Posts and Telecommunications,Chongqing 400064,China)
出处
《电子与信息学报》
EI
CAS
CSCD
北大核心
2024年第9期3646-3653,共8页
Journal of Electronics & Information Technology
基金
国家自然科学基金(61601070)
重庆市自然科学基金面上项目(CSTB2023NSCQ-MSX0680)
重庆市教育委员会科学技术研究重大项目(KJZD-M202300101)
重庆邮电大学博士研究生创新人才项目(BYJS202217)。
关键词
深度神经网络
帧内高效视频编码
特征表示
编码树单元深度划分
多尺度多输入
互补分类
Deep Neural Networks(DNN)
Intra-mode High Efficiency Video Coding(HEVC)
Features Representation
Coding Tree Unit(CTU)partition
Multi-scale-multi-input
Complementation classification