首页|基于神经网络的HEVC帧内预测组合快速算法

基于神经网络的HEVC帧内预测组合快速算法

扫码查看
为了提升高效视频编码(High Efficiency Video Coding,HEVC)帧内编码的实时性能,本文提出的方法利用了引入偶数边长与步长的卷积核以及自注意力机制的轻量级卷积网络来预测编码树单元(Coding Tree Unit,CTU)的帧内划分结构,从而减少了编码器对CTU进行四叉树递归遍历划分的编码时间.原始编码策略中粗模式决策通过基于残差经哈德曼变换的预测残差绝对值总和(Sum of Absolute Transformed Difference,SATD)的损失值来估计率失真优化过程中的率失真损失值来进行加速,但仍会耗费一定的编码时间.提出一种方法通过采样搜索的方式减少粗模式决策过程中计算的模式数,从35种模式降低到了18种模式,降低了粗模式决策过程中计算估计损失值的时间.由粗模式决策过程得到的较优的多个候选帧内模式来进行率失真优化,为了缩减粗模式决策需要计算的候选模式数,在候选模式列表中根据前后帧内预测角度模式的估计损失值的差距来筛选掉部分可能性较低的候选模式实现早停止决策,从而减少需要进行率失真优化的候选模式数量,进而减少率失真优化过程的计算时间.本文提出的算法在测试序列上平均实现78.15%的编码时间缩减,BD-PSNR为-0.168 dB,BD-RATE为3.49%.
A Fast Combination Algorithm for HEVC Intra-Prediction Based on Neural Network
To improve the real-time performance of High Efficiency Video Coding(HEVC)intra-frame encoding,a method,which utilizes a lightweight convolutional network with even-length and step-size convolutional kernels and a self-attention mechanism,is proposed to predict the intra-frame partitioning structure of Coding Tree Units(CTU),thereby reducing the encoding time required for the encoder to perform quadtree recursive traversal partitioning on CTUs.In the original encoding strategy,Rough Mode Decision accelerates the process by estimating the rate-distortion loss value in Rate Distortion Optimization based on the Sum of Absolute Transformed Difference(SATD)-based loss value,but it still consumes a certain amount of encoding time.A proposed method reduces the number of patterns calculated in the Rough Mode Decision process through a sampling search approach,reducing the number of patterns from 35 to 18,and decreasing the time required to estimate the loss value during the Rough Mode Decision process.The more favorable multiple candidate intra-frame modes obtained from the Rough Mode Decision process are used for Rate Distortion Optimization.In order to reduce the number of candidate modes that need to be calculated in Rate Distortion Optimization,an early stopping decision is implemented by filtering out some less likely candidate modes based on the differences in the estimated loss values of the intra-frame prediction angle modes in the candidate mode list,thus reducing the number of candidate modes that need to be evaluated in Rate Distortion Optimization and consequently decreasing the computation time of the Rate Distortion Optimization process.The proposed algorithm achieves an average encoding time reduction of 78.15%on the test sequences,with a BD-PSNR of-0.168dB and a BD-RATE of 3.49%.

video codingneural networkintra-frame predictionfast algorithm

范俊宇、宋立锋

展开 >

广东工业大学 信息工程学院,广东 广州 510006

惠州市广工大物联网协同创新研究院有限公司,广东 惠州 516025

视频编码 神经网络 帧内预测 快速算法

广东省科技创新战略专项(省重点实验室认定)项目

2021B1212050003

2024

广东工业大学学报
广东工业大学

广东工业大学学报

影响因子:0.628
ISSN:1007-7162
年,卷(期):2024.41(3)
  • 25