首页|基于时间特征细化网络的时空视频超分辨率研究

基于时间特征细化网络的时空视频超分辨率研究

扫码查看
时空视频超分辨率(space-time video super-resolution,STVSR)通过时间和空间 2 个尺度提升视频的质量,从而实现在视频采集设备、传输或者存储有限的情况下依然能实时地呈现高分辨率和高帧率的视频,满足人们对超高清画质的追求。相比两阶段方法,一阶段方法实现的是特征层面而非像素层面的帧插值,其在推理速度和计算复杂度上都明显更胜一筹。一些现有的一阶段STVSR方法采用基于像素幻觉的特征插值,这幻化了像素,因此很难应对帧间快速运动物体的预测。为此,提出一种基于光流法的金字塔编码器-解码器网络来进行时间特征插值,实现快速的双向光流估计和更真实自然的纹理合成,在使得网络结构更高效的同时弥补了大运动对光流估计带来的不稳定性。另外,空间模块采用基于滑动窗口的局部传播和基于循环网络的双向传播来强化帧对齐,整个网络称为时间特征细化网络(temporal feature refinement netowrk,TFRnet)。为了进一步挖掘TFRnet的潜力,将空间超分辨率先于时间超分辨率(space-first),在几种广泛使用的数据基准和评估指标上的实验证明了所提出方法TFRnet-sf的出色性能,在总体峰值信噪比(peak signal to noise ratio,PSNR)和结构相似性(structural similarity,SSIM)提升的同时,插入中间帧的PSNR和SSIM也得到提升,在一定程度上缓和了插入的中间帧与原有帧之间PSNR和SSIM差距过大的问题。
Space-time video super-resolution based on temporal feature refinement network
Space-time video super-resolution(STVSR)enhances video quality across both temporal and spatial dimensions,enabling real-time presentation of high-resolution and high-frame-rate videos despite limitations in video capture devices,transmission,or storage,thus meeting the demand for ultra-high-definition image quality.Compared to two-stage methods,one-stage approaches achieve frame interpolation at the feature level rather than the pixel level,significantly outperforming in terms of inference speed and computational complexity.Some existing one-stage STVSR methods employ pixel hallucination-based feature interpolation,which struggles to predict fast-moving objects between frames.To address this,a pyramid encoder-decoder network based on optical flow for temporal feature inter-polation is proposed,to achieve rapid bidirectional optical flow estimation and more realistic texture synthesis.This network structure,termed temporal feature refinement network(TFRnet),enhances efficiency while mitigating the insta-bility of optical flow estimation for large motions.Additionally,the spatial module incorporates sliding window-based local propagation and bidirectional propagation based on recurrent networks to strengthen frame alignment.To further exploit TFRnet's potential,spatial super-resolution is prioritized over temporal super-resolution(space-first approach).Experiments on several widely used data benchmarks and evaluation metrics demonstrate the excellent performance of our proposed method,TFRnet-sf.While improving overall peak signal-to-noise ratio(PSNR)and structural similarity index(SSIM),it also enhances PSNR and SSIM for inserted intermediate frames,alleviating to some extent the issue of significant disparities in PSNR and SSIM between inserted intermediate frames and original frames.

space-time video super-resolutionpyramid encoder-decoder networktemporal feature interpolationspace-first strategydeep learning

姚晓娟、穆柯、潘沛、杨紫伊、赵雨飞、朱永贵

展开 >

中国传媒大学 数据科学与智能媒体学院,北京 100024

时空视频超分辨率 金字塔编码器-解码器网络 时间特征插值 空-时超分策略 深度学习

2024

南通大学学报(自然科学版)
南通大学

南通大学学报(自然科学版)

影响因子:0.292
ISSN:1673-2340
年,卷(期):2024.23(3)