Addressing the issue of high computational complexity and limited extraction of spatio-temporal features in 3D convolutional neural networks for video super-resolution tasks,this paper introduced a novel lightweight video super-resolution reconstruction network based on hybrid spatio-temporal convolution.Firstly,a hybrid spatio-temporal convolution-based module was proposed to realize the enhancement of the spatio-temporal feature extraction capability of the network as well as reduction of the computational complexity.Then,a similarity-based selective feature fusion module was proposed to further enhance the extraction capability of relevant features.Lastly,a motion compensation module based on the attention mechanism was designed to mitigate the effects of erroneous feature fusion to a certain extent.The experi-mental results demonstrate that the proposed network can achieve a favorable balance between video super-resolution performance and network complexity,and the 4-fold super-resolution reaches 8 frames per sec-ond on the benchmark dataset SPMCS-11.The proposed network meets the requirements for fast and ac-curate reasoning operations on edge devices.
video super-resolutiondeep learning3D Convolutional Neural Networkfeature fusion