Reconstruction of Video Snapshot Compressive Imaging Based on Triple Self-Attention
Video Snapshot Compressive Imaging(SCI)is a computational imaging technique that achieves efficient imaging through hybrid compression in both temporal and spatial domains.In video SCI,the sparsity of the signal and its correlations in the temporal and spatial domains can be exploited to effectively reconstruct the original video signal using appropriate video snapshot SCI algorithms.Although recent deep learning-based reconstruction algorithms have achieved state-of-the-art results in many tasks,they still face challenges related to excessive model complexity and slow reconstruction speeds.To address these issues,this research proposes a reconstruction network model for SCI based on triple self-attention,called SCT-SCI.It employs a multibranch-grouped self-attention mechanism to leverage the correlation in the spatial and temporal domains.The SCT-SCI model comprises a feature extraction module,a video reconstruction module,and a triple self-attention module,called SCT-Block.Each SCT-Block comprises a window self-attention branch,a channel self-attention branch,and a temporal self-attention branch.Additionally,it introduces a spatial fusion module,called SC-2DFusion,and a global fusion module,called SCT-3DFusion,to enhance feature fusion.The experimental results show that on the simulated video dataset,the proposed model demonstrates an advantage in low complexity.It saves 31.58%of the reconstruction time compared to the EfficientSCI model,while maintaining a similar reconstruction quality,thus improving real-time performance.