Infrared and Visible Image Fusion Method Based on Information Enhancement and Mask Loss
Aiming at the problems of detail weakening and edge blurring in infrared and visible fusion images in low-light scenes,an image fusion method based on information enhancement and mask loss is proposed.Firstly,considering the information degradation of the source image in the low-light scene,the guided filtering is employed to enhance the texture details of the visible image and the edge information of the infrared image before fusion.Secondly,in order to fully extract and effectively fuse the feature information of different modal images,a two-branch network is constructed to extract image features.Based on the dual-branch feature extraction network,an interactive enhancement module based on guided filtering is designed to integrate the complementary information of different feature branches in a progressive interactive way to enhance the detail representation of texture and salient targets in the features.During the fusion stage,an attention guidance module based on spatial and channel dimensions is constructed.In the attention mechanism,the maximum and average scales are used to focus on the feature information.By combining the attention guidance of different dimensions and different scales,the key information in the feature is amplified and the redundant information is filtered out to improve the perceptual capacity of the network for crucial features.In terms of loss function,a method of generating infrared mask is proposed,and a mask loss is designed based on infrared mask to guide the fusion network to retain salient targets,texture details and structure information in the target and background regions.In addition,in the training phase,in order to improve the adaptability of the fusion network and reduce the risk of over-fitting,1 083 pairs of images in the MSRS dataset are selected for clipping and expansion.The clipping size is 120 × 120,and the moving step size is 120.The obtained 21 660 pairs of image blocks are used as the training data of the model.In the test phase,to comprehensively evaluate the fusion performance of the method,comparative experiments are performed on three public datasets:MSRS,TNO,and LLVIP.This paper selects nine state-of-the-art fusion methods for qualitative comparison,including CNN-based SDNet,GAN-based FusionGAN and GANMcC,AE-based DenseFuse,RFN-Nest,and PIAFusion,visual tasks-based SeaFusion and IRFS,as well as Transformer-based SwinFusion.Five evaluation indexes are selected for quantitative comparison,namely information entropy,spatial frequency,average gradient,standard deviation,and visual fidelity.The experimental results show that the proposed method is superior to other comparison algorithms in both qualitative and quantitative evaluation on three public datasets.The generated fusion image exhibits rich texture details,clear saliency targets,and excellent visual perception.Finally,in order to verify the effectiveness of the proposed module,ablation experiments are conducted on the image pre-enhancement processing,interactive enhancement module,mask loss and attention guidance module,respectively.The qualitative and quantitative comparison results of the ablation experiments confirm the effectiveness of the proposed module in the fusion algorithm.