图像重建是光学计算成像的关键环节之一.目前基于深度学习的图像重建主要使用卷积神经网络、循环神经网络或生成对抗网络等模型.大多数研究仅通过单一模态的数据训练模型,难以在保证成像质量的同时又具备不同场景的泛化能力.为解决这一问题,提出了一种基于Transformer模块的多模态图像重建模型(multi-modal image reconstruction model based on the Transformer,Trans-MIR).实验结果表明,Trans-MIR能够从多模态数据中提取图像特征,实现高质量的图像重建,对二维通用人脸散斑图像进行图像重建的结构相似度高达0.93,对三维微管结构图像的超分辨重建的均方误差低至 10-4 量级.Trans-MIR对研究多模态图像重建具有一定的启发作用.
Multi-modal image reconstruction method based on Trans-MIR model
Image reconstruction is one of the key steps in the optical computational imaging.At present,image reconstruction based on deep learning mainly uses convolutional neural network,cyclic neural network and generative adversarial network.Most models are only trained through the data of a single mode,which is difficult to ensure the quality of imaging while possessing the generalization ability of different scenes.To solve this problem,a multi-modal image reconstruction model based on the Transformer(Trans-MIR)is proposed in this paper.Experimental results show that Trans-MIR can extract image features from multi-modal data to achieve high-quality image reconstruction.The structural similarity of 2D universal face speckle reconstruction was as high as 0.93 and the mean square error of 3D microtubule reconstruction was as low as 10-4.It provides inspiration for the study of multimodal image reconstruction.