UMTransNet:结合U-Net和多尺度感知Transformer的图像拼接定位方法

UMTransNet:Image stitching and localization method combining U-Net and multi-scale perception Transformer

扫码查看

原文链接

维普
万方数据

中文摘要：当前基于深度学习的图像拼接定位方法大多只关注深层次特征,且感受野有限,忽略了浅层次特征,影响图像拼接定位的准确性.针对上述问题,文中提出一种结合改进U-Net和多尺度多视角Transformer的图像拼接定位网络UMTransNet.改进U-Net模型的编码器,将编码器中的最大池化层替换成卷积层,防止浅层次特征的流失;将多尺度多视角Transformer嵌入到U-Net的跳跃连接中,Transformer的输出特征与U-Net的上采样特征进行有效融合,实现深层次特征与浅层次特征的平衡,从而提高图像拼接定位的准确性.通过可视化检测结果图显示,所提方法在定位拼接篡改区域方面表现得更加出色.

外文摘要：Most of the current deep learning based image stitching and localization methods are primarily focused on deep-level features with limited receptive fields,thereby overlooking shallow-level features,which adversely affects the accuracy of image stitching and localization.In view of the above,a novel image stitching and localization network UMTransNet which combines an improved U-Net architecture with a multi-scale multi-view Transformer is proposed.The encoder of the U-Net model is enhanced,and the maximum pooling layer of the encoder is replaced with convolutional layers to prevent the loss of shallow-level features.Additionally,the multi-scale multi-view Transformer is embedded into the skip connections of the U-Net,which facilitates the effective fusion of the output features of the Transformer and the upsampled features of the U-Net,so as to achieve a balance between deep-level and shallow-level features,thereby enhancing the accuracy of image stitching and localization.The results of visualization detection graph show that the proposed methed is more excellent in locating stitched tampered regions.

外文关键词：

digital image forensicsimage stitching localizationU-Netmulti-scale perceptionself-attention mechanismcross-attention mechanism

作者：

张维、何月顺、谢浩浩、杨安博、杨超文、吕熊

展开 >

作者单位：

东华理工大学信息工程学院,江西南昌 330013

关键词：

数字图像取证图像拼接定位 U-Net 多尺度感知自注意力机制交叉注意力机制

出版年：

2025

DOI：

10.16652/j.issn.1004-373x.2025.01.006

现代电子技术

陕西电子杂志社

现代电子技术

北大核心

影响因子：0.417

ISSN：1004-373X

年,卷(期)：2025.48(1)