现代电子技术2025,Vol.48Issue(1) :33-39.DOI:10.16652/j.issn.1004-373x.2025.01.006

UMTransNet:结合U-Net和多尺度感知Transformer的图像拼接定位方法

UMTransNet:Image stitching and localization method combining U-Net and multi-scale perception Transformer

张维 何月顺 谢浩浩 杨安博 杨超文 吕熊
现代电子技术2025,Vol.48Issue(1) :33-39.DOI:10.16652/j.issn.1004-373x.2025.01.006

UMTransNet:结合U-Net和多尺度感知Transformer的图像拼接定位方法

UMTransNet:Image stitching and localization method combining U-Net and multi-scale perception Transformer

张维 1何月顺 1谢浩浩 1杨安博 1杨超文 1吕熊1
扫码查看

作者信息

  • 1. 东华理工大学 信息工程学院,江西 南昌 330013
  • 折叠

摘要

当前基于深度学习的图像拼接定位方法大多只关注深层次特征,且感受野有限,忽略了浅层次特征,影响图像拼接定位的准确性.针对上述问题,文中提出一种结合改进U-Net和多尺度多视角Transformer的图像拼接定位网络UMTransNet.改进U-Net模型的编码器,将编码器中的最大池化层替换成卷积层,防止浅层次特征的流失;将多尺度多视角Transformer嵌入到U-Net的跳跃连接中,Transformer的输出特征与U-Net的上采样特征进行有效融合,实现深层次特征与浅层次特征的平衡,从而提高图像拼接定位的准确性.通过可视化检测结果图显示,所提方法在定位拼接篡改区域方面表现得更加出色.

Abstract

Most of the current deep learning based image stitching and localization methods are primarily focused on deep-level features with limited receptive fields,thereby overlooking shallow-level features,which adversely affects the accuracy of image stitching and localization.In view of the above,a novel image stitching and localization network UMTransNet which combines an improved U-Net architecture with a multi-scale multi-view Transformer is proposed.The encoder of the U-Net model is enhanced,and the maximum pooling layer of the encoder is replaced with convolutional layers to prevent the loss of shallow-level features.Additionally,the multi-scale multi-view Transformer is embedded into the skip connections of the U-Net,which facilitates the effective fusion of the output features of the Transformer and the upsampled features of the U-Net,so as to achieve a balance between deep-level and shallow-level features,thereby enhancing the accuracy of image stitching and localization.The results of visualization detection graph show that the proposed methed is more excellent in locating stitched tampered regions.

关键词

数字图像取证/图像拼接定位/U-Net/多尺度感知/自注意力机制/交叉注意力机制

Key words

digital image forensics/image stitching localization/U-Net/multi-scale perception/self-attention mechanism/cross-attention mechanism

引用本文复制引用

出版年

2025
现代电子技术
陕西电子杂志社

现代电子技术

CSTPCD北大核心
影响因子:0.417
ISSN:1004-373X
段落导航相关论文