面向强后处理场景的图像篡改定位模型
Image tampering localization model for intensive post-processing scenarios
谭舜泉 1廖桂樱 1彭荣煊 2黄继武3
作者信息
- 1. 深圳大学计算机与软件学院,广东 深圳 518060;深圳市媒体信息内容安全重点实验室,广东 深圳 518060;广东省智能信息处理实验室,广东 深圳 518060
- 2. 深圳市媒体信息内容安全重点实验室,广东 深圳 518060;广东省智能信息处理实验室,广东 深圳 518060;深圳大学电子与信息工程学院,广东 深圳 518060
- 3. 深圳北理莫斯科大学工程系智能感知与计算广东省重点实验室,广东 深圳 518116
- 折叠
摘要
针对微信、微博等社交平台对图像进行的压缩、尺度拉伸等有损操作带来的篡改痕迹模糊或被破坏的挑战,提出了一种对抗强后处理的图像篡改定位模型.该模型选用了基于Transformer的金字塔视觉转换器作为编码器,用于提取图像的篡改特征.同时,设计了一个类UNet结构的端到端编码器-解码器架构.金字塔视觉转换器的金字塔结构和注意力机制可以灵活关注图像的各个区块,结合类UNet结构能够多尺度地提取图像上下文间的关联信息,对强后处理的图像有着较好的鲁棒性.实验结果表明,所提模型在对抗JPEG压缩、高斯模糊等常见的后处理操作以及在不同社交媒体传播场景的数据集上的定位性能上明显优于目前主流的篡改定位模型,展现出了优异的鲁棒性.
Abstract
Addressing the challenges of blurred or destroyed tampering traces presented by lossy operations such as im-age compression and scaling on images within social platforms like WeChat and Weibo,an adversarial image tampering localization model was introduced.Utilizing the pyramid vision transformer,which was built upon the Transformer archi-tecture,as an encoder for extracting tampering features from images.Simultaneously,an end-to-end encoder-decoder structure,reminiscent of the UNet architecture,was formulated.The pyramid structure and attention mechanisms inher-ented to the pyramid vision transformer afforded a flexible examination of diverse image regions.When integrated with the UNet-like architecture,it facilitated multiscale contextual information extraction,thereby fortifying the model's resil-ience to intense post-processing effects.Empirical results illustrate that the proposed model exhibits a substantial perfor-mance advantage over conventional tampering localization models,particularly in scenarios involving prevalent post-processing techniques such as JPEG compression and Gaussian blur.Notably,the model demonstrates exceptional robust-ness in assessments conducted with datasets representing diverse social media dissemination scenarios.
关键词
强后处理场景/图像篡改定位/鲁棒性/金字塔视觉转换器Key words
intensive post-processing scenario/image tampering localization/robustness/pyramid vision transformer引用本文复制引用
基金项目
国家自然科学基金(62272314)
国家自然科学基金(U23B2022)
广东省重点实验室项目(2023-B1212060076)
出版年
2024