Image tampering localization model for intensive post-processing scenarios
Addressing the challenges of blurred or destroyed tampering traces presented by lossy operations such as im-age compression and scaling on images within social platforms like WeChat and Weibo,an adversarial image tampering localization model was introduced.Utilizing the pyramid vision transformer,which was built upon the Transformer archi-tecture,as an encoder for extracting tampering features from images.Simultaneously,an end-to-end encoder-decoder structure,reminiscent of the UNet architecture,was formulated.The pyramid structure and attention mechanisms inher-ented to the pyramid vision transformer afforded a flexible examination of diverse image regions.When integrated with the UNet-like architecture,it facilitated multiscale contextual information extraction,thereby fortifying the model's resil-ience to intense post-processing effects.Empirical results illustrate that the proposed model exhibits a substantial perfor-mance advantage over conventional tampering localization models,particularly in scenarios involving prevalent post-processing techniques such as JPEG compression and Gaussian blur.Notably,the model demonstrates exceptional robust-ness in assessments conducted with datasets representing diverse social media dissemination scenarios.