大多数基于深度学习技术的视觉设备都配备了图像信号处理(Image Signal Processing,ISP)过程以实现RAW数据到RGB图像的转换,同时集成数据预处理过程以完成高效的图像处理。该文在同时考虑ISP和数据预处理过程影响基础上,搭建了深度学习视觉应用实验平台,提出针对ISP过程的图像缩放攻击,即精心制作对抗RAW经过ISP过程得到攻击图像,一旦缩放到特定尺寸就会呈现出完全不同的样貌。由于所提出的攻击是由基于ISP过程的梯度驱动的,因此构建了一个等效模型来学习目标ISP的转换过程,利用等效模型的近似梯度来发动攻击。该攻击平台的构建涵盖了深度学习算法、图像处理及对抗攻击优化等内容,有助于学生深入学习和理解基于深度学习视觉应用的任务处理原理以及深度学习模型的弱点,培养学生针对复杂算法问题的创新与实践能力。
Experimental platform for image-scaling attack against ISP processes in deep learning-based vision applications
[Objective]Deep learning techniques are extensively employed in visual tasks such as autonomous driving and facial recognition.These models often incorporate data preprocessing modules to align images from various sources with their inputs.However,recent studies indicate that these deep learning models can be vulnerable to image-scaling attacks.In such attacks,carefully designed images are scaled to appear completely different,thereby misleading the model.Most vision devices leveraging deep learning technologies are equipped with an image signal processing(ISP)pipeline.This pipeline converts RAW data to RGB images and integrates data preprocessing for efficient image processing.Despite the numerous adversarial attacks proposed for deep learning,many of them fail to consider the combined impact of ISP and data preprocessing.This oversight undermines their effectiveness in attacking visual applications.[Methods]Therefore,we address this gap by considering the effects of ISP and data preprocessing.We construct an experimental platform for vision applications based on deep learning and introduce an image-scaling attack targeting the ISP pipeline.This attack involves crafting adversarial RAW data that,once processed through the ISP and scaled to specific dimensions,exhibits a completely different appearance.Our proposed attack relies on gradient information derived from the ISP process.It is essential to acknowledge that obtaining gradient information directly from the typically closed ISP process is challenging.Therefore,we construct an equivalent model to learn the target ISP transformation process.We use the approximate gradients of this model to launch the attack.Specifically,we devise an encoder-decoder architecture for the equivalent model to extract and reconstruct corresponding RGB images.Leveraging a data set comprising RAW-RGB image pairs generated by the target ISP pipeline,we employ these pairs as training data for our equivalent model.As this model is trained,it learns to capture the transformation process of the target ISP,facilitating the gradient approximation necessary for the attack.We conducted extensive experiments to demonstrate the effectiveness of our proposed attack.[Results]The results were the following:(1)Our strategy against the target ISP process achieved 100%attack success rates in attack effectiveness.This implies that the crafted adversarial RAW data can be successfully transformed by the target ISP into an adversarial image,which subsequently misleads the model's predictions after scaling.(2)The peak signal-to-noise ratio(PSNR)values between the adversarial RAW and clean RAW were comparable to those between the generated adversarial image and the source image.This implies that the perturbations introduced in the adversarial RAW are preserved in the generated attack image.Furthermore,all corresponding PSNR values exceeded 25,underscoring the stealthiness of the attack.[Conclusions]The construction of this attack encompasses various aspects,including deep learning algorithms,image processing,and adversarial attack optimization.It serves as a valuable educational resource for students to gain a deeper understanding of task processing principles in deep learning-based visual applications,as well as the vulnerabilities of deep learning models.This knowledge fosters and enhances their innovative and practical capabilities when addressing complex algorithmic problems.