计算机学报2024,Vol.47Issue(10) :2401-2416.DOI:10.11897/SP.J.1016.2024.02401

基于视觉提示学习的天气退化图像恢复

Weather-Degraded Image Restoration Based on Visual Prompt Learning

文渊博 高涛 安毅生 李子琦 陈婷
计算机学报2024,Vol.47Issue(10) :2401-2416.DOI:10.11897/SP.J.1016.2024.02401

基于视觉提示学习的天气退化图像恢复

Weather-Degraded Image Restoration Based on Visual Prompt Learning

文渊博 1高涛 2安毅生 1李子琦 1陈婷1
扫码查看

作者信息

  • 1. 长安大学信息工程学院 西安 710064
  • 2. 长安大学数据科学与人工智能研究院 西安 710064
  • 折叠

摘要

尽管现有的天气退化图像恢复方法在单一天气去除任务上已经取得良好表现,但其无法适应真实场景下多变的天气类型.为此,本文提出一种基于视觉提示学习的天气退化图像恢复算法,其是预训练语言图像模型与天气退化图像恢复任务结合的新范式.该算法首先设计一个查询提示约束网络(Query Prompt Contrained Network,QPC-Net),其利用对比语言图像预训练模型中的文本编码器和图像编码器来根据给定的退化图像直接编码其对应真实背景的潜在描述特征.同时,该算法还包括一个示例提示引导网络(Example Prompt Guided Network,EPG-Net),其利用给定的示例图像来引导预训练扩散模型去除查询图像上对应的天气退化.相比类似设定的现有算法,本文算法在8个天气退化数据集上平均改善峰值信噪比2.11dB,平均改善结构相似性4.74%.

Abstract

Images captured in real-world scenarios often suffer from weather degradations like random occurrences of rain,haze and snow,which may cause detail occlusion and content deterioration,thereby impacting the effectiveness of subsequent advanced computer vision algorithms.Existing methods for weather-degraded image restoration can be categorized into task-specific,task-aligned and all-in-one types.However,the first two types require specific training for different weather degradations and struggle to adapt to the diverse weather conditions encountered in real-world scenes.Although all-in-one methods achieve the competitive performance across adverse weather degradation removal tasks,they also fail to adapt to the unseen weather degradations,resulting in poor generalization performance.To this end,a weather-degraded image restoration algorithm based on visual prompt learning is proposed in this work,which is a novel paradigm that integrates the pre-trained language-image model with the weather degraded-image restoration.Specifically,even text inputs with similar meanings may yield significantly different latent features when processed through the text encoder of contrastive language-image pre-training(CLIP)model.The general expectation of image restoration is to provide a degraded image and have the model generate its corresponding restored image,rather than multiple different reconstructed images.Therefore,directly using text to guide image reconstruction may lead to unstable solution spaces,often failing to meet the general expectation of image restoration.In response,a query prompt constrained network(QPC-Net)is introduced to utilize the text encoder and image encoder from CLIP to directly encode the latent descriptive features of corresponding ground truth based on the given degraded images.These latent features are further embedded into a pre-trained stable diffusion model using the cross-attention mechanism,thereby constraining the reverse sampling process and facilitating the content reconstruction.QPC-Net consists of two image encoders,with one set of parameters frozen and the other set trainable.Moreover,many existing weather-degraded image algorithms primarily train strict pixel-level mappings between the degraded and clean images,lacking the exploration of knowledge for different image restoration tasks.This limitation makes it difficult for these algorithms to learn the corresponding context for the weather-degraded image restoration tasks not covered in the training dataset,thereby struggling to adapt to different restoration tasks.To address this issue,an example prompt guided network(EPG-Net)is developed to utilize the given example images to guide pre-trained stable diffusion model in learning the context knowledge of corresponding restoration tasks,thereby removing the degradations from query images.Additionally,acquiring suitable example images for complex mixed weather-degraded image restoration tasks are challenging;however EPG-Net can learn the context knowledge from multiple sets of example images.In experimental evaluations conducted on eight seen weather degradation datasets and seven unseen datasets,the proposed algorithm demonstrates significant improvements.Specifically,on the seen weather-degraded datasets,it achieves an average improvement of 2.11dB in peak signal-to-noise ratio(PSNR),4.74%in structural similarity(SSIM),41.08%in perceptual image block similarity(LPIPS)and 24.25%in natural image quality evaluator(NIQE)compared to existing algorithm with similar setting.Additionally,on the unseen weather-degraded datasets,it achieves an average improvement of 1.88 dB in PSNR,5.61%in SSIM,21.40%in LPIPS and 29.29%in NIQE.

关键词

计算机视觉/视觉提示学习/情境学习/图像恢复/扩散模型

Key words

computer vision/visual prompt learning/in-context learning/image restoration/diffusion model

引用本文复制引用

基金项目

国家重点研发计划(2023YFB2504703)

陕西省国际科技合作计划项目(2024GH-YBXM-24)

国家自然科学基金(52172379)

长安大学中央高校基本科研业务费专项资金(300102242901)

出版年

2024
计算机学报
中国计算机学会 中国科学院计算技术研究所

计算机学报

CSTPCD北大核心
影响因子:3.18
ISSN:0254-4164
段落导航相关论文