基于双指导扩散模型的单样本图像域自适应

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：为了避免现有的单样本图像域自适应算法在反转重建过程中丢失内容信息的现象,提出一种利用CLIP(contrastive language-image pretraining)和ViT(vision transformer)双指导扩散模型去噪、实现内容对齐的单样本图像域自适应算法.首先设计一种基于扩散模型的域反转算法,将位于目标域的图像通过预训练的扩散模型反转到源域,从而获得了内容相同但域信息不同的图像对.其次,将图像对映射到CLIP模型隐空间中,通过内容主导和域主导的2个方向分别顾及内容信息和域信息;同时,将图像对映射到ViT模型隐空间中,通过对比学习的方式分别约束内容信息和域信息.最后,使用条件化指导的去噪方式,实现任意源域图像到目标域的转换.此外,该算法也适用于未见域间转换和多属性编辑的任务.定性和定量的实验结果证明,该算法相对于其他先进算法在多个性能指标上提升2%～27%.

外文标题：One-shot image domain adaptation based on dual guidance diffusion models

外文摘要：In order to avoid the phenomenon of content information missing in the reverse reconstruction process via existing one-shot image domain adaptive algorithms,a new approach was proposed to denoise by taking advantage of contrastive language-image pretraining(CLIP)and vision transformer(ViT)dual-guided diffusion model,resulting in a one-shot image domain adaptation algo-rithm for content alignment.Firstly,domain inversion algorithm based on a diffusion model was proposed,which could invert images in the target domain to the source domain using a pre-trained diffusion model.The image pairs were obtained with same con-tent but different domain information.Then the image pairs were mapped into the implicit space of the CLIP model,taking into account the content and domain information through two directions of content dominance and domain dominance,respectively.Addi-tionally,the image pairs were mapped into the implicit space of the ViT model,with content and domain information constrained separately through contrastive learning.Finally,the conditionally guided denoising method was used to convert arbitrary source domain images to target domains.The proposed algorithm could also be applied to tasks including unseen domain conversion and multi-attribute editing.Qualitative and quantitative experimental results demonstrate that the algorithm improves the multiple perfor-mance indicators from 2% to 27% compared to other advanced algorithms.

外文关键词：

one-shot image domain adaptationdual guidance diffusion modelcontent alignmentdomain inversionconditional guided denoising

作者：

张研博、普园媛、赵征鹏、阳秋霞、徐丹、李思奇

展开 >

作者单位：

云南大学信息学院, 昆明 650504

云南省高校物联网技术及应用重点实验室(云南大学), 昆明 650504

关键词：

单样本图像域自适应双指导扩散模型内容对齐域反转条件化指导去噪

基金：

国家自然科学基金资助项目云南省科技厅应用基础研究计划重点项目

项目编号：

62362070202001BB050043

出版年：

2024

中国科技论文

教育部科技发展中心

中国科技论文

影响因子：0.466

ISSN：2095-2783

年,卷(期)：2024.19(2)

参考文献量28