One-shot image domain adaptation based on dual guidance diffusion models
In order to avoid the phenomenon of content information missing in the reverse reconstruction process via existing one-shot image domain adaptive algorithms,a new approach was proposed to denoise by taking advantage of contrastive language-image pretraining(CLIP)and vision transformer(ViT)dual-guided diffusion model,resulting in a one-shot image domain adaptation algo-rithm for content alignment.Firstly,domain inversion algorithm based on a diffusion model was proposed,which could invert images in the target domain to the source domain using a pre-trained diffusion model.The image pairs were obtained with same con-tent but different domain information.Then the image pairs were mapped into the implicit space of the CLIP model,taking into account the content and domain information through two directions of content dominance and domain dominance,respectively.Addi-tionally,the image pairs were mapped into the implicit space of the ViT model,with content and domain information constrained separately through contrastive learning.Finally,the conditionally guided denoising method was used to convert arbitrary source domain images to target domains.The proposed algorithm could also be applied to tasks including unseen domain conversion and multi-attribute editing.Qualitative and quantitative experimental results demonstrate that the algorithm improves the multiple perfor-mance indicators from 2% to 27% compared to other advanced algorithms.