解耦表征学习视角下认知图像属性特征的图像生成方法

Image Generation Method for Cognizing Image Attribute Features from the Perspective of Disentangled Representation Learning

蔡江海 ¹黄成泉 ²王顺霞 ³罗森艳 ³杨贵燕 ³周丽华³

扫码查看

作者信息

1. 贵州民族大学贵州省模式识别与智能系统重点实验室贵阳 550025;贵州民族大学数据科学与信息工程学院贵阳 550025
2. 贵州民族大学贵州省模式识别与智能系统重点实验室贵阳 550025;贵州民族大学数据科学与信息工程学院贵阳 550025;贵州民族大学工程技术人才实践训练中心贵阳 550025
3. 贵州民族大学数据科学与信息工程学院贵阳 550025
折叠

摘要

在生成式人工智能领域,解耦表征学习的研究进一步推动图像生成方法的发展,但现有的解耦方法更多地关注图像生成的低维表示,忽略目标变化图像内在的可解释因素,导致生成的图像容易受到其它不相关属性特征的影响.为此,文中提出解耦表征学习视角下认知图像属性特征的图像生成方法.首先,从生成模型的潜在空间出发,通过训练获得关于目标变化图像的候选遍历方向.然后,构建无监督语义分解策略,并基于候选遍历的方向联合发现嵌入在潜在空间中的可解释方向.最后,利用解耦编码器和对比学习构建对比模拟器和变化空间,进而由可解释方向提取目标变化图像的解耦表征并生成图像.在5个解耦数据集上的实验表明文中方法性能较优.

Abstract

In the field of generative artificial intelligence,the research of disentangled representation learning further promotes the development of image generation methods.However,existing disentanglement methods pay more attention to low-dimensional representation of image generation,ignoring inherent interpretable factors of the target variation image.This oversight results in generated image being susceptible to the influence of other irrelevant attribute features.To address this issue,an image generation method for cognizing image attribute features from the perspective of disentangled representation learning is proposed.Firstly,candidate traversal directions for the target variation image are obtained by training,starting from the latent space of the generative model.Secondly,an unsupervised semantic decomposition strategy is constructed,and the interpretable directions embedded in the latent space are jointly discovered based on the direction of candidate traversals.Finally,a contrast simulator and a variation space are constructed using disentangled encoders and contrastive learning.Consequently,the disentangled representations of the target variation image are extracted from the interpretable directions and the image is generated.Extensive experiments on five popular disentanglement datasets demonstrate the superior performance of the proposed method.

关键词

解耦表征学习/潜在空间/可解释方向/图像生成/变化空间

Key words

Disentangled Representation Learning/Latent Space/Interpretable Direction/Image Ge-neration/Variation Space

引用本文复制引用

基金项目

国家自然科学基金项目(62062024)

贵州省科技计划项目(黔科合基础-ZK[2021]一般342)

贵州省研究生教育教学改革重点项目(黔教合YJSJGKT[2021]018)

贵州省教育厅自然科学研究项目(黔教技[2022]015)

贵州省模式识别与智能系统重点实验室2022年度开放课题(GZMUKL[2022]KF03)

出版年

2024

模式识别与人工智能

中国自动化学会,国家智能计算机研究开发中心,中国科学院合肥智能机械研究所

模式识别与人工智能

CSTPCDCSCD北大核心

影响因子：0.954

ISSN：1003-6059

参考文献量2

段落导航