首页|文本生成图像研究综述

文本生成图像研究综述

扫码查看
对文本生成图像任务进行综合评估和整理,根据生成图像的理念,将文本生成图像任务分为3大类:基于生成对抗网络架构生成图像、基于自回归模型架构生成图像、基于扩散模型架构生成图像。针对基于生成对抗网络架构的文本生成图像方法,按照改进的不同技术点归纳为6小类:采用多层次体系嵌套架构、注意力机制的应用、应用孪生网络、采用循环一致方法、深度融合文本特征和改进无条件模型。通过对不同方法的分析,总结并讨论了现有的文本生成图像方法通用评估指标和数据集。
Survey of text-to-image synthesis
A comprehensive evaluation and categorization of text-to-image generation tasks were conducted.Text-to-image generation tasks were classified into three major categories based on the principles of image generation:text-to-image generation based on the generative adversarial network architecture,text-to-image generation based on the autoregressive model architecture,and text-to-image generation based on the diffusion model architecture.Improvements in different aspects were categorized into six subcategories for text-to-image generation methods based on the generative adversarial network architecture:adoption of multi-level hierarchical architectures,application of attention mechanisms,utilization of siamese networks,incorporation of cycle-consistency methods,deep fusion of text features,and enhancement of unconditional models.The general evaluation indicators and datasets of existing text-to-image methods were summarized and discussed through the analysis of different methods.

AI-generated contenttext-to-imagegenerative adversarial networkautoregressive modeldiffu-sion model

曹寅、秦俊平、马千里、孙昊、闫凯、王磊、任家琪

展开 >

内蒙古工业大学数据科学与应用学院,内蒙古呼和浩特 010000

内蒙古自治区基于大数据的软件服务工程技术研究中心,内蒙古呼和浩特 010000

人工智能生成内容 文本生成图像 生成对抗网络 自回归模型 扩散模型

国家自然科学基金内蒙古自治区自然科学基金内蒙古自治区科技重大专项自治区直属高校基本科研业务费专项

619620442019MS060052021ZD0015JY20220327

2024

浙江大学学报(工学版)
浙江大学

浙江大学学报(工学版)

CSTPCD北大核心
影响因子:0.625
ISSN:1008-973X
年,卷(期):2024.58(2)
  • 92