一种全局信息增强的语义图像合成方法

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据

中文摘要：语义图像合成是图像翻译领域一个重要的研究和应用方向,旨在利用输入的语义图像(如语义分割图、地图、线稿图等)生成与图像描述相符的真实图像.针对基于生成对抗网络(GAN)的语义图像合成任务由于缺乏全局信息导致生成图像特征模糊、纹理细节缺乏关联性的问题,基于pix2pix网络提出一种结合外部注意力机制改进的全局信息增强语义图像合成方法.首先,在U-net结构的生成器上采样阶段引入外部注意力机制,增强生成图像像素间的空间相关性;其次,在生成器上采样层使用深度残差模块,在提高生成图像质量的同时,增强生成图像的多样性;最后,鉴别器引入全局信息以增强鉴别能力.在cityscape、landscape、edges2shoes数据集上进行实验,改进后的方法相比基线模型在FID指标上分别提升了57.37、26.74和1.78.实验结果表明,该模型能够有效利用全局信息增强生成图像纹理细节的关联性,提高图像质量.

外文标题：A Method for Semantic Image Synthesis with Global Information Enhancement

外文摘要：Semantic image synthesis is an important application and research direction in the field of image translation.Its aim is to generate real images that are consistent with image descriptions using input semantic images,such as semantic segmentation maps,maps and sketches.In response to the problems of blurry image features and lack of correlation in texture details due to the lack of global information in semantic image synthesis tasks based on generative adversarial networks(GANs),this paper proposes a global information-enhanced semantic image synthesis method based on the pix2pix network model,combined with an external attention mechanism.Firstly,an external attention mecha-nism is introduced in the upsampling stage of the generator with a U-net structure to enhance the spatial correlation between generated image pixels.Secondly,deep residual modules are used in the upsampling layers of the generator to improve the quality of generated images while en-hancing the diversity of the generated images.Finally,the discriminator incorporates global information to enhance its discrimination ability.Experimental evaluations on the Cityscape,Landscape,and Edges2shoes datasets demonstrate the effectiveness of the proposed model.Com-pared to the baseline model,the improved method achieves improvements of 57.37,26.74,and 1.78 in terms of the FID(Fréchet Inception Distance)metric for the Cityscape,Landscape,and Edges2shoes datasets,respectively.The results show that the proposed model can effec-tively utilize global information to enhance the correlation of texture details in generated images and improve the quality of generated images.

外文关键词：

image-to-image translationsemantic image synthesisgenerative adversarial networkdeep learningcomputer vision

作者：

刘勇、李俊岐、陈永强

展开 >

作者单位：

武汉纺织大学计算机与人工智能学院

湖北省服装信息化工程技术研究中心,湖北武汉 430200

关键词：

图像翻译语义图像合成生成对抗网络深度学习计算机视觉

出版年：

2024

DOI：

10.11907/rjdk.231647

软件导刊

湖北省信息学会

软件导刊

影响因子：0.524

ISSN：1672-7800

年,卷(期)：2024.23(10)