基于扩散模型的食品图像生成研究

Research on food image generation based on diffusion models

扫码查看

原文链接

维普
万方数据

中文摘要：食物图像生成主要研究从一组特定的配料中生成膳食图像,该任务属于文本到图像任务的范畴.但由于与膳食图像相关的因素较复杂,生成逼真食品图像的类似工作迄今未能完全实现.现有的方法基于配料和烹饪信息利用生成对抗网络逐步产生高质量的样本,但不能覆盖整个分布,因此很难达到条件生成高质量图像的目的.扩散模型是一类基于似然性的模型,最近已被证明可以产生高质量的图像,同时提供理想的特性,如分布覆盖、固定训练目标和易于扩展.通过跨模态信息关联并引导扩散模型根据类别信息生成高质量食物图片.在Recipe1M数据集上的结果表明,模型性能比基线方法有显著的提升.

外文摘要：Research on food image generation primarily focuses on generating meal images from a specific set of ingredients,falling under the category of text-to-image tasks.However,due to the complexity associated with dietary images,similar efforts to generate realistic food images have yet to achieve complete success.Existing methods utilize generative adversarial networks(GANs)based on ingredient and cooking information to progressively generate high-quality samples.However,these methods may fail to cover the entire distribution,making it challenging to achieve the goal of conditionally generating high-quality images.Diffu-sion models,a class of likelihood-based models,have recently been demonstrated to generate high-quality images while offering de-sirable properties such as distribution coverage,fixed training objectives,and ease of scalability.This paper explores the utilization of cross-modal information association and guidance of diffusion models to generate high-quality food images based on category in-formation.Results on the Recipe1M dataset demonstrate a significant improvement in model performance compared to baseline methods.

外文关键词：

diffusion modelsrecipeimage generation

作者：

徐桓程

展开 >

作者单位：

西南交通大学计算机与人工智能学院,成都 611756

关键词：

扩散模型食谱图像生成

出版年：

2024

DOI：