首页|A Survey of Multimodal Controllable Diffusion Models

A Survey of Multimodal Controllable Diffusion Models

扫码查看
Diffusion models have recently emerged as powerful generative models,producing high-fidelity samples across domains.Despite this,they have two key challenges,including improving the time-consuming iterative generation process and controlling and steering the generation process.Existing surveys provide broad overviews of diffusion model advance-ments.However,they lack comprehensive coverage specifically centered on techniques for controllable generation.This survey seeks to address this gap by providing a comprehensive and coherent review on controllable generation in diffusion models.We provide a detailed taxonomy defining controlled generation for diffusion models.Controllable generation is categorized based on the formulation,methodologies,and evaluation metrics.By enumerating the range of methods re-searchers have developed for enhanced control,we aim to establish controllable diffusion generation as a distinct subfield warranting dedicated focus.With this survey,we contextualize recent results,provide the dedicated treatment of control-lable diffusion model generation,and outline limitations and future directions.To demonstrate applicability,we highlight controllable diffusion techniques for major computer vision tasks application.By consolidating methods and applications for controllable diffusion models,we hope to catalyze further innovations in reliable and scalable controllable generation.

diffusion modelcontrollable generationapplicationpersonalization

江锐、郑光聪、李藤、杨天瑞、王井东、李玺

展开 >

College of Computer Science and Technology,Zhejiang University,Hangzhou 310007,China

Department of Mathematics,Nanjing University,Nanjing 210023,China

Baidu Visual Technology Department,Baidu Inc.,Beijing 100085,China

National Science Foundation for Distinguished Young Scholars of ChinaNational Natural Science Foundation of ChinaZhejiang Provincial Natural Science Foundation of ChinaNg Teng Fong Charitable Foundation in the form of ZJU-SUTD IDEA

62225605U20A20222LD24F020016188170-11102

2024

计算机科学技术学报(英文版)
中国计算机学会

计算机科学技术学报(英文版)

CSTPCD
影响因子:0.432
ISSN:1000-9000
年,卷(期):2024.39(3)