首页|计算机视觉中的提示学习:综述

计算机视觉中的提示学习:综述

扫码查看
自大型预训练视觉—语言模型(VLM)爆发以来,提示学习已在计算机视觉领域引发广泛关注.基于VLM构建的视觉和语言信息之间的密切关系,提示学习成为许多重要应用领域(如人工智能内容生成(AIGC))中的关键技术.本综述循序渐进且全面地总结了与AIGC相关的视觉提示学习.首先介绍了VLM,它是视觉提示学习的基础.然后,回顾了视觉提示学习方法和提示引导生成模型,并讨论了如何提高将AIGC模型适用于下游特定任务的效率.最后,提供了一些有前景的关于提示学习的研究方向.
Prompt learning in computer vision:a survey
Prompt learning has attracted broad attention in computer vision since the large pre-trained vision-language models(VLMs)exploded.Based on the close relationship between vision and language information built by VLM,prompt learning becomes a crucial technique in many important applications such as artificial intelligence generated content(AIGC).In this survey,we provide a progressive and comprehensive review of visual prompt learning as related to AIGC.We begin by introducing VLM,the foundation of visual prompt learning.Then,we review the vision prompt learning methods and prompt-guided generative models,and discuss how to improve the efficiency of adapting AIGC models to specific downstream tasks.Finally,we provide some promising research directions concerning prompt learning.

Prompt learningVisual prompt tuning(VPT)Image generationImage classificationArtificial intelligence generated content(AIGC)

雷一鸣、李婧琦、李子龙、曹原、单洪明

展开 >

上海市智能信息处理重点实验室,计算机科学技术学院,复旦大学,中国 上海市,200438

类脑智能科学与技术研究院,复旦大学,中国 上海市,200433

脑科学前沿科学中心,复旦大学,中国 上海市,200433

上海脑科学与类脑研究中心,中国 上海市,201210

展开 >

提示学习 视觉提示微调 图像生成 图像分类 人工智能内容生成(AIGC)

National Natural Science Foundation of ChinaNational Natural Science Foundation of ChinaChina Postdoctoral Science FoundationNatural science Foundation of Shanghai,ChinaShanghai Municipal of Science and Technology Project,ChinaShanghai Center for Brain Science and Brain-Inspired Technology,China

62306075621011362022TQ006921ZR140360020JC1419500

2024

信息与电子工程前沿(英文)
浙江大学

信息与电子工程前沿(英文)

CSTPCD
影响因子:0.371
ISSN:2095-9184
年,卷(期):2024.25(1)
  • 1
  • 1