首页|视觉表征学习综述

视觉表征学习综述

扫码查看
表征学习是人工智能算法中的重要一环,好的表征能够让后续的下游任务事半功倍.随着深度学习在计算机视觉领域的发展,视觉表征学习变得越来越重要,其目的是将复杂的视觉信息转换为更易于人工智能算法学习的表达.文中主要介绍了目前广泛使用的视觉表征学习的研究工作,根据数据依赖程度和类型的不同,将其划分为预训练视觉表征学习、生成式视觉表征学习、对比式视觉表征学习、解耦式视觉表征学习以及结合语言信息的视觉表征学习.具体而言,预训练视觉表征学习是基于有监督的预训练模型在视觉表征学习上的应用;生成式视觉表征学习利用生成模型学习视觉表征;对比式视觉表征学习主要介绍了利用对比学习思想来学习视觉表征的各类网络框架.此外,还介绍了利用变分自编码器和生成对抗网络在解耦式视觉表征学习中的应用,以及利用语言信息来增强视觉表征学习的各种方法.最后,总结了视觉表征学习的评价准则和未来展望.
Review of Visual Representation Learning
Representation learning is an important step of artificial intelligence algorithm,where well designed representation can boost downstream tasks.With the development of deep learning in computer vision,visual representation learning has become in-creasingly important,aiming at transforming complex visual information into representation that is easier for artificial intelligence algorithm to learn.In this paper,we focus on current research works widely used in visual representation learning,which are cate-gorized as pre-trained visual representation learning,generative visual representation learning,contrastive visual representation learning,decoupled visual representation learning,and visual representation learning combined with language information accor-ding to the degrees and types of data dependency.Specifically,pre-trained visual representation learning is the application of su-pervised pre-training model in visual representation learning;generative visual representation learning uses generative model to learn visual representations;and contrastive visual representation learning focuses on the various network frameworks which using contrast learning to learn visual representations.Besides,the paper presents the applications of VAE and GAN in decoupled visual representation learning,as well as various approaches to improve visual representation learning with language information.Finally,evaluation metrics in visual representation learning and future perspectives are summarized.

Visual representation learningArtificial intelligence algorithmDecoupled visual representation learningLanguage in-formation

王帅炜、雷杰、冯尊磊、梁荣华

展开 >

浙江工业大学计算机科学与技术学院 杭州 310023

浙江大学计算机科学与技术学院 杭州 310027

视觉表征学习 人工智能算法 解耦式视觉表征学习 语言信息

国家自然科学基金国家自然科学基金浙江省自然科学基金浙江省自然科学基金

6210622662036009LQ22F020013LDT23F0202

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(11)