首页|跨模态检索研究方法综述

跨模态检索研究方法综述

扫码查看
跨模态检索是多模态学习中的一个关键领域,其主要目标是寻找不同模态之间的语义关系,使其能在不同模态之间检索到具有相似语义特征的样本.随着深度神经网络发展,跨模态检索受到许多学者的关注,输入—输出查询的模态不同,其一致性比较仍然是一个难点.为此,首先介绍跨模态检索的相关概念,对基于实值表示、二进制表示等跨模态检索的常用方法进行总结,然后重点阐述深度学习模型在跨模态检索上的应用、跨模态检索主要数据集和评价指标,最后提出该领域的未来发展方向与现存主要难点与挑战,以期为跨模态检索的研究人员提供参考与借鉴.
A Review of Research Methods for Cross-Modal Retrieval
Cross modal retrieval is a key field in multimodal learning,whose main goal is to find semantic relationships between different mo-dalities,so that it can retrieve samples with similar semantic features between different modalities.With the development of deep neural net-works,cross modal retrieval has attracted the attention of many scholars.The consistency comparison of input-output queries remains a chal-lenge due to their different modalities.To this end,first introduce the relevant concepts of cross modal retrieval,summarize the commonly used methods of cross modal retrieval based on real value representation and binary representation,and then focus on the application of deep learning models in cross modal retrieval,the main datasets and evaluation indicators of cross modal retrieval.Finally,propose the future devel-opment direction and existing main difficulties and challenges in this field,in order to provide reference and guidance for researchers in cross modal retrieval.

cross-modal retrievaldeep learningreal value representationbinary representation

侯嘉润、施水才、王洪俊

展开 >

北京信息科技大学 计算机学院,北京 100101

拓尔思信息技术股份有限公司,北京 100096

跨模态检索 深度学习 实值表示 二进制表示

2024

软件导刊
湖北省信息学会

软件导刊

影响因子:0.524
ISSN:1672-7800
年,卷(期):2024.23(5)