首页|基于深度多模态关联学习的短视频多标签分类研究

基于深度多模态关联学习的短视频多标签分类研究

扫码查看
[目的]充分利用模态互补性,增强模态之间和模态与标签之间的相关性,实现高度准确的分类效果.[方法]提出一种基于多模态语义增强及图卷积网络的短视频多标签分类算法,利用短视频中的多模态信息进行多标签分类任务.[结果]算法分类精度达87.15%,比最优的基准算法提升了 6.82个百分点.[局限]模态融合增强信息存在冗余信息,这些冗余掩盖了模态之间的相关性;此外,基于多模态的多标签分类研究较为有限.[结论]本文算法能够提高模态之间的互补性,增强模态与类别之间的相关性,提高分类准确性.
Research on Micro-video Multi-Label Classification Based on Deep Multimodal Association Learning
[Objective]The paper makes full use of the complementarity of modalities to enhance the correlation between modalities as well as between modalities and labels to achieve highly accurate classification effects.[Methods]We proposed a multi-label classification algorithm for micro-videos based on multimodal semantic enhancement and graph convolutional networks,utilizing multimodal information in micro-videos to support multi-label classification tasks.[Results]We verified the effectiveness of the proposed algorithm through a large number of experimental analyses,and the algorithm's classification accuracy reached 87.15%,which is 6.82%higher than the optimal benchmark algorithm.[Limitations]The process of modality fusion for information enhancement is hindered by the presence of redundant data,which in turn obscures the correlation between modalities.Furthermore,the domain of modality-based multi-label classification remains relatively unexplored with limited research available.[Conclusions]The algorithm effectively enhances the complementarity among modalities,strengthens the correlation between modalities and categories,and improves the accuracy of classification.

Multimodal FusionSemantic EnhancementGraph Convolutional NetworkMicro-video

李云、卢志翔、刘姝伊、王粟、吕梓民、井佩光

展开 >

广西财经学院大数据与人工智能学院 南宁 530003

南宁学院人工智能与软件学院 南宁 530200

天津大学电气自动化与信息工程学院 天津 300072

广西民族大学电子信息学院 南宁 530006

广西大学计算机与电子信息学院 南宁 530004

展开 >

多模态融合 语义增强 图卷积网络 短视频

国家自然科学基金国家自然科学基金国家自然科学基金

618610146180227762361002

2024

数据分析与知识发现
中国科学院文献情报中心

数据分析与知识发现

CSTPCDCSSCICHSSCD北大核心EI
影响因子:1.452
ISSN:2096-3467
年,卷(期):2024.8(7)