基于注意力机制和对比学习的多模态情感分析

扫码查看

原文链接

万方数据
维普

中文摘要：针对现有多模态情感分析模型中各模态信息融合不充分以及对时序依赖性挖掘不足的问题，提出了一种结合跨模态注意力、全局自注意力和对比学习的多模态情感分析模型，提高了情感理解的深度。具体来说，首先，分别提取音频、文本、图像3个模态的特征，并将它们映射到同一向量空间中。随后，采用跨注意力机制和全局注意力机制对模态间数据进行有效建模和融合。同时，引入基于数据、标签和时序的对比学习任务，深化模型对多模态特征差异性的理解。在CMU-MOSI和CMU-MOSEI两个公开数据集上的实验结果表明，相较于模态不变和模态特定表示(modality-invariant and-specific representations，MISA)模型，本文模型的二分类准确率分别提升了 1。2和1。6百分点，且F1值分别提升了 1。0和1。6百分点。

外文标题：Multimodal sentiment analysis based on attention mechanism and contrastive learning

外文摘要：To address the challenges associated with inadequate integration of information across modalities and limited analysis of temporal dependencies in existing multimodal sentiment analysis models,a model incorporating cross-modal attention,global self-attention,and contrastive learning was proposed,to deepen sentiment analysis.Specifically,features from speech,text,and image modalities were independently extracted,and maped into a unified vector space.Then,inter-modal data was effectively modeled and integrated using both cross-attention and global attention mechanisms.Meanwhile,contrastive learning tasks based on data,labeling,and timing were introduced to enhance the model's understanding of multimodal feature variability.Experimental evaluations on two publicly available datasets,CMU-MOSI and CMU-MOSEI,reveal that the proposed model achieves superior binary classification accuracy improvements of 1.2 and 1.6 percentage points,and F1 score enhancements of 1.0 and 1.6 percentage points,respectively,compared with the modality-invariant and-specific representations(MISA)model.

外文关键词：

multimodalsentiment analysisattention mechanismcontrastive learning

作者：

方旭东、王兴芬

展开 >

作者单位：

北京信息科技大学计算机学院,北京 102206

北京信息科技大学信息管理学院,北京 102206

关键词：

多模态情感分析注意力机制对比学习

出版年：

2024

DOI：

10.16508/j.cnki.11-5866/n.2024.04.009

北京信息科技大学学报(自然科学版)

北京信息科技大学

北京信息科技大学学报(自然科学版)

影响因子：0.363

ISSN：1674-6864

年,卷(期)：2024.39(4)