基于跨模态情感联合增强网络的多模态情感分析方法

Multimodal sentiment analysis method based on cross-modal emotion joint enhancement network

王植 ¹张珏²

扫码查看

作者信息

1. 西安航空职业技术学院,陕西西安 710089
2. 榆林学院信息工程学院,陕西榆林 719000
折叠

摘要

多模态情感分析是人工智能领域重要的研究方向之一,旨在利用多模态数据判断用户情感.现有的大多数方法忽略了不同模态数据之间的异质性,导致情感分析结果出现偏差.针对以上问题提出一种基于跨模态情感联合增强网络的多模态情感分析方法.首先,利用3种深度神经网络预训练模型提取不同模态的语义特征,并通过双向长短期记忆网络挖掘其单模态上下文时序信息;其次,设计了一种跨模态情感联合增强模块,实现融合文本模态和视觉模态特征生成情感极性语义特征,融合文本模态和音频模态信息生成情感强度语义特征,并以情感极性作为方向情感强度表示增幅联合增强情感语义.通过两个公共基准数据集CMU-MOSI和CMU-MOSEI的实验结果表明,所提出的跨模态情感联合增强网络可以获得比相关方法更好的性能.

Abstract

Multimodal sentiment analysis is one of the important research directions in the field of artificial intelligence,which aims at using multimodal data to judge user emotion.Most of the existing methods ig-nore the heterogeneity between different modalities,resulting in bias in sentiment analysis results.To solve the above problems,this paper proposes a multimodal emotion analysis method based on cross-modal emo-tion joint enhancement network.Firstly,three pre-training models are used to extract the semantic features of different modes,and the single-mode context timing information is mined through bidirectional long short-term memory network.Secondly,a cross-modal emotion joint enhancement module is designed to integrate text and visual modal features to generate emotion polar semantic features.The semantic features of emotion intensity are generated by the fusion of text and audio modal information,and the emotion po-larity is taken as the direction of emotion intensity as the increase of emotion semantic representation.Ex-perimental results on two public benchmark datasets CMU-MOSI and CMU-MOSEI show that the pro-posed cross-modal emotion joint enhancement network can achieve better performance than related meth-ods.

关键词

跨模态/多模态情感分析/语义特征/特征融合

Key words

Cross-modal/Multimodal sentiment analysis/Semantic features/Feature fusion

引用本文复制引用

基金项目

榆林学院博士科研基金(22GK03)

出版年

2024

甘肃科学学报

甘肃省科学院中国科学院资源环境科学信息中心

甘肃科学学报

CSTPCD

影响因子：0.414

ISSN：1004-0366

段落导航