首页|多模态在情感识别中的研究与应用

多模态在情感识别中的研究与应用

扫码查看
为了解决错别字、语法错误、网络文化特殊用词等引起的噪声干扰,本文研究多模态融合的情感识别方法,提出一种基于模态融合的情感识别网络模型.首先,提取 3 种模态特征,使多模态数据之间的格式统一并对齐;其次,为了挖掘各模态之间的关联关系,融合文本、音频与视频 3 个模态的特征,根据提取的融合特征间的互补信息解决噪声干扰问题;在此基础上,利用注意力机制与双向循环神经网络进一步充分捕获融合特征及不同情感话语中的上下文信息,得到更加丰富的融合特征表示;最后,搭建下游任务模块,利用丰富的融合特征表示,提升下游任务情感识别的识别效果.利用本文所提出的网络模型分别在 3 个数据集上进行了实验,实验结果表明多模态比单一模态效果更好,基于模态融合的情感识别网络在识别性能上有较好的表现,本文结论可用于指导话语情感识别过程.
Research and application of multimodality in emotion recognition
In order to solve various noise interference such as typos,grammatical errors,and special words of network culture,this paper studies the emotional recognition method of multi-modal fusion,and proposes an emotional recognition network model based on modal fusion.Firstly,three modal features are extracted to unify and align the formats between multimodal data.And then,in order to mine the relationship between the modalities,the features of the three modalities of text,audio and video are fused,and thereby,the noise interference problem is solved according to the complementary information between the extracted fusion features.On this basis,the attention mechanism and the bidirectional recurrent neural network are used to further fully capture the fusion features and the context information in different emotional discourses,obtaining a richer fusion feature representation.Finally,the downstream task module is built,using rich fusion feature representation to improve the recognition effect of downstream task emotion recognition.Experiments have been carried out on three datasets using the network model proposed in this paper.The experimental results show that the multi-modal effect is better than the single-modal effect,and the emotion recognition network based on modal fusion has better performance in recognition performance.

deep learningemotion recognitionmultimodalmultimodal fusionrecurrent neural networkbi-directional gated recurrent unitfully connected neural networkattention mechanism

文培煜、聂国豪、王兴梅、吴沛然

展开 >

哈尔滨工程大学 计算机科学与技术学院,黑龙江 哈尔滨 150001

哈尔滨工程大学 水声技术全国重点实验室,黑龙江 哈尔滨 150001

深度学习 情感识别 多模态 多模态融合 循环神经网络 双向门控网络 全连接神经网络 注意力机制

重点实验室开放基金项目

KY10600220048

2024

应用科技
哈尔滨工程大学

应用科技

CSTPCD
影响因子:0.693
ISSN:1009-671X
年,卷(期):2024.51(1)
  • 3