多模态在情感识别中的研究与应用

扫码查看

原文链接

万方数据
维普

中文摘要：为了解决错别字、语法错误、网络文化特殊用词等引起的噪声干扰,本文研究多模态融合的情感识别方法,提出一种基于模态融合的情感识别网络模型.首先,提取 3 种模态特征,使多模态数据之间的格式统一并对齐;其次,为了挖掘各模态之间的关联关系,融合文本、音频与视频 3 个模态的特征,根据提取的融合特征间的互补信息解决噪声干扰问题;在此基础上,利用注意力机制与双向循环神经网络进一步充分捕获融合特征及不同情感话语中的上下文信息,得到更加丰富的融合特征表示;最后,搭建下游任务模块,利用丰富的融合特征表示,提升下游任务情感识别的识别效果.利用本文所提出的网络模型分别在 3 个数据集上进行了实验,实验结果表明多模态比单一模态效果更好,基于模态融合的情感识别网络在识别性能上有较好的表现,本文结论可用于指导话语情感识别过程.

外文标题：Research and application of multimodality in emotion recognition

外文摘要：In order to solve various noise interference such as typos,grammatical errors,and special words of network culture,this paper studies the emotional recognition method of multi-modal fusion,and proposes an emotional recognition network model based on modal fusion.Firstly,three modal features are extracted to unify and align the formats between multimodal data.And then,in order to mine the relationship between the modalities,the features of the three modalities of text,audio and video are fused,and thereby,the noise interference problem is solved according to the complementary information between the extracted fusion features.On this basis,the attention mechanism and the bidirectional recurrent neural network are used to further fully capture the fusion features and the context information in different emotional discourses,obtaining a richer fusion feature representation.Finally,the downstream task module is built,using rich fusion feature representation to improve the recognition effect of downstream task emotion recognition.Experiments have been carried out on three datasets using the network model proposed in this paper.The experimental results show that the multi-modal effect is better than the single-modal effect,and the emotion recognition network based on modal fusion has better performance in recognition performance.

外文关键词：

deep learningemotion recognitionmultimodalmultimodal fusionrecurrent neural networkbi-directional gated recurrent unitfully connected neural networkattention mechanism

作者：

文培煜、聂国豪、王兴梅、吴沛然

展开 >

作者单位：

哈尔滨工程大学计算机科学与技术学院,黑龙江哈尔滨 150001

哈尔滨工程大学水声技术全国重点实验室,黑龙江哈尔滨 150001

关键词：

深度学习情感识别多模态多模态融合循环神经网络双向门控网络全连接神经网络注意力机制

基金：

重点实验室开放基金项目

项目编号：

KY10600220048

出版年：

2024

DOI：

10.11991/yykj.202306017

应用科技

哈尔滨工程大学

应用科技

CSTPCD

影响因子：0.693

ISSN：1009-671X

年,卷(期)：2024.51(1)

参考文献量3