首页|基于轻量级的DialogueRNN多模态优化方法

基于轻量级的DialogueRNN多模态优化方法

扫码查看
对话情感识别(ERC)是自然语言处理中非常活跃的研究领域,而对话语情感进行分类也广泛应用于人机交互中,目前大多数研究主要集中在对说话人和上下文信息建模上,主要使用简单的特征拼接来利用多模态信息而忽略了模态间的依赖关系。为了解决上述问题,该文使用基于注意力机制的网络模型来动态地融合多模态特征,提出了一种基于轻量级的DialogueRNN多模态优化方法MMRNN(multimodal RNN)。首先,在多模态融合的过程中输入注意力得分来关注更重要的模态;其次,优化掉DialogueRNN中的情感GRU;最后,对模型进行堆叠以增加模型深度,并在每层的输出加上注意力机制得到最终情感输出。通过在两个公开的数据集上的模拟实验结果表明,提出的基于轻量级的DialogueRNN多模态优化方法具有较好的性能。
Multimodal Optimization Method Based on Lightweight DialogueRNN
Emotion recognition in conversation(ERC)is a largely active research field in natural language processing,and the classification of discourse emotion is also widely used in human-computer interaction.At present,most of the research mainly focuses on the modeling of speaker and context information,mainly using simple feature stitching to use multimodal information and ignoring the de-pendency between modes.In order to solve the above problems,dynamic feature fusion based on attention network can magnify the mode of high quality and information,a lightweight dialoguernn multimodal optimization method,MMRNN(multimodal RNN),is proposed.Firstly,in the process of multimodal fusion,the attention score is input to focus on the more important modes.Secondly,the emotional GRU is optimized in DialogueRNN.Finally,the models is stacked to increase the depth of the model,and the attention mechanism is added to the output of each layer to get the final emotional output.The simulation results on two public datasets show that the proposed multimodal optimization method based on lightweight DialogueRNN has excellent performance.

multimodal fusionemotion recognition in conversationattention mechanismscene modelingmodel stacking

李晨、梁平、顾进广

展开 >

武汉科技大学计算机科学与技术学院,湖北武汉 430065

智能信息处理与实时工业系统湖北省重点实验室,湖北武汉 430065

多模态融合 对话情感识别 注意力机制 情景建模 模型堆叠

国家社会科学基金重大项目科技创新新一代人工智能重大项目(2030)

11&ZD1892020AAA0108500

2024

计算机技术与发展
陕西省计算机学会

计算机技术与发展

CSTPCD
影响因子:0.621
ISSN:1673-629X
年,卷(期):2024.34(8)