基于聚类和群组归一化的多模态对话情绪识别
Multimodal conversation emotion recognition based on clustering and group normalization
罗奇 1苟刚1
作者信息
- 1. 贵州大学公共大数据国家重点实验室,贵州 贵阳 550025;贵州大学计算机科学与技术学院,贵州 贵阳 550025
- 折叠
摘要
相似情绪类别识别混乱导致识别效果下降的问题一直是多模态情绪识别任务的一大挑战.针对此问题,提出一个基于聚类群组归一化的关系图神经网络模型方法.首先使用3 个不同特征提取器提取出3 种模态特征,并融入说话者编码后进行拼接,既丰富特征表示又保留原始信息;其次使用Transformer提取上下文信息;最后将特征节点输入关系图卷积神经网络后,通过对节点进行聚类分组,并独立地进行群组归一化,使相似节点更加相似,缓解相似情绪容易识别混乱的问题.通过实验验证,提出的网络模型在IEMOCAP数据集四分类上的F1 值可达到 86.34%,验证该方法的有效性,并且目前该模型达到IEMOCAP数据集的最佳性能.
Abstract
It is a challenge for multimodal emotion recognition task that the confusion of similar emotion categories recognition leads to a decrease in recognition effect.To address this problem,a neural network modeling approach for relational graphs is proposed based on clustering group normalization.Firstly,three modal features are extracted using three different feature extractors and spliced by incorporating speaker encoding,which enriches the feature representation and preserves the original information.Secondly,con-textual information is extracted using Transformer.Finally,after the feature nodes are input into the relational graph convolutional neural network,the nodes are clustered and grouped by clustering and independently normalized to make similar nodes more similar,which alleviates the problem that similar emotions are difficult to delimit.Through experimental validation,the network model can reach an 86.34%F1-score on the IEMOCAP dataset four classification,which verifies the effectiveness of the method in this paper.At present,the model achieves the best performance on this dataset.
关键词
图神经网络/特征融合/群组归一化/聚类/对话情绪识别Key words
graph neural network/feature fusion/group normalization/cluster/conversation emotion recognition引用本文复制引用
基金项目
国家自然科学基金资助项目(62162010)
贵州省科技支撑计划资助项目(黔科合支撑[2022]一般267)
出版年
2024