基于聚类和群组归一化的多模态对话情绪识别

Multimodal conversation emotion recognition based on clustering and group normalization

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：相似情绪类别识别混乱导致识别效果下降的问题一直是多模态情绪识别任务的一大挑战.针对此问题,提出一个基于聚类群组归一化的关系图神经网络模型方法.首先使用3 个不同特征提取器提取出3 种模态特征,并融入说话者编码后进行拼接,既丰富特征表示又保留原始信息;其次使用Transformer提取上下文信息;最后将特征节点输入关系图卷积神经网络后,通过对节点进行聚类分组,并独立地进行群组归一化,使相似节点更加相似,缓解相似情绪容易识别混乱的问题.通过实验验证,提出的网络模型在IEMOCAP数据集四分类上的F1 值可达到 86.34%,验证该方法的有效性,并且目前该模型达到IEMOCAP数据集的最佳性能.

外文摘要：It is a challenge for multimodal emotion recognition task that the confusion of similar emotion categories recognition leads to a decrease in recognition effect.To address this problem,a neural network modeling approach for relational graphs is proposed based on clustering group normalization.Firstly,three modal features are extracted using three different feature extractors and spliced by incorporating speaker encoding,which enriches the feature representation and preserves the original information.Secondly,con-textual information is extracted using Transformer.Finally,after the feature nodes are input into the relational graph convolutional neural network,the nodes are clustered and grouped by clustering and independently normalized to make similar nodes more similar,which alleviates the problem that similar emotions are difficult to delimit.Through experimental validation,the network model can reach an 86.34%F1-score on the IEMOCAP dataset four classification,which verifies the effectiveness of the method in this paper.At present,the model achieves the best performance on this dataset.

外文关键词：

graph neural networkfeature fusiongroup normalizationclusterconversation emotion recognition

作者：

罗奇、苟刚

展开 >

作者单位：

贵州大学公共大数据国家重点实验室,贵州贵阳 550025

贵州大学计算机科学与技术学院,贵州贵阳 550025

关键词：

图神经网络特征融合群组归一化聚类对话情绪识别

基金：

国家自然科学基金资助项目贵州省科技支撑计划资助项目

项目编号：

62162010黔科合支撑[2022]一般267

出版年：

2024

DOI：

10.6040/j.issn.1671-9352.1.2023.055

山东大学学报(理学版)

山东大学

山东大学学报(理学版)

CSTPCD北大核心

影响因子：0.437

ISSN：1671-9352

年,卷(期)：2024.59(7)

参考文献量22