新疆大学学报(自然科学版)(中英文)2024,Vol.41Issue(1) :52-58.DOI:10.13568/j.cnki.651094.651316.2023.07.05.0002

具有互补特征学习框架和注意力特征融合模块的语音情感识别模型

Speech Emotion Recognition with Complementary Feature Learning Framework and Attentional Feature Fusion Module

黄佩瑶 程慧慧 唐小煜
新疆大学学报(自然科学版)(中英文)2024,Vol.41Issue(1) :52-58.DOI:10.13568/j.cnki.651094.651316.2023.07.05.0002

具有互补特征学习框架和注意力特征融合模块的语音情感识别模型

Speech Emotion Recognition with Complementary Feature Learning Framework and Attentional Feature Fusion Module

黄佩瑶 1程慧慧 2唐小煜3
扫码查看

作者信息

  • 1. 华南师范大学工学部电子与信息工程学院,广东佛山 528225
  • 2. 华南师范大学物理学院,广东广州 510006
  • 3. 华南师范大学工学部电子与信息工程学院,广东佛山 528225;华南师范大学物理学院,广东广州 510006
  • 折叠

摘要

针对深度学习的特征提取方法无法全面提取语音中的情感特征,也无法有效地融合这些特征的问题,提出了一种集成互补特征学习框架和注意力特征融合模块的语音情感识别模型.该互补特征学习框架包含两条独立的表征提取分支和一条交互互补表征提取分支,能够全面覆盖情感特征的独立性表征和互补性表征.为了进一步优化模型性能,引入注意力特征融合模块,该模块能够根据不同表征对情感分类的贡献程度分配合适的权重,使模型能最大程度地关注对情感识别最有助的特征.基于两个公开情感数据库(Emo-DB和IEMOCAP)的仿真实验结果,验证了所提模型的鲁棒性和有效性.

Abstract

Addressing the limitations of deep learning feature extraction methods,which fail to comprehensively extract and effectively integrate emotional features from speech,this paper proposes a novel speech emotion recog-nition model.It integrates a complementary feature learning framework and an attention feature fusion module.The complementary feature learning framework consists of two independent representational extraction branches and an interactive complementary representational extraction branch,thoroughly covering both independent and complementary representations of emotional features.To further optimize model performance,an attention fea-ture fusion module is introduced.This module allocates appropriate weights based on the contribution level of different representations to emotion classification,enabling the model to focus maximally on features most bene-ficial for emotion recognition.Simulation experiments conducted on two public emotion databases(Emo-DB and IEMOCAP)validate the robustness and effectiveness of the proposed model.

关键词

语音情感识别/深度神经网络/情感特征表征/特征提取器/特征融合/注意力机制/人工智能

Key words

speech emotion recognition/deep neural networks/emotional feature representation/feature extrac-tor/feature fusion/attention mechanism/artificial intelligence

引用本文复制引用

基金项目

国家自然科学基金(62001173)

广东省大学生科技创新人才培养专项(pdjh2022a0131)

出版年

2024
新疆大学学报(自然科学版)(中英文)
新疆大学

新疆大学学报(自然科学版)(中英文)

CSTPCD
影响因子:0.13
ISSN:2096-7675
参考文献量25
段落导航相关论文