首页|基于小波散射变换和MFCC的双特征语音情感识别融合算法

基于小波散射变换和MFCC的双特征语音情感识别融合算法

扫码查看
为了充分挖掘语音信号频谱包含的情感信息以提高语音情感识别的准确性,提出了一种基于小波散射变换和梅尔频率倒谱系数(Mel-frequency cepstral coefficient,MFCC)的排列熵加权和偏差调整规则的语音情感识别融合算法(PEW-BAR).算法首先获取语音信号的小波散射特征和梅尔频率倒谱系数的相关特征;然后按尺度维度扩展小波散射特征,利用支持向量机得到情感识别的后验概率并获得排列熵,并使用排列熵对后验概率进行加权;最后采用一种偏差调整规则进一步融合MFCC的相关特征的识别结果.实验结果表明,在EMODB、RAVDESS和eNTERFACE05数据集上,与传统的基于小波散射系数的语音情感识别方法相比,该算法将ACC分别提高了2.82%、2.85%和5.92%,将UAR分别提升了3.40%、2.87%和5.80%,IEMOCAP上提高了6.89%.
Dual-feature speech emotion recognition fusion algorithm based on wavelet scattering transform and MFCC
A fusion algorithm named permutation entropy weighted and bias adjustment rule fusion(PEW-BAR)was proposed to enhance the accuracy of speech emotion recognition by exploiting the emotional information in the spec-tral characteristics of speech signals.The algorithm was based on the integration of wavelet scattering transform and Mel-frequency cepstral coefficients(MFCC).Firstly,wavelet scattering features and MFCC-related features from speech signals were extracted.Then,the wavelet scattering features were expanded in the scale dimension and applied support vector machines to obtain posterior probabilities for emotion recognition.And permutation entropy was calcu-lated and a weighted fusion based on this entropy was subsequently applied.Finally,a bias adjustment rule was uti-lized to refine the integration results obtained from the MFCC-related features.Experimental results on various datas-ets,including EMODB,RAVDESS,and eNTERFACE05,demonstrate notable improvements.The proposed algo-rithm outperforms traditional wavelet scattering coefficient-based methods,achieving accuracy improvements of 2.82%,2.85%,and 5.92%,respectively.Additionally,it shows enhancements of 3.40%,2.87%,and 5.80%in terms of unweighted average recall(UAR),and a 6.89%improvement on the IEMOCAP dataset.

speech emotion recognitionwavelet scattering transformpermutation entropyMFCCmodel fusion

应娜、吴顺朋、杨萌、邹雨鉴

展开 >

杭州电子科技大学通信工程学院,浙江 杭州 310018

语音情感识别 小波散射变换 排列熵 MFCC 模型融合

浙江省自然科学基金浙江省省属高校基本科研业务费专项

LTGS23F010001GK239909299001-406

2024

电信科学
中国通信学会 人民邮电出版社

电信科学

CSTPCD北大核心
影响因子:0.902
ISSN:1000-0801
年,卷(期):2024.40(5)