首页|基于语音节奏差异的情感识别方法

基于语音节奏差异的情感识别方法

扫码查看
语音情感识别在金融反欺诈等领域有着重要的应用前景,但是语音情感识别的准确率提升变得越来越困难.现有基于语谱图的语音情感识别等方法难以捕捉节奏差异特征,从而影响识别效果.文中基于语音节奏特征的差异性,提出了能量帧时频融合的语音情感识别方法.其关键是,针对语音中高能量区域进行频谱筛选,以高能语音帧的分布和时频变化来体现个体的语音节奏差异.在此基础上建立基于卷积神经网络(CNN)和循环神经网络(RNN)的情感识别模型,实现对频谱的时域和频域变化特征的提取与融合.在公开数据集IEMOCAP上进行实验,结果表明,该基于语音节奏差异的语音情感识别与基于语谱图的方法相比,在加权准确率WA和非加权准确率UA指标上分别平均提升了 1.05%和1.9%;同时也表明个体的语音节奏差异对提升语音情感识别效果具有重要作用.
Speech Emotion Recognition Based on Voice Rhythm Differences
Speech emotion recognition has an important application prospect in financial anti-fraud and other fields,but it is in-creasingly difficult to improve the accuracy of speech emotion recognition.The existing methods of speech emotion recognition based on spectrograms are difficult to capture the rhythm difference features,which affects the recognition effect.Based on the difference of speech rhythm features,this paper proposes a speech emotion recognition method based on energy frames and time-frequency fusion.The key is to screen high-energy regions of the spectrum in the speech,and reflect the individual voice rhythm differences with the distribution of high-energy speech frames and time-frequency changes.On this basis,an emotion recognition model based on convolutional neural network(CNN)and recurrent neural network(RNN)is established to realize the extraction and fusion of the time and frequency changes of the spectrum.On the open data set IEMOCAP,the experiment shows that com-pared with the method based on spectrogram,the weighted accuracy WA and the unweighted accuracy UA of the speech emotion recognition based on the difference of speech rhythm increases by 1.05%and 1.9%on average respectively.At the same time,it also shows that individual voice rhythm difference plays an important role in improving the effect of speech emotion recognition.

Speech emotion recognitionEnergy framesSpectrumTime-frequency fusionVoice rhythm difference

张家豪、章昭辉、严琦、王鹏伟

展开 >

东华大学计算机科学与技术学院 上海 201620

语音情感识别 能量帧 频域谱线 时频融合 语音节奏差异

上海市科技创新行动技术高新技术领域项目

22511100700

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(4)
  • 27