Speech emotion recognition algorithm based on MHA-ResNet
A significant challenge in the field of speech emotion recognition lies in the extraction of key features from speech signals to enhance recognition accuracy.Drawing on existing research,a model for speech emotion recognition based on Multi-Head-Attention Residual Network(MHA-ResNet)is proposed to elevate the precision of recognizing emotions conveyed through speech.Firstly,the emotional feature set is extracted from the preprocessed speech data.And then,the discriminative speech emotional information in different states is obtained by using the parallel processing characteristics of the multi-head attention mechanism.Finally,deep emotional features are further captured by the residual network,facilitating accurate recognition of diverse emotions.To validate the efficacy of this model,experi-ments are conducted using CASIA and EmoDB data sets,yielding recognition accuracies of 93.59%and 97.57%,respectively.
speech emotion recognitionmultiple attention mechanismresidual networkemotional feature set