首页|基于加权融合的语音表情多模态情感识别方法

基于加权融合的语音表情多模态情感识别方法

扫码查看
当前的多模态融合方法未充分利用语音和表情模态之间的互补性,导致多模态情感识别的识别率较低。为解决上述问题,提出了一种基于加权融合的语音表情多模态情感识别方法。方法首先利用语音活动检测(Voice Activation Detection,VAD)算法提取语音关键帧;然后,利用信息熵建模情感产生是一个连续的过程,并提取表情关键帧;其次,为充分利用语音和表情模态之间的互补性,采用语音和表情关键帧对齐技术计算语音和表情权重。这些权重被输进特征融合层进行加权融合,有效地提高多模态情感识别的识别率;最后,在RML、eNTERFACE05 和BAUM-1s数据集上的实验结果表明,该上述方法的识别率高于其它基准方法。
Speech and Expression Multi-Modal Emotion Recognition Method Using Weighted Fusion
Current multi-modal fusion methods do not fully use the complementarity between speech and expres-sion modalities,which results in low recognition rates for multi-modal emotion recognition.Thus,to solve this prob-lem,this paper proposes a speech and expression multi-modal emotion recognition method based on weighted fusion.The method first uses the voice activation detection(VAD)algorithm to extract speech keyframes.Then,the informa-tion entropy is used to model that the generation of emotion is a continuous process,and the expression key frames are extracted.In addition,in order to fully use the complementarity between speech and expression modalities,the speech and expression key frame alignment techniques are utilized to calculate speech and expression weights.These weights are input into the feature fusion layer for weighted fusion,which effectively improves the recognition rate of speech and expression multi-modal emotion recognition.Finally,the experimental results on the RML,eNTERFACE05 and BAUM-1s datasets show that the recognition rate of this method is higher than other benchmark methods.

Emotion recognitionSpeechExpressionWeighted fusion

焦爽、陈光辉

展开 >

中国大唐集团科学技术研究院有限公司中南电力试验研究院,河南 郑州 450000

东南大学信息科学与工程学院,江苏 南京 210096

情感识别 语音 表情 加权融合

国家自然科学基金

U1504622

2024

计算机仿真
中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD
影响因子:0.518
ISSN:1006-9348
年,卷(期):2024.41(7)
  • 8