Cross-modal Speech Sentiment Classification Based on Knowledge Distillation
This paper proposes a new cross-modal speech sentiment classification task,which aims to leverage the text modal data as the source side to classify the speech modal data on the target side.This paper designs a cross-modal sentiment classification model based on knowledge distillation,which is intended to distill the prior pre-train-ing knowledge learning from the text-modal sentiment classification model(teacher model)into the speech-modal sentiment classification model(student model).The proposed model is distinguished by that its capability of direct analysis of the original speech data without relying on the speech recognition technology,which is crucial to large-scale implementation in the actual speech emotion analysis application scenarios.Experimental results show that the proposed method can effectively use the experience of text modal sentiment classification to improve the effect of speech modal sentiment classification.