基于知识蒸馏的跨模态语音情感分类

Cross-modal Speech Sentiment Classification Based on Knowledge Distillation

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：针对语音情感分类任务面临的语音数据标注困难的挑战,该文提出了一种新的跨模态语音情感分类任务,其可以使用文本模态数据(源端)帮助语音模态数据(目标端)进行情感分类.在此基础上,提出了一种基于知识蒸馏的跨模态情感分类模型,旨在通过知识蒸馏方法将文本情感分类模型(教师模型)学习到的预训练先验知识蒸馏到语音情感分类模型(学生模型)中.该模型的特色在于无须在测试端依赖昂贵的语音识别技术,可对原始语音数据直接进行情感分类,有利于该模型在实际语音情感分类应用场景中大规模落地.实验结果表明,该文所提出的方法可以有效利用文本模态分类的经验来提升语音模态的分类效果.

外文摘要：This paper proposes a new cross-modal speech sentiment classification task,which aims to leverage the text modal data as the source side to classify the speech modal data on the target side.This paper designs a cross-modal sentiment classification model based on knowledge distillation,which is intended to distill the prior pre-train-ing knowledge learning from the text-modal sentiment classification model(teacher model)into the speech-modal sentiment classification model(student model).The proposed model is distinguished by that its capability of direct analysis of the original speech data without relying on the speech recognition technology,which is crucial to large-scale implementation in the actual speech emotion analysis application scenarios.Experimental results show that the proposed method can effectively use the experience of text modal sentiment classification to improve the effect of speech modal sentiment classification.

外文关键词：

cross-modalknowledge distillationsentiment classification

作者：

尤佩雯、王晶晶、高晓雅、李寿山

展开 >

作者单位：

苏州大学自然语言处理实验室计算机科学与技术学院,江苏苏州 215006

关键词：

跨模态知识蒸馏情感分类

基金：

国家自然科学基金国家自然科学基金国家自然科学基金中国博士后科学基金江苏高校优势学科建设工程资助项目

项目编号：

6200616662076175620761762019M661930

出版年：

2024

中文信息学报

中国中文信息学会,中国科学院软件研究所

中文信息学报

CSTPCDCHSSCD北大核心

影响因子：0.8

ISSN：1003-0077

年,卷(期)：2024.38(4)

参考文献量24