一种全面的少标签样本情形下的知识蒸馏方法

扫码查看

原文链接

万方数据
维普

中文摘要：[目的]针对在自然语言处理中有标签样本稀缺和高性能的大规模参数量模型训练成本高的问题,本文在有标签样本不足情况下,通过知识蒸馏方法,提升在高性能大参数量模型指导下的小参数量模型性能.[方法]通过噪声提纯方法,从无标签数据中获取有价值的数据并赋予其伪标签,增加有标签样本数量;并在传统蒸馏模型基础上增加知识回顾机制和助教模型,实现从大参数量模型到小参数量模型的全面的知识迁移.[结果]在IMDB、AG_NEWS、Yahoo! Answers数据集的文本分类和情感分析任务上,使用原数据集规模的5％作为有标签数据,准确率表现与使用全部数据训练的传统蒸馏模型分别仅相差1.45％、2.75％、7.28％.[局限]仅针对自然语言处理中的文本分类以及情感分析任务进行实验研究,后续可进一步扩大任务覆盖面.[结论]本文所提方法在少量有标签样本的情形下,实现了较好的蒸馏效果,使得小参数量模型的性能得到显著提升.

外文标题：Knowledge Distillation with Few Labeled Samples

外文摘要：[Objective]This paper uses the knowledge distillation method to improve the performance of a small-parameter model guided by the high-performance large-parameter model with insufficient labeled samples.It tries to address the issue of sample scarcity and reduce the cost of large-parameter models with high performance in natural language processing.[Methods]First,we used noise purification to obtain valuable data from an unlabeled corpus.Then,we added pseudo labels and increased the number of labeled samples.Meanwhile,we added the knowledge review mechanism and teaching assistant model to the traditional distillation model to realize comprehensive knowledge transfer from the large-parameter model to the small-parameter model.[Results]We conducted text classification and sentiment analysis tasks with the proposed model on IMDB,AG_NEWS,and Yahoo! Answers datasets.With only 5％of the original data labeled,the new model's accuracy rate was only 1.45％,2.75％,and 7.28％less than the traditional distillation model trained with original data.[Limitations]We only examined the new model with text classification and sentiment analysis tasks in natural language processing,which need to be expanded in the future.[Conclusions]The proposed method could achieve a better distillation effect and improve the performance of the small-parameter model.

外文关键词：

Knowledge DistillationSemi-Supervised LearningFew Labeled SamplesText Classification

作者：

刘彤、任欣儒、尹金辉、倪维健

展开 >

作者单位：

山东科技大学计算机科学与工程学院青岛 266590

关键词：

知识蒸馏半监督学习少标签样本文本分类

基金：

山东省自然科学基金项目山东科技大学青年教师教学拔尖人才培养项目山东科技大学专业学位研究生教学案例库建设项目

项目编号：

ZR2022MF319BJ20211110

出版年：

2024

DOI：

10.11925/infotech.2096-3467.2022.1155

数据分析与知识发现

中国科学院文献情报中心

数据分析与知识发现

CSTPCDCSSCICHSSCD北大核心EI

影响因子：1.452

ISSN：2096-3467

年,卷(期)：2024.8(1)

参考文献量17