Online knowledge distillation with elastic peer

扫码查看

原文链接

NSTL
Elsevier

外文摘要：Knowledge distillation is a highly effective method for transferring knowledge from a cum-bersome teacher network to a lightweight student network. However, teacher networks are not always available. An alternative method called online knowledge distillation, which applies a group of peer networks to learn collaboratively with each other, has been pro -posed previously. In this study, we revisit online knowledge distillation and find that the existing training strategy limits the diversity among peer networks. Thus, online knowledge distillation cannot achieve its full potential. To address this, a novel online knowledge dis-tillation with elastic peer (KDEP) strategy is introduced here. The entire training process is divided into two phases by KDEP. In each phase, a specific training strategy is applied to adjust the diversity to an appropriate degree. Extensive experiments have been conducted on individual benchmarks, including CIFAR-100, CINIC-10, Tiny ImageNet, and Caltech-UCSD Birds. The results demonstrate the superiority of KDEP. For example, when the peer networks are ShuffleNetV2-1.0 and ShuffleNetV2-0.5, the target peer network ShuffleNetV2-0.5 achieves 57:00% top-1 accuracy on Tiny ImageNet via KDEP. (c) 2021 Elsevier Inc. All rights reserved.

外文关键词：

Neural network compressionKnowledge distillationKnowledge transfer

作者：

Tan, Chao、Liu, Jie

展开 >

作者单位：

Natl Univ Def Technol

出版年：

2022

DOI：

10.1016/j.ins.2021.10.043

Information Sciences

EISCI

ISSN：0020-0255

年,卷(期)：2022.583

被引量7
参考文献量45