首页|Online knowledge distillation with elastic peer
Online knowledge distillation with elastic peer
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NSTL
Elsevier
Knowledge distillation is a highly effective method for transferring knowledge from a cum-bersome teacher network to a lightweight student network. However, teacher networks are not always available. An alternative method called online knowledge distillation, which applies a group of peer networks to learn collaboratively with each other, has been pro -posed previously. In this study, we revisit online knowledge distillation and find that the existing training strategy limits the diversity among peer networks. Thus, online knowledge distillation cannot achieve its full potential. To address this, a novel online knowledge dis-tillation with elastic peer (KDEP) strategy is introduced here. The entire training process is divided into two phases by KDEP. In each phase, a specific training strategy is applied to adjust the diversity to an appropriate degree. Extensive experiments have been conducted on individual benchmarks, including CIFAR-100, CINIC-10, Tiny ImageNet, and Caltech-UCSD Birds. The results demonstrate the superiority of KDEP. For example, when the peer networks are ShuffleNetV2-1.0 and ShuffleNetV2-0.5, the target peer network ShuffleNetV2-0.5 achieves 57:00% top-1 accuracy on Tiny ImageNet via KDEP. (c) 2021 Elsevier Inc. All rights reserved.
Neural network compressionKnowledge distillationKnowledge transfer