Enhanced Residual Networks via Mixed Knowledge Fraction
Methods such as stimulative training and group knowledge based training are employed to collect group knowledge from shallow subnets in residual networks for self-distillation,thereby enhancing network performance.However,the group knowledge acquired by these methods suffers from issues such as slow updating and difficulties in combining with DataMix techniques.To address these issues,enhanced residual networks via mixed knowledge fraction(MKF)are proposed.The mixed knowledge is decomposed and modeled as quadratic programming by minimizing the fraction loss,and thus high-quality group knowledge is obtained from the mixed knowledge.To improve the robustness and diversity of the knowledge,a compound DataMix technique is proposed to construct a composite data augmentation method.Different from high-precision optimization algorithms with poor efficiency,a simple and efficient linear knowledge fraction technique is designed.The previous group knowledge is taken as knowledge bases,and the mixed knowledge is decomposed based on the knowledge bases.The enhanced group knowledge is then adopted to distill sampled subnetworks.Experiments on mainstream residual networks and classification datasets verify the effectiveness of MKF.
Deep LearningNeural NetworkKnowledge DistillationNetwork EnhancementResidual Network