Weight-Nested Teacher-Student Network Based on Structural Re-parameterization
Teacher-student networks have achieved significant results and become important frameworks or paradigms in the fields of knowledge distillation, knowledge expansion, domain adaptation, and multi-task learning.In different application scenarios, teacher-student networks have different optimization directions.This paper proposes a weight-nested teacher-student network model for knowledge expansion tasks in semi-supervised application scenarios.The model borrows from the structural reparameterization method, designs the student network as a structurally reparameterized network, and further uses its backbone branch as the teacher network to achieve the nesting of the teacher and student networks.Since the backbone branches of the teacher and student networks share weights, the proposed framework can significantly reduce the parameter volume and memory requirements of traditional teacher-student networks models.To verify the effectiveness of the proposed framework under semi-supervised settings, we conducte comparative experiments on the Food-101 dataset.The experimental results show that the nested teacher-student networks achieves slightly higher performance than the traditional teacher-student network model under various labeling ratios, while reducing the model parameter volume and memory usage by more than 40%.