基于结构重参数化的权重嵌套式教师-学生网络

Weight-Nested Teacher-Student Network Based on Structural Re-parameterization

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：教师-学生网络在知识蒸馏、知识扩展、域自适应以及多任务学习等领域的应用中均已取得显著成果且成为重要的框架或范式.在不同的应用场景中,教师-学生网络拥有不同的优化方向.本文针对半监督应用场景下的知识扩展任务提出了一种权重嵌套式教师-学生网络模型.该模型借鉴结构重参数化方法,将学生网络设计为结构重参数化网络,并进一步以其主干分支作为教师网络,实现教师网络和学生网络的嵌套.由于教师网络和学生网络的主干分支共享权重,因此本文所提出的框架可以大幅地缩减传统教师-学生网络模型的参数量以及对显存的要求.为了验证本文提出的框架在半监督设定下的有效性,我们在Food-101数据集上进行了对比实验.实验结果表明嵌套式的教师-学生网络在模型参数量和显存占用量减少超过40％的条件下,实现了在多种标注比例下均取得略高于传统教师-学生网络模型的性能.

外文摘要：Teacher-student networks have achieved significant results and become important frameworks or paradigms in the fields of knowledge distillation, knowledge expansion, domain adaptation, and multi-task learning.In different application scenarios, teacher-student networks have different optimization directions.This paper proposes a weight-nested teacher-student network model for knowledge expansion tasks in semi-supervised application scenarios.The model borrows from the structural reparameterization method, designs the student network as a structurally reparameterized network, and further uses its backbone branch as the teacher network to achieve the nesting of the teacher and student networks.Since the backbone branches of the teacher and student networks share weights, the proposed framework can significantly reduce the parameter volume and memory requirements of traditional teacher-student networks models.To verify the effectiveness of the proposed framework under semi-supervised settings, we conducte comparative experiments on the Food-101 dataset.The experimental results show that the nested teacher-student networks achieves slightly higher performance than the traditional teacher-student network model under various labeling ratios, while reducing the model parameter volume and memory usage by more than 40％.

外文关键词：

teacher-student networksstructural re-parameterization methodmodel weight nestingsemi-supervised learningimage classification

作者：

庞枫骞、夏雨明、曾京生、康营营、邢志强

展开 >

作者单位：

北方工业大学信息学院,北京100144

关键词：

教师-学生网络结构重参数化方法模型权重嵌套半监督学习图像分类

出版年：

2024

北方工业大学学报

北方工业大学

北方工业大学学报

影响因子：0.368

ISSN：1001-5477

年,卷(期)：2024.36(2)