Class Attention Knowledge Distillation Based on Channel Correlation
Previous knowledge distillation methods have shown impressive performance in model compression,among which in the work of Class Attention Transfer Based Knowledge Distillation(CAT-KD),it has been proven that the transfer class activation graph can enable student models to acquire and enhance the ability to recognize input classification regions,which is the key to current mainstream CNN models for classification.It enhances the ability of the student model to recognize input classification regions and improves distillation performance by transferring class activation maps through average pooling and 12 normalization.However,this approach ignores the channel related knowledge in the class activation maps,which is crucial for the student model's ability to learn and recognize input classification regions.To address the aforementioned issues,we propose a class attention transfer method based on channel correlation.Specifically,in order to extract rich knowledge from class activation maps,the proposed method not only considers the feature knowledge of different channels in the class activation maps within the samples,but also considers the relationship knowledge based on each channel feature in the class activation maps of different samples.The experiment shows that the proposed method has improved by 0.96 percentage points compared to the benchmark method on the CIFAR-100 dataset,which is better than the comparison method.