首页|Co-history:协同学习中考虑历史信息的标签噪声鲁棒学习方法

Co-history:协同学习中考虑历史信息的标签噪声鲁棒学习方法

扫码查看
目的 深度神经网络在计算机视觉分类任务上表现出优秀的性能,然而,在标签噪声环境下,深度学习模型面临着严峻的考验。基于协同学习(co-teaching)的学习算法能够有效缓解神经网络对噪声标签数据的学习问题,但仍然存在许多不足之处。为此,提出了一种协同学习中考虑历史信息的标签噪声鲁棒学习方法(Co-history)。方法 首先,针对在噪声标签环境下使用交叉熵损失函数(cross entropy,CE)存在的过拟合问题,通过分析样本损失的历史规律,提出了修正损失函数,在模型训练时减弱CE损失带来的过拟合带来的影响。其次,针对co-teaching算法中两个网络存在过早收敛的问题,提出差异损失函数,在训练过程中保持两个网络的差异性。最后,遵循小损失选择策略,通过结合样本历史损失,提出了新的样本选择方法,可以更加精准地选择干净样本。结果 在4个模拟噪声数据集 F-MNIST(Fashion-mixed National Institute of Standards and Technology)、SVHN(street view house number)、CIFAR-10(Canadian Institute for Advanced Research-10)和 CIFAR-100 和一个真实数据集 Clothing 1M 上进行对比实验。其中,在 F-MNIST、SVHN、CIFAR-10、CIFAR-100,对称噪声(symmetric)40%噪声率下,对比 co-teaching算法,本文方法分别提高了 3。52%、4。77%、6。16%和6。96%;在真实数据集Clothing1M下,对比co-teaching算法,本文方法的最佳准确率和最后准确率分别提高了0。94%和1。2%。结论 本文提出的协同学习下考虑历史损失的带噪声标签鲁棒分类算法,经过大量实验论证,可以有效降低噪声标签带来的影响,提高模型分类准确率。
Co-history:learning with noisy labels by co-teaching with history losses
Objective Deep neural networks(DNNs)have been successfully applied in many fields,especially in com-puter vision,which cannot be achieved without large-scale labeled datasets.However,collecting large-scale datasets with accurate labels is difficult in practice,especially in some professional fields.The labeling of these datasets requires the involvement of relevant experts,thus increasing manpower and financial resources.To cut costs,researchers have started using datasets built by crowdsourcing annotations,search engine queries,and web crawling,among others.However,these datasets inevitably contain noisy labels that seriously affect the generalization of DNNs because DNNs memorize these noise labels during training.Learning algorithms based on co-teaching methods,including Co-teaching+,JoCoR,and CoDis,can effectively alleviate the learning problem of neural networks on noisy label data.Scholars have put forward dif-ferent opinions regarding the use of two networks to solve noisy labels.However,in a noisy label environment,the deep learning model based on CE loss is very sensitive to the noisy label,thus making the model easily fit the noisy label sample and unable to learn the real pattern of the data.With the progress of training,Co-teaching causes the parameters of the two networks to gradually become consistent and prematurely converge to the same network,thus stopping the learning process.As the iteration progresses,the network inevitably remembers some of the noisy label samples and thus failing to distin-guish the noisy from the clean samples accurately based on the cross entropy(CE)loss value.In this case,relying solely on CE loss as a small loss selection strategy is not reliable.To solve these problems,this paper proposes learning with noisy labels by co-teaching with history losses(Co-history)that considers historical information in collaborative learning.Method First,to solve the overfitting problem of cross entropy loss(CE)in a noisy label environment,a correction loss is proposed by analyzing the history of sample loss.The revised loss function adjusts the weight of the CE loss in the current iteration in order for the CE loss of the sample to remain stable in the historical iteration as a whole,hence conforming to the law that the classifier should be maintained after separating the noisy from the clean samples so as to reduce the influ-ence of overfitting caused by CE loss.Second,the difference loss is proposed to address the problem of premature conver-gence of two networks in the co-teaching algorithm.Inspired by contrast loss,the difference loss makes the two networks maintain a certain distance from the feature representation of the same sample so as to maintain the difference between these networks in the training process and to avoid their degradation into a single network.Given the differences in the net-work parameters,various decision boundaries are generated,and different types of errors are filtered.Therefore,maintain-ing such difference can benefit collaborative learning.Finally,due to the existence of overfitting,those samples with noisy labels tend to have larger loss fluctuations than those samples with clean labels.By combining the historical loss informa-tion of these samples and following the small loss selection strategy,a new sample selection method is proposed to select clean samples accurately.Specifically,those samples with low classification losses and low fluctuations in historical losses are selected as clean samples for training.Result Several experiments are conducted to demonstrate the effectiveness of the Co-history algorithm,including comparison experiments on four standard datasets(F-MNIST,SVHN,CIFAR-10,and CIFAR-100)and one real dataset(Clothing1M).Four categories of artificially simulated noise are added to the standard dataset,including symmetric,asymmetric,pairflip,and tridiagonal noise types,with 20%and 40%noise rates for each noise type.In the real dataset,the labels are generated by the text around the image,which contains the noise label itself,thus generating no additional label noise.At the symmetric noise type with 20%noise rate,the co-history algorithm demon-strates 2.05%,2.19%,3.06%,and 2.58%improvements over the co-teaching algorithm in the F-MNIST,SVHN,CIFAR-10,and CIFAR-100 datasets,respectively.With 40%noise rate,the corresponding improvements are 3.52%,4.77%,6.16%,and 6.96%.In the real Clothing1M dataset,the best and lowest accuracies of co-history have improved by 0.94%and 1.2%,respectively,compared with the co-teaching algorithm.The effectiveness of the proposed loss is proven by ablation experiments.Conclusion A correction loss is proposed in this paper to address the overfitting problem of CE loss training and the historical law of sample loss,and a difference loss function is introduced to solve the premature convergence of two networks in Co-teaching.In view of the traditional small-loss sample selection strategy,the historical law of sample loss is fully considered in this paper,and a highly accurate sample selection strategy is developed.The pro-posed Co-history algorithm demonstrates its superiority over the existing co-teaching strategies in a large number of experi-ments.This algorithm also shows strong robustness in datasets with noisy labels and is particularly suitable for noisy label scenarios.The various improvements in this algorithm are also clearly demonstrated in ablation experiments.Given that this algorithm needs to analyze the historical loss information of each sample,the historical loss value of each sample should be saved.Increasing the number of training samples would occupy more memory space,thus increasing computing and storage costs.In addition,with a large number of sample categories,the performance of the proposed algorithm becomes suboptimal under some noisy environments(e.g.,asymmetric noise type with 40%noise rate and the CIFAR-100 dataset with 20%noise rate).Future work will focus on the development of high-performance solutions under the premise of guaranteed accuracy and excellent robust classification algorithms for learning with noisy labels.

deep neural network(DNN)classificationnoisy labelsco-teachinghistorical loss

董永峰、李佳伟、王振、贾文玉

展开 >

河北工业大学人工智能与数据科学学院,天津 300401

河北省大数据计算重点实验室(河北工业大学),天津 300401

河北省数据驱动工业智能工程研究中心(河北工业大学),天津 300401

深度神经网络(DNN) 分类 噪声标签 协同学习 历史损失

2024

中国图象图形学报
中国科学院遥感应用研究所,中国图象图形学学会 ,北京应用物理与计算数学研究所

中国图象图形学报

CSTPCD北大核心
影响因子:1.111
ISSN:1006-8961
年,卷(期):2024.29(12)