长尾分类在现实世界中是一项不可避免且充满挑战的任务.传统方法通常只专注于类间的不平衡分布,然而近期的研究开始重视类内的长尾分布,即同一类别内,具有头部属性的样本远多于尾部属性的样本.由于属性的隐含性和其组合的复杂性,类内不平衡问题更加难以处理.为此,文中提出一种基于引领森林并使用多中心损失的广义长尾分类框架(Cogni-sance),旨在通过不变性特征学习的范式建立长尾分类问题的多粒度联合求解模型.首先,该框架通过无监督学习构建粗粒度引领森林(Coarse-Grained Leading Forest,CLF),以更好地表征类内关于不同属性的样本分布,进而在不变风险最小化的过程中构建不同的环境.其次,设计了一种新的度量学习损失,即多中心损失(Multi-Center Loss,MCL),可在特征学习过程中逐步消除混淆属性.同时,Cognisance不依赖于特定模型结构,可作为独立组件与其他长尾分类方法集成.在ImageNet-GLT和MSCOCO-GLT数据集上的实验结果显示,所提框架取得了最佳性能,现有方法通过与本框架集成,在Top1-Accuracy指标上均获得2%~8%的提升.
Multi-granular and Generalized Long-tailed Classification Based on Leading Forest
Long-tailed classification is an inevitable and challenging task in the real world.Traditional methods usually focus only on inter-class imbalanced distributions,however,recent studies have begun to emphasize intra-class long-tailed distributions,i.e.,within the same class,there are far more samples with head attributes than tail ones.Due to the implicitness of the attributes and the complexity of their combinations,the intra-class imbalance problem is even more difficult to deal with.For this purpose,a generalized long-tailed classification framework(Cognisance)is proposed in the paper,aiming to build a multi-granularity joint so-lution model for the long-tailed classification problem through the invariant feature learning.Firstly,the framework constructs coarse-grained leading forest(CLF)through unsupervised learning to better characterize the distribution of samples about diffe-rent attributes within the class,and thus constructs different environments in the process of invariant risk minimization.Second-ly,the framework designs a new metric learning loss,multi-center loss(MCL),to gradually eliminate confusing attributes during the feature learning process.Additionally,the framework does not depend on a specific model structure and can be integrated with other long-tailed classification methods as an independent component.Experimental results on datasets ImageNet-GLT and MSCOCO-GLT show that,the proposed method achieves the best performance,and existing methods all gain an improvement of 2%~8%in Top1-Accuracy metric by integrating with this framework.
Long-tailed classificationImbalance learningInvariant feature learningMulti-granularity joint problem solving