首页|类不平衡的公共和标签特定特征多标签分类

类不平衡的公共和标签特定特征多标签分类

扫码查看
多标签分类主要解决实例数据对应多个标签问题,现有多标签方法大多利用所有特征组成的相同数据表示来区分所有标签,由于每个标签自身特点不同,统一的特征不能完全区分标签,给模型训练带来负面作用和时间成本增加,如何利用对每个标签而言最具有辨别力的特征来提高模型分类性能成为一种难题,此外现实中类不平衡问题同样会导致多标签学习模型的性能下降。基于此,提出一种类不平衡的公共和标签特定特征多标签分类方法。首先,找到种子实例的最近邻居,然后通过插值技术得到合成实例的特征来解决类不平衡问题;其次,为了找出对每个标签最具代表性的特征,引入l1,l2,1 正则化约束系数矩阵提取标签的特定特征和公共特征;最后,使用标签相关性实现关联标签的模型输出相似,实例相关性保证关联特征共享对应标签分布信息提高分类性能。实验表明所提方法与其他多标签分类方法相比获得了更好的分类精度。
Class Imbalance Multi-label Classification with Common and Label Specific Features
Multi-label classification mainly deals with the problem that instances data is associated with multiple class labels.Most of the existing multi-label methods use the same data representation consisting of all features to distinguish all labels.However,due to the different characteristics of each label,unified features cannot fully differentiate them,which brings negative effects and increases time cost to model training.Therefore,it becomes a challenge to improve the model classification performance by utilizing the most discriminative features for each label.In addition,the problem of class imbalance in reality can also result in a decline in the performance of multi-label learning models.Motivated by this,we propose a new approach of class imbalance multi-label classification with common and label specific features.Firstly,we find the nearest neighbors of seed instances,and then use interpolation techniques to obtain the features of synthetic instances to solve the problem of class imbalance.Secondly,in order to find the most representative features for each label,we introduce l1-norm and l2,1-norm regularizers constraint coefficient matrix to extract label-specific features and common features.Finally,we use label correlation to achieve similar model output of associated labels,and instance correlation to ensure that associated features share corresponding label distribution information to improve classification performance.Extensive experiments show a competitive performance of proposed method against other multi-label learning approaches.

multi-label classificationclass-imbalancecommon featureslabel-specific featureslabel correlation

张海翔、李培培、胡学钢

展开 >

蚌埠医学院附属合肥市第二人民医院 讯息处,安徽 合肥 230012

合肥工业大学 大数据知识工程教育部重点实验室,安徽 合肥 230601

多标签分类 类不平衡 公共特征 标签特定特征 标签相关性

国家自然科学基金资助项目国家自然科学基金资助项目国家自然科学基金资助项目蚌埠医学院科技计划项目

6197607762076085621201060082022byzd225sk

2024

计算机技术与发展
陕西省计算机学会

计算机技术与发展

CSTPCD
影响因子:0.621
ISSN:1673-629X
年,卷(期):2024.34(2)
  • 30