首页|基于PageRank和互信息的多标签分类器链算法

基于PageRank和互信息的多标签分类器链算法

扫码查看
分类器链算法是解决多标签分类问题的一种有效方法.寻求分类器链中的标签合适顺序是该类算法的关键所在.单链模式中不恰当标签顺序严重影响分类性能,而采用随机多链方式带来的是算法复杂度徒增问题.针对上述问题,提出了一种基于PageRank和互信息的多标签分类器链算法.首先,探索标签和网页之间的共性,将标签之间的相似关系类比网页之间的链接;然后考虑全局相关性,利用互信息度量标签之间的相关性;最后,基于相关性信息,利用PageRank衡量网页重要性的思想对标签进行排序,形成分类器链.对来自不同领域的10 个公开多标签数据集的实验结果表明,该算法能为分类器链找到合适的标签顺序,不仅提高了分类精度,而且降低了计算代价.
A Multi-Label Classifier Chain Algorithm Based on PageRank and Mutual Information
Classifier chains are a sort of multi-label classification algorithm.For the classifier chains algorithm,finding the appropriate label order is the key to improving the classification accuracy.In single-order mode,im-proper label order seriously affected the classification performance,while adopting random multiple-order mode brought the problem of increasing algorithm complexity.To address the above issues,a multi-label classifier chain algorithm based on PageRank and mutual information is proposed.First,the similarities between labels and web pages are explored,analogizing the similarity between labels to the links between web pages,and then con-sidering global relevance by using mutual information to measure the correlation between labels.Finally,based on the correlation information,the idea of PageRank to measure the importance of web pages is used to rank labels and form a classifier chain.Experiments on ten common multi-label data sets from different fields show that this method can find the appropriate label order for the classifier chains algorithm,improving the classification accura-cy and reducing the computational cost.

multi-label classificationclassifier chainsPageRanklabel correlationmutual information

丁家满、李欣宇、贾连印、胡爽、王红斌

展开 >

昆明理工大学 信息工程与自动化学院,云南 昆明 650500

云南省人工智能重点实验室,云南 昆明 650500

多标签分类 分类器链 网页排名 标签相关性 互信息

国家自然科学基金项目国家自然科学基金项目云南省科技揭榜项目

6226203462262035202204BW050001

2024

昆明理工大学学报(自然科学版)
昆明理工大学

昆明理工大学学报(自然科学版)

CSTPCD北大核心
影响因子:0.516
ISSN:1007-855X
年,卷(期):2024.49(3)
  • 34