首页|不完备数据集的邻域容差互信息选择集成分类算法

不完备数据集的邻域容差互信息选择集成分类算法

扫码查看
针对不完备混合信息系统的分类问题,结合粒计算中的邻域容差关系和互信息理论,定义邻域容差互信息的概念,并利用集成学习的思想,提出不完备数据集的邻域容差互信息选择集成分类算法.该算法首先根据缺失属性得到信息粒,划分粒层构建粒空间,在不同的粒层上使用以BP神经网络作为基分类器的集成算法,构建新的基分类器;然后,根据每个信息粒的缺失属性计算出关于类属性的邻域容差互信息,来衡量各个信息粒的重要度,并根据基分类器预测准确率以及邻域容差互信息重新定义基分类器权重;最后,根据预测样本对基分类器加权集成预测分类结果,并与传统的集成分类算法进行对比分析.对于部分不完备混合型数据集,新提出的集成分类算法能有效提升分类准确率.
Neighborhood-tolerance mutual information selection ensemble classification algorithm for incomplete data sets
In order to solve the classification problem of incomplete mixed information systems,the concept of neighborhood-tolerance mutual information is defined by combining neighborhood-tolerance and mutual information theory in granular computing,and a selective ensemble classification algorithm based on neighborhood-tolerance mutual information is proposed by using ensemble learning.In this algorithm,information particles are obtained according to the missing attributes,and the space is constructed by dividing the particles into different layers.A new base classifier is constructed by integrating the BP neural network as the base classifier on different layers.Then,the neighborhood-tolerance mutual information about class attributes is calculated according to the missing attributes of each information particle to measure the importance of each information particle,and the weight of the base classifier is redefined according to the prediction accuracy of the base classifier and the neighborhood-tolerance mutual information.Finally,based on the predicted samples,the weighted ensemble prediction results of base classifier are analyzed and compared with the traditional ensemble classification algorithm.For partial incomplete mixed data sets,the proposed ensemble classification algorithm can effectively improve the classification accuracy.

incomplete hybrid information systemneighborhood-tolerance mutual informationensemble learningclassification

李丽红、董红瑶、刘文杰、李宝霖、代琪

展开 >

华北理工大学理学院,唐山,063210

河北省数据科学与应用重点实验室,华北理工大学,唐山,063210

唐山市工程计算重点实验室,华北理工大学,唐山,063210

首钢矿业公司职工子弟学校,唐山,064404

华北理工大学人工智能学院,唐山,063210

中国石油大学(北京)自动化系,北京,102249

展开 >

不完备混合信息系统 邻域容差互信息 集成学习 分类

河北省数据科学与应用重点实验室项目唐山市数据科学重点实验室项目

1012020110120301

2024

南京大学学报(自然科学版)
南京大学

南京大学学报(自然科学版)

CSTPCD北大核心
影响因子:0.756
ISSN:0469-5097
年,卷(期):2024.60(1)
  • 25