首页|基于中心偏移的Fisher score与直觉邻域模糊熵的多标记特征选择

基于中心偏移的Fisher score与直觉邻域模糊熵的多标记特征选择

扫码查看
现有多标记Fisher score模型中边缘样本会影响算法分类效果.鉴于邻域直觉模糊熵处理不确定信息时具有更强的表达能力与分辨能力的优势,文中提出了一种基于中心偏移的Fisher score与邻域直觉模糊熵的多标记特征选择方法.首先,根据标记将多标记论域划分为多个样本集,计算样本集的特征均值作为标记下样本的原始中心点,以最远样本的距离乘以距离系数,去除边缘样本集,定义了新的有效样本集,计算中心偏移处理后的标记下每个特征的得分以及标记集的特征得分,进而建立了基于中心偏移的多标记Fisher score模型,预处理多标记数据.然后,引入多标记分类间隔作为自适应模糊邻域半径参数,定义了模糊邻域相似关系和模糊邻域粒,由此构造了多标记模糊邻域粗糙集的上、下近似集;在此基础上提出了多标记邻域粗糙直觉隶属度函数和非隶属度函数,定义了多标记邻域直觉模糊熵.最后,给出了特征的外部和内部重要度的计算公式,设计了基于邻域直觉模糊熵的多标记特征选择算法,筛选出最优特征子集.在多标记K近邻分类器下、9个多标记数据集上的实验结果表明,所提算法选择的最优子集具有良好的分类性能.
Multilabel Feature Selection Based on Fisher Score with Center Shift and Neighborhood Intuitionistic Fuzzy Entropy
The edge samples in the existing multilabel Fisher score models affect the classification effect of the algorithm.It has the available virtues of stronger expression and resolution when using neighborhood intuitive fuzzy entropy to deal with uncertain information.Therefore,this paper develops a multilabel feature selection based on the Fisher score with center shift and neighbor-hood intuitionistic fuzzy entropy.Firstly,the multilabel domain is divided into multiple sample sets according to the labels,the feature mean of the sample set is calculated as the original center point of the samples under the labels,and the distance of the furthest samples is multiplied by the distance coefficient,the edge sample set is removed,and then a new effective sample set is defined.The score of each feature under the labels is calculated after center migration processing and the feature score of the label set.Then,a multilabel Fisher score model is established based on center migration to preprocess multilabel data.Secondly,the multilabel classification interval is introduced as the adaptive fuzzy neighborhood radius parameter,the fuzzy neighborhood simi-larity relation and fuzzy neighborhood particle are defined,and the upper and lower approximate sets of the multilabel fuzzy neighborhood rough sets are constructed.On this basis,the rough intuitive membership function and non-membership function of multilabel neighborhood are proposed,and the multilabel neighborhood intuitionistic fuzzy entropy is defined.Finally,the formu-las for calculating the external and internal significance of features are obtained,and a multilabel feature selection algorithm based on neighborhood intuitive fuzzy entropy is designed to screen the optimal feature subset.Under the multilabel K-nearest neighbor classifier,experimental results on nine multilabel datasets show that the optimal subset selected by the proposed algorithm has great classification effect.

Multilabel learningFeature selectionFisher scoreMultilabel fuzzy neighborhood rough setsNeighborhood intuitio-nistic fuzzy entropy

孙林、马天娇

展开 >

天津科技大学人工智能学院 天津 300457

河南师范大学计算机与信息工程学院 河南新乡 453007

多标记学习 特征选择 Fisher score 多标记模糊邻域粗糙集 邻域直觉模糊熵

国家自然科学基金国家自然科学基金

6207608961772176

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(7)
  • 1
  • 7