基于次相关特征和邻域互信息的在线多标记特征选择算法
Online multi-label feature selection based on sub-correlation features and neighborhood mutual information
程雨轩 1毛煜 1张小清 1曾艺祥 1林耀进1
作者信息
- 1. 闽南师范大学计算机学院,福建 漳州 363000;数据科学与智能应用福建省高等学校重点实验室,福建 漳州 363000
- 折叠
摘要
为了充分地挖掘被单一度量指标算法忽略但对分类结果有利的特征,提出了基于次相关特征和邻域互信息的在线多标记特征选择算法,通过计算得到的新到达特征的重要性以及相关度,分析其显著性的区别,将特征区分为显著特征以及次相关特征.利用邻域交互信息对新到达的特征与已选特征集合进行冗余性分析,剔除依赖度较低的特征,以此逐步提升特征子集的质量.构建了基于全局的线性和非线性关系的度量指标,并以此来计算特征的局部相关度,有效地挖掘次相关特征.充分考虑特征空间中次相关特征存在的问题,将次相关特征从特征集合中剥离并单独保存,使之在冗余分析阶段不会因显著特征对度量指标敏感度高所产生的影响而被剔除出特征集合.建立了特征选择指标,利用迭代策略根据指标进行特征选择.实验结果表明,该算法具有很好的有效性和稳定性.
Abstract
To fully mine the features neglected by the single metric algorithm but beneficial to the classifier,this paper proposes an online multi-label feature selection algorithm based on sub-correlation features and neighborhood mutual information.By calculating the importance and correlation of newly arrived features,the difference between the significance of new features is analyzed,and the features are divided into salient features and sub-correlation features.Redundancy analysis is performed on newly arrived features and selected feature sets using neighborhood interaction information,and features with low dependencies are eliminated,to gradually improve the quality of feature subsets.This paper also constructs a measurement index based on the global linear and nonlinear relationship and uses it to calculate the local correlation of features,effectively mining the sub-correlation features.Strip the sub-correlation features from the feature set and save them separately,so that they will not be eliminated from the feature set during the redundancy analysis stage due to the high sensitivity of the salient features to the measurement index.Using established feature selection indicators and iterative strategies to select features according to the indicators.Experimental results show that the proposed algorithm has good effectiveness and stability.
关键词
在线特征选择/多标记学习/邻域熵/邻域互信息/次相关特征Key words
online feature selection/multi-label learning/neighborhood entropy/neighborhood mutual information/sub-correlation feature引用本文复制引用
基金项目
福建省自然科学基金(2022J01914)
出版年
2024