基于高维相关性多标签在线流特征选择
Multi-label online stream feature selection based on high-dimensional correlation
朱礼全 1林耀进 1毛煜 1程雨轩1
作者信息
- 1. 闽南师范大学计算机学院,福建 漳州 363000;数据科学与智能应用福建省高等学校重点实验室,福建 漳州 363000
- 折叠
摘要
提出了一种基于高维相关性的多标签在线流特征选择算法,该算法将标签空间进行等价映射,构建基于高维标签空间的权重无向图,利用图信息和Jaccard指数来衡量标签之间的高维权重,利用标签的高维相关性计算新到达特征的显著性.通过迭代显著性均值来判断新特征的显著水平,设计了一种基于平衡全局和局部的在线特征选择算法对已选特征子集进行动态优化,考虑已选特征与标签空间的全局相关性,过滤掉不相关的特征.分析已选特征之间的局部相关性,剔除冗余特征.与 6 种多标签特征选择方法进行对比实验,实验结果验证了所提算法的有效性.
Abstract
This paper proposes a multi-label online stream feature selection algorithm based on high-dimensional correlation.The algorithm employs an equivalent mapping of the label space and constructs a weighted undirected graph based on the high-dimensional label space.It utilizes graph information and Jaccard index to measure the high-dimensional weights between labels.The significance of newly arrived features is calculated based on the high-dimensional correlation of the labels,and the significance level of new features is determined through iterative mean significance.Furthermore,a balanced global and local online feature selection algorithm is designed to dynamically optimize the selected feature subset by considering the global correlation between the selected features and the label space,thereby filtering out irrelevant features.Redundant features are eliminated by analyzing the local correlation among the selected features.The testing results validate the effectiveness of the proposed algorithm through comparative tests with six other multi-label feature selection methods.
关键词
多标签特征选择/在线流特征/高维相关性/标签权重Key words
multi-label feature selection/online streaming feature/high dimensional correlation/label weight引用本文复制引用
基金项目
福建省自然科学基金(2022J01914)
出版年
2024