山东大学学报(理学版)2024,Vol.59Issue(3) :61-70.DOI:10.6040/j.issn.1671-9352.7.2023.1073

基于样本相关性的层次特征选择算法

Hierarchical feature selection algorithm based on instance correlations

史春雨 毛煜 刘浩阳 林耀进
山东大学学报(理学版)2024,Vol.59Issue(3) :61-70.DOI:10.6040/j.issn.1671-9352.7.2023.1073

基于样本相关性的层次特征选择算法

Hierarchical feature selection algorithm based on instance correlations

史春雨 1毛煜 1刘浩阳 1林耀进1
扫码查看

作者信息

  • 1. 闽南师范大学计算机学院,福建漳州 363000;数据科学与智能应用重点实验室(闽南师范大学),福建漳州 363000
  • 折叠

摘要

提出了基于样本相关性的层次特征选择算法(hierarchical feature selection algorithm based on instance correlations,HFSIC)以进一步提高分层分类特征选择算法的性能.在使用稀疏正则项去除不相关特征之后,将层次结构中的父子关系与特征空间中样本之间的重构关系相结合,学习同一子树下各类别的样本相关性,利用递归正则优化输出特征权重矩阵.在衡量样本相关性时,将重构系数矩阵整合到训练模型中,同时利用l2,1范数去除不相关的和冗余的特征.使用加速近端梯度法解决所提模型的优化问题,并在多个评价指标下评估所提算法的优越性.实验结果表明,所提方法在5个数据集上的表现优于其他算法,验证了该算法的有效性.

Abstract

A hierarchical feature selection algorithm based on instance correlations(HFSIC)is proposed to further improve the performance of the hierarchical feature selection algorithm.After using sparse regularization items to remove irrelevant features,the parent-child relationship in the hierarchical structure with the reconstruction relationship between samples in the feature space are combined.The correlation of samples of each category under the same subtree are learned.Recursive regularization to optimize the output features weight matrix is used.When measuring the sample correlation,the reconstructed coefficient matrix is integrated into the training model,and the norm is used to remove irrelevant and redundant features.The optimization problem of the proposed model is solved using the accelerated proximal gradient method,and the superiority of the proposed algorithm is evaluated under multiple evaluation metrics.The experimental results show that the proposed method outperforms the other algorithms on five data-sets.The test verifies the effectiveness of the proposed algorithm.

关键词

特征选择/层次结构/样本相关性/递归正则化

Key words

feature selection/hierarchical structure/instance correlation/recursive regularization

引用本文复制引用

基金项目

国家自然科学基金(62076116)

福建省自然科学基金(2022J01914)

出版年

2024
山东大学学报(理学版)
山东大学

山东大学学报(理学版)

CSTPCD北大核心
影响因子:0.437
ISSN:1671-9352
参考文献量23
段落导航相关论文