首页|基于基尼相关系数的超高维判别特征筛方法

基于基尼相关系数的超高维判别特征筛方法

扫码查看
本文针对超高维判别分类数据,基于基尼相关系数构建了无模型假设下的特征筛选方法,对连续型特征进行筛选,并将其推广到响应变量为连续型变量,自变量为离散变量的情形.在一定的正则条件下证明了确定筛选性质和指标排序相合性,并采用蒙特卡罗模拟和实例验证了筛选方法的有效性.该研究为超高维数据的特征筛选提供了一种新方法,并扩展了概率统计中独立性概念的应用.
Features Screening for Ultra-High Dimensional Discriminant Data Based on Gini Correlation Coefficient
This paper proposes a model-free discriminant screening method based on the Gini correlation coefficient for screening continuous features in ultra-high dimensional classification data.Additionally,the method can be generalized to cases where the response variable is continuous and the independent variable is discrete.The proposed feature screening method satisfies the sure screening property and ranking consistency property under certain regular conditions.Finally,the effectiveness of the screening method has been verified through Monte Carlo simulation and analysis of real data.This study provides a novel approach to feature selection in high-dimensional data and extends the application of the concept of independence in statistical theory.

ultra-high dimensional datafeature screeningGini correlation coefficient

宋凤丽、孙威

展开 >

南京信息工程大学数学与统计学院,江苏南京 210044

超高维数据 特征筛选 基尼相关系数

2024

数理统计与管理
中国现场统计研究会

数理统计与管理

CSTPCDCSSCICHSSCD北大核心
影响因子:1.114
ISSN:1002-1566
年,卷(期):2024.43(6)