基于投影相关的超高维生存数据的特征筛选新方法
A new feature screening method for ultra-high-dimensional survival data based on projection correlation
潘莹丽 1葛翔宇 2周艳丽3
作者信息
- 1. 湖北大学数学与统计学学院,应用数学湖北省重点实验室,武汉 430062
- 2. 中南财经政法大学统计与数学学院,武汉 430073
- 3. 中南财经政法大学金融学院,武汉 430073
- 折叠
摘要
本文对超高维右删失生存数据的特征筛选提出一种基于投影相关且具有确定独立筛选(projec-tion correlation sure independent screening,PC-SIS)的新方法.一方面,PC-SIS 方法并不需要指定任何模型,也不需要对生存函数进行非参数估计,且对矩条件和次指数条件不敏感,适用于对异常值或厚尾数据的分析.另一方面,在一定的正则化条件下,PC-SIS方法具有确定筛选性和秩相合性.模拟和实证研究表明,PC-SIS方法能在保留所有重要特征的前提下剔除与响应变量相关程度较弱的特征,以实现降维的目的.
Abstract
In this paper,a projective correlation method with sure independent screening(abbreviated to the PC-SIS method)is proposed for feature screening of ultra-high-dimensional right-censored survival data.On the one hand,the PC-SIS method does not require any model to be specified,nor does it require non-parametric estimation of the survival function,and it is insensitive to moment conditions and sub-exponential conditions,so it is applicable to analyze the data with outliers or heavy-tailed.On the other hand,under certain regularization conditions,the PC-SIS method has sure screening and rank consistency properties.A simulation and an empirical study show that the PC-SIS method can eliminate features with weak correlation with response variables on the premise of preserving all important features to achieve the purpose of dimension reduction.
关键词
投影相关/秩相合性/确定筛选/生存数据/超高维Key words
projection correlation/rank consistency/sure screening/survival data/ultra-high dimensionality引用本文复制引用
基金项目
国家自然科学基金(11901175)
国家自然科学基金(71974204)
国家自然科学基金(71901222)
国家社会科学基金(20&ZD132)
中南财经政法大学中央高校基本科研业务费专项资金(2722022AK001)
出版年
2024