Knockoff方法研究进展综述
Overview of Research Advance for Knockoff Methods
袁攀旭 1李高荣1
作者信息
- 1. 北京师范大学统计学院,北京,100875
- 折叠
摘要
随着现代科学技术的快速发展,大数据时代正向我们走来.此时,统计方法的可重复性对于提高科学研究的严谨性至关重要.Barber和Candès[48]提出的knockoff方法是一种可结合任意特征重要性得分的变量选择算法,在发现真实效应的同时严格控制错误发现率(false discovery rate,FDR),其核心想法是构造称为knockoff的合成变量来模仿原始变量之间的相关结构.该方法无需计算p-值而在近年来受到广泛关注,成为当今统计和机器学习最热点的研究领域.本文主要介绍knock-off方法的最新研究进展,并简要探讨未来可能的研究方向.
Abstract
With the rapid development of modern science and technology,the era of big data is com-ing to us.At this time,the reproducibility of statistical methods is pivotal for improving rigor in scien-tific research.The knockoff procedure proposed in Barber and Candès[48]is a general variable selection algorithm that can leverage any feature importance score to discover true effects while rigorously con-trolling false discovery rate(FDR).The main idea is to construct synthetic variables called knockoffs to mimic the correlation structure found within the original variables.This method has received much attention in recent years because it completely bypasses the computation of p-values,and has become the most popular research area in statistics and machine learning.This paper mainly introduces the newly research advance in knockoff procedure and briefly discusses some future directions.
关键词
knockoff方法/多重假设检验/错误发现率/高维数据/稀疏性/变量选择/可重复性Key words
knockoff method/multiple hypothesis testing/false discovery rate/high-dimensional data/sparsity/variable selection/reproducibility引用本文复制引用
基金项目
国家自然科学基金项目(12271046)
国家自然科学基金项目(12131006)
出版年
2024