Variable Screening and Selection for Ultra-high Dimensional Additive Quantile Regression with Missing Data
We propose an effective iterative screening method for the ultra-high di-mensional additive quantile regression with missing data.Specifically,the canonical correlation analysis is introduced into the maximum correlation coefficient based on the optimal transformation,and the marginal contribution of important variables is sorted by the maximum correlation coefficient after the optimal transformation of covariates and model residuals.On the basis of variable screening,the sparse smooth penalty is used to make further variable selection.The proposed variable selection method has three advantages:(1)The maximum correlation based on optimal transformation can reflect the nonlinear dependent structure of response variable to covariable more comprehensively;(2)In the iteration process,the residual can be used to obtain the relevant information of the model so as to improve the accuracy of variable screening;(3)The variable screening process can be separated from model estimation to avoid re-gression of redundant covariables.Under appropriate conditions,the sure independent screening property of the variable screening method and the sparsity and consistency of the estimator under the sparse-smooth penalty are proved.Finally,the performance of the proposed method is given by Monte Carlo simulation and the rat genome data is used to illustrate the effectiveness of the proposed method.