A Time Series Data Dimensionality Reduction-Based Death Risk Prediction Model in Sepsis
Most of the existing death risk prediction models for septic patients need data of blood routine examination,result-ing in more input features and complex collection process.To solve this problem,an improved wrapped feature selection method and SD2V-XGBoost prediction model based on LSTM and XGBoost are proposed,which can predict the death risk of septic patients with only less clinical real-time features.Firstly,the improved wrapped feature selection method is used to select the features with high correlation with the risk of death.Secondly,using LSTM neurons,patients'time series data are mapped into a vector.Finally,the vector output from LSTM network and the statistical characteristics of patients'time series data are used as the input of XGBoost to predict the risk of death.The experiment is carried out by using the public data set MIMIC-Ⅲ.In terms of the number of input fea-tures,compared with the existing model,SD2V-XGBoost model reduces the number of input features by 71%on the premise of maintaining the prediction performance.In terms of prediction performance,when only clinical real-time features are used,AU-ROC(Area Under Receiver Operating Characteristic Curve)of SD2V-XGBoost is 0.852 1,AUPRC(Area Under Precision Recall Curve)is 0.632 0,and the true positive rate is 72.15%,which are better than LSTM,XGBoost and random forest model.
long short-term memorydeath risk predictiontime series data processingfeature selectionXGBoost model