Interpretable machine learning-based prognostic model for severe chronic obstructive pulmonary disease
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NETL
NSTL
万方数据
目的 建立预测重症慢性阻塞性肺疾病(简称慢阻肺)患者死亡风险的机器学习模型,探讨与慢阻肺患者死亡风险相关的因素,并加以解释,解决机器学习模型的"黑箱"问题.方法 选取美国多中心急诊重症监护病(emergency intensive care unit,eICU)数据库中的8 088例重症慢阻肺患者为研究对象,提取每次入住重症监护病房的前24 h内的数据并随机分组,70%用于模型训练,30%用于模型验证.采用LASSO回归进行预测变量选择,避免过拟合.采用5种机器学习模型对患者的住院病死率进行预测.通过曲线下面积(area under curve,AUC)比较5种模型和APACHE Ⅳa评分的预测性能,并采用SHAP(SHapley Additive exPlanations)方法解释随机森林(random forest,RF)模型的预测结果.结果 RF模型在5种机器学习模型和APACHE Ⅳa评分系统中表现出最佳的性能,AUC达到0.830(95%置信区间0.806~0.855).通过SHAP方法检测最重要的10种预测变量,其中无创收缩压的最小值被认为是最重要的预测变量.结论 通过机器学习识别危险因素,并使用SHAP方法解释预测结果,可早期预测患者的死亡风险,有助于临床医生制定准确的治疗计划,合理分配医疗资源.
Objective To develop a machine learning(ML)model to predict the risk of death in intensive care unit(ICU)patients with chronic obstructive pulmonary disease(COPD),explain the factors related to the risk of death in COPD patients,and solve the"black box"problem of ML model.Methods A total of 8 088 patients with severe COPD were selected from the eICU Collaborative Research Database(eICU-CRD).Data within the initial 24 hours of each ICU stay were extracted and randomly divided,with 70%for model training and 30%for model validation.The LASSO regression was deployed for predictor variable selection to avoid overfitting.Five ML models were employed to predict in-hospital mortality.The prediction performance of the ML models was compared with alternative models using the area under curve(AUC),while SHAP(SHapley Additive exPlanations)method was used to explain this random forest(RF)model.Results The RF model performed best among the APACHE Ⅳa scoring system and five ML models with the AUC of 0.830(95%CI 0.806-0.855).The SHAP method detects the top 10 predictors according to the importance ranking and the minimum of non-invasive systolic blood pressure was recognized as the most significant predictor variable.Conclusion Leveraging ML model to capture risk factors and using the SHAP method to interpret the prediction outcome can predict the risk of death of patients early,which helps clinicians make accurate treatment plans and allocate medical resources rationally.
Chronic obstructive pulmonary diseasemachine learningeICU Collaborative Research DatabasemortalitySHAP(SHapley Additive exPlanation)method