Establishment of machine learning-based risk prediction model for acute kidney injury in acute myocardial infarction patients and compared with traditional model
Establishment of machine learning-based risk prediction model for acute kidney injury in acute myocardial infarction patients and compared with traditional model
叶楠 1祝闯 2徐丰博 1程虹 1孙玉玲
扫码查看
点击上方二维码区域,可以放大扫码查看
作者信息
1. 首都医科大学附属北京安贞医院肾内科,北京 100029
2. 北京邮电大学人工智能学院,北京 100876
折叠
摘要
目的 利用机器学习算法建立急性心肌梗死(acute myocardial infarction,AMI)患者发生急性肾损伤(acute kidney injury,AKI)风险预测模型,并与传统Logistic回归模型比较。 方法 该研究为回顾性研究。收集首都医科大学附属北京安贞医院2011年7月至2016年12月AMI患者的人口学、实验室检查、治疗方案和用药情况等资料。AKI诊断标准参照2012年改善全球肾脏病预后组织公布的AKI诊疗指南,入选AMI患者采用单纯随机抽样法将其分为训练集(70%)及内部测试集(30%)。运用SelectFromModel和Lasso回归模型选择重要特征因素为AMI患者发生AKI的预测因素。分别利用Logistic回归模型(模型A)及机器学习算法(模型B)建立AMI患者发生AKI的风险预测模型,DeLong法比较模型A和模型B在测试集中的受试者工作特征曲线(receiver-operating characteristics curve,ROC曲线)下面积(area under the curve,AUC),并选出最佳模型。 结果 共6 014例AMI患者被纳入该研究,年龄(58.4±11.7)岁,男性3 414例(80.5%),AKI 674例(11.2%),训练集4 252例(70.7%),测试集1 762例(29.3%)。SelectFromModel和Lasso回归模型选取的12项临床指标包括心肌梗死次数、ST段抬高型心肌梗死、室性心动过速、Ⅲ度房室传导阻滞、入院时伴失代偿性心力衰竭、入院血肌酐、血尿素氮、肌酸激酶同工酶峰值、使用利尿剂、利尿剂日最大剂量、利尿剂使用天数及使用他汀类药物。Logistic回归模型结果显示,预测测试集AMI患者发生AKI的ROC曲线AUC为0.80(95% CI 0.76~0.84)。机器学习算法模型在测试集中得到的ROC曲线AUC为0.82(95% CI 0.78~0.85)。2种模型ROC曲线AUC比较的差异无统计学意义(Z=0.858,P=0.363),但机器学习算法预测模型ROC曲线AUC略高于传统模型。 结论 基于机器学习算法构建的AMI患者发生AKI的风险预测模型与传统Logistic回归模型的预测效应相似,但机器学习算法模型有更优的趋势,引入机器学习算法模型可能提高预测AMI患者发生AKI风险的能力。 Objective To establish a predictive risk model for acute kidney injury (AKI) in acute myocardial infarction (AMI) patients based on machine learning algorithm and compare with a traditional logistic regression model. Methods It was a retrospective study. The demographic data, laboratory examination, treatment regimen and medication of AMI patients from July 2011 to December 2016 in Beijing Anzhen Hospital, Capital Medical University were collected. The diagnostic criteria of AKI were based on the AKI diagnosis and treatment guidelines published by Kidney Diseases: Improving Global Outcomes in 2012. The selected AMI patients were randomly divided into training set (70%) and internal test set (30%) by simple random sampling. SelectFromModel and Lasso regression models were used to extract clinical parameters as predictors of AKI in AMI patients. Logistic regression model (model A) and machine learning algorithm (model B) were used to establish the risk prediction model of AKI in AMI patients. DeLong method was used to compare the area under the receiver-operating characteristic (ROC) curve (AUC) between model A and model B for selecting the best model. Results A total of 6 014 AMI patients were included in the study, with age of (58.4±11.7) years old and 3 414 males (80.5%). There were 674 patients (11.2%) with AKI. There were 4 252 patients (70.7%) in the training set and 1 762 patients (29.3%) in the test set. The selected twelve clinical parameters by the SelectFromModel and Lasso regression models included the number of myocardial infarctions, ST-segment elevation myocardial infarction, ventricular tachycardia, third degree atrioventricular block, decompensated heart failure at admission, admission serum creatinine, admission blood urea nitrogen, admission peak creatine kinase isoenzyme, diuretics, maximum daily dose of diuretics, days of diuretic use and statins. Logistic regression prediction model showed that AUC for the test set was 0.80 (95% CI 0.76-0.84). The machine learning algorithm model obtained AUC in the test set with 0.82 (95% CI 0.78-0.85).There was no significant difference in AUC between the two models (Z=0.858, P=0.363), and AUC of the machine learning algorithm predictive model was slightly higher than that of the traditional logistic regression model. Conclusions The prediction effect of AKI risk in AMI patients based on machine learning algorithm is similar to that of traditional logistic regression model, and the prediction accuracy of machine learning algorithm is better. The introduction of machine learning algorithm model may improve the ability to predict AKI risk.
Abstract
Objective To establish a predictive risk model for acute kidney injury (AKI) in acute myocardial infarction (AMI) patients based on machine learning algorithm and compare with a traditional logistic regression model. Methods It was a retrospective study. The demographic data, laboratory examination, treatment regimen and medication of AMI patients from July 2011 to December 2016 in Beijing Anzhen Hospital, Capital Medical University were collected. The diagnostic criteria of AKI were based on the AKI diagnosis and treatment guidelines published by Kidney Diseases: Improving Global Outcomes in 2012. The selected AMI patients were randomly divided into training set (70%) and internal test set (30%) by simple random sampling. SelectFromModel and Lasso regression models were used to extract clinical parameters as predictors of AKI in AMI patients. Logistic regression model (model A) and machine learning algorithm (model B) were used to establish the risk prediction model of AKI in AMI patients. DeLong method was used to compare the area under the receiver-operating characteristic (ROC) curve (AUC) between model A and model B for selecting the best model. Results A total of 6 014 AMI patients were included in the study, with age of (58.4±11.7) years old and 3 414 males (80.5%). There were 674 patients (11.2%) with AKI. There were 4 252 patients (70.7%) in the training set and 1 762 patients (29.3%) in the test set. The selected twelve clinical parameters by the SelectFromModel and Lasso regression models included the number of myocardial infarctions, ST-segment elevation myocardial infarction, ventricular tachycardia, third degree atrioventricular block, decompensated heart failure at admission, admission serum creatinine, admission blood urea nitrogen, admission peak creatine kinase isoenzyme, diuretics, maximum daily dose of diuretics, days of diuretic use and statins. Logistic regression prediction model showed that AUC for the test set was 0.80 (95% CI 0.76-0.84). The machine learning algorithm model obtained AUC in the test set with 0.82 (95% CI 0.78-0.85).There was no significant difference in AUC between the two models (Z=0.858, P=0.363), and AUC of the machine learning algorithm predictive model was slightly higher than that of the traditional logistic regression model. Conclusions The prediction effect of AKI risk in AMI patients based on machine learning algorithm is similar to that of traditional logistic regression model, and the prediction accuracy of machine learning algorithm is better. The introduction of machine learning algorithm model may improve the ability to predict AKI risk.
关键词
机器学习/急性肾损伤/心肌梗死/预测模型
Key words
Machine learning/Acute kidney injury/Myocardial infarction/Predictive model