三种机器学习算法预测心力衰竭死亡的价值研究

Value of Three Machine Learning Algorithms in Predicting Death from Heart Failure

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据

中文摘要：目的用机器学习三种不同算法建立心力衰竭分类预测模型,比较模型的准确率,并分析心力衰竭死亡事件重要性特征,对人群尽早发现和实施介入措施提供援助,努力提高人们的健康水平和生活质量.方法使用Kaggle平台发布的心力衰竭数据集,通过缺失值填充法、数据标准化处理、SMOTE方法进行数据预处理.基于随机森林、C4.5、AdaBoost算法建立心力衰竭预测模型.使用性能评估指标混淆矩阵、ROC曲线、均方根误差以及均值绝对误差评估评价模型性能.结果 PermutationImportance给出的变量重要性排序中,血清肌酐水平、年龄、血清钠离水平排序靠前.三种模型中,随机森林模型准确率为85％,精确率为81％,召回率为68％;C4.5模型准确率为83％,精确率为80％,召回率为63％;AdaBoost模型准确率为80％,精确率为71％,召回率为63％.结论基于所用数据集,随机森林模型优于C4.5模型与AdaBoost模型,心力衰竭死亡风险预测模型能为心力衰竭早期预防控制及诊断提供参考依据.

外文摘要：Objective To establish a classification and prediction model of heart failure by using three different algorithms of machine learning,compare the accuracy of the model,and analyze the importance characteristics of heart failure death events,so as to provide assistance for the early detection and implementation of intervention measures,and strive to improve people's health level and quality of life.Methods Using the heart failure data set published by Kaggle platform,the data preprocessing was carried out by missing value filling method,data standardization processing and SMOTE method.A heart failure prediction model was established based on random forest,C4.5 and AdaBoost algorithms.The performance evaluation index confusion matrix,ROC curve,root mean square error and mean absolute error were used to evaluate the performance of the model.Results In the order of importance of variables given by PermutationImportance,serum creatinine level,age and serum sodium level were ranked first.Among the three models,the accuracy of the random forest model was 85％,the accuracy was 81％,and the recall rate was 68％;the accuracy rate of the C4.5 model was 83％,the accuracy rate was 80％,and the recall rate was 63％.The accuracy rate of AdaBoost model was 80％,the accuracy rate was 71％,and the recall rate was 63％.Conclusion Based on the data set used,the random forest model is superior to the C4.5 model and the AdaBoost model.The heart failure death risk prediction model can provide a reference for early prevention,control and diagnosis of heart failure.

外文关键词：

Heart failureDeathPrediction modelC4.5Random forestAdaBoost

作者：

陈晓彤、岑梓熹、谭静仪、栾雅、彭师师、严波、何震

展开 >

作者单位：

广州新华学院健康学院,广东广州 510310

江苏科技大学材料工程学院,江苏镇江 215699

关键词：

心力衰竭死亡预测模型 C4.5 随机森林 AdaBoost

出版年：

2024

DOI：

10.3969/j.issn.1006-1959.2024.11.002

医学信息

国家卫生部信息化管理领导小组中国电子学会中国医药信息学分会陕西文博生物信息工程研究所

医学信息

影响因子：0.161

ISSN：1006-1959

年,卷(期)：2024.37(11)