Value of Three Machine Learning Algorithms in Predicting Death from Heart Failure
Objective To establish a classification and prediction model of heart failure by using three different algorithms of machine learning,compare the accuracy of the model,and analyze the importance characteristics of heart failure death events,so as to provide assistance for the early detection and implementation of intervention measures,and strive to improve people's health level and quality of life.Methods Using the heart failure data set published by Kaggle platform,the data preprocessing was carried out by missing value filling method,data standardization processing and SMOTE method.A heart failure prediction model was established based on random forest,C4.5 and AdaBoost algorithms.The performance evaluation index confusion matrix,ROC curve,root mean square error and mean absolute error were used to evaluate the performance of the model.Results In the order of importance of variables given by PermutationImportance,serum creatinine level,age and serum sodium level were ranked first.Among the three models,the accuracy of the random forest model was 85%,the accuracy was 81%,and the recall rate was 68%;the accuracy rate of the C4.5 model was 83%,the accuracy rate was 80%,and the recall rate was 63%.The accuracy rate of AdaBoost model was 80%,the accuracy rate was 71%,and the recall rate was 63%.Conclusion Based on the data set used,the random forest model is superior to the C4.5 model and the AdaBoost model.The heart failure death risk prediction model can provide a reference for early prevention,control and diagnosis of heart failure.