Establishment and Evaluation of the Prediction Models of the Relationship between Cardiovascular and Cerebrovascular Diseases and Meteorological Factors
Objective To explore the relationship between the incidence of cardiovascular and cerebrovascular diseases and meteorological factors,and to predict the incidence risk levels of cardiovascular and cerebrovascular diseases using machine learning methods,with the aim of providing the scientific basis for disease prevention and control.Methods Patients with cardiovascular and cerebrovascular diseases,whose information were provided by the Guizhou Center for Disease Control and Prevention,were selected as subjects.The predictive factors of the model were determined through correlation analysis,and the prediction models for the risk of cardiovascular and cerebrovascular diseases were constructed based on four machine learning models:support vector machine,extreme gradient boosting,light gradient boosting machine,and random forest.The included patients were divided into the training set and the testing set in the ratio of 8:2.The training set was used for model training and parameter optimization,and the testing set was used to evaluate the effect of the model.The predictive performance of each model was mainly evaluated by accuracy.Results A total of 16 383 patients over 60 years of age with cardiovascular and cerebrovascular diseases were included in this study,including 6507 women.The number of daily cases was unbalanced,in which the diagnostic types included acute myocardial infarction,stroke,angina pectoris,and sudden cardiac death.The number of daily cases was correlated with 26 meteorological factors in 3 categories including air pressure,air temperature,and humidity,and was positively correlated with air pressure and relative humidity,but negatively correlated with air temperature.The GridSearchCV function was used to find the optimal weight ratio,the machine learning method was used to construct the model,and the output model index parameters were verified through the testing set.The light gradient boosting machine model performed best in the prediction task,with an accuracy of 85.68%,a precision of 82.56%,a recall of 85.68%,and the F1 score was 79.56%(all P<0.05).The INP value of the temperature of 72 h before the onset of cardiovascular and cerebrovascular diseases was 63 814,which was the most important meteorological factor affecting the number of daily cases.The temperatures of 48 h before the onset and 24 h before the onset respectively ranked second and third,corresponding to INP values of 62 002 and 43 216.Conclusions The prediction models of cardiovascular and cerebrovascular diseases based on machine learning methods have high predictive value.Among them,the light gradient boosting machine model presented the best performance.
Cardiovascular and cerebrovascular diseaseMeteorological factorMachine learningPrediction model