Establishment and validation of a cardiovascular disease(CVD)high-risk prediction model for physical examination population aged 35 to 75 in Urumqi
Objective To understand the detection of high-risk population of cardiovascular disease(CVD)in the physical examination population aged 35-75 in Urumqi,and to establish a multi-factor Logistic re-gression model and decision tree model to predict the high-risk population of CVD,and to compare and an-alyze the prediction effect and accuracy of the model,providing a reference for the prevention and control of CVD.Methods From June 2023 to January 2024,40 364 physical examiners aged 35 to 75 years old were selected from the physical examination center of a top three hospital in Urumqi for questionnaire in-vestigation and physical examination,and the detection status of CVD high-risk groups was analyzed.Multivariate Logistic regression analysis method and decision tree model were used to establish CVD risk prediction model,compare and analyze the prediction effect and accuracy of the model.Results Among the 40 364 physical examination population aged 35-75 in Urumqi,10 858 were detected as high-risk in-dividuals for CVD,with a detection rate of 26.9%.The results of multiple factor Logistic regression anal-ysis showed that age>40,commercial service industry,retirees,smoking,excessive or insufficient intake of meat in the diet were risk factors detected for CVD among the individuals aged 35-75.Women,educa-tion level of junior high school or above,professional and technical personnel,administrative personnel,agricultural,forestry,animal husbandry and fishery production personnel,as well as production and transportation workers,and exercising once a week or more were protective factors for the high-risk popu-lation of CVD among physical examination population aged 35-75 in Urumqi.The body mass index,sys-tolic blood pressure,diastolic blood pressure,hip circumference,waist circumference,fasting blood glu-cose,total cholesterol,triglycerides,high-density lipoprotein and low-density lipoprotein cholesterol in the high-risk population for CVD,except for high-density lipoprotein which was lower than that in the non high-risk population,were all higher than those in the non high-risk population,and the differences were statistically significant(all P<0.001).Demographic characteristics(age,smoking habits and eating hab-its)and physical examination indicators(systolic blood pressure,diastolic blood pressure,fasting blood glucose,alanine aminotransferase,aspartate aminotransferase,total cholesterol,triglyceride,high-densi-ty lipoprotein and low-density lipoprotein)were used as independent variables.Multivariate Logistic re-gression model and decision tree model were used to establish CVD high risk prediction models and both models had good prediction accuracy.The AUC values of multivariate Logistic regression model and deci-sion tree model were both 0.867(95%CI=0.864-0.871,P<0.001).Conclusion The detection rate of high-risk population of CVD among the physical examination population aged 35-75 in Urumqi is rela-tively high.Both the multivariate Logistic regression model and the visualized decision tree model have good predictive effects.It can predict the possibility of becoming a high-risk population of cardiovascular disease in the future,provide a screening tool for identifying high-risk individuals of CVD,and guide the promotion of zero-level prevention of cardiovascular disease.
cardiovascular disease(CVD)high risk populationdecision tree