中国卫生统计2024,Vol.41Issue(1) :28-34.DOI:10.11783/j.issn.1002-3674.2024.01.006

基于三种预测模型构建医学生超重肥胖风险因素分析

Study on the Risk Prediction Models of Overweight and Obesity in Medical Students

陆晓宇 贾苑吏 李萌萌 赵泽坤 曹肖肖 樊梦婷 夏鑫 成丽 薛玲
中国卫生统计2024,Vol.41Issue(1) :28-34.DOI:10.11783/j.issn.1002-3674.2024.01.006

基于三种预测模型构建医学生超重肥胖风险因素分析

Study on the Risk Prediction Models of Overweight and Obesity in Medical Students

陆晓宇 1贾苑吏 2李萌萌 1赵泽坤 1曹肖肖 1樊梦婷 1夏鑫 1成丽 1薛玲3
扫码查看

作者信息

  • 1. 华北理工大学公共卫生学院(063000)
  • 2. 华北理工大学理学院
  • 3. 华北理工大学公共卫生学院(063000);河北省煤矿卫生与安全重点实验室
  • 折叠

摘要

目的 构建logistic回归、随机森林和SVM模型预测医学生超重肥胖发生的影响因素,并对模型性能参数进行评价和比较,以获得超重肥胖风险评估预测的最优模型.方法 参与者为 2020 年 5-12 月来自河北省某市 1866名医学生,通过自测问卷收集筛查其超重肥胖相关数据;利用Python分别构建logistic回归、随机森林和SVM三种风险评估模型.结果 logistic回归、随机森林和SVM模型准确度分别为 96.26%、98.66%和 98.13%;特异度分别为 99.77%、100%和 99.08%;F1 值分别为 0.85、0.95 和 0.93,随机森林为最优预测模型.随机森林模型结果显示,主观幸福感、负性事件以及学生经济状况在模型中预测权重值均超过 10%.结论 主观幸福感水平、负性事件次数以及学生经济状况等为影响医学生超重肥胖发生率的主要因素;随机森林模型的预测效果较 logistic 回归和SVM更优.

Abstract

Objective To construct logistic regression,random forest and SVM models to predict the influencing factors of overweight and obesity in medical students,and the prediction performance of the three models was compared,so as to obtain the optimal model for the risk assessment of overweight and obesity.Methods Participants included 1 866 medical students from a city in Hebei Province from May to December 2020.The relevant data of overweight and obesity screening were collected through self-test questionnaire;three models of logistic regression,random forest and SVM are constructed by python.Results The test set showed that the accuracy of logistic regression,random forest and SVM models were 96.26%,98.66%and 98.13%respectively;the specificity were 99.77%,100%and 99.00%,respectively;and the AUC were 0.88,0.99 and 0.88 respectively.Random forest is the optimal prediction model;according to the random forest model results,subjective well-being,negative events and students'economic status are more than 10%of weight in the model.Conclusion Subjective well-being,negative events and students'economic status are the main factors affecting the incidence of overweight and obesity in medical students;the prediction performance of random forest model was better than logistic regression model and SVM model.

关键词

医学生/超重肥胖/logistic回归/随机森林/支持向量机

Key words

Medical students/Overweight and obesity/Logistic regression/Random forest/Support vector machine

引用本文复制引用

基金项目

河北省民生科技专项项目(20377718D)

出版年

2024
中国卫生统计
中国卫生信息学会 中国医科大学

中国卫生统计

CSTPCDCSCD北大核心
影响因子:1.172
ISSN:1002-3674
参考文献量24
段落导航相关论文