首页|集成多种机器学习算法的哮喘疾病发病风险预测模型研究

集成多种机器学习算法的哮喘疾病发病风险预测模型研究

扫码查看
目的 基于集成四种机器学习算法建立哮喘疾病发病风险预测模型,为健康气象预报服务及公众防御提供依据.方法 收集、整理 2012-2018 年天津市某三甲医院哮喘病患者逐日就诊数据以及同期气象因子、环境因子、花粉等数据资料,采用主成分分析法选取最优因子,应用 Stacking 集成学习方法集成决策树、随机森林、XGBoost、LightGBM 等四种机器学习算法,通过调节最优风险等级阈值、时间滞后、分季节等手段优化模型性能.结果 随机森林建模预测效果好于决策树及 XG-Boost、LightGBM;基于四个子模型进行多模型集成,相比随机森林模型,在易发、多发等级的预报能力提升约 13%;当选择滞后时间为 2~3 d,且分季节建模后,模型预测能力有进一步提升.结论 综合考虑多种气象因子、环境因子和花粉因素的多模型集成方法可应用于哮喘疾病的气象预测业务和服务.
Study on the prediction model for asthma risk with integration of various machine learning algorithms
Objective To establish a prediction model for asthma risk by integrating four machine learning algorithms,and pro-vide a basis for healthy weather forecast services and public defense.Methods The daily medical data of asthma patients from 2012 to 2018 were collected from a grade A tertiary hospital in Tianjin,as well as meteorological,environmental,and pollen data during the same period of time.A principal component analysis was used to select the optimal factors,and the Stacking integrated learning method was used to integrate the four machine learning algorithms of Decision Tree,Random Forest,XGBoost,and LightGBM.Model perform-ance was optimized by adjusting the optimal risk level threshold,time lag,and seasonality.Results Random forest modeling had a better predictive effect than Decision Tree,XGBoost,and LightGBM.Multi-model integration was performed based on the four sub-models,and compared with the Random Forest model,the integrated model was improved by about 13%in its forecasting ability for the grades of easy occurrence and multiple occurrence.In case of a time lag of 2-3 days and modeling for different seasons,the predictive ability of the model was further improved.Conclusion The multi-model integration method that comprehensively considers various me-teorological,environmental,and pollen factors can be applied to the meteorological forecasting business and services of asthma disease.

asthmameteorological factorenvironment factormachine learning

张庆、段丽瑶、柳艳香、蒋萍、陈子煊、刘博

展开 >

天津市气象台,天津 300074

中国气象局公共气象服务中心

天津市第一中心医院

天津市突发公共事件预警信息发布中心

展开 >

哮喘 气象因子 环境因子 机器学习

中国气象局公共气象服务中心创新基金重点项目

K2021005

2024

环境卫生学杂志
中国疾病预防控制中心

环境卫生学杂志

CSTPCD
影响因子:0.735
ISSN:2095-1906
年,卷(期):2024.14(2)
  • 18