首页|高原脱适应症发生风险预测模型的构建与验证

高原脱适应症发生风险预测模型的构建与验证

扫码查看
目的 应用不同机器学习算法构建从高原返回平原人群高原脱适应症(high altitude de-acclimatization syndrome,HADAS)发病的风险预测模型,并验证其预测效能.方法 于2020年11月至2024年2月对结束高原生活返回内地的人群实地或线上发放问卷调查.收集基本资料、慢性高原病(chronic mountain sickness,CMS)情况和脱适应症状等调查资料,经筛选最终纳入1 095例作为建模组.阳性事件定义为脱适应症状评分>5分.将建模组按7∶3随机分为训练集(n=766)和内部测试集(n=329),采用最小绝对收缩和选择算子(least absolute shrinkage and selection operator,LASSO)回归筛选自变量,基于多因素Logistic回归(multiple factor logistic regression,LR)、决策树(decision tree,DT)、随机森林(random forest,RF)、极端梯度提升(eXtreme gradient boosting,XGB)、支持向量机(support vector machine,SVM)、K最邻近(K-nearest neighbor,KNN)、轻度梯度提升(light gradient boosting,LGB)、朴素贝叶斯(naïve bayes,NB)8种机器学习方法构建预测模型.采用受试者工作特性曲线(receiver operating characteristic,ROC)、校准曲线和混淆矩阵对模型进行比较并进行内部测试;使用列线图或Shapley加性解释(shapley additive explanations,SHAP)图对最终模型进行展示.于2024年8月收集筛选结束高原生活返回平原132例作为外部验证组,对模型进行外部验证.结果 1 095例调查人群中有脱适应症者549人(50.14%).CMS评分、年龄、高原居住时间为LASSO回归筛选出的预测因子.8种机器学习算法建立的HADAS预测模型中,以LR模型最优,ROC的曲线下面积(area under curve,AUC)训练集为0.819(95%CI:0.789~0.850),内部测试集为0.841(95%CI:0.799~0.884),F1评分内部测试集为0.801,内部测试集的AUC、F1得分在8个模型中均为最大;LR模型校准曲线的Spiegelhalter Z检验显示训练集P=0.703、内部测试集P=0.281;LR模型外部验证集AUC为0.867(95%CI:0.765~0.969).结论 以CMS评分、年龄和高原居住时间为预测因子建立的LR模型在内部测试集的综合表现最好,在外部验证集中区分度好,构建的列线图便于应用.
Construction and validation of a risk prediction model for high altitude de-acclimatization syndrome
Objective To construct risk models for predicting the occurrence of high altitude de-acclimatization syndrome(HADAS)in the population returning from the plateau to the plain based on different machine learning algorithms and validate the predicting efficiency of these models.Methods Field or online surveys were conducted on the individuals who had ended their high-altitude living and returned to the plain areas from November 2020 to February 2024.Basic information,chronic mountain sickness(CMS),HADAS symptoms and other data were collected.With the inclusion and exclusion criteria,totally 1 095 individuals were subjected and assigned into the modeling group.Positive events were defined as HADAS score>5.Then the modelling group was divided into a training set(n=766)and an internal test set(n=329)in a 7∶3 ratio.Least absolute shrinkage and selection operator(LASSO)regression was used to select independent variables.Risk prediction models for high-altitude adaptation symptoms were built based on 8 machine learning methods,including multiple factor logistic regression(LR),decision tree(DT),random forest(RF),eXtreme gradient boosting(XGB),support vector machine(SVM),K-nearest neighbor(KNN),light gradient boosting(LGB)and naïve bayes(NB).The models were compared and evaluated using receiver operating characteristic(ROC)curves,calibration curves and confusion matrices in the internal test set.The final model was presented using a nomogram or Shapley additive explanations(SHAP)algorithm.In August 2024,another 132 individuals who returned to the plains and met the same criteria were recruited and served as the external validation group.Results There were 549 individuals(50.14%)out of the 1 095 subjects having HADAS symptoms.LASSO regression identified CMS score,age and duration of high-altitude residence as significant predictors.Among the 8 machine learning algorithms,the LR model was identified as the best,with an area under the curve(AUC)value of 0.819(95%CI:0.789~0.850)and 0.841(95%CI:0.799~0.884),and an F1 score of 0.801 in the internal test set,respectively,and the AUC value and F1 score of the LR model were the largest among the 8 models in the internal test set.Spiegelhalter Z test of the calibration curve of the LR model indicated that its P=0.703 in the training set while P=0.281 in the internal test set.The AUC value of the LR model was 0.867(95%CI:0.765~0.969)in the external validation set.Conclusion The LR model constructed based on indicators including CMS score,age and duration of high-altitude residence has a good overall performance in the internal test set,and good discriminating effect in the external validation set.The constructed nomogram is convenient for application.

prediction modelhigh altitude de-acclimatization syndromemachine learningnomogram

丁宇、王泽军、谢佳新、赵思雨、张钢

展开 >

陆军军医大学(第三军医大学)高原军事医学系 寒区医学教研室,重庆

陆军军医大学(第三军医大学)高原军事医学系 极端环境医学教育部重点实验室,重庆

陆军军医大学(第三军医大学)高原军事医学系 高原作业医学教研室,重庆

预测模型 高原脱适应症 机器学习 列线图

2025

陆军军医大学学报
第三军医大学

陆军军医大学学报

北大核心
影响因子:1.015
ISSN:2097-0927
年,卷(期):2025.47(1)