首页|基于多中心队列数据的机器学习预测重症感染患儿死亡风险和筛选临床特征的研究

基于多中心队列数据的机器学习预测重症感染患儿死亡风险和筛选临床特征的研究

扫码查看
背景 科学、有效地预测重症感染患儿死亡关联因素对降低儿童病死率意义重大.既往重症患儿的病情与死亡关系多采用评分预测(如PCIS等),准确度欠佳.目的 通过机器学习联合特征筛选的方法,挖掘对重症感染患儿死亡风险具有早期预警作用的敏感指标.设计 队列研究.方法 基于全国20个省级行政区域的54家PICU的儿童多中心感染性疾病协作网数据库,纳入年龄>28天至18岁、确诊感染和至少有1个器官发生功能障碍的患儿,统计122项临床特征信息,以出PICU时死亡/恶化或治愈/好转为结局,通过机器学习构建逻辑回归模型(LR)、随机森林模型(RF)、极端梯度提升树(XGB)和反向传播神经网络(BP),筛选重要的临床特征建立重症感染患儿死亡风险预测模型.主要结局指标模型接收者操作特征曲线下面积(AUROC)和模型筛选临床特征性能的优劣.结果 2022年4月1日至2023年12月31日协作网数据库中入PICU时确诊重症感染且入PICU时、入PICU 24 h时和出PICU时临床特征记录均完整的(病例1 738例,经过数据预处理包括异常值处理、缺失值填充、强制值区间范围检验、归一化处理)1 738条信息进入机器学习构建模型.存活或好转患儿1 396例,死亡或恶化患儿342例(19.6%).队列数据按4∶1分为训练集(1 390条)和验证集(348条),训练集中存活或好转1116条,死亡或恶化274条;验证集中存活或好转280条,死亡或恶化68条.在训练集中,共输入模型122个临床特征,经过机器模型学习以及特征筛选后,在50轮的5折分层交叉验证下,验证集LR、RF和XGB的AUROC为0.74~0.78.LR、RF和XGB选择重要性大于均值的临床特征构建最优临床特征,尚无比较好的衡量BP特征重要性的方法,LR模型较RF和XGB构建的最优临床特征较为接近临床预期.结论 机器学习预测儿童重症感染性疾病死亡/恶化结局表现一般,预测模型筛选的临床特征与临床预期尚有距离.
Mortality risk predicting and clinical feature screening of children with severe infection by machine learning based on multicenter cohort data
Background It is of great significance to predict the mortality of children with severe infection scientifically and effectively.In the past,the relationship between illness and death in critically ill children was mostly predicted by scores with poor accuracy like the Pancreatitis Complications and Severity Index.Objective To explore the sensitive indicators for the early warning of the death in children with severe infection by machine learning combined with feature screening.Design Cohort study.Methods We conducted the cohort study based on the pediatric Multi-center Infectious Diseases Collaboration Network database of 54 PICUs in 20 provincial administrative regions of China.In total,122 clinical features of 11 clinical dimensions were collected from children aged>28 days after birth to 18 years,with confirmed infection and at least one organ dysfunction.A risk prediction model for mortality in critically ill children with infections was established by constructing logistic regression models(LR),random forest models(RF),extreme gradient boosting tree models(XGB),and backpropagation neural network models(BP)through machine learning techniques and screening important clinical features.Main outcome measures AUROC and the performance of the model in screening clinical characteristics.Results From April 1,2022 to December 31,2023,there were 1 738 cases of severe infection with complete records at PICU admission,at PICU 24h stay and at discharge from PICU,of whom 1 396 patients survived or improved,and 342(19.6%)died or deteriorated.After data preprocessing by outlier processing,missing value filling,mandatory value interval range testing,normalization processing,1 738 pieces of information were entered into machine learning to build the model.According to the ration of 4∶1,1 390 patients were enrolled in training sets and 348 were in validation sets.In training sets,1 116 patients survived(or cured)and 274 died(or worsened),and in validation sets,280 patients survived(or cured),and 68 died(or worsened).In training sets,a total of 122 clinical features were input.After machine learning and feature screening,the range of AUROC of LR,RF and XGB was 0.74-0.78 in validation sets after 50 rounds of 5-fold stratified cross-validation.Features with greater importance than the mean value were selected to construct the optimal clinical features in LR,RF,and XGB models.At present,there is no good method to measure the importance of BP characteristics.Clinical features constructed by the LR model were closer to clinical expectations than by RF and XGB.Conclusion Machine learning is less than perfect in predicting death of severe infectious diseases in children,and the clinical futures screened by predictive model are still far from clinical expectations.

Machine learningPediatric intensive care unitInfectionRandom forest modelExtreme gradient lifting tree

朱雪梅、陈申成、章莹莹、陆国平、叶琪、阮彤、郑英杰

展开 >

复旦大学附属儿科医院重症医学科 上海,201102

华东理工大学计算机科学与工程学院 上海,200237

复旦大学公共卫生学院流行病学教研室 上海,200032

机器学习 儿童重症监护室 感染 随机森林模型 极端梯度提升树

国家重点研发计划国家重点研发计划国家重点研发计划上海市卫生健康系统重点扶持学科建设项目上海市市级科技重大专项

2021YFC27018002021YFC27018012021YFC27018052023ZDFC0103ZD2021CY001

2024

中国循证儿科杂志
复旦大学

中国循证儿科杂志

CSTPCD北大核心
影响因子:0.949
ISSN:1673-5501
年,卷(期):2024.19(1)
  • 11