中华危重症医学杂志(电子版)2023,Vol.16Issue(5) :390-398.DOI:10.3877/cma.j.issn.1674-6880.2023.05.007

心血管事件患者术后30d死亡风险决策树模型的构建与评估——基于少数类样本合成过采样技术算法

Establishment and evaluation of decision tree model for post-operative 30-day death risk in patients with cardiovascular events:based on synthetic minority over-sampling technique algorithm

陈永庄 莫小乔 谢天
中华危重症医学杂志(电子版)2023,Vol.16Issue(5) :390-398.DOI:10.3877/cma.j.issn.1674-6880.2023.05.007

心血管事件患者术后30d死亡风险决策树模型的构建与评估——基于少数类样本合成过采样技术算法

Establishment and evaluation of decision tree model for post-operative 30-day death risk in patients with cardiovascular events:based on synthetic minority over-sampling technique algorithm

陈永庄 1莫小乔 2谢天3
扫码查看

作者信息

  • 1. 224000 江苏盐城,南京大学医学院附属盐城第一医院麻醉科
  • 2. 200092 上海,上海交通大学医学院附属新华医院手术室
  • 3. 200001 上海,上海交通大学医学院附属第九人民医院普外科
  • 折叠

摘要

目的 建立基于少数类样本合成过采样技术(SMOTE)算法的合并心血管事件行外科手术患者术后30 d死亡风险决策树模型.方法 选择新加坡中央医院2012年至2016年收入住院行手术治疗的华人患者,共纳入3 086例合并心血管事件行外科手术患者(缺血性心脏病史和/或充血性心力衰竭史患者),提取患者基本临床信息以及相关基础病和手术相关评分信息.采用SMOTE算法对原始数据集进行重建,并应用全子集回归筛选预测因子,将数据集按7∶3分为训练组和验证组,其中训练组用于建立决策树风险预测模型,验证组用于内部验证.结果 患者术后30 d病死率为3.0%(93/3 086),术后24h ICU入住率为4.5%(140/3 086).全子集回归分析显示年龄>75岁[比值比(OR)=1.033,95%置信区间(CD(1.024,1.042),P<0.001]、贫血[OR=1.368,95%CI(1.211,1.546),P<0.001]、慢性肾脏病分期>2 期[OR=1.381,95%CI(1.277,1.494),P<0.001]、术前输血[OR=4.496,95%CI(3.268,6.185),P<0.001]、急诊手术[OR=3.344,95%CI(2.752,4.064),P<0.001]、红细胞分布宽度>15.7%[OR=2.097,95%CI(1.658,2.652),P<0.001]及美国麻醉医师协会分级>2级[OR=3.362,95%CI(2.734,4.135),P<0.001]是心血管事件患者术后30 d死亡的危险因素.应用以上7个预测因子构建决策树模型.结果显示训练组受试者工作特征曲线下面积为0.853[95%CI(0.837,0.868),P<0.001],敏感度、特异度分别为0.765、0.756;验证组受试者工作特征曲线下面积为0.858[95%CI(0.834,0.882),P<0.001],敏感度、特异度分别为0.938、0.612,总体判别能力良好.结论 心血管事件患者术后30 d死亡事件发生率低,为不平衡数据分类问题,本研究基于处理不平衡数据常用的SMOTE算法,避免了小概率事件建模过程中的过拟合问题.同时决策树模型具有直观、便捷、个性化的特点,为医务工作者提供了方便的临床预测工具.

Abstract

Objective To establish a decision tree model based on the synthetic minority over-sampling technique(SMOTE)algorithm for the prediction of post-operative 30-day death in patients undergoing surgery with cardiovascular events.Methods A total of 3 086 Chinese patients undergoing surgery with cardiovascular events(the history of ischemic heart disease and/or congestive heart failure)admitted to the Singapore General Hospital for operation from 2012 to 2016 were enrolled,and their clinical information,history of diseases and surgical scores were extracted.The original data was reconstructed by the SMOTE algorithm,and predictors were selected by best subset regression.Data was divided into a training group and a validation group by the ratio of 7:3,of which the training group was used to establish the decision tree model and the validation group was used for internal verification.Results The mortality rate was 3.0%(93/3 086)at 30 days after surgery and 4.5%(140/3 086)of patients were admitted to ICU at 24 h after surgery.The best subset regression analysis showed age>75 years[odds ratio(OR)=1.033,95%confidence interval(CI)(1.024,1.042),P<0.001],anemia severity[OR=1.368,95%CI(1.211,1.546),P<0.001],chronic kidney disease stage>2[0R=1.381,95%CI(1.277,1.494),P<0.001],preoperative blood transfusion[OR=4.496,95%CI(3.268,6.185),P<0.001],surgical types[OR=3.344,95%CI(2.752,4.064),P<0.001],red blood cell distribution width>15.7%[OR=2.097,95%CI(1.658,2.652),P<0.001]and American Society of Anesthesiologists classification>2[OR=3.362,95%CI(2.734,4.135),P<0.001]were risk factors for 30-day death after surgery in patients with cardiovascular events.The above seven predictors were selected to build a decision tree model.The results showed that the area under the receiver operating characteristic curve of the decision tree model was 0.853[95%CI(0.837,0.868),P<0.001],and the sensitivity and specificity were 0.765 and 0.756 respectively in the training group;the area under the receiver operating characteristic curve of the decision tree model was 0.858[95%CI(0.834,0.882),P<0.001],and the sensitivity and specificity were 0.938 and 0.612 respectively in the validation group,with good overall discrimination.Conclusions The risk of post-operative 30-day death in patients with cardiovascular events is an issue of unbalanced data classification,with minor cases for outcomes.In this study,the SMOTE algorithm was adopted avoiding the poor clinical applicability caused by the overfitting in conventional modeling.At the same time,the decision tree model presents visual,convenient and personalized characteristics,which is a useful clinical prediction tool for physicians.

关键词

少数类样本合成过采样技术算法/术后死亡/全子集回归/预测模型/决策树

Key words

Synthetic minority over-sampling technique algorithm/Postoperative death/Best subset regression/Predictive models/Decision tree model

引用本文复制引用

出版年

2023
中华危重症医学杂志(电子版)
中华医学会

中华危重症医学杂志(电子版)

CSTPCDCSCD
影响因子:1.291
ISSN:1674-6880
参考文献量2
段落导航相关论文