构建肝门部胆管癌患者肝切除术后肝功能不全风险预测的机器学习模型

A machine learning model to predict the risk of liver dysfunction after hepatectomy in patients with hilar cholangiocarcinoma

扫码查看

原文链接

维普
万方数据

中文摘要：目的通过机器学习算法,构建肝门部胆管癌(HCCA)患者肝切除术后肝功能不全(PHLD)的风险预测模型.方法回顾性分析河南大学人民医院2017年1月至2023年12月行开腹HCCA根治联合半肝切除术的203例HCCA患者的临床资料,其中男性112例,女性91例,年龄63(55,69)岁.依据PHLD的诊断标准,将患者分为两组:PHLD组(n=45)和非PHLD组(n=158).比较两组患者的年龄、性别、中性粒细胞计数(NEU)、系统免疫炎症指数(SII)、营养预后指数(PNI)、中性粒细胞与淋巴细胞比值(NLR)、手术时间、并发症等临床资料.将两组比较差异有统计学意义的变量纳入7种机器学习模型,即:逻辑回归、随机森林、极限梯度提升、轻量梯度提升、决策树、高斯朴素贝叶斯和支持向量机.采取受试者工作特征曲线下面积(AUC)优选模型,使用沙普利加和解释法(SHAP)分析解释最终优选的模型.结果 PHLD组和非PHLD组HCCA患者在年龄、术前减黄、术前白蛋白、入院总胆红素、术前天冬氨酸转氨酶、术前NEU、术前SII、术前PNI、术前NLR、手术时间、Dindo-Clavien≥ Ⅲ级和剩余肝体积/肝脏总体积方面的差异均具有统计学意义(均P＜0.05).最终确定极限梯度提升模型的预测性能最佳,其在测试集中的AUC为0.888(95％CI:0.776～0.985),准确度为0.854,灵敏度为0.506,特异度为0.965,F1值为0.625,Kappa值为0.519.SHAP分析解释极限梯度提升模型显示,入院总胆红素、手术时间、Dindo-Clavien≥ Ⅲ级、术前SII以及术前NEU是该模型的5个重要因素,这5个因素与HCCA患者PHLD的发生均呈正相关.结论本研究构建的极限梯度提升算法模型对HCCA患者PHLD的预测性能较好,稳定性较好,具有良好的可解释性和与临床适用性.

外文摘要：Objective To establish a machine learning model to predict the risk of post hepatectomy liver dysfunction(PHLD)in patients with hilar cholangiocarcinoma(HCCA).Methods Clinical data of 203 patients with HCCA undergoing open radical hemihepatectomy in Henan University People's Hospital from January 2017 to December 2023 were retrospectively analyzed,including 112 males and 91 females,aged 63(55,69)years.According to the diagnostic criteria for PHLD,patients were divided into two groups:PHLD group(n=45)and non-PHLD group(n=158).Clinical data such as age,sex,neutrophil count(NEU),systemic immunoinflammatory index(SII),nutritional prognosis index(PNI),neutrophil to lymphocyte ratio(NLR),operative time and complications were compared between the two groups.The vari-ables with statistically significant difference between the two groups were included in seven machine learning models,namely logistic regression,random forest,extreme gradient boosting,light gradient boosting,deci-sion tree,gaussian naive bayes and support vector machine.The area under receiver operating characteristic curve optimization model was adopted,and Shapliga sum-interpretation method(SHAP)was used to analyze and interpret the final optimal model.Results There were statistically significant differences in age,preop-erative data including management of jaundice,albumin,total bilirubin,aspartate aminotransferase,NEU,SII,PNI,and NLR,operative time,postoperative complication of Dindo-Clavien≥Grade Ⅲ,and the ratio of FLR/TLV between in the two groups(all P＜0.05).Finally,it was determined that the prediction performance of the extreme gradient boosting model was the best,with an area under curve of 0.888(95％CI:0.776-0.985),an accuracy of 0.854,a sensitivity of 0.506,a specificity of 0.965,an F1 value of 0.625,and a Kappa value of 0.519.SHAP analysis of the extreme gradient boosting model showed that total bilirubin on admission,operation time,postoperative complication of Dindo-Clavien ≥ grade Ⅲ,SII and NEU were five important factors of this model,which were positively correlated with the occurrence of PHLD in HCCA patients.Conclusion The extreme gradient boosting model established in this study has a good predictive performance and stability for PHLD in HCCA patients.

外文关键词：

Bile duct neoplasmsHilar cholangiocarcinomaLiver dysfunctionPredictive modelsMachine learning

作者：

唐昌乾、李炳垚、任泳年、朱恒立、郭宇麒、李冬筱、王亚峰、李世朋、李德宇、王连才

展开 >

作者单位：

河南大学人民医院肝胆胰外科,郑州 450003

新乡医学院,新乡 453003

河南省人民医院肝胆胰外科,郑州 450003

河南省人民医院消化内科,郑州 450003

展开 >

关键词：

胆管肿瘤肝门部胆管癌肝功能不全预测模型机器学习

出版年：

2024

DOI：

10.3760/cma.j.cn113884-20240713-00209

中华肝胆外科杂志

中华医学会

中华肝胆外科杂志

CSTPCD北大核心

影响因子：1.846

ISSN：1007-8118

年,卷(期)：2024.30(12)