首页|肝细胞癌自动化BCLC分期模型研究

肝细胞癌自动化BCLC分期模型研究

扫码查看
目的 借助大数据平台,构建肝细胞癌(hepatocellular carcinoma,HCC)自动化巴塞罗那分期(Barcelona clinic liver cancer,BCLC)模型,以服务于临床诊疗及学术研究工作。方法 选取福建医科大学孟超肝胆医院 2020 年 1 月—2022 年 12月收治的HCC患者的临床资料,通过数据仓库技术(extract-transform-load,ETL)工具构建患者的标准化全维度数据集(每个病例含 700 个维度)。选取 2020 年 1 月—2022 年 12 月收治的 1 076 例HCC患者,根据 2016 年BCLC分期标准,在数据集中提取肝性脑病、腹水、总胆红素、白蛋白、凝血酶原时间、肿瘤个数、肿瘤直径、门静脉癌栓情况、肝外转移情况、患者体力情况等 12 个相关维度,采用基于机器学习的自然语言处理和基于Python语言的XGBoost(eXtreme gradient boosting)模块等方法构建自动化BCLC分期模型。随机抽取 2020 年 1月—2022 年 12 月收治的HCC患者 191 例,进行既往病例测试。选择2020年1月—2022年12月收治的180例HCC患者,进行新增病例测试。由 2 名肝胆外科主治医师对测试病例进行人工分期审核,获得标准分期用于校正。比较模型自动化分期、病例记录分期及标准分期三者间差异,以观察模型的准确性和实用性。结果 基于大数据方法学成功构建HCC自动化BCLC分期模型,通过含 150 个病例的验证集进行验证,准确率为 93。33%,提示建模成功。既往病例测试结果提示,经标准分期校正,自动化分期准确率为 98。43%,错误 3 例,其中 0期 1 例、A期 2 例;记录分期准确率为 96。33%,错误 7 例,其中 0 期 2 例,A期 5 例。新增病例测试结果提示,经标准分期校正,自动化分期准确率为 95。56%,错误 8 例,其中 0 期 1 例,A期 1 例,B期 4 例,C期 2 例,D期 0 例;记录分期准确率为 96。11%,错误 7 例,其中 0 期 2 例,A期 1 例,B期 2 例,C期 2 例,D期 0 例。结论 HCC自动化BCLC分期模型高效、准确,在数据标准化方面尚有改进空间,值得向临床推广。
The Automatic BCLC Staging Model for Hepatocellular Carcinoma
Objective To develop an automated Barcelona clinic liver cancer(BCLC)staging system for hepatocellular carcinoma(HCC)based on big data platform.Methods The clinical data of HCC patients admitted to Mengchao Hepatobiliary Hospital of Fujian Medical University from January 2020 to December 2022 were collected.The standardized full-dimension dataset of patients(700 dimensions per case)was constructed by the ETL(extract-transform-load)tool.A total of 1 076 HCC patients admitted to Mengchao Hepatobiliary Hospital of Fujian Medical University from January 2020 to December 2022 were selected.According to the 2016 BCLC staging standard,12 related dimensions including hepatic encephalopathy,ascites,total bilirubin,albumin,prothrombin time,tumor number,tumor diameter,portal vein tumor thrombus,extrahepatic metastasis and patient performance were extracted from the data set.Such as natural language processing based on machine learning and XGBoost(eXtreme gradient boosting)module based on Python language were used to construct an automated BCLC staging model.A total of 191 HCC patients from January 2020 to December 2022 were randomly selected for previous case testing.A total of 180 HCC patients from January 2020 to December 2022 were selected for new case testing.Two attending hepatobiliary surgeons manually reviewed the staging of the test cases,and standard staging was obtained for correction.The accuracy and practicability of the model,the differences among the automatic staging,case record staging and standard staging were compared.Results The automated BCLC staging model of HCC was successfully constructed based on the big data methodology.The accuracy of the model was 93.33%in the validation set of 150 cases,indicating that the model was successfully established.The test results of previous cases showed that the accuracy of automated staging was 98.43%after the correction of standard staging,and 3 cases were wrong,including 1 case of stage 0 and 2 cases of stage A.The accuracy rate of staging was 96.33%,and 7 cases were wrong,including 2 cases of stage 0 and 5 cases of stage A.The test results of new cases showed that the accuracy of automated staging was 95.56%after the correction of standard staging,and 8 cases were wrong,including 1 case of stage 0,1 case of stage A,4 cases of stage B,2 cases of stage C,and 0 case of stage D.The accuracy rate of staging was 96.11%,and 7 cases were wrong,including 2 cases of stage 0,1 case of stage A,2 cases of stage B,2 cases of stage C,and 0 case of stage D.Conclusion The automated BCLC staging system for HCC is efficient and accurate.There is still room for improvement in data standardization,which is worthy of clinical promotion.

hepatocellular carcinomaBCLC stagingbig dataETL toolsmachine learningnatural language processingXGBOOST

张冰、许庆祎

展开 >

福建医科大学孟超肝胆医院肝胆外科,福建福州 350028

肝细胞癌 BCLC分期 大数据 ETL工具 机器学习 自然语言处理 XGBOOST

2024

中国卫生标准管理
《中国卫生标准管理》杂志社

中国卫生标准管理

影响因子:1.374
ISSN:1674-9316
年,卷(期):2024.15(5)
  • 12