首页|基于Lasso-Logistic回归模型的胃癌影响因素分析

基于Lasso-Logistic回归模型的胃癌影响因素分析

扫码查看
目的 探究胃癌影响因素并构建临床预测模型.方法 收集2020年12月~2023年10月就诊于上海中医药大学附属普陀医院及上海中医药大学附属曙光医院的1000例胃肿瘤患者的临床资料,经数据清洗剔除异常值后,分为胃息肉组(n=487)和胃癌组(n=479).采用非参数检验筛选出有意义的指标,Lasso回归筛选具有非0系数的胃癌相关特征因素,逐步Logistic回归分析筛选出具有显著相关的因素,构建Lasso-Logistic回归模型,并绘制受试者工作特征(receiver operator characteristic,ROC)曲线计算曲线下面积(area under the curve,AUC)及混淆矩阵评估模型效能.结果 多因素Logistic回归分析结果显示,年龄、白细胞计数(white blood cell,WBC)、单核细胞(monocyte,M)计数、谷丙转氨酶(alanine amiontransferase,ALT)、糖类抗原 724(cancer antigen 724,CA724)、糖类抗原 242(cancer antigen 242,CA242)、糖类抗原 50(cancer antigen 50,CA50)、癌胚抗原(carcino-embryonic antigen,CEA)是胃癌的独立影响因素.基于多因素Logistic回归分析结果构建胃癌的风险预测列线图模型,测试集的AUC为0.91,精准率为100%,召回率为100%,验证集的AUC为0.93,精准率为93.63%,召回率为74.1%,模型预测效果良好.结论 本研究构建8个胃癌常见预测因子,且Lasso-Logistic回归预测模型具有较好区分度,临床可基于患者体检报告,完成胃癌早期筛查.
Analysis of Influencing Factors of Gastric Cancer Based on Lasso-Logistic Regression Model
Objective To explore the influencing factors of gastric cancer and construct the clinical prediction model.Methods From December 2020 to October 2023,the clinical data of 1000 patients with stomach neoplasm admitted to Putuo Hospital,Shanghai U-niversity of Traditional Chinese Medicine and Shuguang Hospital,Shanghai University of Traditional Chinese Medicine were collected.Af-ter data cleaning and eliminating abnormal values,the patients were divided into gastric polyps group(n=487)and gastric cancer group(n=479).Non-parametric test was used to screen out meaningful indicators,Lasso regression to screen out the characteristic factors re-lated to gastric cancer with non-zero coefficient,and stepwise Logistic regression analysis to screen out the factors with significant correla-tion,and Lasso-Logistic regression model was constructed.The receiver operator characteristic(ROC)curve was plotted to calculate the area under the curve(AUC)and the confusion matrix to evaluate the model efficiency.Results The results of multivariate Logistic re-gression analysis showed that age,white blood cell(WBC)count,monocyte(M)count,alanine amiontransferase(ALT),cancer anti-gen 724(CA724),cancer antigen 242(CA242),cancer antigen 50(CA50)and carcinoembryonic antigen(CEA)were independent factors affecting gastric cancer.Based on the results of multivariate Logistic regression analysis,the risk prediction nomogram model of gas-tric cancer was constructed.The AUC of test set was 0.91,the accuracy rate was 100%,and the recall rate was 100%;the AUC of valida-tion set was 0.93,the accuracy rate was 93.63%,and the recall rate was 74.1%.The model has good prediction efficiency.Conclusion In this study,8 common predictors of gastric cancer were constructed,and the Lasso-logistic regression prediction model had good differen-tiation,which could be used to complete the early screening of gastric cancer based on the physical examination reports of patients.

Gastric cancerLasso-Logistic regressionRisk factorsClinical prediction model

郭静、韩吉、吕文清、王杰

展开 >

200333 上海中医药大学附属普陀医院

200000 上海中医药大学附属曙光医院

胃癌 Lasso-Logistic回归 危险因素 临床预测模型

国家自然科学基金资助项目

81973625

2024

医学研究杂志
中国医学科学院

医学研究杂志

CSTPCD
影响因子:0.702
ISSN:1673-548X
年,卷(期):2024.53(9)
  • 15