内蒙古医学杂志2024,Vol.56Issue(5) :525-532.DOI:10.16096/J.cnki.nmgyxzz.2024.56.05.003

基于临床指标的非小细胞肺癌风险评估研究:一项机器学习分析

Clinical Indicator-based Risk Assessment Study for Non-small Cell Lung Cancer:A Machine Learning Analysis

孙涛 刘俊 严辉
内蒙古医学杂志2024,Vol.56Issue(5) :525-532.DOI:10.16096/J.cnki.nmgyxzz.2024.56.05.003

基于临床指标的非小细胞肺癌风险评估研究:一项机器学习分析

Clinical Indicator-based Risk Assessment Study for Non-small Cell Lung Cancer:A Machine Learning Analysis

孙涛 1刘俊 2严辉1
扫码查看

作者信息

  • 1. 邵阳市中心医院血液肿瘤试验室,湖南邵阳 422000
  • 2. 邵阳学院附属第一医院科研部,湖南 邵阳 422000
  • 折叠

摘要

目的 旨在利用机器学习方法,通过分析临床指标,评估非小细胞肺癌(non-small cell lung cancer,NSCLC)患病风险.方法 本研究回顾性分析了邵阳市中心医院369例患者的人口学特征、实验室检查结果,以探究各临床指标与非小细胞肺癌之间的关系.首先进行了单因素Logistic回归分析,同时,利用随机森林分类器进行变量重要度分析,以确定对早期癌症风险评估最具影响力的指标,然后通过最小绝对收缩和选择算子回归(Least absolute shrinkage and selection operator,Lasso)进一步筛选变量.最后通过逻辑回归方法构建诺模图,并采用训练集及验证集受试者工作特征曲线(receiver operating characteristic curve,ROC)和决策曲线(decision curve analysis,DCA)进一步验证模型的准确性和可靠性.结果 对评估指标进行 Lasso 回归和逻辑回归分析发现,BMI(P<0.001,95%CI=1.15~1.35)、SKA1(P<0.001,95%CI=1.17~1.38)、SCC(P<0.001,95%CI=2.42~7.51)、CA242(P<0.001,95%CI=1.07~1.28)和性别(P<0.05,95%CI=0.26~0.91)是评估早期癌症风险的重要指标.此外,通过测量训练集和测试集中的AUC、校准曲线和DCA曲线,表明模型具有较高的准确性和临床适用性.结论 本研究通过机器学习方法分析临床指标,能够有效评估非小细胞肺癌的风险.BMI、SKA1、SCC、CA242和性别被发现是对非小细胞肺癌风险评估具有显著影响的指标,因此可作为筛查早期癌症的重要参考.

Abstract

Objective The aim was to assess the risk of non-small cell lung cancer(NSCLC)by analyz-ing clinical indicators using machine learning methods.Methods In this study,we retrospectively analyzed the demographic characteristics and laboratory test results of 369 patients in Shaoyang Central Hospital to investigate the relationship between each clinical indicator and NSCLC.Firstly,one-way logistic regression analysis was performed,meanwhile,variable importance analysis was carried out using random forest classifier to identify the most influential indicators for early cancer risk assessment,and then least absolute shrinkage and selection opera-tor(LASSO)regression was performed to further screen the variables.)regression was performed to further screen the variables.Finally,the Logistic regression method was used to construct the nomogram,and the accu-racy and reliability of the model were further verified by using the receiver operating characteristic curve(ROC)and decision curve analysis(DCA)in the training and validation sets.Results Lasso regression and logistic re-gression analyses of the assessed indicators found that BMI(P<0.001,95%CI=1.15-1.35),SKA1(P<0.001,95%CI=1.17-1.38),SCC(P<0.001,95%CI=2.42-7.51),CA242(P<0.001,95%CI=1.07-1.28)and gender(P<0.05,95%CI=0.26-0.91)were important indicators for assessing early cancer risk.In addition,the model was shown to have high accuracy and clinical applicability by measuring AUC,cali-bration curves and DCA curves in the training and test sets.Conclusion In this study,clinical indicators were ana-lyzed by machine learning algorithms,which can effectively assess the risk of non-small cell lung cancer.BMI,SKA1,SCC,CA242,and gender were found to be the indicators that had a significant effect on the risk assessment of non-small cell lung cancer,and thus can be used as important references to screen early stage cancers.

关键词

非小细胞肺癌/机器学习/临床指标/风险/预测模型

Key words

non-small cell lung cancer/machine learning/clinical indicators/risk/predictive modeling

引用本文复制引用

出版年

2024
内蒙古医学杂志
内蒙古自治区医学会

内蒙古医学杂志

影响因子:0.537
ISSN:1004-0951
段落导航相关论文