首页|基于机器学习算法的胃癌淋巴结转移预测模型研究

基于机器学习算法的胃癌淋巴结转移预测模型研究

扫码查看
目的 基于4种机器学习(ML)算法构建胃癌淋巴结转移的预测模型并验证。方法 回顾性收集531例胃癌根治术患者的临床资料,按3∶1比例将患者随机分为训练集399例和测试集132例。通过单因素分析筛选胃癌淋巴结转移的特征选择变量,分别建立逻辑回归、随机森林、K-邻近算法、支持向量机算法模型并进行变量重要性排序。将所有ML算法模型在测试集中进行验证,绘制受试者工作特征(ROC)曲线,基于曲线下面积(AUC)、灵敏度、特异度、准确度确定最优ML算法模型。基于最优ML算法模型的变量重要性排序构建列线图模型,通过ROC曲线、校准曲线、决策曲线评价列线图模型的区分能力、校准能力和临床适用性。结果 4种ML算法模型比较结果显示,随机森林模型为最优算法模型,其在训练集中的准确度、灵敏度、特异度分别为72。7%、69。9%、75。0%,AUC为0。803,其在测试集中的准确度、灵敏度、特异度分别为64。4%、66。7%、62。5%,AUC为0。751。基于随机森林算法模型的变量构建列线图模型,ROC曲线显示列线图模型在训练集、测试集中的AUC分别为0。721、0。776,校准曲线和决策曲线显示列线图模型在训练集与测试集中均有较好的校准能力和临床适用性。结论 随机森林模型是4种ML算法模型中的最优算法模型,基于随机森林模型构建的列线图模型能够较准确地预测胃癌淋巴结转移风险,从而更好地指导临床诊断和治疗决策。
Research on gastric cancer lymph node metastasis prediction model based on machine learning algorithms
Objective To establish and validate a prediction model for gastric cancer lymph node metastasis based on four machine learning(ML)algorithms.Methods A retrospective analysis was conducted on clinical data of 531 patients who underwent radical gastrectomy.The patients were ran-domly divided into training set(399 patients)and test set(132 patients)in a ratio of 3 to 1.Univari-ate analysis was used to screen for variables associated with gastric cancer lymph node metastasis,and Logistic regression,random forest,K-nearest neighbor algorithm,and support vector machine algo-rithm models were established to rank the importance of variables.All ML algorithm models were vali-dated in the test set,and receiver operating characteristic(ROC)curves were plotted.The optimal ML algorithm model was determined based on the area under the curve(AUC),sensitivity,specifici-ty,and accuracy.A nomogram model was constructed based on the variable importance ranking of the optimal ML algorithm model.The discrimination,calibration,and clinical applicability of the nomo-gram model were evaluated using ROC curves,calibration curves,and decision curves.Results The results of the comparison of the four ML algorithm models showed that the random forest model was the optimal algorithm model.The accuracy,sensitivity,and specificity of the random forest model in the training set were 72.7%,69.9%,and 75.0%,respectively,with an AUC of 0.803.The accuracy,sensitivity,and specificity of the random forest model in the test set were 64.4%,66.7%,and 62.5%,respectively,with an AUC of 0.751.A nomogram model was constructed based on the variables of the random forest algorithm model.The ROC curve showed that the AUCs of the nomogram model in the training set and test set were 0.721 and 0.776,respectively.Calibration curves and decision curves showed that the nomogram model had good calibration and clinical applicability in both the training set and test set.Conclusion The random forest model is the optimal algorithm model a-mong the four ML algorithm models.The nomogram model based on the random forest model can ac-curately predict the risk of gastric cancer lymph node metastasis,thereby better guiding clinical di-agnosis and treatment decisions.

gastric cancerlymph node metastasismachine learning algorithmsprediction modelrandom forestsupport vector machine algorithm

施昊旻、燕速、乔梦梦、杨惠莲

展开 >

青海大学医学部公共卫生系,青海西宁,810001

青海大学附属医院胃肠外科,青海西宁,810001

胃癌 淋巴结转移 机器学习算法 预测模型 随机森林 支持向量机算法

2024

实用临床医药杂志
扬州大学,中国高校科技期刊研究会

实用临床医药杂志

CSTPCD
影响因子:1.543
ISSN:1672-2353
年,卷(期):2024.28(1)
  • 21