Establishment and value assessment of colon cancer diagnostic models based on multiple variables and different machine learning algorithms
Objective To establish a colon cancer diagnostic model based on multiple variables using various machine learning algorithms and to assess its clinical application value.Methods Serum samples from 119 colon cancer patients and 125 healthy controls were collected.Serum exosome was extracted,and miRNA 214-3p(miR-214-3p)level was measured using RT-qPCR.Receiver operating characteristic(ROC)curve was plotted to evaluate the diagnostic effi-ciency of colon cancer.Additionally,30 routine laboratory items of colon cancer patients and healthy controls were col-lected.Characteristic variables were screened,and 11 algorithms were used to establish the diagnostic model.The opti-mal model was selected with ROC and machine learning curves.Results The expression level of miR-214-3p in colon cancer patients was significantly higher than that in healthy controls(P<0.001),with the area under the ROC curve(AUC)being 0.820,indicating good diagnostic performance.After the expression level of miR-214-3p and other 30 routine laboratory items were enrolled,4 characteristic variables were screened to establish the diagnostic model,in-cluding UREA,carcinoembryonic antigen,monocyte and miR-214-3p.The Logistic regression algorithm was identified as the optimal one(AUC=0.93).Conclusion Serum exosome miR-214-3p is a potential biomarker of colon cancer.The model based on 4 characteristic variables and Logistic regression algorithm has an excellent diagnostic performance for diagnosing colon cancer.