Objective:To develop and validate a machine learning classification model for predic-tion of prognostic survival time range in patients with non-small cell lung cancer(NSCLC)based on CT radiomics and morphological features.Methods:The lung1 dataset was downloaded from the cancer imaging archive(TCIA),and 243 eligible patients with peripheral NSCLC were selected and divided into two groups according to the cut-off survival time(group-1≤3 years and group-2>3 years).1037 radiomics features were extracted in each lesion,and feature screening was performed using the least absolute shrinkage and selection operator(LASSO)algorithm.The morphological characteristics of each lesion were recorded and screened using t-test and chi square test.By combining the two meth-ods,the prediction model was built by five machine learning classification methods,including logistic regression,random forest classifier,AdaBoost classifier,Gaussian NB and MLP Classifier.Then,receiv-er operating characteristic curve(ROC)was used to evaluate the effectiveness of five predictive mod-els,and the optimal model was selected.Finally,external validation was conducted using the data of 77 patients collected from The First Affiliated Hospital of Guangzhou University of Chinese Medicine.Results:The Gaussian NB classification prediction model was the best model in this experiment,with relatively good stability.Among all models,the AUC value of this model was relatively high in both the training and validation sets.After external validation,the AUC value of this model in the training set was 0.735,with sensitivity of 0.685 and specificity of 0.700,and the AUC value of this model in the test set was 0.771,with sensitivity of 0.571 and specificity of 0.898.Conclusion:The machine learning classification model based on CT radiomics combined with morphological features can predict the prognostic survival time range of NSCLC patients more accurately.
关键词
肺癌/体层摄影术,X线计算机/影像组学/预后生存时间/机器学习/预测模型
Key words
Lung cancer/Tomography,X-ray computed/Radiomics/Prognostic survival time/Machine learning/Prediction model