A multicenter study on the prediction of gamma passing rate based on radiomic features
Objective To construct classification prediction models for gamma passing rate using radiomics-based machine learning approaches and data from multiple radiotherapy institutions and evaluate the models'performance.Methods The data from 572 volumetric-modulated arc therapy(VMAT)patients across three radiotherapy institutions(514 for training and 58 for testing)were retrospectively collected.Additionally,45 VMAT plans were collected from a single institution as an independent external validation set.For all the data,a three-dimensional dose validation approach based on actual measurements of phantoms was utilized,and gamma analysis was performed at the 3%/2 mm criterion using a dose threshold of 10%,absolute doses,and global normalization.After radiomic features were extracted from dose files,feature selection was performed using the random forest(RF)method and RF combined with Shapley Additive exPlanation(SHAP).Then,feature subsets of varying sizes(10,20,30,40,and 50)were selected based on feature rankings.Using these subsets as inputs,data training was conducted using the Extreme Gradient Boosting(XGBoost)algorithm.Finally,the models'classification performance was assessed using the area under the curve(AUC)values and F1-score.Results Under the 3%/2 mm criterion,all models performed the best in the case of 20 feature subsets.The optimal prediction model established based on the feature selection using RF exhibited AUC and F1-score of 0.88 and 0.89,respectively on the testing set and 0.82 and 0.90,respectively,on the validation set.The optimal prediction model built based on the feature selection using RF combined with SHAP yielded AUC and F1-score of 0.86 and 0.92 on the testing set and 0.87 and 0.89,respectively,on the validation set,along with superior robustness.Therefore,the second model possessed certain advantages over the first model.Conclusions For multicenter dose verification result,it is feasible to construct a machine learning prediction model with high classification performance using radiomic features derived from dose files,combined with feature selection based on SHAP.This approach can assist in advancing the clinical applications and implementation of gamma passing rate prediction models.