Construction of a multigene model of predicting prostate cancer biochemical recurrence based on machine learning
Objective:Based on the prostate cancer data from public databases,a model was constructed to predict prostate cancer recurrence by machine learning methods.Methods:Prostate cancer RNA sequencing data as well as clinical data were downloaded,and prostate genes as well as clinical data were processed to screen pros-tate recurrence-related feature genes.Relevant models were constructed,and the model efficacy was validated.Random forests,support vector machines(radial kernel,linear kernel,binomial kernel,sigmoid kernel),and gradient descent trees were compared with the default parameters,and the models with higher performance were selected for further validation.Results:A total of 148 recurrence difference genes were obtained,and 5 genes were screened according to their importance to construct prediction models.The models constructed based on these genes using different methods had good precision and accuracy,among which the model constructed by the random forest method was the best,with an accuracy of 87%in predicting the recurrence of prostate cancer,and the area under the working curve of the subjects was 0.84.Conclusion:Constructing a machine learning model from gene expression data can be used to predict the recurrence of prostate cancer in a better way.