Construction of a Classification Model for Cyclooxygenase-2 Inhibitors based on Machine Learning
Objective:This study aims to develop a classification model for cyclooxygenase-2(COX-2)inhibitors for the purpose of screening and optimizing COX-2 inhibitors.Methods:Eight machine learning algorithms were used to construct models,and their pre-dictive performance was compared to identify the best model.The optimal model was tested by using Y-scrambling validation method,finally the interpretability analysis of the optimal model was performed by using Shapley Additive eXplanation(SHAP)algorithm.Results:Among the eight different models compared,the Random Forest algorithm exhibited the best performance.With the highest accuracy,balanced accuracy,Matthew's correlation coefficient,area under the ROC curve,and Fl scores(0.893,0.825,0.673,0.909 and 0.933,re-spectively),it comes out on top.Validation with Y-scrambling showed that the predictions of the optimal model were not coincidence.Moreover,the SHAP algorithm was used to mine 20 structural fragments that could affect COX-2 inhibitor activity.Conclusions:In this study,we developed a theoretical basis for developing COX-2 inhibitors,which is useful to other researchers in this field when optimiz-ing lead compounds and designing new COX-2 inhibitors.