Toxicity prediction of metal oxide nanoparticles with the assistance of machine learning
It is momentous to forecast the toxicity of untested metal oxide nanoparticles (MNPs) rapidly and efficaciously before their applications in nanotechnology industry because their applications are more and more widely.In this work,the toxicity dataset of MNPs was established by collecting literature data.The target variable was the cytotoxicity (log (1/ECs0)) of MNPs,and 11 candidate independent variables were selected.The genetic algorithm-support vector regression (GA-SVR) was employed to screen the independent variables.Then the optimal feature set for modeling was obtained,including three variables.Using the new data set formed by the optimal feature set,the support vector regression with linear kernel function (SVR-LKF) and SVR with Gaussian kernel function (SVR-RBF) models were proposed to construct the quantitative structure-activity relationship (QSAR) models for predicting the toxicity of MNPs.By comparison,the performance of the SVR-RBF model was superior to that of the SVR-LKF model in terms of the model evaluation metrics.Meanwhile,it also overmatched the model reported in the literature.Besides,the SVR-LKF model has also impressive performance and practical value in toxicity prediction.In order to explore the toxicity mechanism,the effects of various variables on the toxicity of MNPs were also analyzed by simulation study.Therefore,the method outlined here can provide valuable hints for the toxicity prediction of untested (MNPs) with the assistance of machine learning,and the studies of toxic mechanism of the MNPs.