Bacterial Signature for Prediction of Disease Type Based on Abundance of Ruminococcus
The study used machine learning model to construct a non-invasive evaluation model of diseases based on the abun-dance of Ruminococus to explore the value of intestinal flora in the prediction of disease types.Data in R library was used to down-load data from different studies.Abundance of Ruminococcus,study condition,disease state,age,sex,antibiotic use,region,smoking situation,and other information of human samples were selected,and the evaluation model of disease screening was es-tablished by using machine learning classification models such as random forest,decision tree and Adaboost.The parameters were adjusted by GridSearchCV,and the external verification results were evaluated by using a confusion matrix.Three evalua-tion models were established based on the abundance of Ruminococcus and the general information of samples such as sex and age.The random forest model had the highest accuracy(0.884).In addition,when n_estimators was 220,the score was 0.892,which was the best model.The external validation results also showed that the classification algorithm in the visible model predict-ed relatively few errors,and the model performed well.According to the metagenomic data of fecal samples,the random forest al-gorithm can effectively predict the disease types based on the abundance of Ruminococcus.