Objective To construct a pulmonary tuberculosis diagnosis model based on machine learning algorithms for blood routine test,and to analyze its clinical application value.Methods Totally,469 newly diagnosed patients with pulmonary tuberculosis(pulmonary tuberculosis group)from Shanghai Xuhui Central Hospital from January 2019 to December 2022 were enrolled,and 506 healthy subjects matched by age and sex were enrolled as healthy control group.The data of 22 blood routine test items and demographic parameters of all the subjects were collected.The collinearity was analyzed by LASSO regression analysis.The datum set was randomly divided into 2 parts:75%was used as the training set for the construction of the machine learning model;25%was used as the test set for the performance evaluation of the model.Four machine learning algorithms,distributed random forest(DRF),deep learning,gradient elevator and generalized linear model,were used to test the model,and the diagnostic efficiency of the model was verified by 5-fold crossover method.The diagnostic performance of the model was evaluated by receiver operating characteristic(ROC)curve.Results Based on Logistic regression analysis and LASSO regression analysis,10 non-collinear indicators were selected.DRF was the opitmal machine learning algorithm for the construction of pulmonary tuberculosis diagnosis.In the training set and test set,the areas under curves of the DRF model were 0.992 1 and 0.847 4,the sensitivities were 99.16%and 92.04%,the specificities were 80.91%and 55.22%,and the accuracies were 89.84%and 72.06%,respectively.Conclusions The pulmonary tuberculosis diagnosis model based on machine learning algorithm is an effective diagnostic tool,but its clinical application value needs to be further verified.
Machine learningDiagnostic modelPulmonary tuberculosisBlood routine test