Construction and evaluation of a machine learning-based predictive model for nosocomial infection after artificial joint replacement surgery
OBJECTIVE To construct a machine learning-based model for the prediction of nosocomial infection after artificial joint replacement surgery.METHODS Totally 1 800 patients underwent artificial joint replacement surger-y in a tertiary hospital in Hangzhou,Zhejiang Province,from Jan.2017 to Nov.2022 were selected as the study subjects,and the dataset was randomly divided into a training set(1350 cases)and a test set(450 cases)in a 3:1 ratio.The recursive feature elimination method was applied in the training set for independent variable selection,and the optimal parameters for five types of models,including logistic regression,Support Vector Machine(SVM),Decision Tree(DT),eXtreme Gradient Boosting(XGBoost),and Random Forest(RF),were deter-mined by a grid searched method.Model performance was evaluated using sensitivity(TPR),positive predictive value(PPV),specificity(TNR),negative predictive value(NPV),F1 score,accuracy,and area under the curve(AUC)to determine the superior machine learning model,and the SHAP(Shapley additive explanations)method was used to explain the importance of variables in the superior model.RESULTS Infection occurred in 102 of the 1 800 cases,with an incidence rate of 5.67%.The AUCs of the five models,including logistic,decision tree,ran-dom forest,SVM and XGBoost,were 0.92,0.89,0.98,0.70,and 0.98 in the training set,while the AUCs in the test set were 0.85,0.78,0.86,0.63,and 0.88,respectively;XGBoost,and RF models were the better per-formed machine learning models.SHAP results showed that days of perioperative antimicrobial use,surgery time,age,National nosocomial infection surveillance system(NNIS)score,and blood loss were the more important pre-dictors.CONCLUSION In this study,we established a prediction model of nosocomial infection risk after artificial joint replacement based on machine learning algorithms,and compared the efficacy of multiple prediction models,among which the overall performance of XGBoost and RF models was superior.The aforementioned models were helpful for timely and accurately identification of patients at high risk of nosocomial infection after artificial joint replacement and implementation of effective interventions.