Predictive models of cervical lymph node metastasis in papillary thyroid carcinoma based on Logistic regression and random forest algorithm:a comparative study
Objective To construct a predictive models of cervical lymph node metastasis(LNM)in papillary thyroid carcinoma(PTC)based on Logistic regression and random forest algorithm,and to compare their diagnostic efficacy.Methods A total of 156 PCT patients diagnosed and treated in our hospital were selected and divided into non-metastatic group(n=65)and metastatic group(n=91)according to the presence of cervical LNM.The differences in ultrasound features,genetic tests and clinical data were compared between the two groups.Multivariate Logistic regression was used to screen the independent influencing factors of cervical LNM in PTC.Predictive models of cervical LNM in PTC were constructed based on Logistic regression and random forest algorithms,respectively.And their diagnostic efficacy was analyzed by receiver operating characteristic(ROC)curve.Results There were statistically significant differences in age,maximum diameter of nodule,thyroglobulin antibody(TgAb)level,and proportion of extra thyroidal extension(ETE),BRAFV600E gene mutation,microcalcification between the two groups(all P<0.05).The multivariate Logistic regression analysis showed that microcalcification,maximum nodule diameter of nodule,ETE,age,TgAb level,BRAFV600E gene mutation were independent influencing factors for cervical LNM in PTC(all P<0.05).ROC curve analysis showed that the area under the curve(AUC)of the Logistic regression model in predicting cervical LNM in PTC was 0.763.The random forest model showed the lowest error rate when the number of trees was 272.And the rank order of the relatively important predictors for cervical LNM in PTC were as follows:TgAb level,BRAFV600E gene mutation,microcalcification,age,ETE,and maximum diameter of nodule.And the AUC of the random forest model in predicting cervical LNM in PTC was 0.856,which was higher than that of the Logistic regression model(Z=2.812,P=0.005).Conclusion The diagnostic efficacy of the predictive model of cervical LNM in PTC based on random forest algorithm is higher than that based on Logistic regression.Clinicians can develop rational interventions for PTC patients according to the randomized forest importance ranking of the occurrence of cervical LNM.
UltrasonographyPapillary thyroid carcinomaCervical Lymph Node MetastasisLogistic regression modelRandom forest model