Machine learning and Cox proportional hazards regression model for warning of persistent infec-tion with high-risk HPV type
Objective A prediction model of human papillomavirus based on machine learning was established to determine the factors associated with the persistent infection of high-risk human papilloma vi-rus(HR-HPV),so as to provide early warning for the persistent infection of HR-HPV.Methods Clinical data of 4 407 women who participated in HPV testing at four health centers in Taizhou City from September 2017 to September 2019 and participated in HPV follow-up from September 2020 to September 2022 were collected.The demographic characteristics of total 4 407 subjects in this cohort study were used as the input of the machine learning model,and the change process of the results of the two HPV inspections as the out-put,a prediction model based on machine learning was established,including random forest and multi-layer perceptron,to predict the HPV follow-up results of the research object.Univariate Cox risk proportion re-gression model and multivariate Cox risk proportion regression model were used to statistically analyze 583 primary screening HR-HPV positive cases.Results The accuracy of the random forest prediction model was 84.3%,and the accuracy of the multi-layer perceptron was 80.5%.The top five viral types with persis-tent positive rate of HR-HPV were HPV58,multiple infections,HPV31,HPV33,and HPV52.The multi-variate Cox regression analysis showed that the conversion risk of HR-HPV infection in those with junior high school education or below was 1.72 times that of those with high school education and above(HR=1.72,95%CI:1.03-2.87,P=0.037),and the conversion risk of HR-HPV infection in non-menopausal individuals was 2.11 times higher than that in menopausal individuals(HR=2.11,95%CI:1.10-4.06,P=0.025).Conclusions Machine learning and Cox regression analysis models can provide an early warn-ing of the HR-HPV persistent infection population,which has an important clinical value for the subsequent management of HR-HPV-infected women and the prevention and control of cervical cancer.
Machine learningCox proportional hazards regression modelHigh-risk human papil-lomavirusPersistent infectionCervical cancer