Influential factors and predictive analysis on myopia among primary and secondary school students based on hybrid feature selection
Objective To establish a myopia prediction model and analyze the factors affecting myopia among primary and secondary school students in Xincheng district,Xi'an city,so as to provide a scientific basis for the development of myopia prevention and control strategies for students as well as the implementation of intervention measures.Methods Based on the 2022 Common Disease Surveillance Program for Students in Shaanxi Province,visual acuity examination was per-formed using 5-meter standard logarithmic visual acuity chart,and dioptometry was performed on students'eyes using desktop computer optometer.A total of 2 511 students who participated in myopia screening and filled out questionnaires were included in the study.Support Vector Machine Recursive Feature Elimination(SVM-RFE),Least Absolute Shrinkage and Selection Operator Regression Based on Cross-Validation(LASSOCV),x2 test-SelectKBest,Decision Tree-SelectFrom-Model,and Mutual Information Approach were used for the screening of myopia-influencing factors,respectively,and the screened variables were incorporated into the logistic regression and 5 categorical prediction models to realize the predic-tion of the risk of myopia occurrence.Results A total of 1 780 people were detected with myopia,with a myopia rate of 70.89%(1 780/2 511),69.24%(833/1 203)for boys and 72.4%(947/1 203)for girls.The myopia rates of primary,middle,high and vocational high school students were 54.69%(560/1 024),78.96%(473/599)and 84.12%(747/888)respectively.A total of 17 variables appeared 3 or more times in the top 15 of the 5 feature selection methods.Among the 5 feature selection methods,age and whether parents were myopic were selected in all 5;whether parents reminded to pay attention to reading and writing postures,reading and writing with chest more than one fist away from the edge of the ta-ble,and attending cultural cram classes such as English,math,and writing in time were selected in 4;Logistic regression results showed that age(OR=1.329,95%CI:1.286-1.373,P<0.0001),parental myopia(father or mather myopia OR=1.808,95%CI:1.453-2.251,P<0.0001;father and mather myopia OR=3.566,95%CI:2.691-4.726,P<0.0001),parental reminder to pay attention to reading and writing postures(OR=1.349,95%CI:1.092-1.666,P=0.006),being outdoors during recess(OR=0.774,95%CI:0.636-0.943,P=0.011),watching TV with eyes more than 3 meters away from the TV display(often or always:OR=0.792,95%CI:0.589-1.064,P=0.122;Never or sometimes:OR=1.099,95%CI:0.835-1.445,P=0.501),and the average time spent doing homework and reading af-ter school each day(OR=1.342,95%CI:1.105-1.631,P=0.003)were the factors influencing the myopia.The pre-diction results of the 5 models showed that the performance of each model was better after variable screening than before variable screening.SVM-RBF after variable screening achieved the optimal classification performance(AUC=0.73,accuracy=0.72,f1-score=0.74,precision=0.78,recall=0.72),followed by SVM-POLY after variable screening(AUC=0.73,accuracy=0.71,f1-score=0.74,precision=0.78,recall=0.72).f1-score=0.73,precision=0.78,recall=0.71).This suggested that it was not the case that the more variables included,the better the predictive performance of the model.Conclusion Myopia rates among students increase rapidly with age,in addition to the cumulative effect.It is also associated with an increase in students'schoolwork burden and an increase in the amount of time spent using electronic devices such as cell phones.
Primary and secondary school studentsMyopiaRisk predictionHybrid feature selectionMachine learning