Study of Adult Smoking Behavior in China Based on the LightGBM Model
Using the adult tobacco survey data conducted by the World Health Organization in China in 2018,this study explores the influencing factors of adult smoking behavior.Firstly,perform data cleaning on the original data,including removing irrelevant variables,combining new variables,and other steps.Secondly,feature selection is performed on the processed dataset by combining Chi-square test,analysis of variance,and Maximal Information Coefficient(MIC).Then,it conducts modeling based on XGBoost and LightGBM algorithms,sorting and analyzing the factors affecting adult smoking behavior.Finally,based on the well performing LightGBM model,variable combination modeling is performed to further explore the characteristics of smokers.Through modeling and analysis,it is identified that adult gender,tobacco environment,attitude towards value-added tax,low tar smoke awareness,educational background,and age importance have a varying impact from strong to weak on smoking behavior.