首页|列线图与机器学习算法预测中老年龋齿的比较研究

列线图与机器学习算法预测中老年龋齿的比较研究

扫码查看
目的 对比列线图与不同机器学习算法构建中老年人龋齿预测模型的效能。方法 采用多阶段分层随机抽样法,选取南宁市、贵港市、崇左市510名中老年人为研究对象,进行问卷调查及口腔检查。采用单因素分析和Lasso回归筛选相关变量,使用多因素logistic回归分析确定最终独立影响因素。基于显著特征,建立列线图预测模型,并运用线性判别分析(LDA)、偏最小二乘算法(PLS)、距离多普勒算法(RDA)、广义线性模型(GLM)、随机森林(RF)、支持向量机(SVM)核函数(SVM-Radial)及SVM线性核函数(SVM-Linear)7种机器学习算法构建7种龋齿风险预测模型。采用受试者工作特征(ROC)曲线下面积(AUC)中位数评价各模型预测性能,以及不同变量筛选方法所构建模型的预测性能。结果 中老年人龋齿检出率为71。18%。经过特征筛选后最终保留5个预测因子,分别是年龄(OR=0。945,95%CI:0。917~0。973)、刷牙频率(OR=0。688,95%CI:0。475~0。997)、过去1年是否洗牙(OR=0。303,95%CI:0。103~0。890)、牙存留数(OR=1。062,95%CI:1。038~1。087)和口腔健康评估量表(OHAT)得分(OR=1。363,95%CI:1。234~1。505)。各模型对比结果显示,RF算法所构建的预测模型表现最佳,AUC中位数为0。747,其次为列线图,AUC中位数为0。733。单因素+Lasso+多因素logistic(简称Lasso+logistic)筛选自变量构建预测模型的AUC中位数均高于RF算法筛选自变量构建的预测模型。结论 基于Lasso+logistic筛选变量,RF较列线图及其他机器学习算法在中老年龋齿预测中提供了更可靠的预测性能。
Comparative study on nomogram and machine learning algorithms for predicting dental caries in middle-aged and elderly people
Objective To compare the efficiency of nomogram and different machine learning algo-rithms for constructing the dental caries predictive models for middle-aged and elderly people.Methods The multi-stage stratified random sampling method was used to select 510 middle-aged and elderly people from Nanning City,Guigang City and Chongzuo City as the research subjects for conducting the questionnaire sur-vey and oral cavity examination.The univariate analysis and Lasso regression were used to screen the related variables,and the multivariate logistic regression analysis was used to determine the final independent influen-cing factors.Based on the salient features,the nomogram predictive model was established,and the seven ma-chine learning algorithms,including linear discriminant analysis (LDA),partial least squares (PLS),range Doppler algorithm (RDA),generalized linear models (GLM),random forest (RF),support vector machine (SVM) kernel function (SVM-Radial),and SVM linear kernel function (SVM-Linear),were used to construct the seven kinds of dental caries risk predictive models.The area under the receiver operating characteristic (ROC) curve (AUC) was adopted to evaluate the predictive performance of various models and the predictive performance of models constructed using different variable screening methods.Results The detection rate of dental caries in middle-aged and elderly people was 71.18%.After feature screening,the five predictive factors were ultimately retained,which were the age (OR=0.945,95%CI:0.917-0.973),brushing frequency (OR=0.688,95%CI:0.475-0.997),whether having teeth cleaning in the past one year (OR=0.303,95%CI:0.103-0.890),number of remaining teeth (OR=1.062,95%CI:1.038-1.087) and oral health assess-ment tool (OHAT) score (OR=1.363,95%CI:1.234-1.505).The results of comparison of various models showed that the predictive model constructed by the RF algorithm performed the best,the median of AUC was 0.747,followed by the nomogram,and the median of AUC was 0.733.The median of AUCs in the predic-tion model constructed by single factor+Lasso+multivariate logistic (Lasso+logistic) screening independent variables were higher than those constructed by RF algorithm screening independent variables.ConclusionBased on Lasso+logistic screening variables,RF provide more reliable predictive efficiency in predicting dental caries in middle-aged and elderly people than nomogram and the other machine learning algorithms.

middle-aged and elderly peopledental cariespredictionmachine learningcolumn diagram

赖丽冲、韦发烨、黄冬妹、曹晓莹、彭捷、冯小玲、黄惠桥

展开 >

广西医科大学第二附属医院护理部,南宁 530007

广西医科大学第一附属医院泌尿外科,南宁 530021

广西医科大学第二附属医院康复医学科,南宁 530007

广西医科大学第二附属医院党委办公室,南宁 530007

展开 >

中老年人 龋齿 预测 机器学习 列线图

2024

重庆医学
重庆市卫生信息中心,重庆市医学会

重庆医学

CSTPCD
影响因子:1.797
ISSN:1671-8348
年,卷(期):2024.53(14)