基于自适应GA-RF的用户流失预测研究
Research on User Churn Prediction Based on Adaptive GA-RF
赵峰 1徐丹华1
作者信息
摘要
针对电信用户流失问题,文章提出一种自适应遗传算法优化随机森林的预测模型.首先对Kaggle平台提供的电信数据进行数据清洗、特征提取及无量纲化处理,然后运用SMOTE过采样以解决数据不平衡问题,对决策树、随机森林等模型预测的召回率、F1和AUC值进行对比.最后提出一种自适应遗传算法优化随机森林的电信用户流失预测模型.结果表明,自适应遗传算法优化的随机森林模型的预测性能优于单一分类模型.
Abstract
Aiming at the problem of Telecom user churn,an adaptive genetic algorithm is proposed to optimize the prediction model of random forest.Firstly,data cleaning,feature extraction and dimensionless processing are carried out on the telecom data provided by kaggle platform,and then SMOTE oversampling is used to solve the problem of data imbalance,the recall rate,F1 and AUC predicted by decision tree,random forest and other models are compared.Finally,a prediction model of Telecom user loss based on adaptive genetic algorithm is proposed.The results show that the prediction performance of the random forest model optimized by adaptive genetic algorithm is better than that of the single classification model.
关键词
用户流失/自适应/遗传算法/随机森林/SMOTEKey words
User Churn/Adaptive/Genetic Algorithm/Random Forest/Synthetic Minority Oversampling Technique引用本文复制引用
基金项目
国家自然科学基金(71872002)
安徽省高等学校人文社会科学研究重点项目(SK2019A0072)
出版年
2024