首页|基于三重混合采样和集成学习的潜在高价值旅客发现

基于三重混合采样和集成学习的潜在高价值旅客发现

扫码查看
考虑潜在高价值旅客特有的数据高度不平衡、旅客特征和价值类别弱相关等问题,提出一种基于三重混合采样和集成学习的潜在高价值旅客发现模型.采用RFM(Recency Frequency Monetary)方法标注旅客类别;使用三重混合采样对不平衡旅客数据集进行重采样;使用融合特征选择算法遴选旅客特征;使用梯度提升决策树作为分类器,构建旅客价值预测模型,识别潜在高价值旅客.在PNR数据集上的实验结果表明,与基准算法相比,该模型能取得更好的AUC值和F1值,可以较好地识别潜在高价值旅客.
POTENTIAL HIGH-VALUE PASSENGER DISCOVERY BASED ON SSOMAJ-SMOTE-SSOMIN SAMPLING AND ENSEMBLE LEARNING
Considering highly-imbalanced data and weak correlation between passenger characteristics and value categories of potential high-value passenger,a potential high-value passenger discovery model based on SSOMaj-SMOTE-SSOMin sampling and ensemble learning is proposed.The RFM method was used to label the passenger category.The SSOMaj-SMOTE-SSOMin method was used to resample the imbalanced passenger data set.The fusion feature selection algorithm(FFS)was used to select the passenger features.Gradient boosting decision tree(GBDT)was taken as the classifier to build a passenger value prediction model to identify potential high-value passengers.Compared with the baseline algorithm,the experimental results on the PNR data set show that the proposed model achieves better AUC value and F1 value,and can better identify potential high-value passengers.

Air transportationSSOMaj-SMOTE-SSOMinFeature importance rankingPotential high value passengerImbalanced classificationEnsemble learning

冯霞、胡昉

展开 >

中国民航大学计算机科学与技术学院 天津 300300

中国民航信息技术科研基地 天津 300300

航空运输 三重混合采样 特征重要性排序 潜在高价值旅客 不平衡分类 集成学习

国家自然科学基金项目中国民航大学科研基金项目民航旅客服务智能化应用技术重点实验室项目

615024992013QD18X

2024

计算机应用与软件
上海市计算技术研究所 上海计算机软件技术开发中心

计算机应用与软件

CSTPCD北大核心
影响因子:0.615
ISSN:1000-386X
年,卷(期):2024.41(1)
  • 6