首页|基于RF-FL-LightGBM算法的信用风险评估模型研究

基于RF-FL-LightGBM算法的信用风险评估模型研究

扫码查看
为了解决大数据环境下高维度稀疏的客户信用特征以及样本不平衡问题,从而提高客户的信用评估准确度,论文提出了基于RF-FL-LightGBM算法的信用风险评估模型。首先利用随机森林(RF)对高维数据进行重要性排序和筛选,剔除容易引起模型过度拟合和冗余无效的特征;其次将基于Focal Loss函数改进后的二分类平衡交叉嫡损失函数(FL)作为LightGBM模型的损失函数,以此改善正负样本不平衡导致模型准确度降低的情况,从而提高模型的分类性能。使用某金融租赁公司的历史客户数据集进行实验,结果表明,RF-FL-LightGBM模型的F1值、AUC值都明显高于XGBoost和LigthGBM模型。RF-FL-LightGBM算法不仅有效处理了高维稀疏不平衡样本数据,还提高了客户属性的分类精确度且执行效率更高。
Research on Credit Risk Evaluation Model Based on RF-FL-LightGBM Algorithm
In order to solve the problem of high-dimensional sparse customer credit characteristics and sample imbalance in the big data environment,thereby improving the accuracy of customer credit evaluation,this paper proposes a credit risk evaluation model based on the RF-FL-LightGBM algorithm.First,random forest(RF)is used to sort and filter the importance of high-dimen-sional features to eliminate features that easily lead to model overfitting and redundant uselessness.Secondly,the two-category bal-anced cross-straight loss function(FL)is improved based on the Focal Loss function.As the loss function of the LightGBM model to improve the model accuracy due to the positive and negative samples imbalance,thereby improving the model classification perfor-mance.Experiments use the historical customer data set of a financial leasing company.The results show that the F1-Score and AUC of the RF-FL-LightGBM model are significantly higher than the XGBoost and LigthGBM models.The RF-FL-LightGBM algo-rithm not only effectively processes high-dimensional sparse and unbalanced sample data,but also improves the customer attributes classification accuracy and has higher execution efficiency.

credit risk assessmentrandom forestfeature selectionFocal LossLightGBM algorithm

苗月、吴陈

展开 >

江苏科技大学计算机学院 镇江 212000

信用风险评估 随机森林 特征选取 Focal Loss LightGBM算法

2024

计算机与数字工程
中国船舶重工集团公司第七0九研究所

计算机与数字工程

CSTPCD
影响因子:0.355
ISSN:1672-9722
年,卷(期):2024.52(3)
  • 16