Machine Learning-based Prediction Models for Credit Card Overdue
Banks and other credit institutions aim to utilize customers'credit card data to develop a model for predicting the overdue behavior of target customers.The focus is on identifying customers who are likely to be'not overdue'.To address the issue of traditional machine learning models being unreliable in predicting'overdue'customers,this study presents a random forest model based on the PR curve.The class data were quantified using unique heat coding during data preprocessing,and the sample data were balanced using the SMOTE method.The optimal number of features and a threshold of 0.182,selected based on the PR curve,maximize the score and are used to construct the random forest model.The hyperparameters are optimized using the grid search method.Empirical results demonstrate that the proposed model achieves a recall rate of 0.854 and a reliability of 0.918.The prediction performance is significantly improved compared to traditional machine learning models,making it more advantageous for banks to evaluate customers in batches and identify high-quality customers.