Research on Customer Credit Risk of Small Loan Companies Based on Mixed SMOTE and RF Model
The microfinance industry plays a crucial role in providing financial services to individuals who often lack access to traditional banking systems.However,the inherent risk associated with small-scale lending,particularly the challenge of accurately assessing the creditworthiness of individuals,poses a threat to the stability and growth of microloan institutions.The persistent challenge of individual credit risk in microloans continues to hinder the healthy and sustainable development of the microfinance industry.Specifically,the accurate identifi-cation of high default-risk clients remains a significant issue for microfinance companies when conducting credit risk assessments.This research holds theoretical significance by proposing a hybrid model that combines SMOTE and RF algorithms to address the challenges posed by high-dimensional and imbalanced datasets in the microloan context.The practical significance lies in its potential to enhance the accuracy of credit risk assessments,provi-ding microfinance companies with more robust tools for making informed lending decisions.To enhance the accuracy of credit risk assessments,this research leverages real-world data from Jiangsu-based J Microfinance Company.To tackle the challenges presented by microloan business data,the study employs a hybrid approach.The Random Forest(RF)model is initially constructed,followed by the develop-ment and evaluation of the SMOTE-RF and Borderline-SMOTE-RF models.These models integrate oversampling techniques with the powerful predictive capabilities of the Random Forest algorithm,aiming to improve the accuracy of credit risk assessments.Support Vector Machine(SVM)is selected for comparative experiments to benchmark the performance of the proposed models.The empirical testing reveals that the Borderline-SMOTE-RF algorithm outperforms the other models,demonstrating superior classification performance in personal credit risk assessment for microloans.The hybrid approach effectively addresses the challenges of high dimensionality and data imbalance,providing a robust solution for microfinance companies.Furthermore,based on the importance scores derived from the models,six key indicators influencing personal credit risk are identified.These indicators can serve as a reference for some microfinance companies with less mature credit risk management practices.Microfinance companies are encour-aged to strengthen the collection and utilization of these crucial pieces of information.The study emphasizes the significance of these indicators in enhancing the precision of credit risk assessments for small-scale loans.While the Borderline-SMOTE-RF algorithm emerges as the optimal solution for personal credit risk assess-ment in microloans,further research can explore the impact of manually synthesized virtual samples on indicator importance.However,the introduction of oversampling techniques,particularly the incorporation of artificially synthesized samples,may introduce a certain degree of bias to the ranking of indicators during the crucial selec-tion process.Future research should thus focus on the uniformity of classification performance and indicator importance scores in the context of hybrid algorithms.Analyzing the impact of oversampling on the consistency of indicator rankings will be paramount for ensuring the reliability of the selected key indicators.In conclusion,this research proposes a hybrid algorithm to effectively address the challenge of low accuracy in identifying high default-risk clients in personal credit risk assessment within the microloan industry.For high-dimensional and imbalanced credit data,the hybrid Borderline-SMOTE-RF algorithm can efficiently identify minority class clients with high default risk,ensuring the cash flow of microfinance companies.Simultaneously,the research scores indicator importance and selects six crucial credit indicators,providing more scientifically informed decision support for the lending operations of microfinance companies.
credit riskrandom forestSMOTEclassification modelindicator system