Research on financial fraud identification based on XGBoost model in unbalanced datasets
In view of the unbalance in the number of fraud samples and non-fraud samples in reality,a study on financial statement fraud identification is conducted by applying over-sampling,under-sampling techniques and XGBoost model to 25 financial indicators and 2 non-financial indicators.The results show that the combination of SMOTE over-sampling method and XGBoost model has a good overall identification effect in the unbalanced dataset,which has certain reference significance for the intelligent identification of financial statement fraud of listed companies.
unbalanced datasetidentification of financial statement fraudSMOTEXGBoost