非平衡数据集下基于XGBoost模型的财务舞弊识别研究
Research on financial fraud identification based on XGBoost model in unbalanced datasets
王琦 1熊莎丽娜 1詹柔 1张露 1杨鑫 1张健1
作者信息
- 1. 西南林业大学数理学院,云南 昆明 650224
- 折叠
摘要
针对现实中舞弊样本与非舞弊样本存在的数量不平衡情况,通过25个财务指标与2个非财务指标,运用过采样、欠采样技术及XGBoost模型进行财务报表舞弊识别研究.结果表明,SMOTE过采样方法与XGBoost模型的结合在非平衡数据集下具有较好的整体识别效果,对上市公司财务报表舞弊的智能识别有一定参考意义.
Abstract
In view of the unbalance in the number of fraud samples and non-fraud samples in reality,a study on financial statement fraud identification is conducted by applying over-sampling,under-sampling techniques and XGBoost model to 25 financial indicators and 2 non-financial indicators.The results show that the combination of SMOTE over-sampling method and XGBoost model has a good overall identification effect in the unbalanced dataset,which has certain reference significance for the intelligent identification of financial statement fraud of listed companies.
关键词
非平衡数据集/财务报表舞弊识别/SMOTE/XGBoostKey words
unbalanced dataset/identification of financial statement fraud/SMOTE/XGBoost引用本文复制引用
基金项目
云南省教育厅科学研究基金(2022J0523)
云南省高等学校大学生创新创业训练计划项目()
出版年
2023