非平衡数据集下基于XGBoost模型的财务舞弊识别研究

Research on financial fraud identification based on XGBoost model in unbalanced datasets

王琦 ¹熊莎丽娜 ¹詹柔 ¹张露 ¹杨鑫 ¹张健¹

扫码查看

作者信息

1. 西南林业大学数理学院,云南昆明 650224
折叠

摘要

针对现实中舞弊样本与非舞弊样本存在的数量不平衡情况,通过25个财务指标与2个非财务指标,运用过采样、欠采样技术及XGBoost模型进行财务报表舞弊识别研究.结果表明,SMOTE过采样方法与XGBoost模型的结合在非平衡数据集下具有较好的整体识别效果,对上市公司财务报表舞弊的智能识别有一定参考意义.

Abstract

In view of the unbalance in the number of fraud samples and non-fraud samples in reality,a study on financial statement fraud identification is conducted by applying over-sampling,under-sampling techniques and XGBoost model to 25 financial indicators and 2 non-financial indicators.The results show that the combination of SMOTE over-sampling method and XGBoost model has a good overall identification effect in the unbalanced dataset,which has certain reference significance for the intelligent identification of financial statement fraud of listed companies.

关键词

非平衡数据集/财务报表舞弊识别/SMOTE/XGBoost

Key words

unbalanced dataset/identification of financial statement fraud/SMOTE/XGBoost

引用本文复制引用

基金项目

云南省教育厅科学研究基金(2022J0523)

云南省高等学校大学生创新创业训练计划项目()

出版年

2023

计算机时代

浙江省计算技术研究所　浙江省计算机学会

计算机时代

影响因子：0.411

ISSN：1006-8228

被引量1

参考文献量6

段落导航