首页|基于孤立森林算法的企业分布式财务不良数据检测研究

基于孤立森林算法的企业分布式财务不良数据检测研究

扫码查看
为了实现企业分布式财务不良数据的高效、精准检测,为企业财务安全决策提供重要数据保障,基于孤立森林算法,对企业分布式财务不良数据检测开展研究。通过分析企业分布式财务元数据管理体系,结合元数据仓库中的元数据目录映射实际企业分布式财务数据列表,提取企业分布式实际财务数据;从噪声干扰处理、数据缺失填补等角度,结合Z-score与中位数插值方法对数据预处理,以保证企业分布式财务数据质量;根据数据方差、标准差、偏度、峰度等统计量,计算完成预处理后数据中不良数据的分布特征,并基于孤立森林算法、融合孤立树的二叉树结构,最终实现企业分布式财务不良数据的高效、精准检测。实验结果表明:利用本文设计方法对数据集中预处理后,能够有效解决数据的异常空间分布状态、填补缺数部分数据,修复受噪声干扰产生的畸变状态;检测消耗时间最高值为5。4s,检测精准度最高值为0。93,检测效率与检测精准度具有比较优势。
Detection of Distributed Financial Bad Data of Enterprises Based on Isolated Forest Algorithm
In order to realize the efficient and accurate detection of enterprise distributed financial bad data,and provide enterprises with important data guarantee for financial security decision-making,based on the isolated forest algorithm,the detection of enterprise distributed financial bad data is studied in depth.By analyzing the enterprise distributed financial metadata management system,combining the metadata catalog in the metadata warehouse to map the actual enterprise distributed financial data list,and extracting the actual enterprise distributed financial data;pre-processing the data from the perspectives of noise interference processing,data missing filling,combining the Z-score method with the median interpolation method,to ensure the quality of the enterprise distributed financial data;according to the data variance,standard deviation,skewness,kurtosis and other statistics,calculate the distribution characteristics of bad data in the completed preprocessed data,and based on the isolated forest algorithm,integrate the binary tree structure of the isolated tree to realize the efficient and accurate detection of the bad data of enterprise distributed finance.The experimental results show that:after using the design method to preprocess the data in the data set,it can effectively solve the abnormal spatial distribution state of the data,effectively fill in the missing part of the data,and repair the aberration state generated by the influence of the collection noise interference,which has a good practical application effect.And the highest value of detection consumption time is 5.4s,and the highest value of detection accuracy is 0.93,which has certain advantages in detection efficiency and detection accuracy.

Isolated Forests Algorithmisolated treesdistributed financial datadata detection

李自霞、周波

展开 >

宣城职业技术学院 财务处,安徽 宣城 242000

合肥工业大学 计算机与信息学院,安徽 宣城 242000

孤立森林算法 孤立树 分布式财务数据 数据检测

安徽省质量工程项目

2022cxtd171

2024

湖北文理学院学报
湖北文理学院

湖北文理学院学报

CHSSCD
影响因子:0.164
ISSN:2095-4476
年,卷(期):2024.45(8)