基于文本分析和机器学习的企业风险识别研究

Enterprise Risk Identification Based on Text Analysis and Machine Learning

陈友余 ¹赵金晶 ¹王欣 ¹刘纯霞¹

扫码查看

作者信息

1. 湖南财政经济学院会计学院,长沙 410205
折叠

摘要

在传统财务指标基础上,基于过去、未来和情感视角,应用文本分析方法和自然语义处理方法,重构企业风险识别指标体系,然后引入机器学习方法,以上市公司财务数据及管理层讨论与分析文本信息为数据来源,构建企业风险识别模型,进行企业风险识别.研究结论如下:1)通过提供增量信息,完善风险度量尺度,构建兼具时态敏感性和情感洞察力的三维风险识别体系,以更全面、更准确地对企业风险进行测度和识别;2)引入机器学习算法,对 AdaBoost 模型、Hist Gradient Boosting 模型、Random Forest 模型和 Bagging 模型进行精度比较,发现 AdaBoost 模型最优,稳健性最好,可用于企业风险识别;3)应用机器学习方法和 SHAP 方法,进行企业风险特征重要度排序和企业风险识别机理分析,识别出企业风险关键影响因素,观察各项风险特征对企业风险识别模型的影响.本研究能为企业风险识别指标体系设计和风险识别模型优化提供经验证据和决策支持,并助推企业高质量发展和供应链安全稳定.

Abstract

Based on the traditional financial indicators,this paper applies text analysis and natural semantic processing methods to reconstruct the enterprise risk identification index system based on past and future perspectives.Then,it introduces machine learning methods to construct an enterprise risk identification model based on the financial data of listed companies and the textual information of management dis-cussion and analysis as the data source for enterprise risk identification and prediction.The conclusions of the study are as follows:1)By providing additional information,the risk measurement scale can be improved,and a three-dimensional risk identi-fication system that combines temporal sensitivity and emotional insight can more comprehensively and accurately measure and identify business risks.2)Introduces machine learning algorithms to compare the predictive accuracy of the AdaBoost model,Hist Gradient Boosting model,Random Forest model and Bagging model,and finds that the AdaBoost model is optimal,has the best robustness,and can be used for enterprise risk identification and prediction.3)By applying machine learn-ing and SHAP methods to rank the importance of enterprise risk characteristics and analyze the mechanism of enterprise risk identification,the key influencing factors of enterprise risk can be identified,and the impact mechanism of various risk char-acteristics on the enterprise risk identification model can be observed.This study can provide empirical evidence and decision support for the design of enterprise risk identification index system and optimization of risk identification model,as well as promote the high-quality development of enterprises and supply chain security and stability.

关键词

文本分析/机器学习/外部风险/供应链风险/未来风险

Key words

text analytics/machine learning/external risk/supply chain risk/future risk

引用本文复制引用

出版年

2024

计量经济学报

CSTPCDCSCD

ISSN：

段落导航