Financial distress prediction based on convolutional neural networks:Imaged firms'financial reports
In the past few decades,the prediction and detection of corporate financial distress has been an important research topic in the finance and accounting literature.In an era of a cross between globalization and deglobalization,Chinese listed companies are facing a more complex external and internal environment than ever before,which has led to an increase in their debt and the corresponding default risk.At the same time,with the further opening up of the Chinese economy and financial system,the financial position of Chinese listed companies is beginning to have a significant impact on international financial markets.In this context,it is increasingly important to build more accurate corporate financial distress prediction models to identify early signals about the probability of bankruptcy.Using these forecasting models,Chinese regulators can identify financially unhealthy firms in time,which is also of great significance for preventing and dissolving financial systemic risks.In this paper,we use deep convolutional neural networks(CNN)to predict the financial distress of Chinese listed companies by imaging their fundamental data.This paper focuses on two research questions:First,when using CNN to predict the future financial distress,can the model using financial data from the past years achieve better prediction performance than that using the data of a single year?Secondly,can a deep learning model that extracts features from raw financial data and makes predictions outperform a combination of accounting-experts-selected features(features refer to financial ratios selected based on accounting knowledge)and traditional machine learning models?To explore the above questions,we first convert the raw financial items from firms'financial statements in the past three years into color images that are taken as the input of a deep convolutional neural network.The advantage of imaging the data is not only that the financial data is transformed into image data that CNN is better at processing,but also that it can better represent the relationship between different raw financial items from different years in the form of images:the position of each pixel in the image is used to represent the correlation between different raw financial items,and the different color channels of the color image are used to represent the different years to which the raw financial items belong.On this basis,this paper uses CNN to predict the financial distress of Chinese listed companies,relying on the ability of deep learning to extract features from raw data and allowing the model to learn and extract useful information directly from the raw data to predict financial distress.As a comparison,this paper also transforms the financial statement data of the past year into greyscale images to explore whether the financial data of the past 3 years is more helpful in improving the prediction accuracy.Unlike traditional machine learning models,deep learning models that uses raw data as input can learn features and make predictions directly from the raw data,whereas traditional machine learning methods use the financial ratios that human experts select based on accounting knowledge as input features to make predictions.By comparing the two different models,this paper explores whether a deep learning model that learns from raw data can outperform a traditional machine learning model based on expert-selected features.The results show that CNN model based on color images(three years of financial data)outperforms the CNN model based on greyscale images(one year of financial data).Besides,using deep learning models to understand the raw data,to extract features and to make predictions is comparable to"human-expert-selected financial ratios+traditional machine learning models".Specially,in terms of the most important evaluation metric,recall,CNN model has better performance,which suggests that a"fully automated"deep learning model is able to accurately identify more financially distressed firms.In addition,we find that the relationship between the raw-data-based deep learning model and the expert-knowledge-based machine learning model is complementary rather than alternative:when we combine the two models,the new model performs better than either of them.In a further analysis,for different lengths of financial data(i.e.1,2,3,4 and 5 years of financial data),we also demonstrate that the relatively optimal forecasting performance is achieved by using 3 years of financial data,and that the time-series information of the financial data does not provide additional information for improving the performance of the financial distress prediction model.This paper contributes to existing literature in several ways.First,we develop a method for converting multi-year financial data into color images,and on this basis,we extend the application of computer vision methods by introducing CNNs into firms'financial analysis.In addition,this paper explores the relationship between artificial intelligence and human expert knowledge,and shows that the two can complement each other to achieve better model performance,rather than substituting each other.Finally,this paper explores the impact of financial data from different lengths of time periods on model performance,and finds that the model combining financial data from the past three years has the best performance in terms of each evaluation metric.In addition,RNN-based experiments demonstrate that time-series information does not provide more useful information for predicting financial distress.
Deep learningConvolutional neural networksFinancial distressFinancial ratios