The authors analyze the sentiment tendency of annual reports through the dictionary method,machine learning method and transfer learning method.The dictionary method mainly builds a dictionary of positive words and negative words,and counts the proportion of positive and negative words in annual reports as the basis for sentiment judgment.The machine learning method mainly involves random forest,support vector regression and LGB,The authors use the method of constructing an ordered dictionary with high word frequency to extract the features of the data in annual reports,and obtains the statistical feature matrix of word frequency as the input of the machine learning model,and obtains the sentiment tendency index by predicting the cumulative excess return rate within a month after annual reports are disclosed,forming sentiment factor.The method of transfer learning mainly uses the Chinese BERT with word granularity.For the super long text of annual reports,the authors first construct 8 major catalog features,and uses the MemRecall mechanism in CogLTX under each catalog feature to further processes long texts.The final results show that the sentiment factors generated by transfer learning have the highest performance,followed by the sentiment factors generated by machine learning.In the face of super long texts,the sentimental factors constructed by the dictionary method are less effective.
Sentiment analysisDictionary methodMachine learningTransfer learningCumulative abnormal returnFinancial text