Text classification method based on knowledge enhancement
In order to solve the problem of inaccurate classification in text categorization task due to poor quality of some data,data imbalance and too small dataset,a text categorization algorithm based on knowledge enhancement is proposed.Firstly,the algorithm enhances the data set by adding external knowledge.Secondly,the original text and external knowledge are word-embedded using GloVe word vectors and the text features are extracted using CNN,LSTM and BERT models.Thirdly,the extracted original text features and external knowledge text features are fused in order to obtain the final text features.Finally,the fused text features are fed into the multilayer sensing model to obtain the final text features.The experiments on different datasets show that on the SST-5 dataset,the text classification accuracy of CNN(KB),LSTM(KB)and BERT(KB)is improved by 5.01%,7.92%and 1.5%,respectively,compared with the baseline model,and on the SST-2 dataset,the text classification accuracy of LSTM(KB)and BERT(KB)is improved by 1.76%and 1.5%,respectively,compared with the baseline model.1.76%and 1.29%,respectively;on the IMDB dataset,the text categorization accuracies of models CNN(KB),LSTM(KB)and BERT(KB)are improved by 0.97%,2.87%and 0.76%,respectively,over the baseline model.The above results show that the text classification algorithm can effectively improve the accuracy of text classification and can provide good reference for text classification applications in different fields.
deep learningneural networkstext classificationknowledge enhancementfeature extraction