In response to the shortcomings of insufficient text feature extraction and low classification accuracy in current Chinese text classification,a keyword extraction model based on E-TF-IDF(Expand Term Frequency Inverse Docu-ment Frequency)and a text classification model based on CNN GRU(Convolutional Neural Networks Gated Recurrent Unit)are proposed.This model can be expanded based on the probability of the occurrence of adjacent keywords,in or-der to achieve better keyword feature extraction.CNN-GRU is more suitable for sequence classification and has fewer parameters,which can reduce the risk of overfitting under small data sets.The final experimental results show that the classification accuracy of CNN-GRU is high,with an average of 97.88%.
关键词
文本分类/特征提取/E-TF-IDF/CNN-GRU
Key words
Text classification/Feature extraction/E-TF-IDF/CNN-GRU