In response to the characteristics of citizen hotlines being mostly short texts and sparse features,a short text exten-sion method and a text classification model based on dual channel feature fusion(BERT BiGRU TextCNN,BGTC)were proposed to achieve automatic recognition and classification of citizen hotline texts.Firstly,use the TF-IWF model and LDA topic model to con-struct the core vocabulary;Then,Word2Vec is used to calculate word similarity,completing the extension of short text content and word vector features;Finally,the extended text classification was achieved through the BGTC model that integrates the feature infor-mation of BERT TextCNN and BERT-BiGRU Attention channels.After multiple comparative experiments,the results show that this method has better performance in the text classification task of citizen hotlines,with accuracy and F1 values reaching 85.6%and 85.8%,respectively.
关键词
市民热线/短文本扩展/文本分类/特征融合
Key words
citizen hotline/short text extension/text classification/feature fusion