Short Text Classification Research on Socialized Q&A Community Based on Keyword Expansion:Taking the Legal Q&A Community as an Example
[Research purpose]Applying the keyword vector feature extension method to short text classification in socialized Q&A com-munities,addressing the shortcomings of sparse and unclear semantic features in problem short texts,and improving the information service quality of Q&A communities.[Research method]By combining TF-IDF with Word2vec to extend keyword features,the semantic infor-mation of short texts is enhanced.Leveraging the advantages of CNN for feature extraction,BiLSTM for capturing contextual information,and Attention mechanisms for weight allocation,a CNN-BiLSTM-Attention model is constructed.Taking the data from"china.findlaw.cn"as an example,after extending the keyword vector features,the CNN-BiLSTM-Attention model is utilized to effectively classify short texts in the legal Q&A community.[Research conclusion]Empirical research on 8 legal topics shows that the classification performance is improved after keyword expansion,and the optimal classification performance is achieved when the number of expanded keywords rea-ches 13.Using the CNN-BiLSTM-Attention model to classify the expanded legal Q&A short texts,the classification accuracy reaches 97.63%,which is 1.08%higher on average compared to several other classifiers.
keyword features extensionsocialized Q&A communityshort text classificationdeep learningCNN-BiLSTM-Attention model