Semi-supervised Classification for Short Text Based on Multi-grained Graphs and Attention Mechanism
Sparse and fuzzy semantics,insufficient information,and irregular expressions in short texts pose great challenges to short text classification tasks.Moreover,the existing short text classification methods ignore the interactive information between terms,and implicit semantics cannot be fully exploited;therefore,they are classified inefficiently.To address these problems,a semi-supervised short text classification method based on multi-grained graphs and attention mechanism,named MgGAt,is proposed.Two types of graphs are constructed based on word and text granularities,and semantic information is fully mined to perform classification task.First,the model builds a word-level graph,captures word embeddings,and learns the feature representations of a short text.Specifically,intra-and inter-hop attention are introduced on a word-level graph to effectively extract high-order information from various semantic perspectives that are hidden in word terms and obtain word embeddings with rich semantics.Simultaneously,a pooling strategy is designed according to the characteristics of the word embeddings,which are aggregated into text vectors.Thereafter,a text-level graph is constructed,and with the help of part of the labeled information,the advantage of the Graph Neural Network(GNN)is used to perform label propagation and reasoning on the graph to achieve semi-supervised short text classification.Experimental results on four public datasets demonstrate that,compared with baseline models,the classification accuracy and F1 value of the proposed MgGAt increased by 1.18 and 1.37 percentage points respectively,on average,resulting in better classification performance.
short text classificationsemi-supervised classificationGraph Neural Network(GNN)attention mechanismmulti-grained graph