针对现有文本分类方法无法充分提取中文文本中的语义特征,从而影响分类效果的问题,提出一种融合知识图谱与多神经网络的文本分类模型KGMNN(knowledge graph and multiple neural network).首先,该模型以Word2Vec作为嵌入层对文本进行向量化表示,利用多神经网络提取文本的全局语义特征与局部语义特征;其次,借助外部知识图谱获取文本相关概念集以丰富文本特征,并引入注意力机制计算每个概念权重值,降低了无关噪声概念对分类的影响;最后,将文本特征及其概念特征融合作用于Softmax分类器以得到分类结果.在THUCNews短文本数据集与长文本数据集上进行性能评估,实验结果表明,所提模型的分类准确率分别为96.67%和97.57%,与传统模型相比具有更好的分类性能.
A text classification model combining knowledge graph and multiple neural network
Aiming at the problem that existing text classification methods cannot fully extract the semantic features from Chinese text,which affects the classification effect,a text classification model combining knowledge graph and multiple neural network(KGMNN)is proposed.Firstly,the model uses Word2Vec as the embedding layer to represent the text vectorically,and uses multiple neural networks to extract the global and local semantic features of the text.Secondly,the external knowledge graph is used to obtain the text related concept set to enrich the text features,and the attention mechanism is introduced to calculate the weight value of each concept,so as to reduce the impact of irrelevant noise concepts on classification.Finally,the text feature and its concept feature are fused to Softmax classifier to get classification results.The performance of the proposed model is evaluated on THUCNews short text dataset and long text dataset,and the experimental results show that the classification accuracy of the proposed model is 96.67%and 97.57%,respectively,which has better classification performance than traditional models.
neural networkattention mechanismknowledge graphChinese text classification