基于特征重利用的双通道文本分类模型

Dual channel text classification model based on feature reuse

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：针对大多数卷积神经网络(convolutional neural networks,CNN)与循环神经网络(recurrent neural network,RNN)结合的CNN-RNN文本分类模型采用单通道模式极大限制了模型对文本特征提取能力的问题,提出一种基于特征重利用的双通道文本分类模型.首先,模型在RNN通道中利用长短期记忆(long short-term memory,LSTM)神经网络与门控循环单元(gate recurrent unit,GRU)网络共同提取文本的上下文语义信息,同时利用CNN通道提取文本的局部特征;其次,在2个通道中分别引入注意力机制,以使模型能准确关注到文本中的关键词.此外,在RNN通道中对模型进行改进,实现原始特征的重利用,进一步增强模型对全局特征的提取能力.在THUCNews数据集上的评估结果显示,所提模型的分类准确率可达96.61％,取得了更好的分类效果.

外文摘要：Although existing structures similar to the combination of convolutional neural networks(CNN)and recurrent neural network(RNN)have been widely used in the field of text classification,most text classification models based on CNN-RNN adopt single channel mode,which greatly limits the ability of the model to extract text features.Therefore,a dual channel text classification model based on feature reuse is proposed.Firstly,the model uses long short-term memory(LSTM)neural networks and gate recurrent unit(GRU)to extract context semantic information of text in RNN channel,and CNN channel to extract local features of text.Secondly,the attention mechanism is introduced in the two channels respectively,which makes the model focus on the keywords in the text accurately.In addition,the model is improved in the RNN channel to realize the reuse of the original features and further promote the ability of the model to extract global features.The evaluation results on the THUCNews dataset show that the classification accuracy of the proposed model can reach 96.61％and achieve better classification results.

外文关键词：

text classificationattentional mechanismdual channelfeature extractionfeature reuse

作者：

廖薇、李启行、徐震、孟静雯

展开 >

作者单位：

上海工程技术大学电子电气工程学院,上海 201620

上海工程技术大学机械与汽车工程学院,上海 201620

关键词：

文本分类注意力机制双通道特征提取特征重利用

基金：

国家自然科学基金项目"上海高校青年东方学者"岗位计划资助项目

项目编号：

62001282QD2017043

出版年：

2024

DOI：

10.14188/j.1671-8844.2024-09-015

武汉大学学报(工学版)

武汉大学

武汉大学学报(工学版)

CSTPCD北大核心

影响因子：0.621

ISSN：1671-8844

年,卷(期)：2024.57(9)

参考文献量5