首页|融入多尺度特征注意力的胶囊神经网络及其在文本分类中的应用

融入多尺度特征注意力的胶囊神经网络及其在文本分类中的应用

扫码查看
近些年来,胶囊神经网络(Capsnets)由于拥有强大的文本特征学习能力已被应用到了文本分类任务当中。目前的研究工作大部将提取到的文本多元语法特征视为同等重要,而忽略了单词所对应各个多元语法特征的重要程度应该是由具体上下文决定的这一问题,这将直接影响到模型对整个文本的语义理解。针对上述问题,本文提出了多尺度特征部分连接胶囊网络(MulPart-Capsnets)。该方法将多尺度特征注意力融入到Capsuets中,多尺度特征注意力能够自动选择不同尺度的多元语法特征,通过对其进行加权求和.就能为每个单词精确捕捉到丰富的多元语法特征。同时,为了减少子胶囊与父胶囊之间的冗余信息传递,本文也对路由算法进行了改进。本文提出的算法在文本分类任务上针对七个著名的数据集进行了有效性验证,和现有的研究工作相比,性能显著提高,说明了本文的算法能够捕获文本中更丰富的多元语法特征,具有更加强大的文本特征学习能力。
融入多尺度特征注意力的胶囊神经网络及其在文本分类中的应用
In recent years,capsule neural networks(Capsnets)has been successfully applied to text classification due to its powerful ability in text feature learning.In previous researches,all the extracted text n-gram features play equal roles in text classification.It is ignored that the importance of each n-gram feature corresponding to a word should bo determined by the specific context.This strategy will directly affect the semantic understanding of model to the whole input text.Based on this,this paper proposes Partially-connected Routings Capsnet with Multi-scale Feature Attention(MulPart-Capsnets),which incorporates multi-scale feature attention into Capsnets.Multi-scale feature attention can automatically select n-gram features from different scales,and capture accurately rich n-gram features for each word by weighted sum rules.In addition,in order to reduce the redundant information transferring between child and parent capsules,dynamic routing algorithm is improved too.In order to verify the effectiveness of the proposed model,our experiments are conducted on seven well-known datasets in text classification.The experimental results demonstrates that the proposed model consistently improves the performance of classification and is able to capture more rich n-gram features of text and possess powerful ability of feature learning.

胶囊神经网络多尺度特征注意力文本分类路由算法卷积神经网络

王超凡、琚生根、孙界平、陈润

展开 >

计算机学院 四川大学 成都610065

胶囊神经网络 多尺度特征注意力 文本分类 路由算法 卷积神经网络

Chinese National Conference on Computational Linguistic

Haikou(CN)

19th Chinese National Conference on Computational Linguistic

719-730

2020