首页|融合双通道标签语义的多标签文本分类模型

融合双通道标签语义的多标签文本分类模型

扫码查看
针对多标签文本分类任务中的标签语义表示,提出了一种双通道标签语义增强模型。该模型包含2个重要的组成模块:基于标签共现的图卷积网络模块和基于预训练的标签语义嵌入模块。前者利用图卷积网络捕获标签之间的语义关联,增强每个标签的语义信息;后者利用预训练模型中的先验知识,增强标签的语义表示。最后,利用注意力机制融合并深度编码来自双通道的标签语义信息。在2个公开数据集AAPD、RCV1-V2上的多标签文本分类实验结果表明:与主流基线方法相比,该方法的精确率、召回率和微F1(Micro-F1)均有显著提升。
Multi-label text classification model integrating two-channel label semantics
A two-channel label semantic enhancement model was proposed for label semantic representation in multi-label text classification tasks.The model comprised two key components:the graph convolutional network module based on label co-occurrence and the label semantic embedding module based on pre-training.The former leveraged graph convolutional network to capture semantic associations among labels,thereby enhancing the semantic information of each label.The latter utilized prior knowledge from pre-trained models to augment the semantic representation of labels.Finally,an attention mechanism was employed to fuse and deeply encode label semantic information from the dual channels.The experimental results of multi-label text classification on two public datasets,AAPD and RCV1-V2,indicate that compared with mainstream baseline methods,our framework demonstrates significant improvements in terms of precision,recall,and micro-F1.

multi-label text classificationlabel semantic embeddingpre-trained language modelgraph convolutional network

冯心昊、吕学强、马登豪、滕尚志、田晶晶

展开 >

北京信息科技大学网络文化与数字传播北京市重点实验室,北京 102206

中国标准化研究院,北京 100012

多标签文本分类 标签语义嵌入 预训练语言模型 图卷积网络

2024

北京信息科技大学学报(自然科学版)
北京信息科技大学

北京信息科技大学学报(自然科学版)

影响因子:0.363
ISSN:1674-6864
年,卷(期):2024.39(4)