融合双通道标签语义的多标签文本分类模型

扫码查看

原文链接

万方数据
维普

中文摘要：针对多标签文本分类任务中的标签语义表示,提出了一种双通道标签语义增强模型.该模型包含2个重要的组成模块:基于标签共现的图卷积网络模块和基于预训练的标签语义嵌入模块.前者利用图卷积网络捕获标签之间的语义关联,增强每个标签的语义信息;后者利用预训练模型中的先验知识,增强标签的语义表示.最后,利用注意力机制融合并深度编码来自双通道的标签语义信息.在2个公开数据集AAPD、RCV1-V2上的多标签文本分类实验结果表明:与主流基线方法相比,该方法的精确率、召回率和微F1(Micro-F1)均有显著提升.

外文标题：Multi-label text classification model integrating two-channel label semantics

外文摘要：A two-channel label semantic enhancement model was proposed for label semantic representation in multi-label text classification tasks.The model comprised two key components:the graph convolutional network module based on label co-occurrence and the label semantic embedding module based on pre-training.The former leveraged graph convolutional network to capture semantic associations among labels,thereby enhancing the semantic information of each label.The latter utilized prior knowledge from pre-trained models to augment the semantic representation of labels.Finally,an attention mechanism was employed to fuse and deeply encode label semantic information from the dual channels.The experimental results of multi-label text classification on two public datasets,AAPD and RCV1-V2,indicate that compared with mainstream baseline methods,our framework demonstrates significant improvements in terms of precision,recall,and micro-F1.

外文关键词：

multi-label text classificationlabel semantic embeddingpre-trained language modelgraph convolutional network

作者：

冯心昊、吕学强、马登豪、滕尚志、田晶晶

展开 >

作者单位：

北京信息科技大学网络文化与数字传播北京市重点实验室,北京 102206

中国标准化研究院,北京 100012

关键词：

多标签文本分类标签语义嵌入预训练语言模型图卷积网络

出版年：

2024

DOI：

10.16508/j.cnki.11-5866/n.2024.04.007

北京信息科技大学学报(自然科学版)

北京信息科技大学

北京信息科技大学学报(自然科学版)

影响因子：0.363

ISSN：1674-6864

年,卷(期)：2024.39(4)