基于Albert-TextCNN模型的多标签新闻文本分类

扫码查看

原文链接

万方数据
维普

中文摘要：针对智能信息推送管理者的多标签新闻文本分类任务,提出了基于ALBERT-CNN模型的解决方案.利用ALBERT预训练模型和TextCNN卷积神经网络,充分进行语义理解和特征提取.通过ALBERT模型进行语义筛选,精准把握新闻文本内容和主题,再传递给TextCNN模型进行分类和标签预测.采用Sigmoid函数输出每个标签的概率,实现精准的多标签分类.实验验证 382 688 条来自今日头条客户端的数据,ALBERT-CNN模型的F1-Score达到 92.05%,召回率达到 96.8%,精确率达到 90%,相比于优于传统的ALBERT和ALBERT-Denses模型的F1-Score和召回率有所提升.在精确率上略低于AlBERT-Dense.该研究为提高信息推送效率和降低误导性信息的传播提供了一个新的解决方案.

外文标题：Multi-label News Text Classification Based on AlBERT-TextCNN Model

外文摘要：Aiming at the multi-label news text classification task of intelligent information push managers,a solution based on ALBERT-CNN model is proposed.The ALBERT pre-trained model and TextCNN Convolutional Neural Network are employed to comprehensively understand semantics and extract features.Semantic filtering is performed through the ALBERT model to accurately grasp the content and themes of news texts,which are then passed to the TextCNN model for classification and label prediction.The sigmoid function is utilized to output the probability of each label,achieving precise multi-label classification.The experiment verifies 382 688 data from the Toutiao client.The F1-Score of ALBERT-CNN model reaches 92.05%,the Recall reaches 96.8%,and the Precision reaches 90%.Compared with the traditional ALBERT and ALBERT-Dense models,it has improved in F1-Score and Recall.It is slightly lower than ALBERT-Dense model in Precision.This study provides a new solution for enhancing information push efficiency and reducing the spread of misleading information.

外文关键词：

multi-label classificationALBERTTextCNNNLP

作者：

麦咏欣、林志豪、葸娟霞

展开 >

作者单位：

广东东软学院信息管理与工程学院,广东佛山 528225

关键词：

多标签分类 ALBERT TextCNN 自然语言处理

出版年：

2024

DOI：

10.19850/j.cnki.2096-4706.2024.20.008

现代信息科技

广东省电子学会

现代信息科技

ISSN：2096-4706

年,卷(期)：2024.8(20)