基于Stacking模型的学术论文多标签分类系统构建

Construction of Multi-Label Classification System for Academic Papers Based on Stacking Model

刘爱琴 ¹郭少鹏¹

扫码查看

作者信息

1. 山西大学经济与管理学院
折叠

摘要

学术论文高质量多标签自动分类是推动学术研究发展的关键程序之一.本研究利用Stacking模型将随机森林、支持向量机、极限树、极端梯度提升和神经网络五个分类器融合为一个异质集成分类器,并利用基于问题转换思想的多二分类模型将该分类器应用于学术论文多标签分类.根据学术论文的特点,依次实现了与之配套的论文特征提取模块、TF-IDF加权模块、数据预处理模块,最终构建成一个面向学术论文的多标签分类系统.仿真实验验证了本研究构建的学术论文多标签分类系统在处理学术论文多标签分类问题时,较传统的单模型分类器或同质集成模型分类器在泛化能力、稳定性与准确率方面都有一定程度的提升.图9.参考文献21.

Abstract

High-quality multi-label automatic classification of academic papers is a key step to promote the de-velopment of academic research.In this study,the five classifiers of random forest,support vector machine,limit tree,extreme gradient boosting,and neural network are fused into a heterogeneous ensemble classifier using Stacking model,and the multi-binary classification model based on problem transformation idea is used to apply the classifier to multi-label classification of academic papers.According to the characteristics of academic pa-pers,the supporting paper feature extraction module,TF-IDF weighting module and data pre-processing module are realized in turn,and finally a multi-label classification system for academic papers is constructed.Simulation experiment verifies that the multi-label classification system for academic papers constructed in this study has a certain degree of improvement in generalization ability,stability and accuracy compared with the traditional sin-gle-model classifier or homogeneous ensemble model classifier for the multi-label classification problem of aca-demic papers.9 figs.21 refs.

关键词

论文分类/Stacking模型/多标签分类/多二分类模型

Key words

Paper Classification/Stacking Model/Multi-Label Classification/Multi-Binary Classification Model

引用本文复制引用

出版年

2024

国家图书馆学刊

中国国家图书馆

国家图书馆学刊

CSTPCDCSSCICHSSCD北大核心

影响因子：1.957

ISSN：1009-3125

参考文献量21

段落导航