首页|基于词向量融合的建筑文本分类方法研究

基于词向量融合的建筑文本分类方法研究

扫码查看
由于建筑领域问题包含复杂多样的领域专有术语,常见的文本分类算法在建筑领域问题分类上难度较大.为提高建筑领域问题的分类性能,提出一种基于融合RoBERTa和Word2Vec的建筑文本分类算法.实验结果表明:在建筑领域问题数据集上,准确率达到91.59%,分类性能较好;在通用数据集上,准确率均高于SVM、CNN等模型.
Research on Architectural Text Classification Method Based on Word Vector Fusion
Due to the complexity and variety of domain-specific terms in architectural questions,the common text classification algorithms are more difficult to classify architectural questions.In order to improve the classification performance of questions in the architectural field,this paper proposes an architectural text classification algorithm based on the fusion of RoBERTa and Word2Vec.Experimental results show that the accuracy rate of the proposed method reaches 91.59%on the construction do-main problem dataset,and the classification performance is better,and on general data sets,the accuracy rate is higher than that of SVM,CNN and other models.

text classificationpretrained language modelsentence vectordeep learningquestion-answering system

胡少云、翁清雄

展开 >

中国科学技术大学,管理学院,安徽,合肥 230026

文本分类 预训练语言模型 句向量 深度学习 问答系统

国家自然科学基金国际(地区)合作与交流项目

7191001010

2024

微型电脑应用
上海市微型电脑应用学会

微型电脑应用

CSTPCD
影响因子:0.359
ISSN:1007-757X
年,卷(期):2024.40(2)
  • 7