首页|基于BERT模型和动态集成选择的多分类文本情感识别研究

基于BERT模型和动态集成选择的多分类文本情感识别研究

扫码查看
针对传统方法提取文本特征向量存在语义缺失,以及有些文本情感识别任务涉及多分类问题,提出一种新的基于BERT(bidirectional encoder representations from transformers)和动态集成选择的多分类文本情感识别策略。首先,采用BERT对文本进行向量化处理,针对多分类文本情感识别任务采用OVO分解策略拆分成多个二分类子任务;其次,针对每个子任务采用动态集成选择策略构建分类器集成模型;最后,基于聚合策略获得最终的预测结果。采用公开的影评数据集对所提出的方法进行实证分析。结果表明:(1)相较于传统的TF-IDF与Word2Vec方法,基于BERT模型的词向量化处理有助于提高文本情感识别精度;(2)针对多分类情感识别任务中的每个子问题,采用动态集成选择策略可以有效提高识别效果;(3)本文建立的预测模型性能比其他现有情感识别模型具有显著优势。
Researchon Multi-class Sentiment Classification Based on BERT and Dynamic Ensemble Selection
To handle semantic deficiency of text feature vector extracted by classic methods and the issue of multi-classsentimentclassification in the text emotion recognition task,a novel multi-class sentiment classifica-tion strategy based onBidirectional Encoder Representations from Transformers(BERT)and dynamic ensemble selection(DES)is proposed.First,BERT is used to vectorize the text.Then,the OVO strategy is used to divide the multi-class sentiment classification problem into multiple binary classification sub-problems.Next,the dynamic ensemble selection strategy is developed to construct binary classifier for dealing with each sub-problem.Finally,the final prediction result is obtained based on the aggregation strategy.A public movie review data set is employed to carry out the experimental analysis.The experimental results indicate that(1)the BERT model is helpful in improving the multi-class sentiment classification performancewith respect to these traditional methods,namely TFIDF and Wor2Vec,(2)it is effective to use the DES strategy for dealing with each sub-problem in multi-class sentiment classification,and(3)the performance of the proposed method is also significantlybetter than that of the existing well-known methods for multi-class sentiment analysis.

text sentiment analysisBERTmulti-classdynamic ensemble selectiondecomposition strategy

张忠良、费秦君、陈愉予、雒兴刚

展开 >

杭州电子科技大学管理学院,浙江 杭州 310018

文本情感识别 BERT 多分类 动态选择集成 分解策略

国家自然科学基金青年项目浙江省哲学社会科学规划课题浙江省自然科学基金重点项目国家自然科学基金面上项目

7180106521NDJC072YBLZ20G01000171831006

2024

中国管理科学
中国优选法统筹法与经济数学研究会 中科院科技政策与管理科学研究所

中国管理科学

CSTPCDCSSCICHSSCD北大核心
影响因子:1.938
ISSN:1003-207X
年,卷(期):2024.32(6)