首页|基于SimCSE和BERT混合模型的短文本情感分类

基于SimCSE和BERT混合模型的短文本情感分类

扫码查看
为了解决BERT模型训练效果受到文本向量存在的各向异性问题,将对比学习(SimCSE)和BERT结合起来构建模型(SimCSE-BERT),分类器不但通过对比学习思想扩充了训练数据量,还可基于SimCSE模型获得"对齐"和"均匀性"俱佳的文本向量去优化基础BERT模型以提高分类效果.实验结果表明,与基础BERT模型相比,混合模型的准确率在外卖、携程酒店和淘宝数据集上分别提升 0.562、0.584 和0.734 个百分点.该模型在短文本情感分类数据集上的分类效果有明显提升,并且具有良好的泛化能力.
Short Text Emotion Classification Based On SimCSE and BERT Hybrid Model
In order to solve the problem that the training effect of the BERT model is affected by the anisotropy of the text vector.This paper combines comparative learning(SimCSE)and BERT to build a model(SimCSE-BERT).The classifier not only expands the amount of training data through the idea of comparative learning,but also obtains text vectors with good ″alignment″and ″uniformity″based on the SimCSE model to optimize the basic BERT model to improve the classification effect.Experimental results are as follows:compared with the basic BERT model,the accu-racy of the hybrid model increases by 0.562,0.584,and 0.734 percentage points in takeout,Ctrip hotel,and Taobao data sets respectively.The classification effect of this model on short text emotional classification data set has been significantly improved,and it has good generalization ability.

Emotion classificationHybrid ModelText vector

刘继、李帅文

展开 >

新疆财经大学统计与数据科学学院,新疆 乌鲁木齐 830012

新疆财经大学新疆社会经济统计与大数据应用研究中心,新疆 乌鲁木齐 830012

情感分类 混合模型 文本向量

国家自然科学基金国家自然科学基金

7216403471762028

2024

计算机仿真
中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD
影响因子:0.518
ISSN:1006-9348
年,卷(期):2024.41(5)
  • 7