Data Augmentation Method for Sentiment Analysis Based on ChatGLM
Sentiment analysis is one of the popular tasks in natural language processing.Due to the difficulty and high cost of annotating train-ing data,sentiment analysis with limited samples has drawn people's attention.Data augmentation methods are one of the primary approaches for handling limited sample learning.However,traditional data augmentation methods have not taken into account the characteristics of senti-ment analysis,which can lead to issues such as semantic inconsistencies,sentiment bias,and excessive generation in the augmented data.To address these problems,a multi-stage data augmentation strategy based on the ChatGLM model is proposed specifically for sentiment analysis.Specifically,it starts with simple word-level data augmentation using EDA methods,followed by filtering the generated data using a sentiment lexicon,and finally,enhancing it at the sentence level using the ChatGLM model.Experimental results demonstrate that this data augmenta-tion method improves accuracy by 1.9%,2.1%,and 2.2%on three different datasets compared to the traditional optimal data augmentation method,confirming the effectiveness of this approach for limited sample sentiment analysis.
few-shot learningsentiment analysisdata augmentationpre-trained modelsnatural language processing