首页|An alert-situation text data augmentation method based on MLM
An alert-situation text data augmentation method based on MLM
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
万方数据
维普
An alert-situation text data augmentation method based on MLM
The performance of deep learning models is heavily reliant on the quality and quantity of train-ing data.Insufficient training data will lead to overfitting.However,in the task of alert-situation text classification,it is usually difficult to obtain a large amount of training data.This paper proposes a text data augmentation method based on masked language model(MLM),aiming to enhance the generalization capability of deep learning models by expanding the training data.The method em-ploys a Mask strategy to randomly conceal words in the text,effectively leveraging contextual infor-mation to predict and replace masked words based on MLM,thereby generating new training data.Three Mask strategies of character level,word level and N-gram are designed,and the performance of each Mask strategy under different Mask ratios is analyzed and studied.The experimental results show that the performance of the word-level Mask strategy is better than the traditional data augmen-tation method.
deep learningtext data augmentationmasked language model(MLM)alert-sit-uation text classification