对雷达装备故障文本进行智能化分类,有助于提高雷达装备保障效率。针对雷达故障文本专业性强,样本量小且不平衡的问题,通过非核心词EDA进行类内数据增强,以实现在增加文本量的同时保持关键信息不变。针对非核心词EDA方法产生的新样本多样性不够的问题,增加SSMix(saliency-based span mixup for text classification),进行类间数据增强,通过对输入文本非线性的交叉融合来提升文本的多样性。实验证明,与现有的经典基线分类方法和典型数据增强分类方法相比,该方法在准确率上有较大幅度的提升。
Radar Fault Text Classification Method for the Fusion of Non-core Words of EDA and SSMix
The intelligent classification of radar equipment fault text is carried out and it is helpful to improve the support efficiency of radar equipment.To solve the problems of highly specialized radar fault text with small amount of samples,and imbalanced classes,the non-core word EDA is used to increase the amount of the text while keeping the key information unchanged.As the diversity of new samples generated by the non-core word EDA is not enough,the SSMix model is added to augment the data between classes,the text diversity is improved by inputting the non-linear cross fusion of the text.The experiments show that the accuracy of the proposed method is greatly improved compared with the present classical baseline classification methods and typical data augmentation classification methods.
radar fault textnon-core word EDASSMixtext data augmentationclassification