Text Classification of Civil Aviation Supervision Based on Dilated Convolution and Self-Attention
This paper proposes a text classification method for an imbalanced short text dataset,which includes Data Augmentation,Dilated Convolution,and ProbSparse Self-Attention.The proposed method addresses the issue of sample imbalance through Roformer-Sim.Additionally,the character embedding vector is obtained using RoBERTa in the embedding layer,and the structure of TextRCNN is utilized for feature extraction to extract information from the text.At the same time,the Dilated Convolution was used in the pooling layer to prevent the loss of important informa-tion and ProbSparse Self-Attention was used to obtain weights for different word embedding vector.The classification F1 value of the proposed model on the Dataset of Inspection Records of Civil Aviation Regulatory Matters reached 96.31%.The comparative experimental results with other classic deep learning algorithms show that the model pro-posed in this paper performs well in the application of the short text dataset.
Imbalanced short textText classificationData augmentationDilated convolutionProbSparse self-at-tention