Military named entity recognition based on RoBERTa-Span-Attack tag pointer network
There are plenty of military entities in the documents of military field.Identification of such information is the basic task of extracting military text information and constructing military know-ledge graph.A model based on robustly optimized BERT pre-training approach(RoBERTa)Span and confrontation training label pointer network(RoBERTa-Span-Attack)was proposed,which was used for Chinese military named entity recognition.Because RoBERTa adopts the pre-training strategy of whole word mask,it has learned the semantic representation for the whole word,which is more sui-table for the recognition of Chinese military named entities.And then,a span-based label pointer net-work which can recognize the starting-end position and label of entities at the same time was adopted to improve the model performance.Finally,adversarial training strategy in which disturbances were added to generate adversarial samples for training process was employed to improve the robustness of the model.Experimental results on military domain dataset demonstrate that the proposed model has better recognition accuracy than BERT-CRF,BERT-Softmax and BERT-Span.
military named entity recognitionpre-trained modelspanlabel pointer networkad-versarial training