To solve the problem of entity sparsity when identifying cybersecurity entities from large-scale,heterogeneous and unstructured cyberspace security information texts due to high-frequency changes and strong randomness,a semantic enhance-ment-based cybersecurity entity recognition model was proposed.The semantic enhancement input matrix was obtained from both multidimensional linguistic feature enhancement and corpus enhancement.The BiLSTM was used to obtain the contextual features of the fused input matrix.Attention allocation coefficients for the output features were generated based on the attention mechanism and features from different spaces were aggregated and encoded using FFNN.The optimal entity recognition sequence was generated using CRF computation.Experimental results show that the model outperforms the generic domain entity recogni-tion model significantly.Compared with other cybersecurity entity recognition models,the model can get better results.
关键词
网络安全/网络威胁情报/实体识别/自然语言处理/预训练/语义增强/注意力机制
Key words
network security/cyber threat intelligence/entity recognition/natural language processing/pre-training/semantic enhancement/attention mechanism