Data Augmentation Method via Large Language Model for Relation Extraction in Cybersecurity
Relationship extraction technology can be used for threat intelligence mining and analysis,providing crucial information support for network security defense.However,relationship extraction tasks in cybersecurity face the problem of dataset deficiency.In recent years,large language model has shown its superior text generation ability,providing powerful technical support for data augmentation tasks.In order to compensate for the shortcomings of traditional data augmentation methods in terms of accuracy and diversity,this paper proposed a data augmentation method via large language model for relation extraction in cybersecurity named MGDA.MGDA used large language model to enhance the original data from four granularities of words,phrases,grammar,and semantics in order to ensure accuracy while improving diversity.The experimental results show that the proposed data augmentation method in this paper effectively improves the effectiveness of relationship extraction tasks in cybersecurity and diversity of generated data.
cyber securityrelation extractiondata augmentationlarge language model