A Data Augmentation-Based Keywords Extraction Model for Scientific and Technical Literature
[Research purpose]The study of scientific and technical literature keywords extraction has significant value.Presently,exist-ing methods for keywords extraction have large errors and can only extract keywords from text,making it difficult to extract words that are more consistent with the core theme of the text based on deep semantic information.This paper focuses on the limitations of keywords ex-traction due to inadequate mining of implicit contextual semantics and insufficient attention to key information,and conducts research to address these issues.[Research method]It proposes a keywords extraction model(GPBA,GPT-2 BiLSTM Mul-Attention)based on data augmentation by language model,and combined with BiLSTM+Mul-Attention extraction model for multi-feature fusion to under-stand the semantic information.[Research conclusion]The experimental results demonstrate that GPBA,the data-enhanced keywords extraction model,outperforms other baseline models and accurately condenses keywords from text.
scientific and technical literaturekeywords extraction modeldata augmentationsemantic informationevaluation metrics