昆明理工大学学报(自然科学版)2024,Vol.49Issue(6) :57-63.DOI:10.16112/j.cnki.53-1223/n.2024.06.331

基于深度学习和规范术语库的学术论文关键词抽取研究

Extraction of Keywords from Academic Papers Based on Deep Learning and Terminology Bank

陈若愚 李焱 吴卓 杜振雷
昆明理工大学学报(自然科学版)2024,Vol.49Issue(6) :57-63.DOI:10.16112/j.cnki.53-1223/n.2024.06.331

基于深度学习和规范术语库的学术论文关键词抽取研究

Extraction of Keywords from Academic Papers Based on Deep Learning and Terminology Bank

陈若愚 1李焱 2吴卓 2杜振雷3
扫码查看

作者信息

  • 1. 北京信息科技大学智能信息处理研究所,北京 100192;北京信息科技大学计算机学院,北京 102206
  • 2. 北京信息科技大学计算机学院,北京 102206
  • 3. 全国科学技术名词审定委员会,北京 100717
  • 折叠

摘要

学术论文中的关键词,对于揭示论文主题、提高文献检索的准确性、促进学术交流有着重要的作用.针对学术论文关键词选择不规范的问题,通过网络爬虫采集了部分计算机领域中文学术论文的摘要,基于全国科学技术名词审定委员会审定的规范术语库,标注了学术论文摘要和规范术语的映射数据集.基于这一数据集和深度学习技术,建立学术论文摘要与规范术语之间的匹配模型,从而实现计算机辅助的学术论文关键词抽取.通过实验验证了所提出方法的可行性,同时,通过基于记忆回放的增量训练与评估,验证了模型的增量学习泛化能力.

Abstract

Keywords in academic papers play an important role in revealing the themes of papers,improving the accuracy of literature retrieval and promoting academic communication.To address the problem of non-standard keyword selection in academic papers,the abstracts of some Chinese academic papers in the field of computer sci-ence were collected by web crawlers,and the mapping data between the abstracts of academic papers and the standardized terms were annotated based on the terminology bank approved by China National Committee for Ter-minology in Science and Technology.Using this dataset and deep learning technology,a matching model between the abstracts of academic papers and the standardized terms is established to realize computer-aided keyword extraction of academic papers.The feasibility of the proposed method was verified through experiments.Addition-ally,through experience replay-based incremental training and evaluation of the data,the incremental generali-zation capability of the model is experimentally verified.

关键词

深度学习/术语库/学术论文/关键词抽取

Key words

deep learning/terminology bank/academic papers/keyword extraction

引用本文复制引用

出版年

2024
昆明理工大学学报(自然科学版)
昆明理工大学

昆明理工大学学报(自然科学版)

CSTPCD北大核心
影响因子:0.516
ISSN:1007-855X
段落导航相关论文