基于深度学习的术语识别研究综述

Review of Term Recognition Studies Based on Deep Learning

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：[目的]梳理深度学习模型在术语识别中的研究现状与面临挑战.[文献范围]在中国知网和Web of Science 中,分别以主题="术语识别"+"术语抽取"、主题="(extract terms OR term recognition OR technology detection OR relation classification)AND deep learning AND ner"作为检索式进行检索,共筛选 73 篇文献进行述评.[方法]对基于深度学习的术语识别一般框架、模型的选择及各模型的优缺点、未来发展趋势进行综述.[结果]基于深度学习的术语识别方法可划分为使用单一神经网络模型、复合神经网络模型和结合深度学习模型的术语识别三大类.从方法使用来看,以BiLSTM-CRF为核心及延伸的模型是术语识别的主流方法;BERT及BERT的优化模型是近年来的研究热点;在特定领域倾向于使用多任务模型代替神经网络模型;迁移学习以及主动学习的应用成为新的研究方向.[局限]仅对已有研究的不同模型及训练结果进行结构化分析,缺少对不同模型在同一数据集上的训练效果对比,待未来进一步研究.[结论]基于深度学习的术语识别未来可在术语标注模式、融合术语的多维特征、小数据集或零数据集的术语识别技术、跨领域模型泛化、结果可解释性和完善评价方法等方面深入研究.

外文摘要：[Objective]This paper reviews the current developments and challenges facing term recognition studies based on deep learning.[Coverage]We searched the中国知网 and the Web of Science using queries of 主题="术语识别"+"术语抽取",and subject="(extract terms OR term recognition OR technology detection OR relation classification)AND deep learning AND ner".A total of 73 articles were retrieved.[Methods]We reviewed these studies on the general framework of deep learning-based term recognition,model selection,advantages and disadvantages of various models,and future development trends.[Results]Deep learning-based term recognition methods can be categorized into three major types:single neural network models,composite neural network models,and models combining deep learning.BiLSTM-CRF models are the mainstream method for term recognition,with BERT and its optimized models being recent research hotspots.In specific domains,multi-task models are preferred over neural network models,and the application of transfer learning and active learning has become a new research direction.[Limitations]We only conducted a structured analysis of different models and training results of existing studies,lacking a comparison of training effects of different models on the same dataset,requiring further research in the future.[Conclusion]Future research in deep learning-based term recognition should focus on term annotation patterns,integrating multidimensional features of terms,term recognition techniques for small or zero datasets,cross-domain model generalization,interpretability of results,and improvement of evaluation methods.

外文关键词：

Term RecognitionDeep LearningText Mining

作者：

阮光册、钟静涵、张祎笛

展开 >

作者单位：

华东师范大学信息管理系上海 200062

关键词：

术语识别深度学习文本挖掘

出版年：

2024

DOI：

10.11925/infotech.2096-3467.2023.0158

数据分析与知识发现

中国科学院文献情报中心

数据分析与知识发现

CSTPCDCSSCICHSSCD北大核心EI

影响因子：1.452

ISSN：2096-3467

年,卷(期)：2024.8(4)

参考文献量77