基于混合式迁移学习的命名实体识别算法

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据

中文摘要：针对命名实体识别领域中大量标注数据难于获取而带来的问题,提出基于混合式迁移学习的命名实体识别算法——MT-NER.利用样本之间的距离作为权衡样本相似性的标准,进行样本迁移以扩充目标域样本;利用模型迁移建立带有finetune的新命名实体识别网络结构,用扩充后的目标域数据集来训练网络.以医疗领域为例的实验结果分析表明,MT-NER算法在小样本数据中的实体识别效果最佳,精度达到93.31％,召回率达到89.5％,F1值达到0.931 7,与BiLSTM-CRF模型相比分别提升了 6.33百分点、3.65百分点和0.089 1.

外文标题：NAMED ENTITY RECOGNITION ALGORITHM BASED ON MIXED TRANSFER LEARNING

外文摘要：In the field of named entity recognition,it is difficult to obtain a large number of labeled data.To solve this problem,this paper proposes a named entity recognition algorithm based on mixed transfer learning named MT-NER.The distance between the samples was used as the criterion to balance the similarity of the samples,and the instances-based transfer learning was carried out to expand the target domain samples.A new named entity recognition network structure with finetune was established by the models-based transfer learning,and the expanded target domain data set was used to train the network.Taking the medical field as an example,experiments show that MT-NER algorithm has the best effect in entity recognition in small sample data,with an accuracy of 93.31％,a recall rate of 89.5％and a F1 value of 0.931 7.Compared with the BiLSTM-CRF model,the accuracy,recall rate and F1 value of MT-NER are improved by 6.33,3.65 and 8.91 percentage points.

外文关键词：

Named entity recognitionTransfer learningBidirectional LSTM-CRFDistribution adaptation

作者：

余肖生、张合欢、陈鹏

展开 >

作者单位：

三峡大学计算机与信息学院湖北宜昌 443002

关键词：

命名实体识别迁移学习双向LSTM-CRF 分布自适应

基金：

国家重点研发计划项目

项目编号：

2016YFC0802500

出版年：

2024

DOI：

10.3969/j.issn.1000-386x.2024.08.044

计算机应用与软件

上海市计算技术研究所上海计算机软件技术开发中心

计算机应用与软件

CSTPCD北大核心

影响因子：0.615

ISSN：1000-386X

年,卷(期)：2024.41(8)

参考文献量7