最近对寻址的专利实体关系抽取方法
Method for extracting patent entity relationships based on nearest pairing addressing
李成奇 1雷海卫 1李帆 1呼文秀1
作者信息
- 1. 中北大学计算机科学与技术学院,山西太原 030051
- 折叠
摘要
针对专利领域没有公开数据集的问题,标注一个中文专利实体关系抽取数据集PERD(patent entity relation data-set).为完成实体关系抽取任务,提出最近对寻址的实体关系抽取模型NPAM(nearest pair addressing entity relationship extraction model),实体位置信息获取方法的改进、注意力机制建模矩阵和实体抽取方法的创新,使该模型在PERD上F1值达到72.74%,相比模型PRGC提升12.64个百分点.实验结果验证了该模型的有效性.
Abstract
To address the issue of the lack of open datasets in the patent field,a Chinese patent entity relation extraction dataset(PERD)was annotated.To accomplish the task of entity relationship extraction,nearest pair addressing entity relationship ex-traction model(NPAM)was proposed.The improvement of entity location information acquisition methods,attention mecha-nism modeling matrices,and entity extraction methods lead to an F1 score of 72.74%on PERD,representing a 12.64 percentage point improvement over the PRGC model.Experimental results validate the effectiveness of this model.
关键词
实体关系抽取/专利领域/数据集/最近对寻址/注意力机制/关联性矩阵/全词标记Key words
entity relationship extraction/patent field/dataset/nearest pair addressing/attention mechanism/correlation ma-trix/whole word tag引用本文复制引用
基金项目
山西省重点研发计划基金项目(201903D121166)
出版年
2024