计算机仿真2024,Vol.41Issue(11) :512-516.

知识库中标注词句序列命名实体识别方法

A Method for Identifying Named Entities in Annotated Word And Sentence Sequences in a Knowledge Base

郭龙 梁灿 李彦丽
计算机仿真2024,Vol.41Issue(11) :512-516.

知识库中标注词句序列命名实体识别方法

A Method for Identifying Named Entities in Annotated Word And Sentence Sequences in a Knowledge Base

郭龙 1梁灿 2李彦丽3
扫码查看

作者信息

  • 1. 中海油信息科技有限公司湛江分公司,海南 海口 570100
  • 2. 常州工学院土木建筑工程学院,江苏 常州 213032;中国石油大学(北京)油气资源与工程全国重点实验室,北京 102249
  • 3. 中海石油(中国)有限公司海南分公司,海南 海口 570100
  • 折叠

摘要

网络文本数据具有信息类型多样性、数据规模庞大性及形式多变性等特点,应用传统数据序列命名实体识别方法难以对知识库文本数据精准挖掘,易存在文本数据信息丢失的问题,实体信息识别效果不佳.为解决上述问题,提出了一种基于图神经网络的知识库中命名实体识别方法研究.方法采用词句融合方式表征文本信息,以避免命名实体识别中文本信息中的词句丢失.然后通过遗忘门、sigmoid函数清除无关或相关性小的词句信息,保留相关性较大的信息,基于tanh函数、记忆细胞单元更新词句信息,利用图神经网络挖掘词、句间的特征及关联关系,采用条件随机场、最大化似然函数标注词句序列,确定命名实体内容.最后,应用实验验证所提方法的先进性.实验结果表明,所提方法显著提升了命名实体的识别准度,且收敛速度快,应用效果较好.

Abstract

Network text data has the characteristics of information type diversity,data scale and form variability.It is difficult to accurately mine the text data of the knowledge base,which is difficult to accurately mine the text data,which is easy to lose the problem of text data information,and the identification effect of entity information is not good.In order to solve this problem,a method of named entity recognition in the knowledge base based on graph neural net-work is proposed.This method uses word sentence fusion to represent the text information to avoid the loss of word and sentence information in named entity recognition.Then it removes irrelevant or less relevant word and sentence information through forgetting gate and sigmoid function,retains large relevant information,updates word and sentence information based on tanh function and memory cell unit,uses graph neural network to mine the characteristics and correlation between words and sentences,uses conditional random field and maximum likelihood function to label word and sentence sequences,and determines the content of named entities.Finally,the experiment proves the advancement of the proposed method.Experimental results show that the proposed method significantly improves the recognition ac-curacy of named entities,and has a fast convergence speed and good application effect.

关键词

图神经网络/知识库/字词向量/命名实体/长短期记忆网络

Key words

Graph neural network/Knowledge base/Word vector/Named entities/Long and short-term memory networks

引用本文复制引用

出版年

2024
计算机仿真
中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD
影响因子:0.518
ISSN:1006-9348
段落导航相关论文