首页|知识库中标注词句序列命名实体识别方法

知识库中标注词句序列命名实体识别方法

扫码查看
网络文本数据具有信息类型多样性、数据规模庞大性及形式多变性等特点,应用传统数据序列命名实体识别方法难以对知识库文本数据精准挖掘,易存在文本数据信息丢失的问题,实体信息识别效果不佳。为解决上述问题,提出了一种基于图神经网络的知识库中命名实体识别方法研究。方法采用词句融合方式表征文本信息,以避免命名实体识别中文本信息中的词句丢失。然后通过遗忘门、sigmoid函数清除无关或相关性小的词句信息,保留相关性较大的信息,基于tanh函数、记忆细胞单元更新词句信息,利用图神经网络挖掘词、句间的特征及关联关系,采用条件随机场、最大化似然函数标注词句序列,确定命名实体内容。最后,应用实验验证所提方法的先进性。实验结果表明,所提方法显著提升了命名实体的识别准度,且收敛速度快,应用效果较好。
A Method for Identifying Named Entities in Annotated Word And Sentence Sequences in a Knowledge Base
Network text data has the characteristics of information type diversity,data scale and form variability.It is difficult to accurately mine the text data of the knowledge base,which is difficult to accurately mine the text data,which is easy to lose the problem of text data information,and the identification effect of entity information is not good.In order to solve this problem,a method of named entity recognition in the knowledge base based on graph neural net-work is proposed.This method uses word sentence fusion to represent the text information to avoid the loss of word and sentence information in named entity recognition.Then it removes irrelevant or less relevant word and sentence information through forgetting gate and sigmoid function,retains large relevant information,updates word and sentence information based on tanh function and memory cell unit,uses graph neural network to mine the characteristics and correlation between words and sentences,uses conditional random field and maximum likelihood function to label word and sentence sequences,and determines the content of named entities.Finally,the experiment proves the advancement of the proposed method.Experimental results show that the proposed method significantly improves the recognition ac-curacy of named entities,and has a fast convergence speed and good application effect.

Graph neural networkKnowledge baseWord vectorNamed entitiesLong and short-term memory networks

郭龙、梁灿、李彦丽

展开 >

中海油信息科技有限公司湛江分公司,海南 海口 570100

常州工学院土木建筑工程学院,江苏 常州 213032

中国石油大学(北京)油气资源与工程全国重点实验室,北京 102249

中海石油(中国)有限公司海南分公司,海南 海口 570100

展开 >

图神经网络 知识库 字词向量 命名实体 长短期记忆网络

2024

计算机仿真
中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD
影响因子:0.518
ISSN:1006-9348
年,卷(期):2024.41(11)