基于Bert融合词汇的中文命名实体识别

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：命名实体识别是自然语言处理中一项非常重要的任务,一句话中可以正确理解其中的实体,对于是否能正确理解这句话至关重要,而中文的命名实体识别相比英文更有难度,原因在于中文没有英文中类似空格的边界标示词,且存在复杂的嵌套现象.针对现有的中文命名实体识别方法中大多只利用单一层次的特征这一问题,利用Bert中文预训练集和额外的词汇数据集的融合模型增强词意和中文上下文联系,采用BiGRU网络获取序列特征矩阵,通过条件随机场模型生成全局最优序列,从而提升实体识别准确率.实验结果表明该方法在公开数据集上的效果优于现有模型.

外文标题：Chinese Named Entity Recognition Based on Bert Fusion Vocabulary

外文摘要：Named entity recognition is a very important task in natural language processing.It is crucial to correctly understand the entities in a sentence,and Chinese named entity recognition is more difficult than English because Chinese does not have boundary markers like spaces in English,and there is a complex nesting phenomenon.In response to the problem that most existing Chinese named entity recognition methods only use single level features,a fusion model of Bert Chinese pre training set and additional vocabulary dataset is used to enhance word meaning and Chinese context connection.BiGRU network is used to obtain sequence feature matrix,and a conditional random field model is used to generate the global optimal sequence,thereby improving the accuracy of entity recognition.The experimental results show that this method outperforms existing models on public datasets.

外文关键词：

natural language processingnamed entity recognitionword combinationdeep learning

作者：

宋煜、李可丰

展开 >

作者单位：

上海第二工业大学计算机与信息工程学院,上海 201209

关键词：

自然语言处理命名实体识别字词结合深度学习

出版年：

2024

DOI：