首页|基于大语言模型的命名实体识别

基于大语言模型的命名实体识别

扫码查看
虽然以ChatGPT为代表的自然语言生成(NLG)大语言模型在自然语言处理中的大多数任务中取得了良好的表现,但其在序列识别任务,如命名实体识别任务中的表现暂且不如基于BERT的深度学习模型.针对这一点,本文探究性的通过将现有的中文命名实体识别问题改造成机器阅读理解问题,提出并设计了基于情境学习和模型微调的新方法,使NLG语言模型在识别命名实体达到了更好的效果,并且该方法不同于其他方法需要改变基层模型的预训练参数.同时,由于命名实体是模型生成的结果而不是对原始数据的分类,不存在边界问题.为了验证新框架在命名实体识别任务上的有效性,本文在多个中文命名实体识别数据集上进行了实验.其中,在Resume和Weibo数据集上的F1 分数分别达到了 96.04%和 67.87%,相较于SOTA模型分别提高了 0.4 和 2.7 个百分点,从而验证了新框架能有效利用NLG语言模型在文本生成上的优势完成命名实体识别任务.
Named Entity Recognition Based on Large Language Model
While natural language generation(NLG)-based large language models,represented by ChatGPT,perform well in various natural language processing tasks,their performance in sequence recognition tasks,such as named entity recognition,is somewhat inferior to that of bidirectional encoder representations from Transformer(BERT)-based deep learning models.To address this issue,this study first transforms the existing Chinese named entity recognition problem into a machine reading comprehension problem.A new name entity recognition method based on in-context learning and fine tuning is proposed,thereby enabling NLG-based language models to achieve good results in named entity recognition without changing base model pre-training parameters.Additionally,since named entities are generated by the model rather than classified from original data,there are no boundary issues.To verify the effectiveness of the new framework on named entity recognition tasks,experiments are conducted on some Chinese named entity recognition datasets.On the Resume and Weibo datasets,the F1 scores reach 96.04%and 67.87%respectively,a gain of 0.4 and 2.7 percentage points over the state-of-the-art models,confirming that the new framework can effectively utilize the text generation advantages of NLG-based language models to complete named entity recognition tasks.

named entity recognition(NER)fine tuning of modelmachine reading comprehensionin-context learninglarge language model(LLM)

叶名玮、汤嘉、郭燕、吴桂兴

展开 >

中国科学技术大学软件学院,合肥 230026

中国科学技术大学苏州高等研究院,苏州 215123

命名实体识别 模型微调 机器阅读理解 情境学习 大语言模型

江苏省自然科学基金面上项目

BK20161209

2024

计算机系统应用
中国科学院软件研究所

计算机系统应用

CSTPCD
影响因子:0.449
ISSN:1003-3254
年,卷(期):2024.33(8)