This article describes the entry system submitted by our team in the CCL23 ancient book named entity recognition evaluation. The task aims to automatically identify important entities of the basic elements of events such as names of people, titles of books, and official titles in ancient texts, and divide them into open tracks and closed tracks according to whether the model parameters used are greater than 10b. In this pre-train and fine-tune the open source pre-training model, which significantly improves the performance of the pedestal model on the task of named entity recognition in ancient books. Secondly, an untrusted entity screening algorithm based on pair-wise voting is proposed to obtain candidate entities, and the context enhancement strategy is used to correct entity recognition for candidate entities. In the final evaluation, our system ranked second in the closed circuit with an F1 score of 95.8727.
命名实体识别持续预训练实体修正
王士权、石玲玲、蒲璐汶、方瑞玉、赵宇、宋双永
展开 >
中国电信股份有限公司数字智能科技分公司
命名实体识别 持续预训练 实体修正
Chinese national conference on computational linguistics
Harbin(CN)
22nd Chinese national conference on computational linguistics (CCL 2023): evaluations