首页|Reducing Human Effort in Named Entity Corpus Construction Based on Ensemble Learning and Annotation Categorization

Reducing Human Effort in Named Entity Corpus Construction Based on Ensemble Learning and Annotation Categorization

扫码查看
Annotated named entity corpora play a significant role in many natural language processing applications。 However, annotation by humans is time-consuming and costly。 In this paper, we propose a high recall pre-annotator which combines multiple existing named entity taggers based on ensemble learning, to reduce the number of annotations that humans have to add。 In addition, annotations are categorized into normal annotations and candidate annotations based on their estimated confidence, to reduce the number of human corrective actions as well as the total annotation time。 The experiment results show that our approach outperforms the baseline methods in reduction of annotation time without loss in annotation performance (in terms of F-measure)。

Corpus constructionNamed Entity RecognitionAssisted annotationEnsemble learning

Tingming Lu、Man Zhu、Zhiqiang Gao

展开 >

Key Lab of Computer Network and Information Integration, (Southeast University), Ministry of Education, Nanjing, China,School of Computer Science and Engineering, Southeast University, Nanjing, China

School of Computer Science and Technology, Nanjing University of Posts and Telecommunications, Nanjing, China

International conference on computer processing of oriental languages;CCF conference on natural language processing and Chinese computing

Kunming(CN)

Natural language understanding and intelligent applications

263-274

2016