首页|汉语否定焦点识别研究: 数据集与基线系统

汉语否定焦点识别研究: 数据集与基线系统

扫码查看
自然语言文本中存在大量否定语义表达,否定焦点识别任务作为更细粒度的否定语义分析,近年来开始受到自然语言处理学者的关注。该任务旨在识别句子中被否定词修饰和强调的文本片段,其对自然语言处理的下游任务,如情感分析、观点挖掘等具有重要意义。与英语相比,目前面向汉语的否定焦点识别研究开展缓慢,其主要原因是尚未有中文数据集为模型提供训练和测试数据。为解决上述问题,本文在汉语否定与不确定语料库上进行了否定焦点的标注工作,初步探索了否定焦点在汉语上的语言现象,并构建了一个包含5,762个样本的数据集。同时,本文还提出了一个基于神经网络模型的基线系统,为后续相关研究提供参照。
汉语否定焦点识别研究: 数据集与基线系统
There are a large number of negative expressions in natural language texts.As a more fine-grained negative semantic analysis task,negative focus identification has begun to attract the attention of natural language processing(NLP)researchers in recent years.The task aims to identify the text fragments modified and emphasized by negative cues in the sentence,and it is of great significance to the downstream tasks of NLP,such as sentiment analysis and opinion mining.Compared with English,the study on negative focus identification for Chinese is currently slow,the main reason is that there is no Chinese dataset to provide training and test data for the models.To solve the above issue,this paper carried out the manual annotation of negative focuses on the Chinese Negative and Speculation corpus(CNeSp),initially explored the language phenomena of negative focus on Chinese,and constructed a dataset containing 5,762 samples.Besides,we also come up with a baseline system based on neural network model to provide a reference for subsequent studies.

否定焦点数据集人工标注

盛佳璇、邹博伟、沈龙骧、叶静、洪宇

展开 >

苏州大学计算机科学与技术学院,苏州,2150001

苏州大学计算机科学与技术学院,苏州,2150001,新加坡资讯通信研究院,新加坡,1386322

否定焦点 数据集 人工标注

Chinese National Conference on Computational Linguistic

Haikou(CN)

19th Chinese National Conference on Computational Linguistic

550-560

2020