首页|基于抽象语义表示的汉语疑问句的标注与分析

基于抽象语义表示的汉语疑问句的标注与分析

扫码查看
疑问句的句法语义分析在搜索引擎、信息抽取和问答系统等领域有着广泛的应用.计算语言学多采取问句分类和句法分析相结合的方式来处理疑问句,精度和效率还不理想.而疑问句的语言学研究成果丰富,比如疑问句的结构类型、疑问焦点和疑问代词的非疑问用法等,但缺乏系统的形式化表示.本文致力于解决这一难题,采用基于图结构的汉语句子语义的整体表示方法一中文抽象语义表示(CAMR)来标注疑问句的语义结构,将疑问焦点和整句语义一体化表示出来.然后选取了宾州中文树库CTB8.0网络媒体语料、小学语文教材以及《小王子》中文译本的2万句语料中共计2,071句疑问句,统计了疑问句的主要特点.统计表明,各种疑问代词都可以通过疑问概念amr-unknown和语义关系的组合来表示,能够完整地表示出疑问句的关键信息、疑问焦点和语义结构.最后,根据疑问代词所关联的语义关系,统计了疑问焦点的概率分布,其中原因、修饰语和受事的占比最高分别占26.53%、16.73%以及16.44%.基于抽象语义表示的疑问句标注与分析可以为汉语疑问句研究提供基础理论与资源.
基于抽象语义表示的汉语疑问句的标注与分析
The syntactic and semantic analysis of interrogative sentences has a wide application in the fields of search engines,information extraction and question answering systems.The NLP systems usually use a combination of classification and syntactic analysis to process interrogative sentences,with poor accuracy and efficiency.The interrogative sentence has rich linguistic research results,such as interrogative sentence structure types,etc.,but it lacks systematic formal representation.We use Chinese Abstract Semantic Representation(CAMR)based on graph structure to annotate.The data comes from Penn Chinese Treebank 8.0,Chinese textbooks for elementary schools,and the Chinese translation of Little Prince,for a total of 2071 sentences.All kinds of interrogative words are represented by the combination of the interrogative concept-amr-unknown and the semantic relationship,which can represent the key information of the interrogative sentence,the question focus and the semantic structure of the interrogative sentence.Finally,we calculate the probability distribution of the focus,of which the cause,modifier,and argument accounted for the highest proportion,respectively accounting for 26.53%,16.73%,and 16.44%.Interrogative sentences annotating and analysis based on abstract semantic representation provides a better theory and resources for the study of Chinese interrogative sentences.

疑问句抽象语义表示语义角色中文信息处理

闫培艺、李斌、黄彤、霍凯蕊、陈瑾、曲维光

展开 >

南京师范大学文学院,江苏南京

南京师范大学计算机科学与技术学院,江苏南京

疑问句 抽象语义表示 语义角色 中文信息处理

Chinese National Conference on Computational Linguistic

Haikou(CN)

19th Chinese National Conference on Computational Linguistic

77-87

2020