国际汉语教学研究2024,Issue(1) :81-94.DOI:10.3969/j.issn.2095-798X.2024.01.010

汉语中介语依存树库偏误标注研究

A Study on Error Annotation of Chinese Interlanguage Dependency Treebank

钱隆 王治敏
国际汉语教学研究2024,Issue(1) :81-94.DOI:10.3969/j.issn.2095-798X.2024.01.010

汉语中介语依存树库偏误标注研究

A Study on Error Annotation of Chinese Interlanguage Dependency Treebank

钱隆 1王治敏2
扫码查看

作者信息

  • 1. 北京语言大学国际中文教育研究院
  • 2. 广东外语外贸大学中国语言文化学院
  • 折叠

摘要

偏误标注是汉语中介语依存树库标注的重要组成部分,然而现有的同类型树库尚未充分实现偏误标注与句法标注的有效融合.鉴于此,本文提出了一套汉语中介语依存树库偏误标注方案,旨在更好 地贯彻中介语语料库"基础标注+偏误标注"的标注理念.该方案明确了偏误标注的总体原则和具体标注规则.总体原则包括顺序性原则和自然性原则,二者共同确保了标注过程中语料、依存关系与偏误的真实性、自然性和有效性.具体标注规则涵盖了字、词、句层面的偏误问题,方案为每个层面都提供了相应的标注策略.特别是在句层面,偏误标记在依存关系句法标注体系的基础上改编而来,确保了正确信息赋码和偏误信息赋码之间具有一定的一致性和逻辑关系.研究发现:句层面的偏误通常出现在修饰语与中心语的组合上,这一现象在"状语""谓语""定语""补语"等依存关系上尤为突出;偏误会对依存语法分析产生影响,影响有横纵之分,纵向影响与偏误的类型直接相关,而横向影响则与词汇的结合力紧密相连.

Abstract

Error annotation is a crucial component of the Chinese interlanguage dependency treebank(CIDT).However,the existing treebanks have not yet fully realized the effective combination of error annotation and syntactic annotation.In view of this,this paper proposes a new error annotation scheme for CIDT,aiming to better implement the annotation concept of"basic annotation + error annotation"in the Interlanguage Corpus.The scheme specifies the general principle of error annotation and the specific annotation.The general principle includes the principles of order and naturalness,which together ensure the authenticity,naturalness,and validity of the corpus,as well as dependencies and errors in the annotation process.The specific annotation rules cover the error problems at the word,phrase,and sentence levels,and the scheme provides corresponding annotation strategies for each level.In particular,at the sentence level,error annotation is adapted based on the dependency syntactic annotation system,which ensures a certain consistency and logical relationship between the annotating of correct information and error information.It is found that the error at the sentence level usually occurs in the combination of modifiers and centers,especially in dependency relations such as"adverbials","predicates","attributes"and"complements".Such errors can impact dependency grammar analysis,with vertical and horizontal effects.The vertical effects correlate with error types,while the horizontal effects are associated with the combinatorial capacity of words.

关键词

汉语中介语语料库/依存语法/依存树库/偏误分析/偏误标注

Key words

Chinese interlanguage corpus/dependency grammar/dependency treebank/error analysis/error annotation

引用本文复制引用

基金项目

国家社会科学基金重大项目(2018)(18ZDA295)

中央高校基本科研业务费专项北京语言大学研究生创新基金(2023)(23YCX162)

出版年

2024
国际汉语教学研究

国际汉语教学研究

CHSSCD
ISSN:
参考文献量25
段落导航相关论文