Error annotation is a crucial component of the Chinese interlanguage dependency treebank(CIDT).However,the existing treebanks have not yet fully realized the effective combination of error annotation and syntactic annotation.In view of this,this paper proposes a new error annotation scheme for CIDT,aiming to better implement the annotation concept of"basic annotation + error annotation"in the Interlanguage Corpus.The scheme specifies the general principle of error annotation and the specific annotation.The general principle includes the principles of order and naturalness,which together ensure the authenticity,naturalness,and validity of the corpus,as well as dependencies and errors in the annotation process.The specific annotation rules cover the error problems at the word,phrase,and sentence levels,and the scheme provides corresponding annotation strategies for each level.In particular,at the sentence level,error annotation is adapted based on the dependency syntactic annotation system,which ensures a certain consistency and logical relationship between the annotating of correct information and error information.It is found that the error at the sentence level usually occurs in the combination of modifiers and centers,especially in dependency relations such as"adverbials","predicates","attributes"and"complements".Such errors can impact dependency grammar analysis,with vertical and horizontal effects.The vertical effects correlate with error types,while the horizontal effects are associated with the combinatorial capacity of words.
关键词
汉语中介语语料库/依存语法/依存树库/偏误分析/偏误标注
Key words
Chinese interlanguage corpus/dependency grammar/dependency treebank/error analysis/error annotation