首页|Resolving Coordinate Structures for Chinese Constituent Parsing
Resolving Coordinate Structures for Chinese Constituent Parsing
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NETL
Springer Nature
Coordinate structures are linguistic structures consisting of two or more conjuncts, which usually compose into larger constituent as a whole unit。 However, the boundary of each conjunct is difficult to identify, which makes it difficult to parse the whole coordinate and larger structures。 In labeled data, such as the Penn Chinese Tree Bank (CTB), coordinate structures are not labeled explicitly, which makes solving the problem more complicated。 In this paper, we treat resolving coordinate structures as an independent sub-problem of parsing。 We first define coordinate structures explicitly and design rules to extract the coordinate structures from labeled CTB data。 Then a specifically designed grammar is proposed for automatic parsing of coordinate structures。 We propose two groups of new features to better model coordinate structures in a shift-reduce parsing framework。 Our approach can achieve a 15% improvement in F-1 score on resolving coordinate structures。