首页|基于特征融合的汉语被动句自动识别研究

基于特征融合的汉语被动句自动识别研究

扫码查看
汉语中的被动句根据有无被动标记词可分为有标记被动句和无标记被动句.由于其形态构成复杂多样,给自然语言理解带来很大困难,因此实现汉语被动句的自动识别对自然语言处理下游任务具有重要意义.该文构建了一个被动句语料库,提出了一个融合词性和动词论元框架信息的PC-BERT-CNN模型,对汉语被动句进行自动识别.实验结果表明,该文提出的模型能够准确地识别汉语被动句,其中有标记被动句识别F,值达到98.77%,无标记被动句识别F1值达到96.72%.
Automatic Recognition of Chinese Passive Sentences Based on Feature Fusion
Chinese passive sentences can be classified into marked and unmarked passive sentences based on the pres-ence of passive markers.Due to their complex and diverse forms,they pose significant challenges to natural language understanding.Therefore,the automatic recognition of Chinese passive sentences is important for downstream tasks in natural language processing.In this paper,we construct a corpus specifically for passive sentences and propose a PC-BERT-CNN model that integrates part-of-speech and verb argument frame information to automatic Chinese passive sentence identification.Experiment results demonstrate the proposed model achieves 98.77%F1 score for marked passive sentence recognition,and 96.72%for unmarked passive sentence recognition.

Chinese passive sentencesautomatic recognitionfeature fusioncorpus

胡康、曲维光、魏庭新、周俊生、李斌、顾彦慧

展开 >

南京师范大学中北学院,江苏丹阳 212334

南京师范大学计算机与电子信息学院/人工智能学院,江苏南京 210023

南京师范大学文学院,江苏南京 210097

南京师范大学国际文化教育学院,江苏南京 210097

展开 >

汉语被动句 自动识别 特征融合 语料库

国家社会科学基金

21&ZD288

2024

中文信息学报
中国中文信息学会,中国科学院软件研究所

中文信息学报

CSTPCDCHSSCD北大核心
影响因子:0.8
ISSN:1003-0077
年,卷(期):2024.38(8)