首页|基于节点词全句共现的动态词义消歧研究

基于节点词全句共现的动态词义消歧研究

扫码查看
文章根据词义消歧即将词义回归语境这一特性,提出了一种基于节点词全句共现的动态词义消歧方法.该方法首先以全句为窗口限定节点词的使用语境,其次使用互信息(MI)、卡方检验(X2检验)和相对词序比(RRWR)等统计方法抽取节点词的语义相关词,并参照《同义词词林》构建相关词语义范畴库,最后以共现频数作为加权系数,依靠单义词语义聚类分布率对中低频共现多义词进行消歧.采用该方法对与"美丽"共现的1030个小于7义类的多义词进行消歧的测试试验中取得了 85.2%的正确率.
A Study of Dynamic Word Sense Disambiguation Based on Full-sentence Co-occurrence of Node Word
Based on the property that word sense disambiguation is the return of word sense to context,we propose a dynamic word sense disambiguation method based on full-sentence co-occur-rence of node word.The method firstly uses the full sentence as a window to limit the node word us-age context,secondly uses statistical methods such as mutual information,chi-square test and ratio of relative word rank to extract semantically related words,and builds a related semantic category data-base by referring to"Tongyici Cilin"(A Dictionary of Synonyms),and finally uses the co-occurrence frequency as a weighting factor to disambiguate the low and medium frequency co-occurring multi-sense words by relying on the distribution rate of single-sense word meaning clusters.The method is used to disambiguate 1030 multiple-meaning words with less than 7 meaning categories that co-oc-curred with"meili"(beautiful),and a correct rate of 85.2%is achieved in the test.

node wordwhole sentence co-occurrenceword sense disambiguationsemantic cluste-ringunsupervised learning

闫亚亚、邢红兵

展开 >

暨南大学华文学院 广东 广州 510610

北京语言大学国际学生教育政策与评价研究院 北京 100083

节点词 全句共现 词义消歧 语义聚类 无指导学习

国家自然科学基金项目教育部中外语言合作交流中心2022年国际中文教育研究课题青年项目

3227109122YH69D

2024

语言科学
徐州师范大学语言研究所

语言科学

CSSCICHSSCD北大核心
影响因子:0.583
ISSN:1671-9484
年,卷(期):2024.23(4)
  • 11