首页|An Approach to Identify the Complete Reduplicated Multiword Expressions in Digital Bengali Text
An Approach to Identify the Complete Reduplicated Multiword Expressions in Digital Bengali Text
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NETL
NSTL
This work presents an approach recognizing the complete reduplication of bi-word multiword expressions in a robust Bengali dataset. Reduplication, denoting the repetition of any language unit in linguistic studies, is a crucial aspect of identifying multiword expressions. The proposed method performs in two stages: the first stage includes pre-processing activities, and the second involves identifying bi-gram word pairs using two different methods and a comprehensive validation to find the accuracy of the proposed system. The proposed approach, employing the Levenshtein distance method, achieves a significant accuracy of 99% for three categories of bi-gram combinations of complete reduplicated multiword expressions. It exhibits a notable improvement of 1%, surpassing the result of the related work.
BCRMEComplete reduplicationMultiword expressionsPart-of-speech taggingNatural language processing
Subrata Pan
展开 >
Department of Information Technology, Bankura Unnayani Institute of Engineering, Bankura 722146, West Bengal, India
Journal of The Institution of Engineers (India), Series B. Electrical eingineering, electronics and telecommunication engineering, computer engineering