Study Findings on Machine Learning Described by Researchers at National Universi ty of Uzbekistan (Developing named entity recognition algorithms for Uzbek: Data set insights and implementation)

乌兹别克斯坦国立大学研究人员描述的机器学习研究结果（为乌兹别克人开发命名实体识别算法：数据集洞察和实现）

扫码查看

摘要

由一名新闻记者-机器人与机器学习每日新闻的工作人员新闻编辑-调查人员发布了关于人工智能的新报告。根据乌兹别克斯坦国立大学B y NewsRx记者的新闻报道，研究表明："本文提出了一个数据集，并研究了在资源受限的语言环境中，用乌兹别克语识别命名实体(NLP)的方法。"我们的新闻记者从乌兹别克斯坦国立大学的研究中获得了一句话：“尽管NLP的申请不断增加，但乌兹别克语的代表性仍然不足，这突显了我们工作的重要性。我们的数据集包括1160个句子，有近19000个词形式，注释了部分Speech和命名实体。”此外，为了实际应用和实验，作者开发了两种算法，利用该算法识别乌兹别克语文本中的命名实体，并描述了数据集的创建方法、算法的设计和实现。本研究不仅为乌兹贝克语命名实体识别(NER)任务提供了一个重要的数据集,也为其他低资源语言(如卡拉卡尔帕K)中基于词汇的NER或机器学习NER的使用提供了方法学基础。

Abstract

By a News Reporter-Staff News Editor at Robotics & Machine Learning Daily News – Investigators publish new report on artificial in telligence. According to news reporting from National University of Uzbekistan b y NewsRx journalists, research stated, “This paper presents a dataset and approa ches to named entity recognition (NLP) in Uzbek language, in a resource-constrai ned language environment.” Our news reporters obtained a quote from the research from National University o f Uzbekistan: “Despite the increase in NLP applications, the Uzbek language is s till underrepresented, which underscores the importance of our work. Our dataset includes 1,160 sentences with nearly 19,000 word forms annotated for parts of s peech and named entities, making it a valuable resource for linguistic research and machine learning applications in Uzbek. In addition, for practical applicati on and experiments, the authors have developed two algorithms that, using this d ictionary, identifies named entities in Uzbek language texts. In addition, the a uthors described the methodology for creating the dataset, the design of the alg orithms, and their application to the Uzbek language. This study not only provid es an important dataset for future named entity recognition(NER) tasks in the Uz bek language, but also offers a methodological basis for the use of vocabulary-b ased NER or Machine learning NER in other low-resource languages (e.g. Karakalpa k).”

Key words

National University of Uzbekistan/Algor ithms/Cyborgs/Emerging Technologies/Machine Learning/Named Entity Recognitio n

引用本文复制引用

出版年

2024

Robotics & Machine Learning Daily News

ISSN：

段落导航