大语言模型多语言词对齐能力评测方法
Evaluation methods for multilingual word alignment capabilities in large language models
李洁 1李正芳 1邹垚 1熊大卫 1胡建1
作者信息
- 1. 西南民族大学计算机与人工智能学院,四川成都 610041
- 折叠
摘要
针对目前大语言模型多语言词对齐能力评测相对缺乏的问题,提出一种通过跨语言选词填空任务评测其多语言词对齐能力的方法:根据词汇的上下文长短、词性、干扰词数量等规则生成多维度的跨语言选词填空评测数据集,并用其对多种大语言模型的多语言词对齐能力进行评测.在以中文和英文两种语言以及代表性大语言模型为例的实验结果显示,多个大语言模型在中英跨语言选词填空任务上的准确率超过80%,最高达90.4%,证明大语言模型具有较优的多语言词对齐能力.不仅可为评测大语言模型多语言词对齐能力提供方法和测试数据,也可为多语言共同性和跨语言处理任务研究者提供模型选择建议.
Abstract
To solve the problem that the relative lack of current evaluations on the multilingual word alignment capabilities of Large Language Models,this paper introduced a method to evaluate these capabilities through a cross-lingual cloze.The method involved generating a multidimensional evaluation datasets based on rules such as the context length,part of speech,and number of distractor words of the vocabulary.This datasets were used to test the multilingual word alignment abilities of various Large Language Models.Experimental results,using representative Large Language Models and focusing on Chinese and English,dem-onstrated that these models achieved an accuracy rate of over 80%on cross-lingual cloze tasks,with the highest reaching 90.4%.This performance confirmed the strong multilingual alignment capabilities of Large Language Models.The study not only offered a methodology and data for evaluating multilingual word alignment capabilities but also provided model selection recom-mendations for researchers in multilingual universality and cross-lingual processing tasks.
关键词
大语言模型/多语言词对齐能力评测/跨语言选词填空Key words
large language model/multilingual word alignment ability evaluation/cross-lingual cloze引用本文复制引用
出版年
2024