Evaluation methods for multilingual word alignment capabilities in large language models
To solve the problem that the relative lack of current evaluations on the multilingual word alignment capabilities of Large Language Models,this paper introduced a method to evaluate these capabilities through a cross-lingual cloze.The method involved generating a multidimensional evaluation datasets based on rules such as the context length,part of speech,and number of distractor words of the vocabulary.This datasets were used to test the multilingual word alignment abilities of various Large Language Models.Experimental results,using representative Large Language Models and focusing on Chinese and English,dem-onstrated that these models achieved an accuracy rate of over 80%on cross-lingual cloze tasks,with the highest reaching 90.4%.This performance confirmed the strong multilingual alignment capabilities of Large Language Models.The study not only offered a methodology and data for evaluating multilingual word alignment capabilities but also provided model selection recom-mendations for researchers in multilingual universality and cross-lingual processing tasks.
large language modelmultilingual word alignment ability evaluationcross-lingual cloze