首页|大语言模型中文语体能力评测研究

大语言模型中文语体能力评测研究

扫码查看
语体能力是重要的语用能力,大规模语言模型(下称"大模型")要在语言生活中落地,需对语体能力进行充分的评价和研究.本文将语体能力定义为在特定语域下使用合适语体进行交际的能力,并基于此设计了语体分类、语体生成、语体转换三个任务,以评测ChatGPT等大模型的中文语体能力.研究发现不同大模型在不同任务和语体上各有其优势与局限.GPT-4的中文语体能力最为全面,ChatGPT和文心一言性能较为出色,ChatGLM-6B和讯飞星火的表现较弱且不稳定.此外,大模型生成的散文、小说等文本过于正式,缺乏文采,一致性错误、规范性错误、事实性错误、不合逻辑、语句不流畅、机器翻译痕迹明显等问题较为突出.本研究为训练和测试人类的语体能力提供了方法参考,对语文教学、国际中文教育等领域的语言能力提升具有借鉴价值.
Research on Evaluation of Chinese Stylistic Competence on Large Language Models
Stylistic competence is an important pragmatic competence,and full evaluation and development of stylistic competence is required for Large Language Model(LLM)to function effectively in everyday language contexts.In this paper,stylistic competence is defined as the ability to use appropriate style for communication in a specific register.Based on this,three tasks of stylistic classification,stylistic generation and stylistic transformation are designed to evaluate the Chinese stylistic competence of LLMs represented by ChatGPT.It is found that LLMs have their own advantages and limitations in different tasks and styles.GPT-4 demonstrates the most comprehensive and excellent Chinese stylistic competence,while ChatGPT3.5 and ERNIE Bot have better performances.On the other hand,ChatGLM-6B and SparkDesk have weak and unstable performances.In addition,the prose,novel and other texts generated by each model are too formal.The literary grace is ordinary,and problems such as consistency errors,normative errors,factual errors,illogicality,insufficient sentence fluency and obvious traces of machine translation still exist.This study also provides methodological references for training and testing human stylistic competence,and carries reference value for language competence enhancement in the fields of Chinese teaching and international Chinese language education.

Large Language Modelstylistic competencelanguage resource

周立炜、饶高琦

展开 >

北京语言大学 国际中文教育研究院 北京 100083

北京语言大学 中国语言文字规范标准研究中心 北京 100083

大规模语言模型 语体能力 语言资源

国家社会科学重大基金项目

21&ZD289

2024

语言文字应用
教育部语言文字应用研究所

语言文字应用

CSTPCDCSSCICHSSCD北大核心
影响因子:1.215
ISSN:1003-5397
年,卷(期):2024.(1)