大语言模型在汉语写作智能评估中的应用研究

Practical Exploration of Large Language Models in Chinese Automated Essay Evaluation

扫码查看

原文链接

NETL
NSTL
维普
万方数据

中文摘要：研究旨在评估大语言模型在写作自动评分、智能评语生成两个典型写作智能评估任务中的性能.研究以汉语二语学习者为研究对象,采用了3 种不同提示策略验证大语言模型在写作自动评分和自动评语反馈方面的有效性,包括标准提示、思维链提示以及自洽思维链提示.结果显示,尽管大语言模型在写作自动评分任务中表现出一定的潜力,其稳定性和可靠性仍有待提高,但通过不断优化这些提示策略,可以显著增强模型处理写作评分和评语生成的能力.此外,这3 种提示语会产生不同的效果,以提示的方式评估大语言模型的性能表现存在主观性,还不能完全替代教师独立开展评估测试,但现阶段可以作为辅助工具提高教师评估作文的效率.本研究的发现为大语言模型在汉语写作智能评估领域的应用提供了有力支持,为未来开发更高效、更精准的汉语写作智能评估系统提供参考.

外文摘要：This study aims to evaluate the performance of large language models in two typical writing intelligent assessment tasks:automatic writing scoring and intelligent commentary generation.Focusing on Chinese as a second language learners,this research employed three different prompting strategies to verify the effectiveness of large language models in automatic writing scoring and automated feedback generation,including standard prompts,thought chain prompts,and self-consistent thought chain prompts.The results show that although large language models demonstrate potential in the automatic writing scoring task,their stability and reliability still need improvement.However,by continuously optimizing these prompting strategies,the capability of the model to handle writing scoring and commentary generation can be significantly enhanced.Moreover,different prompts yield different effects,and assessing the performance of large language models with prompts involves subjectivity.Thus,they cannot fully replace teachers'independent assessment tests but can serve as auxiliary tools to improve the efficiency of teachers'assessment of compositions at this stage.The findings of this study provide strong support for the application of large language models in the field of intelligent assessment of Chinese writing,emphasizing their potential value in enhancing the performance of assessment systems.This serves as a reference for developing more efficient and accurate intelligent assessment systems for Chinese writing in the future.

外文关键词：

automated essay evaluationautomated essay scoringintelligent commentary generationlarge language modelChatGLM

作者：

薛嗣媛、周建设

展开 >

作者单位：

中国社会科学院语言研究所, 北京 100005

首都师范大学中国语言智能研究中心, 北京 100089

关键词：

写作智能评估自动作文评分智能评语生成大语言模型 ChatGLM

基金：

国家语委科研全球中文学习联盟专项(十四五)国家语委重点项目科研规划课题(十四五)中国教育技术协会重大项目

项目编号：

YB145-16ZDI145-92YB145-56XJJ202205003

出版年：

2024

DOI：

10.14091/j.cnki.kmxyxb.2024.02.002

昆明学院学报

昆明学院

昆明学院学报

CHSSCD

影响因子：0.167

ISSN：1674-5639

年,卷(期)：2024.46(2)

被引量1
参考文献量50