Based on 45 English essays written by non-English major university students as samples,this article compares the scoring validity of ChatGPT and three typical domestic Automated Essay Scoring(AES)systems.Simultaneously,through case analysis,it compares the differences between ChatGPT and a representative AES system(Pigai)in terms of overall evaluation,corrective feedback,discourse structure,and logical reasoning assessment abilities.The research reveals that both ChatGPT and the three AES systems have lower average scores compared to the teacher's grading,with ChatGPT having the lowest score.ChatGPT shows a moderate positive correlation with the teacher's grading,but its con-sistency with manual grading is not as high as the three AES systems.ChatGPT demonstrates relatively significant advantages in providing feedback on essay content:its overall evaluation is more comprehen-sive and personalized,error identification is more accurate,modification suggestions are more direct,and its ability to assess discourse structure and content perspective surpasses that of AES systems.However,ChatGPT has limitations of unstable scoring,challenges in understanding prompts,and occasional lazi-ness like humans.Understanding these strengths and limitations of ChatGPT contributes to a more scien-tifically informed application of its capabilities in English writing instruction.
关键词
大学英语写作/ChatGPT/作文自动评分系统/评分效度/内容反馈
Key words
College English writing/ChatGPT/automated essay scoring(AES)/scoring validity/feed-back on content