Comparison of Feedback Validity in College English Writing Between ChatGPT and AES Systems
Based on 45 English essays written by non-English major university students as samples,this article compares the scoring validity of ChatGPT and three typical domestic Automated Essay Scoring(AES)systems.Simultaneously,through case analysis,it compares the differences between ChatGPT and a representative AES system(Pigai)in terms of overall evaluation,corrective feedback,discourse structure,and logical reasoning assessment abilities.The research reveals that both ChatGPT and the three AES systems have lower average scores compared to the teacher's grading,with ChatGPT having the lowest score.ChatGPT shows a moderate positive correlation with the teacher's grading,but its con-sistency with manual grading is not as high as the three AES systems.ChatGPT demonstrates relatively significant advantages in providing feedback on essay content:its overall evaluation is more comprehen-sive and personalized,error identification is more accurate,modification suggestions are more direct,and its ability to assess discourse structure and content perspective surpasses that of AES systems.However,ChatGPT has limitations of unstable scoring,challenges in understanding prompts,and occasional lazi-ness like humans.Understanding these strengths and limitations of ChatGPT contributes to a more scien-tifically informed application of its capabilities in English writing instruction.
College English writingChatGPTautomated essay scoring(AES)scoring validityfeed-back on content