首页|基于任务的人机协同口语评分效度论证:以CET-SET为例

基于任务的人机协同口语评分效度论证:以CET-SET为例

扫码查看
本研究基于交互口语构念,依托全国大学英语口语考试智能评分系统,在任务层面对考生口语能力实施分析性和综合性评分,对开放性口语测试任务的人机协同评分开展效度论证。研究发现,机器能够根据不同口语任务的特点和测试目的进行评分,智能评分系统在任务层面的评分结果具有相当高的准确性,人机协同评分模式下与任务层面口语能力相关的因子解释了绝大部分考试分数方差。智能评分应用于大规模口语评分的探索中,专家定标阶段可采用基于任务的分析性评分,大规模评分阶段可采用基于任务的综合性人机协同评分,以增强考试结果的可解释性,提高评分效率。
Based on the interactionalist speaking construct,this study validates the human-machine collaboration in rating open-ended speaking tasks with an automated scoring system for the College English Test-Spoken English Test,in which test takers'performances are scored both analytically and holistically on rating scales at the task level.The findings show that the automated scoring system gives scores in relation to task features and test purposes,demonstrating a high level of rating accuracy.In the mode of human-machine collaboration,quite a large portion of the score variance could be attributed to the speaking abilities related factors deemed essential for task completion.When applying automated scoring to the large-scale rating of speaking tests,it is suggested that the task-based analytic scoring be used in setting gold standards for machine learning,and the task-based holistic scoring be adopted for human-machine collaboration in large-scale rating sessions,in order to facilitate score interpretation and ensure rating efficiency.

College English Testspeaking testautomated scoringinteractionalist construct theory

张晓艺、王伟、杨浩然

展开 >

复旦大学外国语言文学学院,上海 200433

教育部教育考试院,北京 100084

上海交通大学外国语学院,上海 200240

大学英语四、六级考试 口语考试 自动评分 交互构念理论

2022年度教育部人文社会科学研究青年基金项目

22YJC740102

2024

外语界
上海外国语大学

外语界

CSTPCDCSSCICHSSCD北大核心
影响因子:6.117
ISSN:1004-5112
年,卷(期):2024.(2)
  • 27