社会学视角下的英语主观题自动评分效度研究综述
The Validity of Automated Scoring Systems for English Subjective Tests from Social Dimension:Its Theoretical Framework and Empirical Studies
白丽芳1
作者信息
- 1. 海南大学外国语学院,海南海口 570228
- 折叠
摘要
本文从社会学视角回顾了主观题自动评分(Automated Scoring,AS)的效度验证框架,并以写作为例,介绍了评估、概括、解释、外推和使用效度的验证方法和主要结论.已有研究表明,写作AS的评估和概括效度整体较高,外推效度因验证方式不同而结论各异,解释效度缺乏验证手段,使用效度充满争议.人工和机器评分各有其优劣,在大规模使用机器辅助人工评分之前,需要更多的调查关注其社会影响,从而促进评分的公开、公正,实现人工和机器各展所长.
Abstract
This article reviews the testing methods of validity of automated scoring(AS)systems from social dimension,and takes AS for writing as an example to introduce the major findings in the validity of evaluation,generalization,explanation,extrapolation and utilization.The existent studies reveal that the validity of evaluation and generalization has generally reached a high level;that of extrapolation varies remarkably due to the difference in research design;meanwhile,explanation has been untested for lack of appropriate method;and utilization remains controversial.Since humans and machines have their respective strengths and weaknesses,before machines are used extensively to assist human scoring,more investigations into its social impacts should be conducted to ensure the transparency and fairness of the scoring and human-machine complementation.
关键词
主观题自动评分/效度/反思/社会公平Key words
automated scoring for subjective test/validity/reflection/social justice引用本文复制引用
出版年
2024