首页|抽查设定下基于学生评价能力的同行互评概率图模型

抽查设定下基于学生评价能力的同行互评概率图模型

扫码查看
随着大量中文MOOC平台的兴起,批改大规模学生提交的主观题作业成为教育研究领域亟待解决的问题。同行互评要求学生作为同行评价者来批改同伴的作业,是解决该挑战问题的主流方法。近年来,研究人员基于概率图模型对同行评价者的评分可靠性和偏见建模,有效提升了基于同行评价打分估计主观题作业真实分数的准确性。然而,现有概率图模型只考虑学生在本次作业上的得分对其评分可靠性的影响,未对可以直接衡量评价者评分可靠性的学生评分偏差进行建模,存在局限性。鉴于此,本文结合教师抽查的方式,基于学生评分偏差对评价者评价能力进行有效量化,并以此为基础提出两种新颖的同行互评概率图模型,即RPG6(reliability-aware peer grading 6)和RPG7(reliability-aware peer grading 7)。这两个模型在现有概率图模型的基础上,在学生的评分可靠性建模中添加了基于评分偏差感知的学生评价能力,以提高模型对作业真实分数的估计准确性。真实课堂实验表明,本文提出的RPG6和RPG7模型在同行互评活动中对作业真实分数的估计更为准确,比现有最好技术在均方根误差方面平均降低了11。75%。
Probabilistic Graph Models for Peer Assessment Based on Student Grading Ability Under the Setting of Spot-checking
With the proliferation of many MOOC platforms,grading open-ended assignments submitted by many students presents a significant challenge in educational research.Peer assessment,which requires students to act as peer graders and evaluate their peers'submissions of assign-ments,is the mainstream solution to address this issue.Researchers have recently proposed various probabilistic graph models to evaluate peer graders'grading reliability and bias,effectively improving the estimated actual scores of assignments based on peer grades.However,the existing probabilistic graph models consider only the impact of students'scores on the current assignment regarding their grading reliability,failing to ac-count for their scoring deviation,which directly measures their reliability.This limitation affects the performance of these models.Therefore,this study proposes two novel probabilistic graph models,RPG6 and RPG7,which incorporate the peer graders'grading ability,quantified based on their score deviation within a small proportion of submissions being spot-checked by teachers.These models,constructed on the foundation of two existing probabilistic graph models,represent the grading reliability of peer graders as a variable dependent on their scoring deviation-aware grading ability rather than their scores for the current assignment.This approach enhances the estimation of the true scores of assignments.Real classroom experiments demonstrated that the proposed RPG6 and RPG7 models achieve greater accuracy in estimating the true scores of assign-ments in peer assessment activities.Specifically,the RMSE values of RPG6 and RPG7 are,on average,11.75%lower than those of the state-of-the-art method.

peer assessmentprobabilistic graph modeltrue score estimationscoring deviationgrading abilityspot-checking

许嘉、杨攀原、吕品、刘恒

展开 >

广州大学 网络空间安全学院,广东 广州 510006

广西大学 计算机与电子信息学院,广西 南宁 530004

广西医科大学 信息与管理学院,广西 南宁 530021

同行互评 概率图模型 真实分数估计 评分偏差 评价能力 抽查

2025

工程科学与技术
四川大学

工程科学与技术

北大核心
影响因子:0.913
ISSN:2096-3246
年,卷(期):2025.57(1)