eCrowd:一种基于嵌入表示的众包集成模型

eCrowd:an embedding representation based on crowdsourcing model

姜贺萌 ¹黄晓宇¹

扫码查看

作者信息

1. 华南理工大学电子商务系,广州 510006
折叠

摘要

众包是当前常用的一种数据标注手段.由于众包机制的开放性,工人的标注质量参差不齐.现有众包模型从混杂的众包数据中提取真实标签时,普遍未能充分利用众包对象的特征信息.提出了一种基于对象嵌入表示的二阶段众包算法,先通过众包数据生成众包对象的嵌入表示,预测部分标注;再结合已知与预测的标注,生成最终预测结果.多个数据集的实验表明,该方法显著优于对比算法.

Abstract

Crowdsourcing is one of the most used methods for data label acquisition.Most existing crowdsourcing algorithms take the crowdsourced results as input and produce the estimate true labels directly,a major drawback of these approaches is that they can't make use of the side information of predictions.In this work,we propose eCrowd,an embedding representation based crowdsourcing model.eCrowd adopts a two-stage working strategy:In the first stage,it first learns the embedding representations of both workers and tasks based on the collected labels,then predicts some specific missing"worker-task"labels with the learned rep-resentations.In the second stage,it produces estimates of the true labels of all tasks,based on the collected and predicted labels.For evaluations,we conduct various crowdsourcing prediction experiments on four real datasets with eCrowd and three other com-parison algorithms,all results shows the superior of our proposed algorithm.

关键词

众包/矩阵分解/神经网络/嵌入表示

Key words

crowdsourcing/matrix factorization/neural network/embedding representation

引用本文复制引用

出版年

2024

现代计算机

中大控股

现代计算机

影响因子：0.292

ISSN：1007-1423

段落导航