首页|When Crowdsourcing Meets Data Markets:A Fair Data Value Metric for Data Trading

When Crowdsourcing Meets Data Markets:A Fair Data Value Metric for Data Trading

扫码查看
Large-quantity and high-quality data is critical to the success of machine learning in diverse applications.Faced with the dilemma of data silos where data is difficult to circulate,emerging data markets attempt to break the dilemma by facilitating data exchange on the Internet.Crowdsourcing,on the other hand,is one of the important meth-ods to efficiently collect large amounts of data with high-value in data markets.In this paper,we investigate the joint problem of efficient data acquisition and fair budget distribution across the crowdsourcing and data markets.We propose a new metric of data value as the uncertainty reduction of a Bayesian machine learning model by integrating the data in-to model training.Guided by this data value metric,we design a mechanism called Shapley Value Mechanism with Indi-vidual Rationality(SV-IR),in which we design a greedy algorithm with a constant approximation ratio to greedily select the most cost-efficient data brokers,and a fair compensation determination rule based on the Shapley value,respecting the individual rationality constraints.We further propose a fair reward distribution method for the data holders with various effort levels under the charge of a data broker.We demonstrate the fairness of the compensation determination rule and reward distribution rule by evaluating our mechanisms on two real-world datasets.The evaluation results also show that the selection algorithm in SV-IR could approach the optimal solution,and outperforms state-of-the-art methods.

data tradingcrowdsourcingmechanism designShapley value

刘洋溯、郑臻哲、吴帆、陈贵海

展开 >

Department of Computer Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240,China

National Key Research and Development Program of ChinaNational Natural Science Foundation of ChinaNational Natural Science Foundation of ChinaNational Natural Science Foundation of ChinaNational Natural Science Foundation of ChinaNational Natural Science Foundation of ChinaNational Natural Science Foundation of China

2020YFB1707900U22682046232220662132018620252046227230762372296

2024

计算机科学技术学报(英文版)
中国计算机学会

计算机科学技术学报(英文版)

CSTPCD
影响因子:0.432
ISSN:1000-9000
年,卷(期):2024.39(3)