Data Annotation Governance:Backstage Risks and Governance Shifts for Trustworthy AI
Before training artificial intelligence (AI) models,it is necessary to manually identify and annotate the data. Therefore,the "backstage" data annotation process is one of the most important stages that can lead to the risks of illusions and bias in the "frontstage" of AI. Recent years have seen data annotators,previously hidden in AI's background,gain visibility through policy documents and media reports,prompting the academic community to reflect on the enigmatic nature of technological innovation. However,from the perspective of risk governance,the numerous players participating in the practice of data annotation remain in a state of confusion,with ambiguous rights and obligations,impeding the goal of trustworthy AI. This paper traces and compares the data annotation governance of the leading economies in the global AI industry. It finds that the current data annotation governance targets emphasize mainly "AI service providers",and there is a tendency to place data within the order of private individuals. The risk governance towards trustworthy artificial intelligence urgently needs to expand the scope of governance from "providers" to the "data supply chain",and to establish a collective governance system participated by multiple stakeholders. This will allow a deeper examination of the interests of the groups involved in AI industry and provide tangible social security for the unstable data labor force.
Data AnnotationData GovernanceIllusionBiasGhost Work