生成式人工智能数据训练的法治基调及展开

The legal tone and development of generative artificial intelligence data training

陈兵 ¹傅小鸥¹

扫码查看

作者信息

1. 南开大学法学院,天津 300350
折叠

摘要

数据训练是保证人工智能应用高质量落地的核心.随着生成式人工智能大模型产品的广泛应用,数据训练过程关涉用户基本数据、各主体行为轨迹及多元主体复杂的权益变动,这可能对市场竞争、企业创新甚或国家安全产生负面影响.基此,需确保数据来源的合法性,提升数据质量的可信度,遵循数据来源合法—数据质量可信—数据价值释放的基本要求对标数据训练的不同阶段.在数据计算与应用阶段,应注意深度合成技术带来的训练数据污染和运行数据异常问题;在数据开放和共享阶段,应全面准确审视个人信息保护风险和知识产权侵权风险,遵循科技发展的规律,平衡技术可及性、实践可行性以及价值正当性之间的关系,在夯实安全发展的基础上为创新发展预留空间.为此,生成式人工智能的数据训练应以安全为底线,在优化协同监管架构及方法的同时促进创新发展,并兼顾多元主体的正当权益.

Abstract

Data training is the core of ensuring the high quality landing of artificial intelligence applications.With the wide application of generative artificial intelligence large model products,the data training process involves the basic data of users,the behavior trajectories of various agents and the complex changes in the rights and interests of multiple agents,which may have a negative impact on market compe-tition,enterprise innovation and even national security.Therefore,it is necessary to ensure the legitimacy of data sources,improve the credibility of data quality,and follow the basic requirements of"legitimate data sources—credible data quality—data value release"to mark different stages of data training.In the stage of data calculation and application,we should pay attention to the pollution of training data and ab-normal operation data brought by deep synthesis technology.In the stage of data opening and sharing,per-sonal information protection and intellectual property rights infringement risks should be comprehensively and accurately examined;the law of scientific and technological development should be followed;the rela-tionship between technical accessibility,practical feasibility and value legitimacy should be balanced and space should be reserved for innovative development on the basis of consolidating safe development.To this end,the data training of generative AI should take security as the bottom line,promote innovative de-velopment while optimizing the collaborative regulatory framework and methods and take into account the legitimate rights and interests of multiple entities.

关键词

生成式人工智能/数据训练/数据质量/安全发展/创新发展/规范发展

Key words

generative artificial intelligence/data training/data quality/safe development/innovative development/standardized development

引用本文复制引用

基金项目

最高人民法院司法研究重大课题(ZGFYZDKT202317-03)

教育部人文社会科学重点研究基地重大项目(19JJD820009)

出版年

2024

辽宁师范大学学报(社会科学版)

辽宁师范大学

辽宁师范大学学报(社会科学版)

CHSSCD

影响因子：0.736

ISSN：1000-1751

参考文献量21

段落导航