首页|高质量AI数据体系面临的数据版权困境、应对策略解析与实施路径研究

高质量AI数据体系面临的数据版权困境、应对策略解析与实施路径研究

扫码查看
[目的/意义]党的二十届三中全会决定明确提出,完善推动人工智能等战略性产业发展政策和治理体系。近年来,全球人工智能版权数据诉讼纷争频发,人工智能训练数据版权保护困境成为构建高质量AI数据体系面临的关键堵点和现实难题。[方法/过程]本研究在研究梳理人工智能数据版权保护相关学术研究和产业实践的基础上,系统性总结了应对数据版权困境的六大代表性做法,对比解析了不同做法的优缺点和适用性。[结果/结论]针对人工智能数据版权困境,即暂无既能促进人工智能版权数据供给又能兼顾数据版权保护工作的最优解问题,本研究在充分参考六大代表性做法解析和结合中国具备的四大独特优势基础上,研究提出系统妥善解决数据版权困境筑牢高质量AI数据体系的总体实施路径构想,分别为打造国家级人工智能数据版权一体化综合服务平台,探索推进适应人工智能发展的数据版权综合改革试点,建立完善人工智能数据版权相关立法并推动行业自律,以期对加大中国人工智能版权数据供给、制定相关政策和推动工作提供有益参考。
Copyright Data Dilemma of Building High-Quality Data System for AI:Present Situation,Coping Strategies,and Implementation Path
[Purpose/Significance]Improving the policy and governance systems to promote the development of strategic industries such as artificial intelligence was explicitly proposed in the resolution of the Third Plenary Session of the 20th Central Committee of the Communist Party of China.In recent years,the conflict between AI companies'desire for copyrighted data and the copyright holders'protection of copyrighted data has become increasingly apparent.There have been a number of lawsuits and disputes around the world regarding copyright infringement caused by artificial intelligence.The dilemma of copyright protection of AI training data has become a difficulty and bottleneck that urgently needs to be resolved in the development of high-quality data system for AI.[Method/Process]Based on the academic research and industrial practice on the copyright protection of AI data,this study systematically summarizes six representative approaches to address the copyright dilemma of AI training data,and provides a comparative analysis of the advantages,disadvantages,and applicability of these approaches.The six representative approaches are:signing a license agreement by both parties,initiating special plans or forming alliances,introducing a copyright notice mechanism,introducing a copyright risk guarantee mechanism,replacing with synthetic data,and applying copyright detection tools to large language models.For the copyright dilemma of AI training data,there is no optimal solution that can both encourage the supply of AI copyright training data and protect the copyright of data.[Results/Conclusions]In order to provide helpful references for increasing the supply of AI copyright data,formulating relevant policies,and promoting related work,this study has proposed a concept of general implementation path to build a high-quality data system for AI to solve the copyright dilemma of AI training data,based on the comparative analysis of the above six representative approaches and combined with China's four unique advantages.These include:1)Integrating existing platforms to build a national-level integrated service platform for copyright data for AI,with state-owned enterprises(SOEs)under the direct administration of the central government taking the lead in establishing a national copyright data alliance and connecting copyright data to the platform.2)To collaborate with local pilots of data intellectual property rights,explore and promote comprehensive reform pilot programs of copyright data adapted to the development of AI,and continuously strengthen the cooperation efforts and willingness between AI enterprises and copyright holders.3)The focus should be on principled or critical issues,establishing and improving legislation related to copyright data for AI and promoting industry self-regulation.

artificial intelligencedata system for AIcopyright protectioncopyright datadata elements

张何灿、易成岐、郭鹏、黄倩倩、靳晓锟

展开 >

国家信息中心大数据发展部,北京 100045

深圳数聚湾区大数据研究院战略研究中心,深圳 518048

中国人民大学 信息资源管理学院,北京 100872

中国科学院科技战略咨询研究院,北京 100190

展开 >

人工智能 AI数据体系 版权保护 数据版权 数据要素

2024

农业图书情报学报
中国农业科学院农业信息研究所

农业图书情报学报

影响因子:0.48
ISSN:1002-1248
年,卷(期):2024.36(9)