生成式人工智能数据训练的合理使用规则研究
曹新明 1范晔1
作者信息
摘要
生成式人工智能数据训练过程涉及对作品、资料、文献等数据的获取和利用,可能引发版权侵权.为避免版权侵权,人工智能开发者或者事前获得授权许可,或者依法获得侵权豁免.然而,传统许可模式在实践中难以支撑海量数据学习模式的需要;法定许可方案也面临交易成本和管理成本高昂的问题.以博弈论视角进行分析得出,合理使用是配置作品数据资源的较优路径,也是协调著作权人和人工智能开发者利益冲突的理性选择.对此,建议以《著作权法》第二十四条第一款第(十三)项的"兜底条款"为接口,在《著作权法实施条例》中引入生成式人工智能数据训练的合理使用专门例外.该条款应适当放宽适用条件,通过"三步检验法"的后两步进行限制,以此增加适用的弹性.
Abstract
The process of generative AI data training involves access to and exploitation of works,materials,documents,etc.,which may lead to copyright infringement.In order to avoid copyright infringement,AI developers either obtain a licence beforehand or obtain an exemption from infringement in accordance with the law.However,the traditional licensing model is difficult to support the needs of massive data learning in practice;the statutory licensing scheme also faces the problem of high transaction and management costs.Analyses from a game-theoretic perspective suggest that fair use is the preferred path for allocating work data resources,and is also a rational choice for reconciling the conflicting interests of copyright owners and AI developers.In this regard,it is proposed to take the'bottom clause'in Article 24(1)(13)of the Copyright Law as an interface,and introduce a special exception for fair use of generative AI data training in the Implementing Regulations of The Copyright Law.The provision should appropriately relax the conditions of application and increase the flexibility of application by restricting it through the last two steps of the'three-step test'.
关键词
生成式人工智能/数据训练/合理使用/版权/博弈论Key words
generative artificial intelligence/data training/fair use/copyright/game theory引用本文复制引用
出版年
2024