多视角看大模型安全及实践

Large Model Safety and Practice from Multiple Perspectives

王笑尘 ¹张坤 ¹张鹏¹

扫码查看

作者信息

1. 北京智谱华章科技有限公司北京 100086
折叠

摘要

随着人工智能领域大模型(large model)的广泛应用,大模型,尤其是大语言模型(large language model,LLM)的安全问题受到了广泛关注.大模型作为一种新兴技术,与之相关的安全态势分析以及安全体系建设均亟待挖掘与探索.从社会关系以及技术应用2个视角,分析了大模型安全的整体趋势.同时,基于大模型自身的特点,梳理了大模型安全能力建设的实践思路,为大模型研发、大模型应用构建提供了安全体系构建的参考方案.介绍的大模型安全能力实践方案包括安全评估基准建设、模型价值观对齐方法、模型线上服务安全系统建设3个部分.

Abstract

With the widespread application of big models in the field of artificial intelligence,the security issues of large models,especially large language models,have received attention.As an emerging technology,the security situation analysis of large models and the construction of security systems need further exploration.We analyze the overall trend of large model security from two perspectives:society and technology application.Based on the characteristics of large models,we sort out the practice of large model security building,and provide a reference plan for building a security system for large model development and large model application construction.The large model security practice plan introduced in this article includes three parts:security benchmark construction,model values alignment method,and model online service security system construction.

关键词

大模型/大模型线上服务/安全系统/人工智能伦理/大模型安全形势分析

Key words

large model/large model online service/security system/artificial intelligence ethics/large model security situation analysis

引用本文复制引用

出版年

2024

计算机研究与发展

中国科学院计算技术研究所中国计算机学会

计算机研究与发展

CSTPCDCSCD北大核心

影响因子：2.649

ISSN：1000-1239

参考文献量36

段落导航