首页|大语言模型对齐研究综述

大语言模型对齐研究综述

扫码查看
随着人工智能技术的飞速发展,大语言模型已在众多领域得到了广泛应用.然而,大语言模型可能会生成不准确、有误导性甚至有害的内容,这引发了人们对大语言模型可靠性的担忧,采用对齐技术来确保大语言模型的行为与人类价值观一致已经成为一个亟待解决的问题.对近年来大语言模型对齐技术的研究进展进行综述.介绍了常用的指令数据收集方法和人类偏好数据集,概述了监督调整和对齐调整的相关研究,讨论了模型评估常用的数据集和方法,总结并展望了未来的研究方向.
Survey on large language models alignment research
With the rapid development of artificial intelligence technology,large language models have been widely applied in numerous fields.However,the potential of large language models to generate inaccurate,misleading,or even harmful contents has raised concerns about their reliability.Adopting alignment techniques to ensure the behav-ior of large language models is consistent with human values has become an urgent issue to address.Recent research progress on alignment techniques for large language models were surveyed.Common methods for collecting instruc-tion data and human preference datasets were introduced,research on supervised tuning and alignment adjustments was summarized,commonly used datasets and methods for model evaluation were discussed,and future research di-rections were concluded.

large language modelalignment techniquetunereinforcement learning

刘昆麟、屈新纪、谭芳、康红辉、赵少伟、施嵘

展开 >

中兴通讯股份有限公司,广东深圳 518057

大语言模型 对齐技术 调整 强化学习

2024

电信科学
中国通信学会 人民邮电出版社

电信科学

CSTPCD北大核心
影响因子:0.902
ISSN:1000-0801
年,卷(期):2024.40(6)