MOSS:An Open Conversational Large Language Model

扫码查看

原文链接

NETL
NSTL
万方数据

外文摘要：Conversational large language models(LLMs)such as ChatGPT and GPT-4 have recently exhibited remarkable capabilit-ies across various domains,capturing widespread attention from the public.To facilitate this line of research,in this paper,we report the development of MOSS,an open-sourced conversational LLM that contains 16 B parameters and can perform a variety of instructions in multi-turn interactions with humans.The base model of MOSS is pre-trained on large-scale unlabeled English,Chinese,and code data.To optimize the model for dialogue,we generate 1.1 M synthetic conversations based on user prompts collected through our earlier ver-sions of the model API.We then perform preference-aware training on preference data annotated from AI feedback.Evaluation results on real-world use cases and academic benchmarks demonstrate the effectiveness of the proposed approaches.In addition,we present an effective practice to augment MOSS with several external tools.Through the development of MOSS,we have established a complete technical roadmap for large language models from pre-training,supervised fine-tuning to alignment,verifying the feasibility of chatG-PT under resource-limited conditions and providing a reference for both the academic and industrial communities.Model weights and code are publicly available at https://github.com/OpenMOSS/MOSS.

外文关键词：

Large language modelsnatural language processingpre-trainingalignmentchatGPTMOSS

作者：

Tianxiang Sun、Xiaotian Zhang、Zhengfu He、Peng Li、Qinyuan Cheng、Xiangyang Liu、Hang Yan、Yunfan Shao、Qiong Tang、Shiduo Zhang、Xingjian Zhao、Ke Chen、Yining Zheng、Zhejian Zhou、Ruixiao Li、Jun Zhan、Yunhua Zhou、Linyang Li、Xiaogui Yang、Lingling Wu、Zhangyue Yin、Xuanjing Huang、Yu-Gang Jiang、Xipeng Qiu

展开 >

作者单位：

Fudan University,Shanghai 200438,China

基金：

National Natural science Foundation of China

项目编号：

62022027

出版年：

2024

DOI：

10.1007/s11633-024-1502-8

机器智能研究(英文)

中国科学院自动化所

机器智能研究(英文)

CSTPCDEI

影响因子：0.49

ISSN：2731-538X

年,卷(期)：2024.21(5)