基于存算一体集成芯片的大语言模型专用硬件架构

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：目前以ChatGPT为代表的人工智能(AI)大模型在参数规模和系统算力需求上呈现指数级的增长趋势.深入研究了大型模型专用硬件架构,详细分析了大模型在部署过程中面临的带宽问题,以及这些问题对当前数据中心的重大影响.提出采用存算一体集成芯片架构的解决方案,旨在缓解数据传输压力,同时提高大模型推理的能量效率.此外,还深入研究了在存算一体架构下轻量化-存内压缩协同设计的可能性,以实现稀疏网络在存算一体硬件上的稠密映射,从而显著提高存储密度和计算能效.

外文标题：Large Language Model Specific Hardware Architecture Based on Integrated Compute-in-Memory Chips

外文摘要：Artificial intelligent(AI)models represented by ChatGPT are showing an exponential growth trend in parameter size and system com-puting power requirements.The dedicated hardware architecture for large models is studied,and a detailed analysis of the bandwidth bottle-neck issues faced by large models during deployment is provided,as well as the significant impact of this challenge on current data centers.To address this issue,a solution of using integrated compute-in-memory chiplets has been proposed,aiming to alleviate data transmission pres-sure and improve the energy efficiency of large-scale model inference.In addition,the possibility of lightweight in-memory compression col-laborative design under the in-memory computing architecture is studied,in order to achieve dense mapping of sparse networks on the inte-grated in-memory computing architecture hardware,thereby significantly improving storage density and computational energy efficiency.

外文关键词：

large language modelcompute-in-memorychipletin-memory compression

作者：

何斯琪、穆琛、陈迟晓

展开 >

作者单位：

复旦大学,中国上海 200433

关键词：

大语言模型存算一体集成芯粒存内压缩

基金：

国家自然科学基金复旦大学-中兴通讯强计算架构研究联合实验室存算一体架构研究项目

项目编号：

62322404

出版年：

2024

DOI：

10.12142/ZTETJ.202402006

中兴通讯技术

中兴通讯股份有限公司,安徽科学技术情报研究所

中兴通讯技术

CSTPCD北大核心

影响因子：1.272

ISSN：1009-6868

年,卷(期)：2024.30(2)

参考文献量8