首页|基于存算一体集成芯片的大语言模型专用硬件架构

基于存算一体集成芯片的大语言模型专用硬件架构

扫码查看
目前以ChatGPT为代表的人工智能(AI)大模型在参数规模和系统算力需求上呈现指数级的增长趋势。深入研究了大型模型专用硬件架构,详细分析了大模型在部署过程中面临的带宽问题,以及这些问题对当前数据中心的重大影响。提出采用存算一体集成芯片架构的解决方案,旨在缓解数据传输压力,同时提高大模型推理的能量效率。此外,还深入研究了在存算一体架构下轻量化-存内压缩协同设计的可能性,以实现稀疏网络在存算一体硬件上的稠密映射,从而显著提高存储密度和计算能效。
Large Language Model Specific Hardware Architecture Based on Integrated Compute-in-Memory Chips
Artificial intelligent(AI)models represented by ChatGPT are showing an exponential growth trend in parameter size and system com-puting power requirements.The dedicated hardware architecture for large models is studied,and a detailed analysis of the bandwidth bottle-neck issues faced by large models during deployment is provided,as well as the significant impact of this challenge on current data centers.To address this issue,a solution of using integrated compute-in-memory chiplets has been proposed,aiming to alleviate data transmission pres-sure and improve the energy efficiency of large-scale model inference.In addition,the possibility of lightweight in-memory compression col-laborative design under the in-memory computing architecture is studied,in order to achieve dense mapping of sparse networks on the inte-grated in-memory computing architecture hardware,thereby significantly improving storage density and computational energy efficiency.

large language modelcompute-in-memorychipletin-memory compression

何斯琪、穆琛、陈迟晓

展开 >

复旦大学,中国 上海 200433

大语言模型 存算一体 集成芯粒 存内压缩

国家自然科学基金复旦大学-中兴通讯强计算架构研究联合实验室存算一体架构研究项目

62322404

2024

中兴通讯技术
中兴通讯股份有限公司,安徽科学技术情报研究所

中兴通讯技术

CSTPCD北大核心
影响因子:1.272
ISSN:1009-6868
年,卷(期):2024.30(2)
  • 8