Crystal-Kyber算法的FPGA高效并行优化
FPGA Efficient Parallel Optimization of Crystal-Kyber
吕顺森 1李斌 2翟嘉琪 1李松岐 1周清雷2
作者信息
- 1. 郑州大学计算机与人工智能学院,河南郑州 450001;数字工程与先进计算国家重点实验室,河南郑州 450001
- 2. 郑州大学计算机与人工智能学院,河南郑州 450001
- 折叠
摘要
多项式乘法运算制约着基于格的后量子密码在现实中的应用.为提高后量子密码Crystal_Kyber算法的性能效率,减少运行时间,降低多项式乘法的影响,本文设计了一种新的蝶形运算单元对素模q=3 329的Kyber方案进行优化.首先,采用16路并行调度新型蝶形运算单元的方式执行算法,缩短了计算周期;其次,使用流水线技术以及改进的K2RED算法,设计实现新型蝶形运算单元,用于降低资源消耗;最后,利用多RAM的方式存储数据,并且多通道优化RAM,允许数据交替存储在RAM中,提高资源复用率.实验结果表明,本文优化后的数论变换(Number Theoretic Transform,NTT)、逆数论变换(Inverse NTT,INTT)、点对位相乘(Point-Wise Multiplication,PWM)的效率达到200 MHz,合并执行Kyber效率达到175 MHz,优于其他方案,具有良好的性能.
Abstract
Polynomial multiplication operations limit the practical applications of lattice-based post-quantum cryptog-raphy.In order to improve the performance and efficiency of post-quantum cryptography Crystal_Kyber algorithm,and re-duce the running time and reduce the influence of polynomial multiplication,this paper designs a new butterfly operation unit to optimize the Kyber scheme with prime modulus q=3329.First of all,the algorithm is executed by 16-way parallel scheduling of the new butterfly operation unit,which shortens the calculation cycle.Secondly,using pipeline technology and improved K2RED algorithm,the design and implementation of a new butterfly operation unit for reducing resource con-sumption.Ultimately,the data is stored in the way of multi-RAM,and the multi-channel RAM is optimized to allow data to be stored alternately in RAM and improve the resource reuse rate.The experimental results show that the optimized NTT(number theoretic transform),INTT(Inverse NTT),PWM(point-wise multiplication)efficiency reaches 200 MHz,and the combined execution Kyber efficiency reaches 175 MHz,which is superior to other schemes and has good performance.
关键词
后量子密码/Crystal-Kyber/K2RED/蝶形运算/多项式乘法/硬件效率Key words
post quantum cryptography/Crystal-Kyber/K2RED/butterfly arithmetic/polynomial multiplication/hardware efficiency引用本文复制引用
出版年
2024