FPGA Efficient Scalability Optimization of Dilithium
To improve the operational efficiency of Dilithium in practical applications,an efficient field programmable gate array(FPGA)implementation of the Dilithium algorithm is proposed.Optimization is carried out in several aspects,including combi-ning the Karatsuba-Offman algorithm(KOA)with the fast modular reduction algorithm to create a fast modular multiplication unit,optimizing the extensive polynomial multiplication achieved through number theoretic transform(NTT)implementation.Multiple RAM accesses are employed for polynomial coefficient operations,and a coefficient reading strategy tailored to the characteristics of the Dilithium algorithm is designed to achieve rapid and accurate reading of polynomial coefficients from RAM.For the sampling and hashing tasks in the scheme,the characteristics of the SHAKE algorithm series are analyzed,leading to the development of a low-latency and scalable Keccak hardware architecture,allowing it to execute different SHAKE algorithms based on the input signal.Experimental results demonstrate that the working frequency of the proposed algorithm is increased by 60.7%~131.9%,while balancing hardware resource consumption and execution efficiency.