High-speed Parallel Implementation of Lattice-based Cryptography Based on AVX512
The rapid development of quantum computing seriously threatens the security of widely used public-key cryptography.Lattice-based cryptography occupies an essential position in Post-Quantum Cryptography(PQC)owing to its excellent anti-quantum security and efficient computational efficiency.In May 2022,the National Institute of Standards and Technology(NIST)published four PQC standards,three of which are lattice-based cryptography algorithms,along with Kyber.With the identification of post-quantum standards,the importance and need for their efficient implementation is increasing.This study presents an optimized and high-speed parallel implementation of the Kyber algorithm based on the Advanced Vector eXtensions 512(AVX512).It utilizes techniques such as lazy reduction,optimized Montgomery modular reduction,and optimized Number-Theoretic Transformation(NTT)to reduce unnecessary modular reduction operations and improve the efficiency and parallelism of polynomial computations by fully utilizing computer storage space.It also employs redundant bit technology to improve the parallel processing capability of bits during polynomial sampling.The 512 bit width of AVX512 is utilized to perform 8-way parallel Hash operations,and the resulting pseudo-random bit strings are properly scheduled to fully leverage parallel performance.Finally,this study implements polynomial computations and sampling on Kyber in high-speed parallel using the AVX512 instruction set and further implements the entire Kyber public-key encryption scheme.Performance test results indicate that the key generation and encryption algorithms in this study achieve 10 to 16 times acceleration compared to the C language implementation provided in the standard documentation,while the decryption algorithm achieves approximately 56 times acceleration.