[Resource Topic] 2022/1154: Efficient Constant-Time Implementation of SM4 with Intel GFNI instruction set extension and Arm NEON coprocessor

Welcome to the resource topic for 2022/1154

Title:
Efficient Constant-Time Implementation of SM4 with Intel GFNI instruction set extension and Arm NEON coprocessor

Authors: Weiji Guo

Abstract:

The efficiency of constant-time SM4 implementation has been lagging behind that of AES for most internet traffic and applicable data encryption scenarios. The best performance before our works was 3.77 cpb for x86 platform (AESNI + AVX2), and 8.62 cpb for Arm platform (NEON). Meanwhile the state of art constant-time AES implementation could reach 0.63 cpb. Dedicated SM4 instruction set extensions like those optionally available in Armv8.2, could achieve comparable cpb to AES. But they are only available in limited processors, therefore does not impact much to real-world uses. To fill the gap we explored some novel techniques with Intel GFNI instruction set extension and Arm NEON coprocessor. We achieved 1.51 cpb with GFNI + AVX512 and 2.62 cpb with GFNI + AVX2 for Intel processors; we also achieved 6.74 cpb with NEON. In addition, we simplified the algebraic expression of SM4 S-Box. And our technique to exploit L1 cache could also be applied to other applications and hardware platforms if the circumstances apply.

ePrint: https://eprint.iacr.org/2022/1154

See all topics related to this paper.

Feel free to post resources that are related to this paper below.

Example resources include: implementations, explanation materials, talks, slides, links to previous discussions on other websites.

For more information, see the rules for Resource Topics .