Welcome to the resource topic for 2025/1939
Title:
Efficient Polynomial Multiplication for HQC on ARM Cortex-M4
Authors: Jihoon Jang, Myeonghoon Lee, Donggeun Kwon, Seokhie Hong, Suhri Kim, Sangjin Lee
Abstract:HQC, a code-based KEM selected by NIST for post-quantum standardization in March 2025, relies on fast polynomial multiplication over \mathbb{F}_2[x] on embedded targets. On ARM Cortex-M4, where carry-less multiply is unavailable, prior work has focused on two approaches, Frobenius Additive FFT (FAFFT) and a radix-16 method that computes \mathbb{F}_2[x] multiplications via 32-bit integer multiplications.
In this paper, we propose improved variants of FAFFT and the radix-16 method that optimize HQC on Cortex-M4. Regarding FAFFT, applying FAFFT to a polynomial length n=2^{k}+r with small r, such as hqc-128 and -192, requires operating 2^{k+2}\approx 4n. To address this overhead, we use FAFFT with 2-Karatsuba. We divide at 2^{k}, evaluate two subproducts of size 2^k with FAFFT at length 2^{k+1}, and handle the residual of size r via radix-16 multiplication. We further optimize the FAFFT butterfly by shortening XOR sequences that result from fixed-coefficient multiplications expressed as matrix-vector multiplications and by scheduling that fully utilizes all 14 general-purpose registers. In the final 5 layers, we implement bit swaps between registers with SWAPMOVE operations.
For the radix-16 method, we build a cost model based on operation counts to explore Karatsuba and Toom-Cook combinations, producing a small set of optimal candidates that we evaluate on the target device. We compare with the gf2x library using the same algorithm set. For hqc-128 and -192 the resulting combinations are identical, while for hqc-256 our combination achieves 21.7% fewer cycles.
On a NUCLEO-L4R5ZI board with a Cortex-M4 microcontroller, our implementations reduce polynomial-multiplication cycles by 11.3% (hqc-128) and 9.2% (hqc-192) with FAFFT, and by 24.5% (hqc-128) and 3.2% (hqc-192) with the radix-16 method. Overall, we achieve cycle reductions of 16.4%, 15.9%, and 14.7% for key generation, encapsulation, and decapsulation in hqc-128, and 5.8%, 5.8%, and 5.5% in hqc-192.
ePrint: https://eprint.iacr.org/2025/1939
See all topics related to this paper.
Feel free to post resources that are related to this paper below.
Example resources include: implementations, explanation materials, talks, slides, links to previous discussions on other websites.
For more information, see the rules for Resource Topics .