Welcome to the resource topic for 2022/1621
cuXCMP: CUDA-Accelerated Private Comparison Based on Homomorphic Encryption
Authors: Hao Yang, Shiyu Shen, Zhe Liu, Yunlei ZhaoAbstract:
Private comparison schemes constructed on homomorphic encryption oﬀer the noninteractive, output expressive and parallelizable features, and have advantages in communication bandwidth and performance. In this paper, we propose cuXCMP, which allows negative and ﬂoat inputs, oﬀers fully output expressive feature, and is more extensible and practical compared to XCMP (AsiaCCS 2018). Meanwhile, we introduce several memory-centric optimizations of the constant term extraction kernel tailored for CUDA-enabled GPUs. Firstly, we fully utilize the shared memory and present compact GPU implementations of NTT and INTT using a single block; Secondly, we fuse multiple kernels into one AKS kernel, which conducts the automorphism and key switching operation, and reduce the grid dimension for better resource usage, data access rate and synchronization. Thirdly, we precisely measure the IO latency and choose an appropriate number of CUDA streams to enable concurrent execution of independent operations, yielding a constant term extraction kernel with perfect latency hide, i.e., CTX. Combining these approaches, we boost the overall execution time to optimum level and the speedup ratio increases with the comparison scales. For one comparison, we speedup the AKS by 23.71×, CTX by 15.58×, and scheme by 1.83× (resp., 18.29×, 11.75×, and 1.42×) compared to C (resp., AVX512) baselines, respectively. For 32 comparisons, our CTX and scheme implementations outperform the C (resp., AVX512) baselines by 112.00× and 1.99× (resp., 81.53× and 1.51×).
Feel free to post resources that are related to this paper below.
Example resources include: implementations, explanation materials, talks, slides, links to previous discussions on other websites.
For more information, see the rules for Resource Topics .