[Resource Topic] 2024/136: Secure Transformer Inference Made Non-interactive

Welcome to the resource topic for 2024/136

Title:
Secure Transformer Inference Made Non-interactive

Authors: Jiawen Zhang, Jian Liu, Xinpeng Yang, Yinghao Wang, Kejia Chen, Xiaoyang Hou, Kui Ren, Xiaohu Yang

Abstract:

Secure transformer inference has emerged as a prominent research topic following the proliferation of ChatGPT. Existing solutions are typically interactive, involving substantial communication load and numerous interaction rounds between the client and the server.

In this paper, we propose NEXUS the first non-interactive protocol for secure transformer inference, where the client is only required to submit an encrypted input and await the encrypted result from the server. Central to NEXUS are two innovative techniques: SIMD ciphertext compression/decompression, and SIMD slots folding. Consequently, our approach achieves a speedup of 2.8$\times$ and a remarkable bandwidth reduction of 368.6$\times$, compared to the state-of-the-art solution presented in S&P '24.

ePrint: https://eprint.iacr.org/2024/136

See all topics related to this paper.

Feel free to post resources that are related to this paper below.

Example resources include: implementations, explanation materials, talks, slides, links to previous discussions on other websites.

For more information, see the rules for Resource Topics .