[Resource Topic] 2023/1147: CipherGPT: Secure Two-Party GPT Inference

Welcome to the resource topic for 2023/1147

Title:
CipherGPT: Secure Two-Party GPT Inference

Authors: Xiaoyang Hou, Jian Liu, Jingyu Li, Wen-jie Lu, Cheng Hong, Kui Ren

Abstract:

ChatGPT is recognized as a significant revolution in the field of artificial intelligence, but it raises serious concerns regarding user privacy, as the data submitted by users may contain sensitive information.
Existing solutions for secure inference face significant challenges in supporting GPT-like models due to the enormous number of model parameters and complex activation functions.

In this paper, we develop CipherGPT, the \mathit{first} framework for secure two-party GPT inference, building upon a series of innovative protocols.
First, we propose a secure matrix multiplication that is customized for GPT inference, achieving upto 2.5$\times$ speedup and 11.2$\times$ bandwidth reduction over SOTA.
We also propose a novel protocol for securely computing GELU, surpassing SOTA by 4.2$\times$ in runtime, 3.4$\times$ in communication and 10.9$\times$ in precision.
Furthermore, we come up with the first protocol for top-k sampling.

We provide a full-fledged implementation and comprehensive benchmark for CipherGPT.
In particular, we measure the runtime and communication for each individual operation, along with their corresponding proportions.
We believe this can serve as a reference for future research in this area.

ePrint: https://eprint.iacr.org/2023/1147

See all topics related to this paper.

Feel free to post resources that are related to this paper below.

Example resources include: implementations, explanation materials, talks, slides, links to previous discussions on other websites.

For more information, see the rules for Resource Topics .