[Resource Topic] 2023/1678: BumbleBee: Secure Two-party Inference Framework for Large Transformers

Welcome to the resource topic for 2023/1678

BumbleBee: Secure Two-party Inference Framework for Large Transformers

Authors: Wen-jie Lu, Zhicong Huang, Zhen Gu, Jingyu Li, Jian Liu, Kui Ren, Cheng Hong, Tao Wei, WenGuang Chen


Large transformer-based models have realized state- of-the-art performance on lots of real-world tasks such as natural language processing and computer vision. However, with the increasing sensitivity of the data and tasks they handle, privacy has become a major concern during model deployment. In this work, we focus on private inference in two-party settings, where one party holds private inputs and the other holds the model. We introduce BumbleBee, a fast and communication-friendly two-party private transformer inference system. Our contributions are three-fold: Firstly, we present optimized homomorphic encryption-based proto- cols that enable the multiplication of large matrices with 80 – 90% less communication cost than existing methods. Secondly, we offer a general method for designing efficient and accurate protocols for non-linear activation functions in transformers. Our activation protocols have demonstrated speed and reduced the communication overhead by 80 – 95% over two existing methods. Finally, we conducted intensive benchmarks on several large transformer models. Results show that BumbleBee is more than one order of magnitude faster than Iron (NeurIPS22).

ePrint: https://eprint.iacr.org/2023/1678

See all topics related to this paper.

Feel free to post resources that are related to this paper below.

Example resources include: implementations, explanation materials, talks, slides, links to previous discussions on other websites.

For more information, see the rules for Resource Topics .