Welcome to the resource topic for 2023/1678
Title:
BumbleBee: Secure Two-party Inference Framework for Large Transformers
Authors: Wen-jie Lu, Zhicong Huang, Zhen Gu, Jingyu Li, Jian Liu, Kui Ren, Cheng Hong, Tao Wei, WenGuang Chen
Abstract:Large transformer-based models have realized state- of-the-art performance on lots of real-world tasks such as natural language processing and computer vision. However, with the increasing sensitivity of the data and tasks they handle, privacy has become a major concern during model deployment. In this work, we focus on private inference in two-party settings, where one party holds private inputs and the other holds the model. We introduce BumbleBee, a fast and communication-friendly two-party private transformer inference system. Our contributions are three-fold: Firstly, we present optimized homomorphic encryption-based proto- cols that enable the multiplication of large matrices with 80 – 90% less communication cost than existing methods. Secondly, we offer a general method for designing efficient and accurate protocols for non-linear activation functions in transformers. Our activation protocols have demonstrated speed and reduced the communication overhead by 80 – 95% over two existing methods. Finally, we conducted intensive benchmarks on several large transformer models. Results show that BumbleBee is more than one order of magnitude faster than Iron (NeurIPS22).
ePrint: https://eprint.iacr.org/2023/1678
See all topics related to this paper.
Feel free to post resources that are related to this paper below.
Example resources include: implementations, explanation materials, talks, slides, links to previous discussions on other websites.
For more information, see the rules for Resource Topics .