[Resource Topic] 2024/1429: Powerformer: Efficient Privacy-Preserving Transformer with Batch Rectifier-Power Max Function and Optimized Homomorphic Attention

system · September 14, 2024, 10:22am

Welcome to the resource topic for 2024/1429

Title:
Powerformer: Efficient Privacy-Preserving Transformer with Batch Rectifier-Power Max Function and Optimized Homomorphic Attention

Authors: Dongjin Park, Eunsang Lee, Joon-Woo Lee

Abstract:

We propose an efficient non-interactive privacy-preserving Transformer inference architecture called Powerformer. Since softmax is a non-algebraic operation, previous studies have attempted to modify it to be HE-friendly, but these methods have encountered issues with accuracy degradation or prolonged execution times due to the use of multiple bootstrappings. We propose replacing softmax with a new ReLU-based function called the \textit{Batch Rectifier-Power max} (BRPmax) function without any unstable approximation methods, which outperforms even original BERT performance within BERT-Large model while requiring fewer levels, allowing it to operate with only a single bootstrapping. We also present a matrix multiplication algorithms specialized for attention block that reduce the number of key-switchings by 35% to 91% compared to existing state-of-the-art methods. We design clear end-to-end HE-based implementation for private Transformer model, and our implementation of Powerformer on the BERT-tiny model using RNS-CKKS takes 503 seconds on a single-threaded CPU, and to the best of our knowledge, this is the first end-to-end non-interactive Transformer implementation using HE.

ePrint: https://eprint.iacr.org/2024/1429

See all topics related to this paper.

Feel free to post resources that are related to this paper below.

Example resources include: implementations, explanation materials, talks, slides, links to previous discussions on other websites.

For more information, see the rules for Resource Topics .