[Resource Topic] 2024/2024: Hash-Prune-Invert: Improved Differentially Private Heavy-Hitter Detection in the Two-Server Model

system · December 13, 2024, 11:46pm

Welcome to the resource topic for 2024/2024

Title:
Hash-Prune-Invert: Improved Differentially Private Heavy-Hitter Detection in the Two-Server Model

Authors: Borja Balle, James Bell, Albert Cheu, Adria Gascon, Jonathan Katz, Mariana Raykova, Phillipp Schoppmann, Thomas Steinke

Abstract:

Differentially private (DP) heavy-hitter detection is an important primitive for data analysis. Given a threshold t and a dataset of n items from a domain of size d, such detection algorithms ignore items occurring fewer than t times while identifying items occurring more than t+\Delta times; we call \Delta the error margin. In the central model where a curator holds the entire dataset, (\varepsilon,\delta)-DP algorithms can achieve error margin \Theta(\frac 1 \varepsilon \log \frac 1 \delta), which is optimal when d \gg 1/\delta.

Several works, e.g., Poplar (S&amp;P 2021), have proposed protocols in which two or more non-colluding servers jointly compute the heavy hitters from inputs held by $n$ clients. Unfortunately, existing protocols suffer from an undesirable dependence on $\log d$ in terms of both server efficiency (computation, communication, and round complexity) and accuracy (i.e., error margin), making them unsuitable for large domains (e.g., when items are kB-long strings, $\log d \approx 10^4$).

We present hash-prune-invert (HPI), a technique for compiling any heavy-hitter protocol with the $\log d$ dependencies mentioned above into a new protocol with improvements across the board: computation, communication, and round complexity depend (roughly) on $\log n$ rather than $\log d$, and the error margin is independent of $d$. Our transformation preserves privacy against an active adversary corrupting at most one of the servers and any number of clients. We apply HPI to an improved version of Poplar, also introduced in this work, that improves Poplar's error margin by roughly a factor of $\sqrt{n}$ (regardless of $d$). Our experiments confirm that the resulting protocol improves efficiency and accuracy for large $d$.

ePrint: https://eprint.iacr.org/2024/2024

See all topics related to this paper.

Feel free to post resources that are related to this paper below.

Example resources include: implementations, explanation materials, talks, slides, links to previous discussions on other websites.

For more information, see the rules for Resource Topics .