The Cryptocurrency Post

Fingerprinting Codes Meet Geometry: Improved Lower Bounds for Private Query Release and Adaptive Data Analysis

Fingerprinting Codes Meet Geometry: Improved Lower Bounds for Private Query Release and Adaptive Data Analysis

Fingerprinting codes are a crucial tool for proving lower bounds in differential privacy. They have been used to prove tight lower bounds for several fundamental questions, especially in the “low accuracy” regime. Unlike reconstruction/discrepancy approaches however, they are more suited for proving worst-case lower bounds, for query sets that arise naturally from the fingerprinting codes construction. In this work, we propose a general framework for proving fingerprinting type lower bounds, that allows us to tailor the technique to the geometry of the query set.
Our approach allows us to prove several new results.

First, we show that any (sample- and population-)accurate algorithm for answering QQ arbitrary adaptive counting queries over a universe X\mathcal{X} to accuracy α\alpha needs Ω(log⁡∣X∣⋅log⁡Qα3)\Omega(\frac{\sqrt{\log |\mathcal{X}|}\cdot \log Q}{\alpha^3}) samples. This shows that the approaches based on differential privacy are optimal for this question, and improves significantly on the previously known lower bounds of log⁡Qα2\frac{\log Q}{\alpha^2} and min⁡(Q,log⁡∣X∣)/α2\min(\sqrt{Q}, \sqrt{\log |\mathcal{X}|})/\alpha^2.
Seconly, we show that any (ε,δ)(\varepsilon,\delta)-DP algorithm for answering QQ counting queries to accuracy α\alpha needs Ω(dlog⁡(1/δ)log⁡Qεα2)\Omega\left( \frac{\sqrt{d \log(1/\delta)} \log Q}{\varepsilon \alpha^2} \right) samples. Our framework allows for directly proving this bound and improves by log⁡(1/δ)\sqrt{\log(1/\delta)} the bound proved by Bun, Ullman and Vadhan (2013) using composition. Thirdly, we characterize the sample complexity of answering a set of random 0-1 queries under approximate differential privacy. To achieve this, we give new upper and lower bounds that combined with existing bounds allow us to complete the picture.

Figure 1: Behavior of sample complexity vs. error trade-off for dd random linear queries (left) and worst-case queries (right) over a universe X\mathcal{X} (log⁡\loglog⁡\log scale). The sample complexity for random queries is discontinuous at α≈log⁡∣X∣d\alpha \approx \frac{\sqrt{\log |\mathcal{X}|}}{\sqrt{d}}. The dependence on the privacy parameters and log⁡d\log d terms are suppressed for clarity.


Source link

Exit mobile version