# A Smooth Binary Mechanism for Efficient Private Continual Observation

1 code implementation16 Jun 2023,

We address the efficiency problem by presenting a simple alternative to the binary mechanism in which 1) generating the noise takes constant average time per value, 2) the variance is reduced by a factor about 4 compared to the binary mechanism, and 3) the noise distribution at each step is identical.

3

# PLAN: Variance-Aware Private Mean Estimation

Under a concentration assumption on $\mathcal{D}$, we show how to exploit skew in the vector $\boldsymbol{\sigma}$, obtaining a (zero-concentrated) differentially private mean estimate with $\ell_2$ error proportional to $\|\boldsymbol{\sigma}\|_1$.

# Daisy Bloom Filters

We refer to our parameterization of the weighted Bloom filter as a $\textit{Daisy Bloom filter}$.

# Infinitely Divisible Noise in the Low Privacy Regime

1 code implementation13 Oct 2021,

Federated learning, in which training data is distributed among users and never shared, has emerged as a popular approach to privacy-preserving machine learning.

0

# DEANN: Speeding up Kernel-Density Estimation using Approximate Nearest Neighbor Search

We present an algorithm called Density Estimation from Approximate Nearest Neighbors (DEANN) where we apply Approximate Nearest Neighbor (ANN) algorithms as a black box subroutine to compute an unbiased KDE.

5

# CountSketches, Feature Hashing and the Median of Three

For $t > 1$, the estimator takes the median of $2t-1$ independent estimates, and the probability that the estimate is off by more than $2 \|v\|_2/\sqrt{s}$ is exponentially small in $t$.

# Sampling a Near Neighbor in High Dimensions -- Who is the Fairest of Them All?

Given a set of points $S$ and a radius parameter $r>0$, the $r$-near neighbor ($r$-NN) problem asks for a data structure that, given any query point $q$, returns a point $p$ within distance at most $r$ from $q$.

2

# WOR and $p$'s: Sketches for $\ell_p$-Sampling Without Replacement

We design novel composable sketches for WOR $\ell_p$ sampling, weighted sampling of keys according to a power $p\in[0, 2]$ of their frequency (or for signed data, sum of updates).

# Advances and Open Problems in Federated Learning

FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches.

3,191

# Private Aggregation from Fewer Anonymous Messages

Using a reduction of Balle et al. (2019), our improved analysis of the protocol of Ishai et al. yields, in the same model, an $\left(\varepsilon, \delta\right)$-differentially private protocol for aggregation that, for any constant $\varepsilon > 0$ and any $\delta = \frac{1}{\mathrm{poly}(n)}$, incurs only a constant error and requires only a constant number of messages per party.

Cryptography and Security Data Structures and Algorithms

# The space complexity of inner product filters

no code implementations24 Sep 2019,

Motivated by the problem of filtering candidate pairs in inner product similarity joins we study the following inner product estimation problem: Given parameters $d\in {\bf N}$, $\alpha>\beta\geq 0$ and unit vectors $x, y\in {\bf R}^{d}$ consider the task of distinguishing between the cases $\langle x, y\rangle\leq\beta$ and $\langle x, y\rangle\geq \alpha$ where $\langle x, y\rangle = \sum_{i=1}^d x_i y_i$ is the inner product of vectors $x$ and $y$.

# Oblivious Sketching of High-Degree Polynomial Kernels

Oblivious sketching has emerged as a powerful approach to speeding up numerical linear algebra over the past decade, but our understanding of oblivious sketching solutions for kernel matrices has remained quite limited, suffering from the aforementioned exponential dependence on input parameters.

Data Structures and Algorithms

10

# On the Power of Multiple Anonymous Messages

- Protocols in the multi-message shuffled model with $poly(\log{B}, \log{n})$ bits of communication per user and $poly\log{B}$ error, which provide an exponential improvement on the error compared to what is possible with single-message algorithms.

# PUFFINN: Parameterless and Universally Fast FInding of Nearest Neighbors

We describe a novel synthetic data set that is difficult to solve for almost all existing nearest neighbor search approaches, and for which PUFFINN significantly outperform previous methods.

Data Structures and Algorithms Computational Geometry

49

# Scalable and Differentially Private Distributed Aggregation in the Shuffled Model

no code implementations19 Jun 2019, ,

Federated learning promises to make machine learning feasible on distributed, private datasets by implementing gradient descent using secure aggregation methods.

# Fair Near Neighbor Search: Independent Range Sampling in High Dimensions

There are several variants of the similarity search problem, and one of the most relevant is the $r$-near neighbor ($r$-NN) problem: given a radius $r>0$ and a set of points $S$, construct a data structure that, for any given query point $q$, returns a point $p$ within distance at most $r$ from $q$.

0

# Space-efficient Feature Maps for String Alignment Kernels

We present novel space-efficient feature maps (SFMs) of RFFs for a space reduction from O(dD) of the original FMs to O(d) of SFMs with a theoretical guarantee with respect to concentration bounds.

# On the Complexity of Inner Product Similarity Join

* New upper and lower bounds for (A)LSH-based algorithms.

Cannot find the paper you are looking for? You can Submit a new open access paper.