Search Results for author: Arturs Backurs

Found 16 papers, 7 papers with code

Efficiently Computing Similarities to Private Datasets

no code implementations • 13 Mar 2024 • Arturs Backurs, Zinan Lin, Sepideh Mahabadi, Sandeep Silwal, Jakub Tarnawski

We abstract out this common subroutine and study the following fundamental algorithmic problem: Given a similarity function $f$ and a large high-dimensional private dataset $X \subset \mathbb{R}^d$, output a differentially private (DP) data structure which approximates $\sum_{x \in X} f(x, y)$ for any query $y$.

Density Estimation Dimensionality Reduction

Paper
Add Code

Differentially Private Synthetic Data via Foundation Model APIs 2: Text

1 code implementation • 4 Mar 2024 • Chulin Xie, Zinan Lin, Arturs Backurs, Sivakanth Gopi, Da Yu, Huseyin A Inan, Harsha Nori, Haotian Jiang, Huishuai Zhang, Yin Tat Lee, Bo Li, Sergey Yekhanin

Lin et al. (2024) recently introduced the Private Evolution (PE) algorithm to generate DP synthetic images with only API access to diffusion models.

Privacy Preserving

Paper
Code

Privately Aligning Language Models with Reinforcement Learning

no code implementations • 25 Oct 2023 • Fan Wu, Huseyin A. Inan, Arturs Backurs, Varun Chandrasekaran, Janardhan Kulkarni, Robert Sim

Positioned between pre-training and user deployment, aligning large language models (LLMs) through reinforcement learning (RL) has emerged as a prevailing strategy for training instruction following-models such as ChatGPT.

Instruction Following Privacy Preserving +3

Paper
Add Code

Exploring the Limits of Differentially Private Deep Learning with Group-wise Clipping

no code implementations • 3 Dec 2022 • Jiyan He, Xuechen Li, Da Yu, Huishuai Zhang, Janardhan Kulkarni, Yin Tat Lee, Arturs Backurs, Nenghai Yu, Jiang Bian

To reduce the compute time overhead of private learning, we show that \emph{per-layer clipping}, where the gradient of each neural network layer is clipped separately, allows clipping to be performed in conjunction with backpropagation in differentially private optimization.

Computational Efficiency

Paper
Add Code

Unveiling Transformers with LEGO: a synthetic reasoning task

1 code implementation • 9 Jun 2022 • Yi Zhang, Arturs Backurs, Sébastien Bubeck, Ronen Eldan, Suriya Gunasekar, Tal Wagner

We study how the trained models eventually succeed at the task, and in particular, we manage to understand some of the attention heads as well as how the information flows in the network.

Learning to Execute

Paper
Code

Differentially Private Model Compression

no code implementations • 3 Jun 2022 • FatemehSadat Mireshghallah, Arturs Backurs, Huseyin A Inan, Lukas Wutschitz, Janardhan Kulkarni

Recent papers have shown that large pre-trained language models (LLMs) such as BERT, GPT-2 can be fine-tuned on private data to achieve performance comparable to non-private models for many downstream Natural Language Processing (NLP) tasks while simultaneously guaranteeing differential privacy.

Model Compression

Paper
Add Code

Differentially Private Fine-tuning of Language Models

2 code implementations • ICLR 2022 • Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, Huishuai Zhang

For example, on the MNLI dataset we achieve an accuracy of $87. 8\%$ using RoBERTa-Large and $83. 5\%$ using RoBERTa-Base with a privacy budget of $\epsilon = 6. 7$.

Text Generation

Paper
Code

Faster Kernel Matrix Algebra via Density Estimation

no code implementations • 16 Feb 2021 • Arturs Backurs, Piotr Indyk, Cameron Musco, Tal Wagner

In particular, we consider estimating the sum of kernel matrix entries, along with its top eigenvalue and eigenvector.

Density Estimation

Paper
Add Code

Data-to-text Generation by Splicing Together Nearest Neighbors

1 code implementation • EMNLP 2021 • Sam Wiseman, Arturs Backurs, Karl Stratos

We propose to tackle data-to-text generation tasks by directly splicing together retrieved segments of text from "neighbor" source-target pairs.

Conditional Text Generation Data-to-Text Generation

Paper
Code

Impossibility Results for Grammar-Compressed Linear Algebra

no code implementations • NeurIPS 2020 • Amir Abboud, Arturs Backurs, Karl Bringmann, Marvin Künnemann

In this paper we consider lossless compression schemes, and ask if we can run our computations on the compressed data as efficiently as if the original data was that small.

Paper
Add Code

Active Local Learning

no code implementations • 31 Aug 2020 • Arturs Backurs, Avrim Blum, Neha Gupta

In particular, the number of label queries should be independent of the complexity of $H$, and the function $h$ should be well-defined, independent of $x$.

Paper
Add Code

Space and Time Efficient Kernel Density Estimation in High Dimensions

1 code implementation • NeurIPS 2019 • Arturs Backurs, Piotr Indyk, Tal Wagner

We instantiate our framework with the Laplacian and Exponential kernels, two popular kernels which possess the aforementioned property.

Density Estimation Vocal Bursts Intensity Prediction

Paper
Code

Scalable Nearest Neighbor Search for Optimal Transport

1 code implementation • ICML 2020 • Arturs Backurs, Yihe Dong, Piotr Indyk, Ilya Razenshteyn, Tal Wagner

Our extensive experiments, on real-world text and image datasets, show that Flowtree improves over various baselines and existing methods in either running time or accuracy.

Data Structures and Algorithms

115

Paper
Code

Scalable Fair Clustering

1 code implementation • 10 Feb 2019 • Arturs Backurs, Piotr Indyk, Krzysztof Onak, Baruch Schieber, Ali Vakilian, Tal Wagner

In the fair variant of $k$-median, the points are colored, and the goal is to minimize the same average distance objective while ensuring that all clusters have an "approximately equal" number of points of each color.

Clustering Fairness

Paper
Code

Improving Viterbi is Hard: Better Runtimes Imply Faster Clique Algorithms

no code implementations • ICML 2017 • Arturs Backurs, Christos Tzamos

The classic algorithm of Viterbi computes the most likely path in a Hidden Markov Model (HMM) that results in a given sequence of observations.

Paper
Add Code

On the Fine-Grained Complexity of Empirical Risk Minimization: Kernel Methods and Neural Networks

no code implementations • NeurIPS 2017 • Arturs Backurs, Piotr Indyk, Ludwig Schmidt

We also give similar hardness results for computing the gradient of the empirical loss, which is the main computational burden in many non-convex learning tasks.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.