no code implementations • NAACL (PrivateNLP) 2022 • Natalia Ponomareva, Jasmijn Bastings, Sergei Vassilvitskii
We focus on T5 and show that by using recent advances in JAX and XLA we can train models with DP that do not suffer a large drop in pre-training utility, nor in training speed, and can still be fine-tuned to high accuracies on downstream tasks (e. g.
no code implementations • 16 Jul 2024 • Kareem Amin, Alex Bie, Weiwei Kong, Alexey Kurakin, Natalia Ponomareva, Umar Syed, Andreas Terzis, Sergei Vassilvitskii
In the private prediction framework, we only require the output synthetic data to satisfy differential privacy guarantees.
no code implementations • 28 May 2024 • Sami Davies, Sergei Vassilvitskii, Yuyan Wang
In practice, Push-Relabel is even faster than what theoretical guarantees can promise, in part because of the use of good heuristics for seeding and updating the iterative algorithm.
no code implementations • 6 Feb 2024 • Berivan Isik, Natalia Ponomareva, Hussein Hazimeh, Dimitris Paparas, Sergei Vassilvitskii, Sanmi Koyejo
With sufficient alignment, both downstream cross-entropy and BLEU score improve monotonically with more pretraining data.
3 code implementations • 12 Apr 2023 • CJ Carey, Travis Dick, Alessandro Epasto, Adel Javanmard, Josh Karlin, Shankar Kumar, Andres Munoz Medina, Vahab Mirrokni, Gabriel Henrique Nunes, Sergei Vassilvitskii, Peilin Zhong
In this work, we present a new theoretical framework to measure re-identification risk in such user representations.
1 code implementation • 1 Mar 2023 • Natalia Ponomareva, Hussein Hazimeh, Alex Kurakin, Zheng Xu, Carson Denison, H. Brendan McMahan, Sergei Vassilvitskii, Steve Chien, Abhradeep Thakurta
However, while some adoption of DP has happened in industry, attempts to apply DP to real world complex ML models are still few and far between.
1 code implementation • 22 Oct 2022 • Michael Dinitz, Sungjin Im, Thomas Lavastida, Benjamin Moseley, Sergei Vassilvitskii
For each of these problems we introduce new algorithms that take advantage of multiple predictors, and prove bounds on the resulting performance.
1 code implementation • 20 Oct 2022 • Mikhail Khodak, Kareem Amin, Travis Dick, Sergei Vassilvitskii
When applying differential privacy to sensitive data, we can often improve performance using external information such as other sensitive data, public data, or human priors.
no code implementations • 15 Aug 2022 • Kareem Amin, Matthew Joseph, Mónica Ribero, Sergei Vassilvitskii
In this paper, we study an algorithm which uses the exponential mechanism to select a model with high Tukey depth from a collection of non-private regression models.
1 code implementation • 17 Jun 2022 • Vincent Cohen-Addad, Alessandro Epasto, Silvio Lattanzi, Vahab Mirrokni, Andres Munoz, David Saulpic, Chris Schwiegelshohn, Sergei Vassilvitskii
We study the private $k$-median and $k$-means clustering problem in $d$ dimensional Euclidean space.
no code implementations • 18 Feb 2022 • Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar, Sergei Vassilvitskii
A burgeoning paradigm in algorithm design is the field of algorithms with predictions, in which algorithms can take advantage of a possibly-imperfect prediction of some aspect of the problem.
no code implementations • NeurIPS 2021 • Silvio Lattanzi, Benjamin Moseley, Sergei Vassilvitskii, Yuyan Wang, Rudy Zhou
In correlation clustering we are given a set of points along with recommendations whether each pair of points should be placed in the same cluster or into separate clusters.
no code implementations • NeurIPS Workshop LatinX_in_AI 2021 • Andres Munoz Medina, Robert Istvan Busa-Fekete, Umar Syed, Sergei Vassilvitskii
We complement the negative results with a non-parametric estimator for the true privacy loss, and apply our techniques on large-scale benchmark data to demonstrate how to achieve a desired privacy protection.
no code implementations • 5 Oct 2021 • Hossein Esfandiari, Vahab Mirrokni, Umar Syed, Sergei Vassilvitskii
We present new mechanisms for \emph{label differential privacy}, a relaxation of differentially private machine learning that only protects the privacy of the labels in the training set.
no code implementations • NeurIPS 2021 • Michael Dinitz, Sungjin Im, Thomas Lavastida, Benjamin Moseley, Sergei Vassilvitskii
Second, once the duals are feasible, they may not be optimal, so we show that they can be used to quickly find an optimal solution.
no code implementations • 2 Jul 2020 • Andrés Muñoz Medina, Umar Syed, Sergei Vassilvitskii, Ellen Vitercik
We also prove a lower bound demonstrating that the difference between the objective value of our algorithm's solution and the optimal solution is tight up to logarithmic factors among all differentially private algorithms.
no code implementations • NeurIPS 2020 • Sara Ahmadian, Alessandro Epasto, Marina Knittel, Ravi Kumar, Mohammad Mahdian, Benjamin Moseley, Philip Pham, Sergei Vassilvitskii, Yuyan Wang
As machine learning has become more prevalent, researchers have begun to recognize the necessity of ensuring machine learning systems are fair.
1 code implementation • NeurIPS 2020 • Michele Borassi, Alessandro Epasto, Silvio Lattanzi, Sergei Vassilvitskii, Morteza Zadimoghaddam
The sliding window model of computation captures scenarios in which data is arriving continuously, but only the latest $w$ elements should be used for analysis.
Data Structures and Algorithms
no code implementations • NeurIPS 2019 • Kareem Amin, Travis Dick, Alex Kulesza, Andres Munoz, Sergei Vassilvitskii
The covariance matrix of a dataset is a fundamental statistic that can be used for calculating optimum regression weights as well as in many other learning and data analysis settings.
no code implementations • NeurIPS 2018 • Jennifer A. Gillenwater, Alex Kulesza, Sergei Vassilvitskii, Zelda E. Mariet
In this paper we advocate an alternative framework for applying DPPs to recommender systems.
2 code implementations • NeurIPS 2017 • Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Sergei Vassilvitskii
We show that any fair clustering problem can be decomposed into first finding good fairlets, and then using existing machinery for traditional clustering algorithms.
no code implementations • ICML 2018 • Thodoris Lykouris, Sergei Vassilvitskii
Traditional online algorithms encapsulate decision making under uncertainty, and give ways to hedge against all possible future events, while guaranteeing a nearly optimal solution as compared to an offline optimum.
no code implementations • 14 Feb 2018 • Andrés Muñoz Medina, Sergei Vassilvitskii, Dong Yin
The rollout of new versions of a feature in modern applications is a manual multi-stage process, as the feature is released to ever larger groups of users, while its performance is carefully monitored.
no code implementations • ICML 2017 • Silvio Lattanzi, Sergei Vassilvitskii
The study of online algorithms and competitive analysis provides a solid foundation for studying the quality of irrevocable decision making when the data arrives in an online manner.
no code implementations • NeurIPS 2017 • Andrés Muñoz Medina, Sergei Vassilvitskii
In the context of advertising auctions, finding good reserve prices is a notoriously challenging learning problem.
no code implementations • NeurIPS 2017 • Eric Balkanski, Umar Syed, Sergei Vassilvitskii
We first show that when cost functions come from the family of submodular functions with bounded curvature, $\kappa$, the Shapley value can be approximated from samples up to a $\sqrt{1 - \kappa}$ factor, and that the bound is tight.
no code implementations • NeurIPS 2016 • Rishi Gupta, Ravi Kumar, Sergei Vassilvitskii
We study the problem of reconstructing a mixture of Markov chains from the trajectories generated by random walks through the state space.
3 code implementations • 29 Mar 2012 • Bahman Bahmani, Benjamin Moseley, Andrea Vattani, Ravi Kumar, Sergei Vassilvitskii
The recently proposed k-means++ initialization algorithm achieves this, obtaining an initial set of centers that is provably close to the optimum solution.
Databases
no code implementations • 16 Mar 2012 • Vijay Bharadwaj, Peiji Chen, Wenjing Ma, Chandrashekhar Nagarajan, John Tomlin, Sergei Vassilvitskii, Erik Vee, Jian Yang
Motivated by the problem of optimizing allocation in guaranteed display advertising, we develop an efficient, lightweight method of generating a compact {\em allocation plan} that can be used to guide ad server decisions.
Data Structures and Algorithms
no code implementations • 16 Mar 2012 • Peiji Chen, Wenjing Ma, Srinath Mandalapu, Chandrashekhar Nagarajan, Jayavel Shanmugasundaram, Sergei Vassilvitskii, Erik Vee, Manfai Yu, Jason Zien
A large fraction of online display advertising is sold via guaranteed contracts: a publisher guarantees to the advertiser a certain number of user visits satisfying the targeting predicates of the contract.
Data Structures and Algorithms