Search Results for author: Mehryar Mohri

Found 85 papers, 8 papers with code

FedBoost: A Communication-Efficient Algorithm for Federated Learning

no code implementations ICML 2020 Jenny Hamer, Mehryar Mohri, Ananda Theertha Suresh

We provide communication-efficient ensemble algorithms for federated learning, where per-round communication cost is independent of the size of the ensemble.

Density Estimation Federated Learning +2

Online Learning with Dependent Stochastic Feedback Graphs

no code implementations ICML 2020 Corinna Cortes, Giulia Desalvo, Claudio Gentile, Mehryar Mohri, Ningshan Zhang

A general framework for online learning with partial information is one where feedback graphs specify which losses can be observed by the learner.

online learning

Strategizing against Learners in Bayesian Games

no code implementations17 May 2022 Yishay Mansour, Mehryar Mohri, Jon Schneider, Balasubramanian Sivan

We study repeated two-player games where one of the players, the learner, employs a no-regret learning strategy, while the other, the optimizer, is a rational utility maximizer.

$\mathscr{H}$-Consistency Estimation Error of Surrogate Loss Minimizers

no code implementations16 May 2022 Pranjal Awasthi, Anqi Mao, Mehryar Mohri, Yutao Zhong

We also show that previous excess error bounds can be recovered as special cases of our general results.

Differentially Private Learning with Margin Guarantees

no code implementations21 Apr 2022 Raef Bassily, Mehryar Mohri, Ananda Theertha Suresh

For the family of linear hypotheses, we give a pure DP learning algorithm that benefits from relative deviation margin guarantees, as well as an efficient DP learning algorithm with margin guarantees.

Model Selection

On the Existence of the Adversarial Bayes Classifier (Extended Version)

no code implementations3 Dec 2021 Pranjal Awasthi, Natalie S. Frank, Mehryar Mohri

In this work, we study a fundamental question regarding Bayes optimality for adversarial robustness.

Adversarial Robustness

A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning

no code implementations NeurIPS 2021 Christoph Dann, Mehryar Mohri, Tong Zhang, Julian Zimmert

Thompson Sampling is one of the most effective methods for contextual bandits and has been generalized to posterior sampling for certain MDP settings.

Multi-Armed Bandits reinforcement-learning

Boosting with Multiple Sources

no code implementations NeurIPS 2021 Corinna Cortes, Mehryar Mohri, Dmitry Storcheus, Ananda Theertha Suresh

We study the problem of learning accurate ensemble predictors, in particular boosting, in the presence of multiple source domains.

Federated Learning

Breaking the centralized barrier for cross-device federated learning

no code implementations NeurIPS 2021 Sai Praneeth Karimireddy, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian U. Stich, Ananda Theertha Suresh

Federated learning (FL) is a challenging setting for optimization due to the heterogeneity of the data across different clients which gives rise to the client drift phenomenon.

Federated Learning

Adapting to Misspecification in Contextual Bandits

no code implementations NeurIPS 2020 Dylan J. Foster, Claudio Gentile, Mehryar Mohri, Julian Zimmert

Given access to an online oracle for square loss regression, our algorithm attains optimal regret and -- in particular -- optimal dependence on the misspecification level, with no prior knowledge.

Multi-Armed Bandits

Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning

no code implementations NeurIPS 2021 Christoph Dann, Teodor V. Marinov, Mehryar Mohri, Julian Zimmert

Our results show that optimistic algorithms can not achieve the information-theoretic lower bounds even in deterministic MDPs unless there is a unique optimal policy.

reinforcement-learning

Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations

no code implementations NeurIPS 2021 Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

In this work, we consider the more realistic setting of agnostic RL with rich observation spaces and a fixed class of policies $\Pi$ that may not contain any near-optimal policy.

reinforcement-learning

A Finer Calibration Analysis for Adversarial Robustness

no code implementations4 May 2021 Pranjal Awasthi, Anqi Mao, Mehryar Mohri, Yutao Zhong

Moreover, our calibration results, combined with the previous study of consistency by Awasthi et al. (2021), also lead to more general $H$-consistency results covering common hypothesis sets.

Adversarial Robustness Robust classification

Calibration and Consistency of Adversarial Surrogate Losses

no code implementations NeurIPS 2021 Pranjal Awasthi, Natalie Frank, Anqi Mao, Mehryar Mohri, Yutao Zhong

We then give a characterization of H-calibration and prove that some surrogate losses are indeed H-calibrated for the adversarial loss, with these hypothesis sets.

Adversarial Robustness

Communication-Efficient Agnostic Federated Averaging

no code implementations6 Apr 2021 Jae Ro, Mingqing Chen, Rajiv Mathews, Mehryar Mohri, Ananda Theertha Suresh

We propose a communication-efficient distributed algorithm called Agnostic Federated Averaging (or AgnosticFedAvg) to minimize the domain-agnostic objective proposed in Mohri et al. (2019), which is amenable to other private mechanisms such as secure aggregation.

Federated Learning Language Modelling

Learning with User-Level Privacy

no code implementations NeurIPS 2021 Daniel Levy, Ziteng Sun, Kareem Amin, Satyen Kale, Alex Kulesza, Mehryar Mohri, Ananda Theertha Suresh

We show that for high-dimensional mean estimation, empirical risk minimization with smooth losses, stochastic convex optimization, and learning hypothesis classes with finite metric entropy, the privacy cost decreases as $O(1/\sqrt{m})$ as users provide more samples.

PAC-Bayes Learning Bounds for Sample-Dependent Priors

no code implementations NeurIPS 2020 Pranjal Awasthi, Satyen Kale, Stefani Karp, Mehryar Mohri

We present a series of new PAC-Bayes learning guarantees for randomized algorithms with sample-dependent priors.

Agnostic Learning with Multiple Objectives

no code implementations NeurIPS 2020 Corinna Cortes, Mehryar Mohri, Javier Gonzalvo, Dmitry Storcheus

We further implement the algorithm in a popular symbolic gradient computation framework and empirically demonstrate on a number of datasets the benefits of $\almo$ framework versus learning with a fixed mixture weights distribution.

Beyond Individual and Group Fairness

no code implementations21 Aug 2020 Pranjal Awasthi, Corinna Cortes, Yishay Mansour, Mehryar Mohri

In the adversarial setting, we design efficient algorithms with competitive ratio guarantees.

Fairness

Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning

1 code implementation8 Aug 2020 Sai Praneeth Karimireddy, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, Ananda Theertha Suresh

Federated learning (FL) is a challenging setting for optimization due to the heterogeneity of the data across different clients which gives rise to the client drift phenomenon.

Federated Learning

On the Rademacher Complexity of Linear Hypothesis Sets

no code implementations21 Jul 2020 Pranjal Awasthi, Natalie Frank, Mehryar Mohri

Linear predictors form a rich class of hypotheses used in a variety of learning algorithms.

A Theory of Multiple-Source Adaptation with Limited Target Labeled Data

no code implementations19 Jul 2020 Yishay Mansour, Mehryar Mohri, Jae Ro, Ananda Theertha Suresh, Ke wu

We present a theoretical and algorithmic study of the multiple-source domain adaptation problem in the common scenario where the learner has access only to a limited amount of labeled target data, but where the learner has at disposal a large amount of labeled data from multiple source domains.

Domain Adaptation Model Selection

Relative Deviation Margin Bounds

no code implementations26 Jun 2020 Corinna Cortes, Mehryar Mohri, Ananda Theertha Suresh

We present a series of new and more favorable margin-based learning guarantees that depend on the empirical margin loss of a predictor.

Generalization Bounds

Corralling Stochastic Bandit Algorithms

no code implementations16 Jun 2020 Raman Arora, Teodor V. Marinov, Mehryar Mohri

We study the problem of corralling stochastic bandit algorithms, that is combining multiple bandit algorithms designed for a stochastic environment, with the goal of devising a corralling algorithm that performs almost as well as the best base algorithm.

Reinforcement Learning with Feedback Graphs

no code implementations NeurIPS 2020 Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

We study episodic reinforcement learning in Markov decision processes when the agent receives additional feedback per step in the form of several transition observations.

reinforcement-learning

Adversarial Learning Guarantees for Linear Hypotheses and Neural Networks

no code implementations ICML 2020 Pranjal Awasthi, Natalie Frank, Mehryar Mohri

We give upper and lower bounds for the adversarial empirical Rademacher complexity of linear hypotheses with adversarial perturbations measured in $l_r$-norm for an arbitrary $r \geq 1$.

Adversarial Robustness

Adaptive Region-Based Active Learning

no code implementations ICML 2020 Corinna Cortes, Giulia Desalvo, Claudio Gentile, Mehryar Mohri, Ningshan Zhang

We present a new active learning algorithm that adaptively partitions the input space into a finite number of regions, and subsequently seeks a distinct predictor for each region, both phases actively requesting labels.

Active Learning

Regularized Gradient Boosting

no code implementations NeurIPS 2019 Corinna Cortes, Mehryar Mohri, Dmitry Storcheus

We fill this gap by deriving data-dependent learning guarantees for \GB\ used with \emph{regularization}, expressed in terms of the Rademacher complexities of the constrained families of base predictors.

Generalization Bounds

Learning GANs and Ensembles Using Discrepancy

no code implementations NeurIPS 2019 Ben Adlam, Corinna Cortes, Mehryar Mohri, Ningshan Zhang

Generative adversarial networks (GANs) generate data based on minimizing a divergence between two distributions.

Domain Adaptation

SCAFFOLD: Stochastic Controlled Averaging for Federated Learning

4 code implementations ICML 2020 Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, Ananda Theertha Suresh

We obtain tight convergence rates for FedAvg and prove that it suffers from `client-drift' when the data is heterogeneous (non-iid), resulting in unstable and slow convergence.

Distributed Optimization Federated Learning

Bandits with Feedback Graphs and Switching Costs

no code implementations NeurIPS 2019 Raman Arora, Teodor V. Marinov, Mehryar Mohri

We give a new algorithm whose regret guarantee depends only on the domination number of the graph.

AdaNet: A Scalable and Flexible Framework for Automatically Learning Ensembles

1 code implementation30 Apr 2019 Charles Weill, Javier Gonzalvo, Vitaly Kuznetsov, Scott Yang, Scott Yak, Hanna Mazzawi, Eugen Hotaj, Ghassen Jerfel, Vladimir Macko, Ben Adlam, Mehryar Mohri, Corinna Cortes

AdaNet is a lightweight TensorFlow-based (Abadi et al., 2015) framework for automatically learning high-quality ensembles with minimal expert intervention.

Neural Architecture Search

Hypothesis Set Stability and Generalization

no code implementations NeurIPS 2019 Dylan J. Foster, Spencer Greenberg, Satyen Kale, Haipeng Luo, Mehryar Mohri, Karthik Sridharan

Our main result is a generalization bound for data-dependent hypothesis sets expressed in terms of a notion of hypothesis set stability and a notion of Rademacher complexity for data-dependent hypothesis sets that we introduce.

Agnostic Federated Learning

6 code implementations1 Feb 2019 Mehryar Mohri, Gary Sivek, Ananda Theertha Suresh

A key learning scenario in large-scale applications is that of federated learning, where a centralized model is trained based on data originating from a large number of clients.

Domain Adaptation Fairness +2

Efficient Gradient Computation for Structured Output Learning with Rational and Tropical Losses

no code implementations NeurIPS 2018 Corinna Cortes, Vitaly Kuznetsov, Mehryar Mohri, Dmitry Storcheus, Scott Yang

In this paper, we design efficient gradient computation algorithms for two broad families of structured prediction loss functions: rational and tropical losses.

Structured Prediction

Policy Regret in Repeated Games

no code implementations NeurIPS 2018 Raman Arora, Michael Dinitz, Teodor V. Marinov, Mehryar Mohri

We revisit the notion of policy regret and first show that there are online learning settings in which policy regret and external regret are incompatible: any sequence of play that achieves a favorable regret with respect to one definition must do poorly with respect to the other.

online learning

Algorithms and Theory for Multiple-Source Adaptation

no code implementations NeurIPS 2018 Judy Hoffman, Mehryar Mohri, Ningshan Zhang

This work includes a number of novel contributions for the multiple-source adaptation problem.

Online Non-Additive Path Learning under Full and Partial Information

no code implementations18 Apr 2018 Corinna Cortes, Vitaly Kuznetsov, Mehryar Mohri, Holakou Rahmanian, Manfred K. Warmuth

We study the problem of online path learning with non-additive gains, which is a central problem appearing in several applications, including ensemble structured prediction.

Structured Prediction

Logistic Regression: The Importance of Being Improper

no code implementations25 Mar 2018 Dylan J. Foster, Satyen Kale, Haipeng Luo, Mehryar Mohri, Karthik Sridharan

Starting with the simple observation that the logistic loss is $1$-mixable, we design a new efficient improper learning algorithm for online logistic regression that circumvents the aforementioned lower bound with a regret bound exhibiting a doubly-exponential improvement in dependence on the predictor norm.

Theory and Algorithms for Forecasting Time Series

no code implementations15 Mar 2018 Vitaly Kuznetsov, Mehryar Mohri

We present data-dependent learning bounds for the general scenario of non-stationary non-mixing stochastic processes.

Time Series Time Series Forecasting

Parameter-free online learning via model selection

no code implementations NeurIPS 2017 Dylan J. Foster, Satyen Kale, Mehryar Mohri, Karthik Sridharan

We introduce an efficient algorithmic framework for model selection in online learning, also known as parameter-free online learning.

Model Selection online learning

Online Learning with Transductive Regret

no code implementations NeurIPS 2017 Mehryar Mohri, Scott Yang

A by-product of our study is an algorithm for swap regret, which, under mild assumptions, is more efficient than existing ones, and a substantially more efficient algorithm for time selection swap regret.

online learning

Discriminative State Space Models

no code implementations NeurIPS 2017 Vitaly Kuznetsov, Mehryar Mohri

In this paper, we introduce and analyze Discriminative State-Space Models for forecasting non-stationary time series.

Time Series

Multiple-Source Adaptation for Regression Problems

no code implementations14 Nov 2017 Judy Hoffman, Mehryar Mohri, Ningshan Zhang

We present a detailed theoretical analysis of the problem of multiple-source adaptation in the general stochastic scenario, extending known results that assume a single target labeling function.

Sentiment Analysis

Discrepancy-Based Algorithms for Non-Stationary Rested Bandits

no code implementations29 Oct 2017 Corinna Cortes, Giulia Desalvo, Vitaly Kuznetsov, Mehryar Mohri, Scott Yang

We show that the notion of discrepancy can be used to design very general algorithms and a unified framework for the analysis of multi-armed rested bandit problems with non-stationary rewards.

Online Learning with Automata-based Expert Sequences

no code implementations29 Apr 2017 Mehryar Mohri, Scott Yang

We consider a general framework of online learning with expert advice where regret is defined with respect to sequences of experts accepted by a weighted automaton.

online learning

Online Learning with Abstention

no code implementations ICML 2018 Corinna Cortes, Giulia Desalvo, Claudio Gentile, Mehryar Mohri, Scott Yang

In the stochastic setting, we first point out a bias problem that limits the straightforward extension of algorithms such as UCB-N to time-varying feedback graphs, as needed in this context.

online learning

Boosting with Abstention

no code implementations NeurIPS 2016 Corinna Cortes, Giulia Desalvo, Mehryar Mohri

We present a new boosting algorithm for the key scenario of binary classification with abstention where the algorithm can abstain from predicting the label of a point, at the price of a fixed cost.

Optimistic Bandit Convex Optimization

no code implementations NeurIPS 2016 Scott Yang, Mehryar Mohri

We introduce the general and powerful scheme of predicting information re-use in optimization algorithms.

Generalization Bounds for Weighted Automata

no code implementations25 Oct 2016 Borja Balle, Mehryar Mohri

We present new data-dependent generalization guarantees for learning weighted automata expressed in terms of the Rademacher complexity of these families.

Generalization Bounds

Structured Prediction Theory Based on Factor Graph Complexity

no code implementations NeurIPS 2016 Corinna Cortes, Mehryar Mohri, Vitaly Kuznetsov, Scott Yang

We give new data-dependent margin guarantees for structured prediction for a very wide family of loss functions and a general family of hypotheses, with an arbitrary factor graph decomposition.

Structured Prediction

Revenue Optimization against Strategic Buyers

no code implementations NeurIPS 2015 Mehryar Mohri, Andres Munoz

We present a revenue optimization algorithm for posted-price auctions when facing a buyer with random valuations who seeks to optimize his $\gamma$-discounted surplus.

Learning Theory and Algorithms for Forecasting Non-stationary Time Series

no code implementations NeurIPS 2015 Vitaly Kuznetsov, Mehryar Mohri

We present data-dependent learning bounds for the general scenario of non-stationary non-mixing stochastic processes.

Learning Theory Time Series +1

Foundations of Coupled Nonlinear Dimensionality Reduction

no code implementations29 Sep 2015 Mehryar Mohri, Afshin Rostamizadeh, Dmitry Storcheus

The generalization error bound is based on a careful analysis of the empirical Rademacher complexity of the relevant hypothesis set.

Generalization Bounds Supervised dimensionality reduction

Accelerating Optimization via Adaptive Prediction

no code implementations18 Sep 2015 Mehryar Mohri, Scott Yang

We present a powerful general framework for designing data-dependent optimization algorithms, building upon and unifying recent techniques in adaptive regularization, optimistic gradient predictions, and problem-dependent randomization.

Voted Kernel Regularization

no code implementations14 Sep 2015 Corinna Cortes, Prasoon Goyal, Vitaly Kuznetsov, Mehryar Mohri

This paper presents an algorithm, Voted Kernel Regularization , that provides the flexibility of using potentially very complex kernel functions such as predictors based on much higher-degree polynomial kernels, while benefitting from strong learning guarantees.

General Classification

Non-parametric Revenue Optimization for Generalized Second Price Auctions

no code implementations8 Jun 2015 Mehryar Mohri, Andres Munoz Medina

To our knowledge, this is the first attempt to apply learning algorithms to the problem of reserve price optimization in GSP auctions.

Density Estimation

Optimal Regret Minimization in Posted-Price Auctions with Strategic Buyers

no code implementations NeurIPS 2014 Mehryar Mohri, Andres Munoz

We study revenue optimization learning algorithms for posted-price auctions with strategic buyers.

Multi-Class Deep Boosting

no code implementations NeurIPS 2014 Vitaly Kuznetsov, Mehryar Mohri, Umar Syed

We give new data-dependent learning bounds for convex ensembles in the multi-class classification setting expressed in terms of the Rademacher complexities of the sub-families composing the base classifier set, and the mixture weight assigned to each sub-family.

Ensemble Learning General Classification +1

Conditional Swap Regret and Conditional Correlated Equilibrium

no code implementations NeurIPS 2014 Mehryar Mohri, Scott Yang

We introduce a natural extension of the notion of swap regret, conditional swap regret, that allows for action modifications conditioned on the player’s action history.

Revenue Optimization in Posted-Price Auctions with Strategic Buyers

no code implementations23 Nov 2014 Mehryar Mohri, Andres Muñoz Medina

We study revenue optimization learning algorithms for posted-price auctions with strategic buyers.

Adaptation Algorithm and Theory Based on Generalized Discrepancy

no code implementations7 May 2014 Corinna Cortes, Mehryar Mohri, Andres Muñoz Medina

We present a new algorithm for domain adaptation improving upon a discrepancy minimization algorithm previously shown to outperform a number of algorithms for this task.

Domain Adaptation

Relative Deviation Learning Bounds and Generalization with Unbounded Loss Functions

no code implementations22 Oct 2013 Corinna Cortes, Spencer Greenberg, Mehryar Mohri

We present an extensive analysis of relative deviation bounds, including detailed proofs of two-sided inequalities and their implications.

Generalization Bounds

Learning Theory and Algorithms for Revenue Optimization in Second-Price Auctions with Reserve

no code implementations21 Oct 2013 Mehryar Mohri, Andres Muñoz Medina

Second-price auctions with reserve play a critical role for modern search engine and popular online sites since the revenue of these companies often directly de- pends on the outcome of such auctions.

Learning Theory

Tight Lower Bound on the Probability of a Binomial Exceeding its Expectation

no code implementations6 Jun 2013 Spencer Greenberg, Mehryar Mohri

We give the proof of a tight lower bound on the probability that a binomial random variable exceeds its expected value.

Generalization Bounds Learning Theory

Perceptron Mistake Bounds

no code implementations1 May 2013 Mehryar Mohri, Afshin Rostamizadeh

We present a brief survey of existing mistake bounds and introduce novel bounds for the Perceptron or the kernel Perceptron algorithm.

Accuracy at the Top

no code implementations NeurIPS 2012 Stephen Boyd, Corinna Cortes, Mehryar Mohri, Ana Radovanovic

We introduce a new notion of classification accuracy based on the top $\tau$-quantile values of a scoring function, a relevant criterion in a number of problems arising for search engines.

General Classification

Algorithms for Learning Kernels Based on Centered Alignment

no code implementations2 Mar 2012 Corinna Cortes, Mehryar Mohri, Afshin Rostamizadeh

Our theoretical results include a novel concentration bound for centered alignment between kernel matrices, the proof of the existence of effective predictors for kernels with high alignment, both for classification and for regression, and the proof of stability-based generalization bounds for a broad family of algorithms for learning kernels based on centered alignment.

General Classification Generalization Bounds

Learning Bounds for Importance Weighting

no code implementations NeurIPS 2010 Corinna Cortes, Yishay Mansour, Mehryar Mohri

This paper presents an analysis of importance weighting for learning from finite samples and gives a series of theoretical and algorithmic results.

Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models

no code implementations NeurIPS 2009 Ryan Mcdonald, Mehryar Mohri, Nathan Silberman, Dan Walker, Gideon S. Mann

Training conditional maximum entropy models on massive data requires significant time and computational resources.

Polynomial Semantic Indexing

no code implementations NeurIPS 2009 Bing Bai, Jason Weston, David Grangier, Ronan Collobert, Kunihiko Sadamasa, Yanjun Qi, Corinna Cortes, Mehryar Mohri

We present a class of nonlinear (polynomial) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score.

Ensemble Nystrom Method

no code implementations NeurIPS 2009 Sanjiv Kumar, Mehryar Mohri, Ameet Talwalkar

A crucial technique for scaling kernel methods to very large data sets reaching or exceeding millions of instances is based on low-rank approximation of kernel matrices.

Learning Non-Linear Combinations of Kernels

no code implementations NeurIPS 2009 Corinna Cortes, Mehryar Mohri, Afshin Rostamizadeh

This paper studies the general problem of learning kernels based on a polynomial combination of base kernels.

Domain Adaptation: Learning Bounds and Algorithms

no code implementations19 Feb 2009 Yishay Mansour, Mehryar Mohri, Afshin Rostamizadeh

This motivates our analysis of the problem of minimizing the empirical discrepancy for various loss functions for which we also give novel algorithms.

Domain Adaptation Generalization Bounds

Rademacher Complexity Bounds for Non-I.I.D. Processes

no code implementations NeurIPS 2008 Mehryar Mohri, Afshin Rostamizadeh

In particular, they are data-dependent and measure the complexity of a class of hypotheses based on the training sample.

Generalization Bounds

Domain Adaptation with Multiple Sources

no code implementations NeurIPS 2008 Yishay Mansour, Mehryar Mohri, Afshin Rostamizadeh

The problem consists of combining these hypotheses to derive a hypothesis with small error with respect to the target domain.

Domain Adaptation

Stability Bounds for Non-i.i.d. Processes

no code implementations NeurIPS 2007 Mehryar Mohri, Afshin Rostamizadeh

We also illustrate their application in the case of several general classes of learning algorithms, including Support Vector Regression and Kernel Ridge Regression.

Generalization Bounds Learning Theory +2

Cannot find the paper you are looking for? You can Submit a new open access paper.