You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

no code implementations • 24 Jun 2022 • Aldo Pacchiano, Ofir Nachum, Nilseh Tripuraneni, Peter Bartlett

In contrast with previous work that have studied multi task RL in other function approximation models, we show that in the presence of bilinear optimization oracle and finite state action spaces there exists a computationally efficient algorithm for multitask MatrixRL via a reduction to quadratic programming.

no code implementations • 16 Jun 2022 • Peter Bartlett, Piotr Indyk, Tal Wagner

Our techniques are general, and provide generalization bounds for many other recently proposed data-driven algorithms in numerical linear algebra, covering both sketching-based and multigrid-based methods.

no code implementations • 8 Mar 2022 • Juan C. Perdomo, Akshay Krishnamurthy, Peter Bartlett, Sham Kakade

Offline policy evaluation is a fundamental statistical problem in reinforcement learning that involves estimating the value function of some decision-making policy given data collected by a potentially different policy.

no code implementations • 8 Nov 2021 • Aldo Pacchiano, Peter Bartlett, Michael I. Jordan

We study the problem of information sharing and cooperation in Multi-Player Multi-Armed bandits.

no code implementations • 21 May 2021 • Jeffrey Chan, Aldo Pacchiano, Nilesh Tripuraneni, Yun S. Song, Peter Bartlett, Michael I. Jordan

Standard approaches to decision-making under uncertainty focus on sequential exploration of the space of decisions.

no code implementations • 19 Mar 2021 • Juan C. Perdomo, Max Simchowitz, Alekh Agarwal, Peter Bartlett

We study the problem of adaptive control of the linear quadratic regulator for systems in very high, or even infinite dimension.

no code implementations • NeurIPS 2021 • Aldo Pacchiano, Jonathan Lee, Peter Bartlett, Ofir Nachum

Since its introduction a decade ago, \emph{relative entropy policy search} (REPS) has demonstrated successful policy learning on a number of simulated and real-world robotic domains, not to mention providing algorithmic components used by many recently proposed reinforcement learning (RL) algorithms.

no code implementations • 24 Dec 2020 • Aldo Pacchiano, Christoph Dann, Claudio Gentile, Peter Bartlett

Finally, unlike recent efforts in model selection for linear stochastic bandits, our approach is versatile enough to also cover cases where the context information is generated by an adversarial environment, rather than a stochastic one.

no code implementations • ICML 2020 • Jonathan N. Lee, Aldo Pacchiano, Peter Bartlett, Michael. I. Jordan

Maximum a posteriori (MAP) inference in discrete-valued Markov random fields is a fundamental problem in machine learning that involves identifying the most likely configuration of random variables given a distribution.

no code implementations • 17 Jun 2020 • Aldo Pacchiano, Mohammad Ghavamzadeh, Peter Bartlett, Heinrich Jiang

We propose an upper-confidence bound algorithm for this problem, called optimistic pessimistic linear bandit (OPLB), and prove an $\widetilde{\mathcal{O}}(\frac{d\sqrt{T}}{\tau-c_0})$ bound on its $T$-round regret, where the denominator is the difference between the constraint threshold and the cost of a known feasible action.

no code implementations • ICLR 2020 • Raman Arora, Peter Bartlett, Poorya Mianjy, Nathan Srebro

In deep learning, we show that the data-dependent regularizer due to dropout directly controls the Rademacher complexity of the underlying class of deep neural networks.

no code implementations • 4 Feb 2019 • Yi-An Ma, Niladri Chatterji, Xiang Cheng, Nicolas Flammarion, Peter Bartlett, Michael. I. Jordan

We formulate gradient-based Markov chain Monte Carlo (MCMC) sampling as optimization on the space of probability measures, with Kullback-Leibler (KL) divergence as the objective functional.

1 code implementation • 29 Oct 2018 • Dong Yin, Kannan Ramchandran, Peter Bartlett

For binary linear classifiers, we prove tight bounds for the adversarial Rademacher complexity, and show that the adversarial Rademacher complexity is never smaller than its natural counterpart, and it has an unavoidable dimension dependence, unless the weight vector has bounded $\ell_1$ norm.

no code implementations • ICML 2018 • Peter Bartlett, Dave Helmbold, Philip Long

We provide polynomial bounds on the number of iterations for gradient descent to approximate the least squares matrix $\Phi$, in the case where the initial hypothesis $\Theta_1 = ... = \Theta_L = I$ has excess loss bounded by a small enough constant.

no code implementations • 14 Jun 2018 • Dong Yin, Yudong Chen, Kannan Ramchandran, Peter Bartlett

In this setting, the Byzantine machines may create fake local minima near a saddle point that is far away from any true local minimum, even when robust gradient estimators are used.

1 code implementation • ICML 2018 • Dong Yin, Yudong Chen, Kannan Ramchandran, Peter Bartlett

In particular, these algorithms are shown to achieve order-optimal statistical error rates for strongly convex losses.

no code implementations • NeurIPS 2017 • Peter Bartlett, Dylan J. Foster, Matus Telgarsky

This paper presents a margin-based multiclass generalization bound for neural networks that scales with their margin-normalized "spectral complexity": their Lipschitz constant, meaning the product of the spectral norms of the weight matrices, times a certain correction factor.

no code implementations • 18 Jun 2017 • Dong Yin, Ashwin Pananjady, Max Lam, Dimitris Papailiopoulos, Kannan Ramchandran, Peter Bartlett

It has been experimentally observed that distributed implementations of mini-batch stochastic gradient descent (SGD) algorithms exhibit speedup saturation and decaying generalization ability beyond a particular batch-size.

no code implementations • 25 May 2017 • Xiang Cheng, Peter Bartlett

Langevin diffusion is a commonly used tool for sampling from a given distribution.

no code implementations • 19 May 2013 • Peter Bartlett, Peter Grunwald, Peter Harremoes, Fares Hedayati, Wojciech Kotlowski

Keywords: SNML Exchangeability, Exponential Family, Online Learning, Logarithmic Loss, Bayesian Strategy, Jeffreys Prior, Fisher Information1

no code implementations • 12 Apr 2013 • Yevgeny Seldin, Peter Bartlett, Koby Crammer

Advice-efficient prediction with expert advice (in analogy to label-efficient prediction) is a variant of prediction with expert advice game, where on each round of the game we are allowed to ask for advice of a limited number $M$ out of $N$ experts.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.