Search Results for author: Peter Bartlett

Found 24 papers, 3 papers with code

Contextual Bandits with Stage-wise Constraints

no code implementations15 Jan 2024 Aldo Pacchiano, Mohammad Ghavamzadeh, Peter Bartlett

In the setting that the constraint is in expectation, we further specialize our results to multi-armed bandits and propose a computationally efficient algorithm for this setting with regret analysis.

Multi-Armed Bandits

Can a Transformer Represent a Kalman Filter?

no code implementations12 Dec 2023 Gautam Goel, Peter Bartlett

We revisit the problem of Kalman Filtering in linear dynamical systems and show that Transformers can approximate the Kalman Filter in a strong sense.

Joint Representation Training in Sequential Tasks with Shared Structure

no code implementations24 Jun 2022 Aldo Pacchiano, Ofir Nachum, Nilseh Tripuraneni, Peter Bartlett

In contrast with previous work that have studied multi task RL in other function approximation models, we show that in the presence of bilinear optimization oracle and finite state action spaces there exists a computationally efficient algorithm for multitask MatrixRL via a reduction to quadratic programming.

Multi-Armed Bandits Reinforcement Learning (RL)

Generalization Bounds for Data-Driven Numerical Linear Algebra

no code implementations16 Jun 2022 Peter Bartlett, Piotr Indyk, Tal Wagner

Our techniques are general, and provide generalization bounds for many other recently proposed data-driven algorithms in numerical linear algebra, covering both sketching-based and multigrid-based methods.

Generalization Bounds PAC learning

A Complete Characterization of Linear Estimators for Offline Policy Evaluation

no code implementations8 Mar 2022 Juan C. Perdomo, Akshay Krishnamurthy, Peter Bartlett, Sham Kakade

Offline policy evaluation is a fundamental statistical problem in reinforcement learning that involves estimating the value function of some decision-making policy given data collected by a potentially different policy.

Decision Making reinforcement-learning +1

Parallelizing Contextual Bandits

no code implementations21 May 2021 Jeffrey Chan, Aldo Pacchiano, Nilesh Tripuraneni, Yun S. Song, Peter Bartlett, Michael I. Jordan

Standard approaches to decision-making under uncertainty focus on sequential exploration of the space of decisions.

Decision Making Decision Making Under Uncertainty +1

Towards a Dimension-Free Understanding of Adaptive Linear Control

no code implementations19 Mar 2021 Juan C. Perdomo, Max Simchowitz, Alekh Agarwal, Peter Bartlett

We study the problem of adaptive control of the linear quadratic regulator for systems in very high, or even infinite dimension.

Near Optimal Policy Optimization via REPS

no code implementations NeurIPS 2021 Aldo Pacchiano, Jonathan Lee, Peter Bartlett, Ofir Nachum

Since its introduction a decade ago, \emph{relative entropy policy search} (REPS) has demonstrated successful policy learning on a number of simulated and real-world robotic domains, not to mention providing algorithmic components used by many recently proposed reinforcement learning (RL) algorithms.

Reinforcement Learning (RL)

Regret Bound Balancing and Elimination for Model Selection in Bandits and RL

no code implementations24 Dec 2020 Aldo Pacchiano, Christoph Dann, Claudio Gentile, Peter Bartlett

Finally, unlike recent efforts in model selection for linear stochastic bandits, our approach is versatile enough to also cover cases where the context information is generated by an adversarial environment, rather than a stochastic one.

Model Selection valid

Accelerated Message Passing for Entropy-Regularized MAP Inference

no code implementations ICML 2020 Jonathan N. Lee, Aldo Pacchiano, Peter Bartlett, Michael. I. Jordan

Maximum a posteriori (MAP) inference in discrete-valued Markov random fields is a fundamental problem in machine learning that involves identifying the most likely configuration of random variables given a distribution.

Stochastic Bandits with Linear Constraints

no code implementations17 Jun 2020 Aldo Pacchiano, Mohammad Ghavamzadeh, Peter Bartlett, Heinrich Jiang

We propose an upper-confidence bound algorithm for this problem, called optimistic pessimistic linear bandit (OPLB), and prove an $\widetilde{\mathcal{O}}(\frac{d\sqrt{T}}{\tau-c_0})$ bound on its $T$-round regret, where the denominator is the difference between the constraint threshold and the cost of a known feasible action.

Multi-Armed Bandits

Dropout: Explicit Forms and Capacity Control

no code implementations ICLR 2020 Raman Arora, Peter Bartlett, Poorya Mianjy, Nathan Srebro

In deep learning, we show that the data-dependent regularizer due to dropout directly controls the Rademacher complexity of the underlying class of deep neural networks.

BIG-bench Machine Learning Matrix Completion

Is There an Analog of Nesterov Acceleration for MCMC?

no code implementations4 Feb 2019 Yi-An Ma, Niladri Chatterji, Xiang Cheng, Nicolas Flammarion, Peter Bartlett, Michael. I. Jordan

We formulate gradient-based Markov chain Monte Carlo (MCMC) sampling as optimization on the space of probability measures, with Kullback-Leibler (KL) divergence as the objective functional.

Rademacher Complexity for Adversarially Robust Generalization

1 code implementation29 Oct 2018 Dong Yin, Kannan Ramchandran, Peter Bartlett

For binary linear classifiers, we prove tight bounds for the adversarial Rademacher complexity, and show that the adversarial Rademacher complexity is never smaller than its natural counterpart, and it has an unavoidable dimension dependence, unless the weight vector has bounded $\ell_1$ norm.

BIG-bench Machine Learning Test

Gradient descent with identity initialization efficiently learns positive definite linear transformations by deep residual networks

no code implementations ICML 2018 Peter Bartlett, Dave Helmbold, Philip Long

We provide polynomial bounds on the number of iterations for gradient descent to approximate the least squares matrix $\Phi$, in the case where the initial hypothesis $\Theta_1 = ... = \Theta_L = I$ has excess loss bounded by a small enough constant.

Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning

no code implementations14 Jun 2018 Dong Yin, Yudong Chen, Kannan Ramchandran, Peter Bartlett

In this setting, the Byzantine machines may create fake local minima near a saddle point that is far away from any true local minimum, even when robust gradient estimators are used.

Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates

1 code implementation ICML 2018 Dong Yin, Yudong Chen, Kannan Ramchandran, Peter Bartlett

In particular, these algorithms are shown to achieve order-optimal statistical error rates for strongly convex losses.

Spectrally-normalized margin bounds for neural networks

1 code implementation NeurIPS 2017 Peter Bartlett, Dylan J. Foster, Matus Telgarsky

This paper presents a margin-based multiclass generalization bound for neural networks that scales with their margin-normalized "spectral complexity": their Lipschitz constant, meaning the product of the spectral norms of the weight matrices, times a certain correction factor.

Gradient Diversity: a Key Ingredient for Scalable Distributed Learning

no code implementations18 Jun 2017 Dong Yin, Ashwin Pananjady, Max Lam, Dimitris Papailiopoulos, Kannan Ramchandran, Peter Bartlett

It has been experimentally observed that distributed implementations of mini-batch stochastic gradient descent (SGD) algorithms exhibit speedup saturation and decaying generalization ability beyond a particular batch-size.


Convergence of Langevin MCMC in KL-divergence

no code implementations25 May 2017 Xiang Cheng, Peter Bartlett

Langevin diffusion is a commonly used tool for sampling from a given distribution.

Horizon-Independent Optimal Prediction with Log-Loss in Exponential Families

no code implementations19 May 2013 Peter Bartlett, Peter Grunwald, Peter Harremoes, Fares Hedayati, Wojciech Kotlowski

Keywords: SNML Exchangeability, Exponential Family, Online Learning, Logarithmic Loss, Bayesian Strategy, Jeffreys Prior, Fisher Information1

Advice-Efficient Prediction with Expert Advice

no code implementations12 Apr 2013 Yevgeny Seldin, Peter Bartlett, Koby Crammer

Advice-efficient prediction with expert advice (in analogy to label-efficient prediction) is a variant of prediction with expert advice game, where on each round of the game we are allowed to ask for advice of a limited number $M$ out of $N$ experts.

Cannot find the paper you are looking for? You can Submit a new open access paper.