Search Results for author: Noah Golowich

Found 28 papers, 0 papers with code

Exploration is Harder than Prediction: Cryptographically Separating Reinforcement Learning from Supervised Learning

no code implementations • 4 Apr 2024 • Noah Golowich, Ankur Moitra, Dhruv Rohatgi

We also show that there is no computationally efficient algorithm for reward-directed RL in block MDPs, even when given access to an oracle for this regression problem.

regression Reinforcement Learning (RL)

Paper
Add Code

From External to Swap Regret 2.0: An Efficient Reduction and Oblivious Adversary for Large Action Spaces

no code implementations • 30 Oct 2023 • Yuval Dagan, Constantinos Daskalakis, Maxwell Fishelson, Noah Golowich

We provide a novel reduction from swap-regret minimization to external-regret minimization, which improves upon the classical reductions of Blum-Mansour [BM07] and Stolz-Lugosi [SL05] in that it does not require finiteness of the space of actions.

Paper
Add Code

Smooth Nash Equilibria: Algorithms and Complexity

no code implementations • 21 Sep 2023 • Constantinos Daskalakis, Noah Golowich, Nika Haghtalab, Abhishek Shetty

We show that both weak and strong $\sigma$-smooth Nash equilibria have superior computational properties to Nash equilibria: when $\sigma$ as well as an approximation parameter $\epsilon$ and the number of players are all constants, there is a constant-time randomized algorithm to find a weak $\epsilon$-approximate $\sigma$-smooth Nash equilibrium in normal-form games.

Paper
Add Code

Exploring and Learning in Sparse Linear MDPs without Computationally Intractable Oracles

no code implementations • 18 Sep 2023 • Noah Golowich, Ankur Moitra, Dhruv Rohatgi

The key assumption underlying linear Markov Decision Processes (MDPs) is that the learner has access to a known feature map $\phi(x, a)$ that maps state-action pairs to $d$-dimensional vectors, and that the rewards and transitions are linear functions in this representation.

feature selection Learning Theory +1

Paper
Add Code

On the Complexity of Multi-Agent Decision Making: From Learning in Games to Partial Monitoring

no code implementations • 1 May 2023 • Dylan J. Foster, Dean P. Foster, Noah Golowich, Alexander Rakhlin

Compared to the best results for the single-agent setting, our bounds have additional gaps.

Decision Making Multi-agent Reinforcement Learning

Paper
Add Code

Hardness of Independent Learning and Sparse Equilibrium Computation in Markov Games

no code implementations • 22 Mar 2023 • Dylan J. Foster, Noah Golowich, Sham M. Kakade

They are proven via lower bounds for a simpler problem we refer to as SparseCCE, in which the goal is to compute a coarse correlated equilibrium that is sparse in the sense that it can be represented as a mixture of a small number of product policies.

Computational Efficiency Multi-agent Reinforcement Learning

Paper
Add Code

Tight Guarantees for Interactive Decision Making with the Decision-Estimation Coefficient

no code implementations • 19 Jan 2023 • Dylan J. Foster, Noah Golowich, Yanjun Han

Recently, Foster et al. (2021) introduced the Decision-Estimation Coefficient (DEC), a measure of statistical complexity which leads to upper and lower bounds on the optimal sample complexity for a general class of problems encompassing bandits and reinforcement learning with function approximation.

Decision Making reinforcement-learning +1

Paper
Add Code

STay-ON-the-Ridge: Guaranteed Convergence to Local Minimax Equilibrium in Nonconvex-Nonconcave Games

no code implementations • 18 Oct 2022 • Constantinos Daskalakis, Noah Golowich, Stratis Skoulakis, Manolis Zampetakis

In particular, our method is not designed to decrease some potential function, such as the distance of its iterate from the set of local min-max equilibria or the projected gradient of the objective, but is designed to satisfy a topological property that guarantees the avoidance of cycles and implies its convergence.

Paper
Add Code

Learning in Observable POMDPs, without Computationally Intractable Oracles

no code implementations • 7 Jun 2022 • Noah Golowich, Ankur Moitra, Dhruv Rohatgi

Much of reinforcement learning theory is built on top of oracles that are computationally hard to implement.

Learning Theory Reinforcement Learning (RL)

Paper
Add Code

The Complexity of Markov Equilibrium in Stochastic Games

no code implementations • 8 Apr 2022 • Constantinos Daskalakis, Noah Golowich, Kaiqing Zhang

Previous work for learning Markov CCE policies all required exponential time and sample complexity in the number of players.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Smoothed Online Learning is as Easy as Statistical Learning

no code implementations • 9 Feb 2022 • Adam Block, Yuval Dagan, Noah Golowich, Alexander Rakhlin

We then prove a lower bound on the oracle complexity of any proper learning algorithm, which matches the oracle-efficient upper bounds up to a polynomial factor, thus demonstrating the existence of a statistical-computational gap in smooth online learning.

Learning Theory Multi-Armed Bandits

Paper
Add Code

Planning in Observable POMDPs in Quasipolynomial Time

no code implementations • 12 Jan 2022 • Noah Golowich, Ankur Moitra, Dhruv Rohatgi

Our main result is a quasipolynomial-time algorithm for planning in (one-step) observable POMDPs.

Paper
Add Code

Differentially Private Nonparametric Regression Under a Growth Condition

no code implementations • 24 Nov 2021 • Noah Golowich

Inspired by recent results for the related setting of binary classification (Alon et al., 2019; Bun et al., 2020), where it was shown that online learnability of a binary class is necessary and sufficient for its private learnability, Jung et al. (2020) showed that in the setting of regression, online learnability of $\mathcal{H}$ is necessary for private learnability.

Binary Classification regression

Paper
Add Code

Fast Rates for Nonparametric Online Learning: From Realizability to Learning in Games

no code implementations • 17 Nov 2021 • Constantinos Daskalakis, Noah Golowich

Our contributions are two-fold: - In the realizable setting of nonparametric online regression with the absolute loss, we propose a randomized proper learning algorithm which gets a near-optimal cumulative loss in terms of the sequential fat-shattering dimension of the hypothesis class.

regression

Paper
Add Code

Near-Optimal No-Regret Learning for Correlated Equilibria in Multi-Player General-Sum Games

no code implementations • 11 Nov 2021 • Ioannis Anagnostides, Constantinos Daskalakis, Gabriele Farina, Maxwell Fishelson, Noah Golowich, Tuomas Sandholm

Recently, Daskalakis, Fishelson, and Golowich (DFG) (NeurIPS`21) showed that if all agents in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights Update (OMWU), the external regret of every player is $O(\textrm{polylog}(T))$ after $T$ repetitions of the game.

Paper
Add Code

Can Q-Learning be Improved with Advice?

no code implementations • 25 Oct 2021 • Noah Golowich, Ankur Moitra

In this paper we address the question of whether worst-case lower bounds for regret in online learning of Markov decision processes (MDPs) can be circumvented when information about the MDP, in the form of predictions about its optimal $Q$-value function, is given to the algorithm.

Q-Learning reinforcement-learning +2

Paper
Add Code

Near-Optimal No-Regret Learning in General Games

no code implementations • NeurIPS 2021 • Constantinos Daskalakis, Maxwell Fishelson, Noah Golowich

We show that Optimistic Hedge -- a common variant of multiplicative-weights-updates with recency bias -- attains ${\rm poly}(\log T)$ regret in multi-player general-sum games.

Paper
Add Code

Littlestone Classes are Privately Online Learnable

no code implementations • NeurIPS 2021 • Noah Golowich, Roi Livni

Specifically, we show that if the class $\mathcal{H}$ has constant Littlestone dimension then, given an oblivious sequence of labelled examples, there is a private learner that makes in expectation at most $O(\log T)$ mistakes -- comparable to the optimal mistake bound in the non-private case, up to a logarithmic factor.

Paper
Add Code

Deep Learning with Label Differential Privacy

no code implementations • NeurIPS 2021 • Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi, Chiyuan Zhang

The Randomized Response (RR) algorithm is a classical technique to improve robustness in survey aggregation, and has been widely adopted in applications with differential privacy guarantees.

Multi-class Classification

Paper
Add Code

Independent Policy Gradient Methods for Competitive Reinforcement Learning

no code implementations • NeurIPS 2020 • Constantinos Daskalakis, Dylan J. Foster, Noah Golowich

We obtain global, non-asymptotic convergence guarantees for independent learning algorithms in competitive reinforcement learning settings with two agents (i. e., zero-sum stochastic games).

Policy Gradient Methods reinforcement-learning +1

Paper
Add Code

Sample-efficient proper PAC learning with approximate differential privacy

no code implementations • 7 Dec 2020 • Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi

In this paper we prove that the sample complexity of properly learning a class of Littlestone dimension $d$ with approximate differential privacy is $\tilde O(d^6)$, ignoring privacy and accuracy parameters.

PAC learning

Paper
Add Code

Tight last-iterate convergence rates for no-regret learning in multi-player games

no code implementations • NeurIPS 2020 • Noah Golowich, Sarath Pattathil, Constantinos Daskalakis

We also show that the $O(1/\sqrt{T})$ rate is tight for all $p$-SCLI algorithms, which includes OG as a special case.

Paper
Add Code

Near-tight closure bounds for Littlestone and threshold dimensions

no code implementations • 7 Jul 2020 • Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi

We study closure properties for the Littlestone and threshold dimensions of binary hypothesis classes.

Paper
Add Code

Last Iterate is Slower than Averaged Iterate in Smooth Convex-Concave Saddle Point Problems

no code implementations • 31 Jan 2020 • Noah Golowich, Sarath Pattathil, Constantinos Daskalakis, Asuman Ozdaglar

In this paper we study the smooth convex-concave saddle point problem.

Paper
Add Code

On the Power of Multiple Anonymous Messages

no code implementations • 29 Aug 2019 • Badih Ghazi, Noah Golowich, Ravi Kumar, Rasmus Pagh, Ameya Velingker

- Protocols in the multi-message shuffled model with $poly(\log{B}, \log{n})$ bits of communication per user and $poly\log{B}$ error, which provide an exponential improvement on the error compared to what is possible with single-message algorithms.

Paper
Add Code

A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks

no code implementations • ICLR 2019 • Sanjeev Arora, Nadav Cohen, Noah Golowich, Wei Hu

We analyze speed of convergence to global optimum for gradient descent training a deep linear neural network (parameterized as $x \mapsto W_N W_{N-1} \cdots W_1 x$) by minimizing the $\ell_2$ loss over whitened data.

Paper
Add Code

Theory of Deep Learning IIb: Optimization Properties of SGD

no code implementations • 7 Jan 2018 • Chiyuan Zhang, Qianli Liao, Alexander Rakhlin, Brando Miranda, Noah Golowich, Tomaso Poggio

In Theory IIb we characterize with a mix of theory and experiments the optimization of deep convolutional networks by Stochastic Gradient Descent.

Paper
Add Code

Size-Independent Sample Complexity of Neural Networks

no code implementations • 18 Dec 2017 • Noah Golowich, Alexander Rakhlin, Ohad Shamir

We study the sample complexity of learning neural networks, by providing new bounds on their Rademacher complexity assuming norm constraints on the parameter matrix of each layer.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.