Search Results for author: Noah Golowich

Found 35 papers, 0 papers with code

Breaking the $T^{2/3}$ Barrier for Sequential Calibration

no code implementations19 Jun 2024 Yuval Dagan, Constantinos Daskalakis, Maxwell Fishelson, Noah Golowich, Robert Kleinberg, Princewill Okoroafor

We then give an improved \emph{upper bound} for the SPR game, which implies, via our equivalence, a forecasting algorithm with calibration error $O(T^{2/3 - \varepsilon})$ for some $\varepsilon > 0$, improving Foster & Vohra's upper bound for the first time.

Linear Bellman Completeness Suffices for Efficient Online Reinforcement Learning with Few Actions

no code implementations17 Jun 2024 Noah Golowich, Ankur Moitra

One of the most natural approaches to reinforcement learning (RL) with function approximation is value iteration, which inductively generates approximations to the optimal value function by solving a sequence of regression problems.

regression Reinforcement Learning (RL)

Is Efficient PAC Learning Possible with an Oracle That Responds 'Yes' or 'No'?

no code implementations17 Jun 2024 Constantinos Daskalakis, Noah Golowich

In this paper, we investigate the question of whether the ability to perform ERM, which computes a hypothesis minimizing empirical risk on a given dataset, is necessary for efficient learning: in particular, is there a weaker oracle than ERM which can nevertheless enable learnability?

Binary Classification PAC learning

The Role of Inherent Bellman Error in Offline Reinforcement Learning with Linear Function Approximation

no code implementations17 Jun 2024 Noah Golowich, Ankur Moitra

Our main structural assumption is that the MDP has low inherent Bellman error, which stipulates that linear value functions have linear Bellman backups with respect to the greedy policy.

Offline RL

Near-Optimal Learning and Planning in Separated Latent MDPs

no code implementations12 Jun 2024 Fan Chen, Constantinos Daskalakis, Noah Golowich, Alexander Rakhlin

We study computational and statistical aspects of learning Latent Markov Decision Processes (LMDPs).

Edit Distance Robust Watermarks for Language Models

no code implementations4 Jun 2024 Noah Golowich, Ankur Moitra

We aim for watermarks which satisfy: (a) undetectability, a cryptographic notion introduced by Christ, Gunn & Zamir (2024) which stipulates that it is computationally hard to distinguish watermarked language model outputs from the model's actual output distribution; and (b) robustness to channels which introduce a constant fraction of adversarial insertions, substitutions, and deletions to the watermarked text.

Language Modelling

Online Control in Population Dynamics

no code implementations3 Jun 2024 Noah Golowich, Elad Hazan, Zhou Lu, Dhruv Rohatgi, Y. Jennifer Sun

The study of population dynamics originated with early sociological works but has since extended into many fields, including biology, epidemiology, evolutionary game theory, and economics.

Epidemiology

Exploration is Harder than Prediction: Cryptographically Separating Reinforcement Learning from Supervised Learning

no code implementations4 Apr 2024 Noah Golowich, Ankur Moitra, Dhruv Rohatgi

We also show that there is no computationally efficient algorithm for reward-directed RL in block MDPs, even when given access to an oracle for this regression problem.

regression Reinforcement Learning (RL)

From External to Swap Regret 2.0: An Efficient Reduction and Oblivious Adversary for Large Action Spaces

no code implementations30 Oct 2023 Yuval Dagan, Constantinos Daskalakis, Maxwell Fishelson, Noah Golowich

We provide a novel reduction from swap-regret minimization to external-regret minimization, which improves upon the classical reductions of Blum-Mansour [BM07] and Stolz-Lugosi [SL05] in that it does not require finiteness of the space of actions.

Smooth Nash Equilibria: Algorithms and Complexity

no code implementations21 Sep 2023 Constantinos Daskalakis, Noah Golowich, Nika Haghtalab, Abhishek Shetty

We show that both weak and strong $\sigma$-smooth Nash equilibria have superior computational properties to Nash equilibria: when $\sigma$ as well as an approximation parameter $\epsilon$ and the number of players are all constants, there is a constant-time randomized algorithm to find a weak $\epsilon$-approximate $\sigma$-smooth Nash equilibrium in normal-form games.

Exploring and Learning in Sparse Linear MDPs without Computationally Intractable Oracles

no code implementations18 Sep 2023 Noah Golowich, Ankur Moitra, Dhruv Rohatgi

The key assumption underlying linear Markov Decision Processes (MDPs) is that the learner has access to a known feature map $\phi(x, a)$ that maps state-action pairs to $d$-dimensional vectors, and that the rewards and transitions are linear functions in this representation.

feature selection Learning Theory +1

Hardness of Independent Learning and Sparse Equilibrium Computation in Markov Games

no code implementations22 Mar 2023 Dylan J. Foster, Noah Golowich, Sham M. Kakade

They are proven via lower bounds for a simpler problem we refer to as SparseCCE, in which the goal is to compute a coarse correlated equilibrium that is sparse in the sense that it can be represented as a mixture of a small number of product policies.

Computational Efficiency Multi-agent Reinforcement Learning

Tight Guarantees for Interactive Decision Making with the Decision-Estimation Coefficient

no code implementations19 Jan 2023 Dylan J. Foster, Noah Golowich, Yanjun Han

Recently, Foster et al. (2021) introduced the Decision-Estimation Coefficient (DEC), a measure of statistical complexity which leads to upper and lower bounds on the optimal sample complexity for a general class of problems encompassing bandits and reinforcement learning with function approximation.

Decision Making reinforcement-learning +1

STay-ON-the-Ridge: Guaranteed Convergence to Local Minimax Equilibrium in Nonconvex-Nonconcave Games

no code implementations18 Oct 2022 Constantinos Daskalakis, Noah Golowich, Stratis Skoulakis, Manolis Zampetakis

In particular, our method is not designed to decrease some potential function, such as the distance of its iterate from the set of local min-max equilibria or the projected gradient of the objective, but is designed to satisfy a topological property that guarantees the avoidance of cycles and implies its convergence.

Learning in Observable POMDPs, without Computationally Intractable Oracles

no code implementations7 Jun 2022 Noah Golowich, Ankur Moitra, Dhruv Rohatgi

Much of reinforcement learning theory is built on top of oracles that are computationally hard to implement.

Learning Theory Reinforcement Learning (RL)

The Complexity of Markov Equilibrium in Stochastic Games

no code implementations8 Apr 2022 Constantinos Daskalakis, Noah Golowich, Kaiqing Zhang

Previous work for learning Markov CCE policies all required exponential time and sample complexity in the number of players.

Multi-agent Reinforcement Learning reinforcement-learning +2

Smoothed Online Learning is as Easy as Statistical Learning

no code implementations9 Feb 2022 Adam Block, Yuval Dagan, Noah Golowich, Alexander Rakhlin

We then prove a lower bound on the oracle complexity of any proper learning algorithm, which matches the oracle-efficient upper bounds up to a polynomial factor, thus demonstrating the existence of a statistical-computational gap in smooth online learning.

Learning Theory Multi-Armed Bandits

Planning in Observable POMDPs in Quasipolynomial Time

no code implementations12 Jan 2022 Noah Golowich, Ankur Moitra, Dhruv Rohatgi

Our main result is a quasipolynomial-time algorithm for planning in (one-step) observable POMDPs.

Differentially Private Nonparametric Regression Under a Growth Condition

no code implementations24 Nov 2021 Noah Golowich

Inspired by recent results for the related setting of binary classification (Alon et al., 2019; Bun et al., 2020), where it was shown that online learnability of a binary class is necessary and sufficient for its private learnability, Jung et al. (2020) showed that in the setting of regression, online learnability of $\mathcal{H}$ is necessary for private learnability.

Binary Classification regression

Fast Rates for Nonparametric Online Learning: From Realizability to Learning in Games

no code implementations17 Nov 2021 Constantinos Daskalakis, Noah Golowich

Our contributions are two-fold: - In the realizable setting of nonparametric online regression with the absolute loss, we propose a randomized proper learning algorithm which gets a near-optimal cumulative loss in terms of the sequential fat-shattering dimension of the hypothesis class.

regression

Near-Optimal No-Regret Learning for Correlated Equilibria in Multi-Player General-Sum Games

no code implementations11 Nov 2021 Ioannis Anagnostides, Constantinos Daskalakis, Gabriele Farina, Maxwell Fishelson, Noah Golowich, Tuomas Sandholm

Recently, Daskalakis, Fishelson, and Golowich (DFG) (NeurIPS`21) showed that if all agents in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights Update (OMWU), the external regret of every player is $O(\textrm{polylog}(T))$ after $T$ repetitions of the game.

Can Q-Learning be Improved with Advice?

no code implementations25 Oct 2021 Noah Golowich, Ankur Moitra

In this paper we address the question of whether worst-case lower bounds for regret in online learning of Markov decision processes (MDPs) can be circumvented when information about the MDP, in the form of predictions about its optimal $Q$-value function, is given to the algorithm.

Q-Learning reinforcement-learning +3

Near-Optimal No-Regret Learning in General Games

no code implementations NeurIPS 2021 Constantinos Daskalakis, Maxwell Fishelson, Noah Golowich

We show that Optimistic Hedge -- a common variant of multiplicative-weights-updates with recency bias -- attains ${\rm poly}(\log T)$ regret in multi-player general-sum games.

Littlestone Classes are Privately Online Learnable

no code implementations NeurIPS 2021 Noah Golowich, Roi Livni

Specifically, we show that if the class $\mathcal{H}$ has constant Littlestone dimension then, given an oblivious sequence of labelled examples, there is a private learner that makes in expectation at most $O(\log T)$ mistakes -- comparable to the optimal mistake bound in the non-private case, up to a logarithmic factor.

Deep Learning with Label Differential Privacy

no code implementations NeurIPS 2021 Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi, Chiyuan Zhang

The Randomized Response (RR) algorithm is a classical technique to improve robustness in survey aggregation, and has been widely adopted in applications with differential privacy guarantees.

Multi-class Classification

Independent Policy Gradient Methods for Competitive Reinforcement Learning

no code implementations NeurIPS 2020 Constantinos Daskalakis, Dylan J. Foster, Noah Golowich

We obtain global, non-asymptotic convergence guarantees for independent learning algorithms in competitive reinforcement learning settings with two agents (i. e., zero-sum stochastic games).

Policy Gradient Methods reinforcement-learning +2

Sample-efficient proper PAC learning with approximate differential privacy

no code implementations7 Dec 2020 Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi

In this paper we prove that the sample complexity of properly learning a class of Littlestone dimension $d$ with approximate differential privacy is $\tilde O(d^6)$, ignoring privacy and accuracy parameters.

PAC learning

Tight last-iterate convergence rates for no-regret learning in multi-player games

no code implementations NeurIPS 2020 Noah Golowich, Sarath Pattathil, Constantinos Daskalakis

We also show that the $O(1/\sqrt{T})$ rate is tight for all $p$-SCLI algorithms, which includes OG as a special case.

Near-tight closure bounds for Littlestone and threshold dimensions

no code implementations7 Jul 2020 Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi

We study closure properties for the Littlestone and threshold dimensions of binary hypothesis classes.

On the Power of Multiple Anonymous Messages

no code implementations29 Aug 2019 Badih Ghazi, Noah Golowich, Ravi Kumar, Rasmus Pagh, Ameya Velingker

- Protocols in the multi-message shuffled model with $poly(\log{B}, \log{n})$ bits of communication per user and $poly\log{B}$ error, which provide an exponential improvement on the error compared to what is possible with single-message algorithms.

A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks

no code implementations ICLR 2019 Sanjeev Arora, Nadav Cohen, Noah Golowich, Wei Hu

We analyze speed of convergence to global optimum for gradient descent training a deep linear neural network (parameterized as $x \mapsto W_N W_{N-1} \cdots W_1 x$) by minimizing the $\ell_2$ loss over whitened data.

Theory of Deep Learning IIb: Optimization Properties of SGD

no code implementations7 Jan 2018 Chiyuan Zhang, Qianli Liao, Alexander Rakhlin, Brando Miranda, Noah Golowich, Tomaso Poggio

In Theory IIb we characterize with a mix of theory and experiments the optimization of deep convolutional networks by Stochastic Gradient Descent.

Size-Independent Sample Complexity of Neural Networks

no code implementations18 Dec 2017 Noah Golowich, Alexander Rakhlin, Ohad Shamir

We study the sample complexity of learning neural networks, by providing new bounds on their Rademacher complexity assuming norm constraints on the parameter matrix of each layer.

Cannot find the paper you are looking for? You can Submit a new open access paper.