Search Results for author: Eric Mazumdar

Found 21 papers, 1 papers with code

Rethinking Scaling Laws for Learning in Strategic Environments

no code implementations • 12 Feb 2024 • Tinashe Handina, Eric Mazumdar

We find that strategic interactions can break the conventional view of scaling laws$\unicode{x2013}$meaning that performance does not necessarily monotonically improve as models get larger and/ or more expressive (even with infinite data).

Model Selection Multi-agent Reinforcement Learning

Paper
Add Code

Two-Timescale Q-Learning with Function Approximation in Zero-Sum Stochastic Games

no code implementations • 8 Dec 2023 • Zaiwei Chen, Kaiqing Zhang, Eric Mazumdar, Asuman Ozdaglar, Adam Wierman

Specifically, through a change of variable, we show that the update equation of the slow-timescale iterates resembles the classical smoothed best-response dynamics, where the regularized Nash gap serves as a valid Lyapunov function.

Q-Learning valid

Paper
Add Code

Algorithmic Collective Action in Machine Learning

no code implementations • 8 Feb 2023 • Moritz Hardt, Eric Mazumdar, Celestine Mendler-Dünner, Tijana Zrnic

We initiate a principled study of algorithmic collective action on digital platforms that deploy machine learning algorithms.

Language Modelling

Paper
Add Code

Follower Agnostic Methods for Stackelberg Games

no code implementations • 2 Feb 2023 • Chinmay Maheshwari, James Cheng, S. Shankar Sasty, Lillian Ratliff, Eric Mazumdar

In this paper, we present an efficient algorithm to solve online Stackelberg games, featuring multiple followers, in a follower-agnostic manner.

Paper
Add Code

A Note on Zeroth-Order Optimization on the Simplex

no code implementations • 2 Aug 2022 • Tijana Zrnic, Eric Mazumdar

The proposed estimator queries the simplex only.

Paper
Add Code

Langevin Monte Carlo for Contextual Bandits

1 code implementation • 22 Jun 2022 • Pan Xu, Hongkai Zheng, Eric Mazumdar, Kamyar Azizzadenesheli, Anima Anandkumar

Existing Thompson sampling-based algorithms need to construct a Laplace approximation (i. e., a Gaussian distribution) of the posterior distribution, which is inefficient to sample in high dimensional applications for general covariance matrices.

Multi-Armed Bandits Thompson Sampling

Paper
Code

Decentralized, Communication- and Coordination-free Learning in Structured Matching Markets

no code implementations • 6 Jun 2022 • Chinmay Maheshwari, Eric Mazumdar, Shankar Sastry

We study the problem of online learning in competitive settings in the context of two-sided matching markets.

Paper
Add Code

Global Convergence to Local Minmax Equilibrium in Classes of Nonconvex Zero-Sum Games

no code implementations • NeurIPS 2021 • Tanner Fiez, Lillian Ratliff, Eric Mazumdar, Evan Faulkner, Adhyyan Narang

For the class of nonconvex-PL zero-sum games, we exploit timescale separation to construct a potential function that when combined with the stability characterization and an asymptotic saddle avoidance result gives a global asymptotic almost-sure convergence guarantee to a set of the strict local minmax equilibrium.

Paper
Add Code

Who Leads and Who Follows in Strategic Classification?

no code implementations • NeurIPS 2021 • Tijana Zrnic, Eric Mazumdar, S. Shankar Sastry, Michael I. Jordan

In particular, by generalizing the standard model to allow both players to learn over time, we show that a decision-maker that makes updates faster than the agents can reverse the order of play, meaning that the agents lead and the decision-maker follows.

Classification

Paper
Add Code

Zeroth-Order Methods for Convex-Concave Minmax Problems: Applications to Decision-Dependent Risk Minimization

no code implementations • 16 Jun 2021 • Chinmay Maheshwari, Chih-Yuan Chiu, Eric Mazumdar, S. Shankar Sastry, Lillian J. Ratliff

Min-max optimization is emerging as a key framework for analyzing problems of robustness to strategically and adversarially generated data.

Paper
Add Code

Global Convergence to Local Minmax Equilibrium in Classes of Nonconvex Zero-Sum Games

no code implementations • NeurIPS 2021 • Tanner Fiez, Lillian J Ratliff, Eric Mazumdar, Evan Faulkner, Adhyyan Narang

Paper
Add Code

Fast Distributionally Robust Learning with Variance Reduced Min-Max Optimization

no code implementations • 27 Apr 2021 • Yaodong Yu, Tianyi Lin, Eric Mazumdar, Michael I. Jordan

Distributionally robust supervised learning (DRSL) is emerging as a key paradigm for building reliable machine learning systems for real-world applications -- reflecting the need for classifiers and predictive models that are robust to the distribution shifts that arise from phenomena such as selection bias or nonstationarity.

BIG-bench Machine Learning Selection bias

Paper
Add Code

Expert Selection in High-Dimensional Markov Decision Processes

no code implementations • 26 Oct 2020 • Vicenc Rubies-Royo, Eric Mazumdar, Roy Dong, Claire Tomlin, S. Shankar Sastry

In this work we present a multi-armed bandit framework for online expert selection in Markov decision processes and demonstrate its use in high-dimensional settings.

Vocal Bursts Intensity Prediction

Paper
Add Code

Technical Report: Adaptive Control for Linearizable Systems Using On-Policy Reinforcement Learning

no code implementations • 6 Apr 2020 • Tyler Westenbroek, Eric Mazumdar, David Fridovich-Keil, Valmik Prabhu, Claire J. Tomlin, S. Shankar Sastry

This paper proposes a framework for adaptively learning a feedback linearization-based tracking controller for an unknown system using discrete-time model-free policy-gradient parameter update rules.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

On Thompson Sampling with Langevin Algorithms

no code implementations • ICML 2020 • Eric Mazumdar, Aldo Pacchiano, Yi-An Ma, Peter L. Bartlett, Michael. I. Jordan

The resulting approximate Thompson sampling algorithm has logarithmic regret and its computational complexity does not scale with the time horizon of the algorithm.

Thompson Sampling

Paper
Add Code

Feedback Linearization for Unknown Systems via Reinforcement Learning

no code implementations • 29 Oct 2019 • Tyler Westenbroek, David Fridovich-Keil, Eric Mazumdar, Shreyas Arora, Valmik Prabhu, S. Shankar Sastry, Claire J. Tomlin

We present a novel approach to control design for nonlinear systems which leverages model-free policy optimization techniques to learn a linearizing controller for a physical plant with unknown dynamics.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Policy-Gradient Algorithms Have No Guarantees of Convergence in Linear Quadratic Games

no code implementations • 8 Jul 2019 • Eric Mazumdar, Lillian J. Ratliff, Michael. I. Jordan, S. Shankar Sastry

In such games the state and action spaces are continuous and global Nash equilibria can be found be solving coupled Ricatti equations.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Convergence Analysis of Gradient-Based Learning with Non-Uniform Learning Rates in Non-Cooperative Multi-Agent Settings

no code implementations • 30 May 2019 • Benjamin Chasnov, Lillian J. Ratliff, Eric Mazumdar, Samuel A. Burden

Considering a class of gradient-based multi-agent learning algorithms in non-cooperative settings, we provide local convergence guarantees to a neighborhood of a stable local Nash equilibrium.

Paper
Add Code

On Gradient-Based Learning in Continuous Games

no code implementations • 16 Apr 2018 • Eric Mazumdar, Lillian J. Ratliff, S. Shankar Sastry

We formulate a general framework for competitive gradient-based learning that encompasses a wide breadth of multi-agent learning algorithms, and analyze the limiting behavior of competitive gradient-based learning algorithms using dynamical systems theory.

Multi-agent Reinforcement Learning

Paper
Add Code

A Multi-Armed Bandit Approach for Online Expert Selection in Markov Decision Processes

no code implementations • 18 Jul 2017 • Eric Mazumdar, Roy Dong, Vicenç Rúbies Royo, Claire Tomlin, S. Shankar Sastry

We formulate a multi-armed bandit (MAB) approach to choosing expert policies online in Markov decision processes (MDPs).

Systems and Control

Paper
Add Code

Inverse Risk-Sensitive Reinforcement Learning

no code implementations • 29 Mar 2017 • Lillian J. Ratliff, Eric Mazumdar

We address the problem of inverse reinforcement learning in Markov decision processes where the agent is risk-sensitive.

Decision Making reinforcement-learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.