Search Results for author: Eric Mazumdar

Found 21 papers, 1 papers with code

Rethinking Scaling Laws for Learning in Strategic Environments

no code implementations12 Feb 2024 Tinashe Handina, Eric Mazumdar

We find that strategic interactions can break the conventional view of scaling laws$\unicode{x2013}$meaning that performance does not necessarily monotonically improve as models get larger and/ or more expressive (even with infinite data).

Model Selection Multi-agent Reinforcement Learning

Two-Timescale Q-Learning with Function Approximation in Zero-Sum Stochastic Games

no code implementations8 Dec 2023 Zaiwei Chen, Kaiqing Zhang, Eric Mazumdar, Asuman Ozdaglar, Adam Wierman

Specifically, through a change of variable, we show that the update equation of the slow-timescale iterates resembles the classical smoothed best-response dynamics, where the regularized Nash gap serves as a valid Lyapunov function.

Q-Learning valid

Algorithmic Collective Action in Machine Learning

no code implementations8 Feb 2023 Moritz Hardt, Eric Mazumdar, Celestine Mendler-Dünner, Tijana Zrnic

We initiate a principled study of algorithmic collective action on digital platforms that deploy machine learning algorithms.

Language Modelling

Follower Agnostic Methods for Stackelberg Games

no code implementations2 Feb 2023 Chinmay Maheshwari, S. Shankar Sasty, Lillian Ratliff, Eric Mazumdar

We propose an algorithm to solve a class of Stackelberg games (possibly with multiple followers) in a follower agnostic manner.

Langevin Monte Carlo for Contextual Bandits

1 code implementation22 Jun 2022 Pan Xu, Hongkai Zheng, Eric Mazumdar, Kamyar Azizzadenesheli, Anima Anandkumar

Existing Thompson sampling-based algorithms need to construct a Laplace approximation (i. e., a Gaussian distribution) of the posterior distribution, which is inefficient to sample in high dimensional applications for general covariance matrices.

Multi-Armed Bandits Thompson Sampling

Decentralized, Communication- and Coordination-free Learning in Structured Matching Markets

no code implementations6 Jun 2022 Chinmay Maheshwari, Eric Mazumdar, Shankar Sastry

We study the problem of online learning in competitive settings in the context of two-sided matching markets.

Global Convergence to Local Minmax Equilibrium in Classes of Nonconvex Zero-Sum Games

no code implementations NeurIPS 2021 Tanner Fiez, Lillian Ratliff, Eric Mazumdar, Evan Faulkner, Adhyyan Narang

For the class of nonconvex-PL zero-sum games, we exploit timescale separation to construct a potential function that when combined with the stability characterization and an asymptotic saddle avoidance result gives a global asymptotic almost-sure convergence guarantee to a set of the strict local minmax equilibrium.

Who Leads and Who Follows in Strategic Classification?

no code implementations NeurIPS 2021 Tijana Zrnic, Eric Mazumdar, S. Shankar Sastry, Michael I. Jordan

In particular, by generalizing the standard model to allow both players to learn over time, we show that a decision-maker that makes updates faster than the agents can reverse the order of play, meaning that the agents lead and the decision-maker follows.

Classification

Zeroth-Order Methods for Convex-Concave Minmax Problems: Applications to Decision-Dependent Risk Minimization

no code implementations16 Jun 2021 Chinmay Maheshwari, Chih-Yuan Chiu, Eric Mazumdar, S. Shankar Sastry, Lillian J. Ratliff

Min-max optimization is emerging as a key framework for analyzing problems of robustness to strategically and adversarially generated data.

Global Convergence to Local Minmax Equilibrium in Classes of Nonconvex Zero-Sum Games

no code implementations NeurIPS 2021 Tanner Fiez, Lillian J Ratliff, Eric Mazumdar, Evan Faulkner, Adhyyan Narang

For the class of nonconvex-PL zero-sum games, we exploit timescale separation to construct a potential function that when combined with the stability characterization and an asymptotic saddle avoidance result gives a global asymptotic almost-sure convergence guarantee to a set of the strict local minmax equilibrium.

Fast Distributionally Robust Learning with Variance Reduced Min-Max Optimization

no code implementations27 Apr 2021 Yaodong Yu, Tianyi Lin, Eric Mazumdar, Michael I. Jordan

Distributionally robust supervised learning (DRSL) is emerging as a key paradigm for building reliable machine learning systems for real-world applications -- reflecting the need for classifiers and predictive models that are robust to the distribution shifts that arise from phenomena such as selection bias or nonstationarity.

BIG-bench Machine Learning Selection bias

Expert Selection in High-Dimensional Markov Decision Processes

no code implementations26 Oct 2020 Vicenc Rubies-Royo, Eric Mazumdar, Roy Dong, Claire Tomlin, S. Shankar Sastry

In this work we present a multi-armed bandit framework for online expert selection in Markov decision processes and demonstrate its use in high-dimensional settings.

Vocal Bursts Intensity Prediction

Technical Report: Adaptive Control for Linearizable Systems Using On-Policy Reinforcement Learning

no code implementations6 Apr 2020 Tyler Westenbroek, Eric Mazumdar, David Fridovich-Keil, Valmik Prabhu, Claire J. Tomlin, S. Shankar Sastry

This paper proposes a framework for adaptively learning a feedback linearization-based tracking controller for an unknown system using discrete-time model-free policy-gradient parameter update rules.

reinforcement-learning Reinforcement Learning (RL)

On Thompson Sampling with Langevin Algorithms

no code implementations ICML 2020 Eric Mazumdar, Aldo Pacchiano, Yi-An Ma, Peter L. Bartlett, Michael. I. Jordan

The resulting approximate Thompson sampling algorithm has logarithmic regret and its computational complexity does not scale with the time horizon of the algorithm.

Thompson Sampling

Feedback Linearization for Unknown Systems via Reinforcement Learning

no code implementations29 Oct 2019 Tyler Westenbroek, David Fridovich-Keil, Eric Mazumdar, Shreyas Arora, Valmik Prabhu, S. Shankar Sastry, Claire J. Tomlin

We present a novel approach to control design for nonlinear systems which leverages model-free policy optimization techniques to learn a linearizing controller for a physical plant with unknown dynamics.

reinforcement-learning Reinforcement Learning (RL)

Convergence Analysis of Gradient-Based Learning with Non-Uniform Learning Rates in Non-Cooperative Multi-Agent Settings

no code implementations30 May 2019 Benjamin Chasnov, Lillian J. Ratliff, Eric Mazumdar, Samuel A. Burden

Considering a class of gradient-based multi-agent learning algorithms in non-cooperative settings, we provide local convergence guarantees to a neighborhood of a stable local Nash equilibrium.

On Gradient-Based Learning in Continuous Games

no code implementations16 Apr 2018 Eric Mazumdar, Lillian J. Ratliff, S. Shankar Sastry

We formulate a general framework for competitive gradient-based learning that encompasses a wide breadth of multi-agent learning algorithms, and analyze the limiting behavior of competitive gradient-based learning algorithms using dynamical systems theory.

Multi-agent Reinforcement Learning

A Multi-Armed Bandit Approach for Online Expert Selection in Markov Decision Processes

no code implementations18 Jul 2017 Eric Mazumdar, Roy Dong, Vicenç Rúbies Royo, Claire Tomlin, S. Shankar Sastry

We formulate a multi-armed bandit (MAB) approach to choosing expert policies online in Markov decision processes (MDPs).

Systems and Control

Inverse Risk-Sensitive Reinforcement Learning

no code implementations29 Mar 2017 Lillian J. Ratliff, Eric Mazumdar

We address the problem of inverse reinforcement learning in Markov decision processes where the agent is risk-sensitive.

Decision Making reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.