Search Results for author: Eric Mazumdar

Found 17 papers, 1 papers with code

Langevin Monte Carlo for Contextual Bandits

1 code implementation22 Jun 2022 Pan Xu, Hongkai Zheng, Eric Mazumdar, Kamyar Azizzadenesheli, Anima Anandkumar

Existing Thompson sampling-based algorithms need to construct a Laplace approximation (i. e., a Gaussian distribution) of the posterior distribution, which is inefficient to sample in high dimensional applications for general covariance matrices.

Multi-Armed Bandits

Decentralized, Communication- and Coordination-free Learning in Structured Matching Markets

no code implementations6 Jun 2022 Chinmay Maheshwari, Eric Mazumdar, Shankar Sastry

We study the problem of online learning in competitive settings in the context of two-sided matching markets.

online learning

Global Convergence to Local Minmax Equilibrium in Classes of Nonconvex Zero-Sum Games

no code implementations NeurIPS 2021 Tanner Fiez, Lillian Ratliff, Eric Mazumdar, Evan Faulkner, Adhyyan Narang

For the class of nonconvex-PL zero-sum games, we exploit timescale separation to construct a potential function that when combined with the stability characterization and an asymptotic saddle avoidance result gives a global asymptotic almost-sure convergence guarantee to a set of the strict local minmax equilibrium.

Who Leads and Who Follows in Strategic Classification?

no code implementations NeurIPS 2021 Tijana Zrnic, Eric Mazumdar, S. Shankar Sastry, Michael I. Jordan

In particular, by generalizing the standard model to allow both players to learn over time, we show that a decision-maker that makes updates faster than the agents can reverse the order of play, meaning that the agents lead and the decision-maker follows.

Classification

Zeroth-Order Methods for Convex-Concave Minmax Problems: Applications to Decision-Dependent Risk Minimization

no code implementations16 Jun 2021 Chinmay Maheshwari, Chih-Yuan Chiu, Eric Mazumdar, S. Shankar Sastry, Lillian J. Ratliff

Min-max optimization is emerging as a key framework for analyzing problems of robustness to strategically and adversarially generated data.

Global Convergence to Local Minmax Equilibrium in Classes of Nonconvex Zero-Sum Games

no code implementations NeurIPS 2021 Tanner Fiez, Lillian J Ratliff, Eric Mazumdar, Evan Faulkner, Adhyyan Narang

For the class of nonconvex-PL zero-sum games, we exploit timescale separation to construct a potential function that when combined with the stability characterization and an asymptotic saddle avoidance result gives a global asymptotic almost-sure convergence guarantee to a set of the strict local minmax equilibrium.

Fast Distributionally Robust Learning with Variance Reduced Min-Max Optimization

no code implementations27 Apr 2021 Yaodong Yu, Tianyi Lin, Eric Mazumdar, Michael I. Jordan

Distributionally robust supervised learning (DRSL) is emerging as a key paradigm for building reliable machine learning systems for real-world applications -- reflecting the need for classifiers and predictive models that are robust to the distribution shifts that arise from phenomena such as selection bias or nonstationarity.

BIG-bench Machine Learning Selection bias

Expert Selection in High-Dimensional Markov Decision Processes

no code implementations26 Oct 2020 Vicenc Rubies-Royo, Eric Mazumdar, Roy Dong, Claire Tomlin, S. Shankar Sastry

In this work we present a multi-armed bandit framework for online expert selection in Markov decision processes and demonstrate its use in high-dimensional settings.

Technical Report: Adaptive Control for Linearizable Systems Using On-Policy Reinforcement Learning

no code implementations6 Apr 2020 Tyler Westenbroek, Eric Mazumdar, David Fridovich-Keil, Valmik Prabhu, Claire J. Tomlin, S. Shankar Sastry

This paper proposes a framework for adaptively learning a feedback linearization-based tracking controller for an unknown system using discrete-time model-free policy-gradient parameter update rules.

reinforcement-learning

On Thompson Sampling with Langevin Algorithms

no code implementations ICML 2020 Eric Mazumdar, Aldo Pacchiano, Yi-An Ma, Peter L. Bartlett, Michael. I. Jordan

The resulting approximate Thompson sampling algorithm has logarithmic regret and its computational complexity does not scale with the time horizon of the algorithm.

Feedback Linearization for Unknown Systems via Reinforcement Learning

no code implementations29 Oct 2019 Tyler Westenbroek, David Fridovich-Keil, Eric Mazumdar, Shreyas Arora, Valmik Prabhu, S. Shankar Sastry, Claire J. Tomlin

We present a novel approach to control design for nonlinear systems which leverages model-free policy optimization techniques to learn a linearizing controller for a physical plant with unknown dynamics.

reinforcement-learning

Policy-Gradient Algorithms Have No Guarantees of Convergence in Linear Quadratic Games

no code implementations8 Jul 2019 Eric Mazumdar, Lillian J. Ratliff, Michael. I. Jordan, S. Shankar Sastry

In such games the state and action spaces are continuous and global Nash equilibria can be found be solving coupled Ricatti equations.

reinforcement-learning

Convergence Analysis of Gradient-Based Learning with Non-Uniform Learning Rates in Non-Cooperative Multi-Agent Settings

no code implementations30 May 2019 Benjamin Chasnov, Lillian J. Ratliff, Eric Mazumdar, Samuel A. Burden

Considering a class of gradient-based multi-agent learning algorithms in non-cooperative settings, we provide local convergence guarantees to a neighborhood of a stable local Nash equilibrium.

On Gradient-Based Learning in Continuous Games

no code implementations16 Apr 2018 Eric Mazumdar, Lillian J. Ratliff, S. Shankar Sastry

We formulate a general framework for competitive gradient-based learning that encompasses a wide breadth of multi-agent learning algorithms, and analyze the limiting behavior of competitive gradient-based learning algorithms using dynamical systems theory.

Multi-agent Reinforcement Learning

A Multi-Armed Bandit Approach for Online Expert Selection in Markov Decision Processes

no code implementations18 Jul 2017 Eric Mazumdar, Roy Dong, Vicenç Rúbies Royo, Claire Tomlin, S. Shankar Sastry

We formulate a multi-armed bandit (MAB) approach to choosing expert policies online in Markov decision processes (MDPs).

Systems and Control

Inverse Risk-Sensitive Reinforcement Learning

no code implementations29 Mar 2017 Lillian J. Ratliff, Eric Mazumdar

We address the problem of inverse reinforcement learning in Markov decision processes where the agent is risk-sensitive.

Decision Making reinforcement-learning

Cannot find the paper you are looking for? You can Submit a new open access paper.