Search Results for author: Adam Wierman

Found 60 papers, 12 papers with code

Online Budgeted Matching with General Bids

no code implementations6 Nov 2024 Jianyi Yang, Pengfei Li, Adam Wierman, Shaolei Ren

In this paper, we remove the FLM assumption and tackle the open problem of OBM with general bids.

Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics Data

1 code implementation6 Nov 2024 Chengrui Qu, Laixi Shi, Kishan Panaganti, Pengcheng You, Adam Wierman

However, with prior information on the degree of the dynamics shift, we design HySRL, a transfer algorithm that achieves problem-dependent sample complexity and outperforms pure online RL.

Reinforcement Learning (RL) Transfer Reinforcement Learning

Breaking the Curse of Multiagency in Robust Multi-Agent Reinforcement Learning

no code implementations30 Sep 2024 Laixi Shi, Jingchu Gai, Eric Mazumdar, Yuejie Chi, Adam Wierman

A notorious yet open challenge is if RMGs can escape the curse of multiagency, where the sample complexity scales exponentially with the number of agents.

Multi-agent Reinforcement Learning

End-to-End Conformal Calibration for Optimization Under Uncertainty

1 code implementation30 Sep 2024 Christopher Yeh, Nicolas Christianson, Alan Wu, Adam Wierman, Yisong Yue

However, ensuring robustness guarantees requires well-calibrated uncertainty estimates, which can be difficult to achieve in high-capacity prediction models such as deep neural networks.

Conformal Prediction Decision Making +3

Last-Iterate Convergence of Payoff-Based Independent Learning in Zero-Sum Stochastic Games

no code implementations2 Sep 2024 Zaiwei Chen, Kaiqing Zhang, Eric Mazumdar, Asuman Ozdaglar, Adam Wierman

In this paper, we consider two-player zero-sum matrix and stochastic games and develop learning dynamics that are payoff-based, convergent, rational, and symmetric between the two players.

CarbonClipper: Optimal Algorithms for Carbon-Aware Spatiotemporal Workload Management

no code implementations14 Aug 2024 Adam Lechowicz, Nicolas Christianson, Bo Sun, Noman Bashir, Mohammad Hajiesmaili, Adam Wierman, Prashant Shenoy

We formalize this as an online problem called spatiotemporal online allocation with deadline constraints ($\mathsf{SOAD}$), in which an online player completes a workload (e. g., a batch compute job) by moving and scheduling the workload across a network subject to a deadline $T$.

Management Scheduling

Distributionally Robust Constrained Reinforcement Learning under Strong Duality

no code implementations22 Jun 2024 Zhengfei Zhang, Kishan Panaganti, Laixi Shi, Yanan Sui, Adam Wierman, Yisong Yue

We study the problem of Distributionally Robust Constrained RL (DRC-RL), where the goal is to maximize the expected reward subject to environmental distribution shifts and constraints.

Car Racing reinforcement-learning +1

Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation

no code implementations31 May 2024 Shangding Gu, Laixi Shi, Yuhao Ding, Alois Knoll, Costas Spanos, Adam Wierman, Ming Jin

Safe reinforcement learning (RL) is crucial for deploying RL agents in real-world applications, as it aims to maximize long-term rewards while satisfying safety constraints.

reinforcement-learning Reinforcement Learning +2

Approximate Global Convergence of Independent Learning in Multi-Agent Systems

no code implementations30 May 2024 Ruiyang Jin, Zaiwei Chen, Yiheng Lin, Jie Song, Adam Wierman

Independent learning (IL), despite being a popular approach in practice to achieve scalability in large-scale multi-agent systems, usually lacks global convergence guarantees.

Q-Learning

Carbon Connect: An Ecosystem for Sustainable Computing

no code implementations22 May 2024 Benjamin C. Lee, David Brooks, Arthur van Benthem, Udit Gupta, Gage Hills, Vincent Liu, Benjamin Pierce, Christopher Stewart, Emma Strubell, Gu-Yeon Wei, Adam Wierman, Yuan YAO, Minlan Yu

For embodied carbon, we must re-think conventional design strategies -- over-provisioned monolithic servers, frequent hardware refresh cycles, custom silicon -- and adopt life-cycle design strategies that more effectively reduce, reuse and recycle hardware at scale.

Management

Model-Free Robust $φ$-Divergence Reinforcement Learning Using Both Offline and Online Data

no code implementations8 May 2024 Kishan Panaganti, Adam Wierman, Eric Mazumdar

To the best of our knowledge, we provide the first improved out-of-data-distribution assumption in large-scale problems with general function approximation under the hybrid robust $\phi$-regularized reinforcement learning framework.

reinforcement-learning Reinforcement Learning

Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty

no code implementations29 Apr 2024 Laixi Shi, Eric Mazumdar, Yuejie Chi, Adam Wierman

To overcome the sim-to-real gap in reinforcement learning (RL), learned policies must maintain robustness against environmental uncertainties.

Multi-agent Reinforcement Learning Reinforcement Learning (RL)

Two-Timescale Q-Learning with Function Approximation in Zero-Sum Stochastic Games

no code implementations8 Dec 2023 Zaiwei Chen, Kaiqing Zhang, Eric Mazumdar, Asuman Ozdaglar, Adam Wierman

Specifically, through a change of variable, we show that the update equation of the slow-timescale iterates resembles the classical smoothed best-response dynamics, where the regularized Nash gap serves as a valid Lyapunov function.

Q-Learning valid

Adversarial Attacks on Cooperative Multi-agent Bandits

no code implementations3 Nov 2023 Jinhang Zuo, Zhiyao Zhang, Xuchuang Wang, Cheng Chen, Shuai Li, John C. S. Lui, Mohammad Hajiesmaili, Adam Wierman

Cooperative multi-agent multi-armed bandits (CMA2B) consider the collaborative efforts of multiple agents in a shared multi-armed bandit game.

Multi-Armed Bandits

Best of Both Worlds Guarantees for Smoothed Online Quadratic Optimization

no code implementations31 Oct 2023 Neelkamal Bhuyan, Debankur Mukherjee, Adam Wierman

We provide the online optimal algorithm when the minimizers of the hitting cost function evolve as a general stochastic process, which, for the case of martingale process, takes the form of a distribution-agnostic dynamic interpolation algorithm (LAI).

Management

Online Conversion with Switching Costs: Robust and Learning-Augmented Algorithms

1 code implementation31 Oct 2023 Adam Lechowicz, Nicolas Christianson, Bo Sun, Noman Bashir, Mohammad Hajiesmaili, Adam Wierman, Prashant Shenoy

We introduce competitive (robust) threshold-based algorithms for both the minimization and maximization variants of this problem, and show they are optimal among deterministic online algorithms.

Online Algorithms with Uncertainty-Quantified Predictions

no code implementations17 Oct 2023 Bo Sun, Jerry Huang, Nicolas Christianson, Mohammad Hajiesmaili, Adam Wierman, Raouf Boutaba

The burgeoning field of algorithms with predictions studies the problem of using possibly imperfect machine learning predictions to improve online algorithm performance.

Uncertainty Quantification

Online learning for robust voltage control under uncertain grid topology

1 code implementation29 Jun 2023 Christopher Yeh, Jing Yu, Yuanyuan Shi, Adam Wierman

In this work, we combine a nested convex body chasing algorithm with a robust predictive controller to achieve provably finite-time convergence to safe voltage limits in the online setting where there is uncertainty in both the network topology as well as load and generation variations.

Towards Environmentally Equitable AI via Geographical Load Balancing

1 code implementation20 Jun 2023 Pengfei Li, Jianyi Yang, Adam Wierman, Shaolei Ren

The results demonstrate that existing GLB approaches may amplify environmental inequity while our proposed equity-aware GLB can significantly reduce the regional disparity in terms of carbon and water footprints.

Learning-Augmented Decentralized Online Convex Optimization in Networks

no code implementations16 Jun 2023 Pengfei Li, Jianyi Yang, Adam Wierman, Shaolei Ren

This paper studies decentralized online convex optimization in a networked multi-agent system and proposes a novel algorithm, Learning-Augmented Decentralized Online optimization (LADO), for individual agents to select actions only based on local online information.

Contextual Combinatorial Bandits with Probabilistically Triggered Arms

no code implementations30 Mar 2023 Xutong Liu, Jinhang Zuo, Siwei Wang, John C. S. Lui, Mohammad Hajiesmaili, Adam Wierman, Wei Chen

We study contextual combinatorial bandits with probabilistically triggered arms (C$^2$MAB-T) under a variety of smoothness conditions that capture a wide range of applications, such as contextual cascading bandits and contextual influence maximization bandits.

Convergence Rates for Localized Actor-Critic in Networked Markov Potential Games

1 code implementation8 Mar 2023 Zhaoyi Zhou, Zaiwei Chen, Yiheng Lin, Adam Wierman

The algorithm is scalable since each agent uses only local information and does not need access to the global state.

Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning

no code implementations30 Nov 2022 Yizhou Zhang, Guannan Qu, Pan Xu, Yiheng Lin, Zaiwei Chen, Adam Wierman

In particular, we show that, despite restricting each agent's attention to only its $\kappa$-hop neighborhood, the agents are able to learn a policy with an optimality gap that decays polynomially in $\kappa$.

Multi-agent Reinforcement Learning reinforcement-learning +1

Stability Constrained Reinforcement Learning for Decentralized Real-Time Voltage Control

1 code implementation16 Sep 2022 Jie Feng, Yuanyuan Shi, Guannan Qu, Steven H. Low, Anima Anandkumar, Adam Wierman

In this paper, we propose a stability-constrained reinforcement learning (RL) method for real-time voltage control, that guarantees system stability both during policy learning and deployment of the learned policy.

reinforcement-learning Reinforcement Learning +1

Robust Online Voltage Control with an Unknown Grid Topology

1 code implementation29 Jun 2022 Christopher Yeh, Jing Yu, Yuanyuan Shi, Adam Wierman

Voltage control generally requires accurate information about the grid's topology in order to guarantee network stability.

Chasing Convex Bodies and Functions with Black-Box Advice

no code implementations23 Jun 2022 Nicolas Christianson, Tinashe Handina, Adam Wierman

We consider the problem of convex function chasing with black-box advice, where an online decision-maker aims to minimize the total cost of making and switching between decisions in a normed vector space, aided by black-box advice such as the decisions of a machine-learned algorithm.

KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed Stability in Nonlinear Dynamical Systems

no code implementations3 Jun 2022 Sahin Lale, Yuanyuan Shi, Guannan Qu, Kamyar Azizzadenesheli, Adam Wierman, Anima Anandkumar

However, current reinforcement learning (RL) methods lack stabilization guarantees, which limits their applicability for the control of safety-critical systems.

reinforcement-learning Reinforcement Learning (RL)

Interface Networks for Failure Localization in Power Systems

no code implementations12 May 2022 Chen Liang, Alessandro Zocca, Steven H. Low, Adam Wierman

Transmission power systems usually consist of interconnected sub-grids that are operated relatively independently.

Near-Optimal Distributed Linear-Quadratic Regulator for Networked Systems

1 code implementation12 Apr 2022 Sungho Shin, Yiheng Lin, Guannan Qu, Adam Wierman, Mihai Anitescu

This paper studies the trade-off between the degree of decentralization and the performance of a distributed controller in a linear-quadratic control setting.

Online Adversarial Stabilization of Unknown Networked Systems

no code implementations5 Mar 2022 Jing Yu, Dimitar Ho, Adam Wierman

We investigate the problem of stabilizing an unknown networked linear system under communication constraints and adversarial disturbances.

Smoothed Online Optimization with Unreliable Predictions

no code implementations7 Feb 2022 Daan Rutten, Nico Christianson, Debankur Mukherjee, Adam Wierman

The goal of the decision maker is to exploit the predictions if they are accurate, while guaranteeing performance that is not much worse than the hindsight optimal sequence of decisions, even when predictions are inaccurate.

Online Optimization with Feedback Delay and Nonlinear Switching Cost

no code implementations29 Oct 2021 Weici Pan, Guanya Shi, Yiheng Lin, Adam Wierman

We study a variant of online optimization in which the learner receives $k$-round $\textit{delayed feedback}$ about hitting cost and there is a multi-step nonlinear switching cost, i. e., costs depend on multiple previous actions in a nonlinear manner.

2k

Stability Constrained Reinforcement Learning for Real-Time Voltage Control

no code implementations30 Sep 2021 Yuanyuan Shi, Guannan Qu, Steven Low, Anima Anandkumar, Adam Wierman

Deep reinforcement learning (RL) has been recognized as a promising tool to address the challenges in real-time control of power systems.

reinforcement-learning Reinforcement Learning +1

Pareto-Optimal Learning-Augmented Algorithms for Online Conversion Problems

no code implementations NeurIPS 2021 Bo Sun, Russell Lee, Mohammad Hajiesmaili, Adam Wierman, Danny H. K. Tsang

This paper leverages machine-learned predictions to design competitive algorithms for online conversion problems with the goal of improving the competitive ratio when predictions are accurate (i. e., consistency), while also guaranteeing a worst-case competitive ratio regardless of the prediction quality (i. e., robustness).

Robustness and Consistency in Linear Quadratic Control with Untrusted Predictions

no code implementations NeurIPS 2021 Tongxin Li, Ruixiao Yang, Guannan Qu, Guanya Shi, Chenkai Yu, Adam Wierman, Steven H. Low

Motivated by online learning methods, we design a self-tuning policy that adaptively learns the trust parameter $\lambda$ with a competitive ratio that depends on $\varepsilon$ and the variation of system perturbations and predictions.

Stable Online Control of Linear Time-Varying Systems

no code implementations29 Apr 2021 Guannan Qu, Yuanyuan Shi, Sahin Lale, Anima Anandkumar, Adam Wierman

In this work, we propose an efficient online control algorithm, COvariance Constrained Online Linear Quadratic (COCO-LQ) control, that guarantees input-to-state stability for a large class of LTV systems while also minimizing the control cost.

Learning-Based Predictive Control via Real-Time Aggregate Flexibility

no code implementations21 Dec 2020 Tongxin Li, Bo Sun, Yue Chen, Zixin Ye, Steven H. Low, Adam Wierman

To be used effectively, an aggregator must be able to communicate the available flexibility of the loads they control, as known as the aggregate flexibility to a system operator.

Optimization and Control Systems and Control Systems and Control

The Power of Predictions in Online Control

no code implementations NeurIPS 2020 Chenkai Yu, Guanya Shi, Soon-Jo Chung, Yisong Yue, Adam Wierman

We study the impact of predictions in online Linear Quadratic Regulator control with both stochastic and adversarial disturbances in the dynamics.

Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward

no code implementations NeurIPS 2020 Guannan Qu, Yiheng Lin, Adam Wierman, Na Li

It has long been recognized that multi-agent reinforcement learning (MARL) faces significant scalability issues due to the fact that the size of the state and action spaces are exponentially large in the number of agents.

Multi-agent Reinforcement Learning reinforcement-learning +1

Scalable Reinforcement Learning of Localized Policies for Multi-Agent Networked Systems

no code implementations L4DC 2020 Guannan Qu, Adam Wierman, Na Li

We study reinforcement learning (RL) in a setting with a network of agents whose states and actions interact in a local manner where the objective is to find localized policies such that the (discounted) global reward is maximized.

reinforcement-learning Reinforcement Learning (RL)

Line Failure Localization of Power Networks Part II: Cut Set Outages

no code implementations22 May 2020 Linqi Guo, Chen Liang, Alessandro Zocca, Steven H. Low, Adam Wierman

Transmission line failure in power systems prop-agate non-locally, making the control of the resulting outages extremely difficult.

Adaptive Network Response to Line Failures in Power Systems

no code implementations22 May 2020 Chen Liang, Linqi Guo, Alessandro Zocca, Steven H. Low, Adam Wierman

Transmission line failures in power systems propagate and cascade non-locally.

Line Failure Localization of Power Networks Part I: Non-cut Outages

no code implementations20 May 2020 Linqi Guo, Chen Liang, Alessandro Zocca, Steven H. Low, Adam Wierman

Transmission line failures in power systems propagate non-locally, making the control of the resulting outages extremely difficult.

Online Optimization with Memory and Competitive Control

1 code implementation NeurIPS 2020 Guanya Shi, Yiheng Lin, Soon-Jo Chung, Yisong Yue, Adam Wierman

This paper presents competitive algorithms for a novel class of online optimization problems with memory.

Finite-Time Analysis of Asynchronous Stochastic Approximation and $Q$-Learning

no code implementations1 Feb 2020 Guannan Qu, Adam Wierman

We consider a general asynchronous Stochastic Approximation (SA) scheme featuring a weighted infinity-norm contractive operator, and prove a bound on its finite-time convergence rate on a single trajectory.

Q-Learning

Scalable Reinforcement Learning for Multi-Agent Networked Systems

no code implementations5 Dec 2019 Guannan Qu, Adam Wierman, Na Li

We study reinforcement learning (RL) in a setting with a network of agents whose states and actions interact in a local manner where the objective is to find localized policies such that the (discounted) global reward is maximized.

reinforcement-learning Reinforcement Learning +1

Online Optimization with Predictions and Non-convex Losses

no code implementations10 Nov 2019 Yiheng Lin, Gautam Goel, Adam Wierman

In this work, we give two general sufficient conditions that specify a relationship between the hitting and movement costs which guarantees that a new algorithm, Synchronized Fixed Horizon Control (SFHC), provides a $1+O(1/w)$ competitive ratio, where $w$ is the number of predictions available to the learner.

Beyond Online Balanced Descent: An Optimal Algorithm for Smoothed Online Optimization

no code implementations NeurIPS 2019 Gautam Goel, Yiheng Lin, Haoyuan Sun, Adam Wierman

We prove a new lower bound on the competitive ratio of any online algorithm in the setting where the costs are $m$-strongly convex and the movement costs are the squared $\ell_2$ norm.

Transparency and Control in Platforms for Networked Markets

no code implementations11 Mar 2019 John Pang, Weixuan Lin, Hu Fu, Jack Kleeman, Eilyan Bitar, Adam Wierman

In this paper, we analyze the worst case efficiency loss of online platform designs under a networked Cournot competition model.

Computer Science and Game Theory

Smoothed Online Optimization for Regression and Control

no code implementations23 Oct 2018 Gautam Goel, Adam Wierman

We consider Online Convex Optimization (OCO) in the setting where the costs are $m$-strongly convex and the online learner pays a switching cost for changing decisions between rounds.

regression

Newton Polytopes and Relative Entropy Optimization

no code implementations3 Oct 2018 Riley Murray, Venkat Chandrasekaran, Adam Wierman

When specialized to the context of polynomials, we obtain analysis and computational tools that only depend on the particular monomials that constitute a sparse polynomial.

Optimization and Control

Smoothed Online Convex Optimization in High Dimensions via Online Balanced Descent

no code implementations28 Mar 2018 Niangjun Chen, Gautam Goel, Adam Wierman

We demonstrate the generality of the OBD framework by showing how, with different choices of "balance," OBD can improve upon state-of-the-art performance guarantees for both competitive ratio and regret, in particular, OBD is the first algorithm to achieve a dimension-free competitive ratio, $3 + O(1/\alpha)$, for locally polyhedral costs, where $\alpha$ measures the "steepness" of the costs.

Vocal Bursts Intensity Prediction

A Parallelizable Acceleration Framework for Packing Linear Programs

no code implementations17 Nov 2017 Palma London, Shai Vardi, Adam Wierman, Hanling Yi

This paper presents an acceleration framework for packing linear programming problems where the amount of data available is limited, i. e., where the number of constraints m is small compared to the variable dimension n. The framework can be used as a black box to speed up linear programming solvers dramatically, by two orders of magnitude in our experiments.

Online Convex Optimization Using Predictions

no code implementations25 Apr 2015 Niangjun Chen, Anish Agarwal, Adam Wierman, Siddharth Barman, Lachlan L. H. Andrew

Making use of predictions is a crucial, but under-explored, area of online algorithms.

Cannot find the paper you are looking for? You can Submit a new open access paper.