Search Results for author: Michael M. Zavlanos

Found 23 papers, 2 papers with code

Risk-averse Learning with Non-Stationary Distributions

no code implementations3 Apr 2024 Siyi Wang, Zifan Wang, Xinlei Yi, Michael M. Zavlanos, Karl H. Johansson, Sandra Hirche

Considering non-stationary environments in online optimization enables decision-maker to effectively adapt to changes and improve its performance over time.

Path Signatures and Graph Neural Networks for Slow Earthquake Analysis: Better Together?

no code implementations5 Feb 2024 Hans Riess, Manolis Veveakis, Michael M. Zavlanos

The path signature, having enjoyed recent success in the machine learning community, is a theoretically-driven method for engineering features from irregular paths.

Earthquake prediction

Policy Evaluation in Distributional LQR

no code implementations23 Mar 2023 Zifan Wang, Yulong Gao, Siyi Wang, Michael M. Zavlanos, Alessandro Abate, Karl H. Johansson

Distributional reinforcement learning (DRL) enhances the understanding of the effects of the randomness in the environment by letting agents learn the distribution of a random return, rather than its expected value as in standard RL.

Distributional Reinforcement Learning

Risk-Averse Multi-Armed Bandits with Unobserved Confounders: A Case Study in Emotion Regulation in Mobile Health

no code implementations9 Sep 2022 Yi Shen, Jessilyn Dunn, Michael M. Zavlanos

In this paper, we consider a risk-averse multi-armed bandit (MAB) problem where the goal is to learn a policy that minimizes the risk of low expected return, as opposed to maximizing the expected return itself, which is the objective in the usual approach to risk-neutral MAB.

Multi-Armed Bandits Transfer Learning

A Zeroth-Order Momentum Method for Risk-Averse Online Convex Games

no code implementations6 Sep 2022 Zifan Wang, Yi Shen, Zachary I. Bell, Scott Nivison, Michael M. Zavlanos, Karl H. Johansson

Specifically, the agents use the conditional value at risk (CVaR) as a risk measure and rely on bandit feedback in the form of the cost values of the selected actions at every episode to estimate their CVaR values and update their actions.

Risk-Averse No-Regret Learning in Online Convex Games

no code implementations16 Mar 2022 Zifan Wang, Yi Shen, Michael M. Zavlanos

To address this challenge, we propose a new online risk-averse learning algorithm that relies on one-point zeroth-order estimation of the CVaR gradients computed using CVaR values that are estimated by appropriately sampling the cost functions.

Failing with Grace: Learning Neural Network Controllers that are Boundedly Unsafe

no code implementations22 Jun 2021 Panagiotis Vlantis, Leila J. Bridgeman, Michael M. Zavlanos

As a result, our method can learn a safe vector field for the closed-loop system and, at the same time, provide worst-case bounds on safety violation over the whole configuration space, defined by the overlap between the over-approximation of the forward reachable set of the closed-loop system and the set of unsafe states.

Formal Verification of Stochastic Systems with ReLU Neural Network Controllers

no code implementations8 Mar 2021 Shiqi Sun, Yan Zhang, Xusheng Luo, Panagiotis Vlantis, Miroslav Pajic, Michael M. Zavlanos

Using this abstraction, we propose a method to compute tight bounds on the safety probabilities of nodes in this graph, despite possible over-approximations of the transition probabilities between these nodes.

Robot Navigation

Learning Optimal Strategies for Temporal Tasks in Stochastic Games

no code implementations8 Feb 2021 Alper Kamil Bozkurt, Yu Wang, Michael M. Zavlanos, Miroslav Pajic

By deriving distinct rewards and discount factors from the acceptance condition of the DPA, we reduce the maximization of the worst-case probability of satisfying the LTL specification into the maximization of a discounted reward objective in the product game; this enables the use of model-free RL algorithms to learn an optimal controller strategy.

Reinforcement Learning (RL)

Temporal Logic Task Allocation in Heterogeneous Multi-Robot Systems

no code implementations14 Jan 2021 Xusheng Luo, Michael M. Zavlanos

To obtain a scalable solution to this complex temporal logic task allocation problem, we propose a hierarchical approach that first allocates specific robots to tasks using the information about the tasks contained in the Nondeterministic Buchi Automaton (NBA) that captures the LTL specification, and then designs low-level executable plans for the robots that respect the high-level assignment.


Plane Wave Elastography: A Frequency-Domain Ultrasound Shear Wave Elastography Approach

no code implementations8 Dec 2020 Reza Khodayi-mehr, Matthew W. Urban, Michael M. Zavlanos, Wilkins Aquino

Currently, commercial methods for SWE rely on directional filtering based on the prior knowledge of the wave propagation direction, to remove complicated wave patterns formed due to reflection and refraction.

Boosting One-Point Derivative-Free Online Optimization via Residual Feedback

no code implementations14 Oct 2020 Yan Zhang, Yi Zhou, Kaiyi Ji, Michael M. Zavlanos

As a result, our regret bounds are much tighter compared to existing regret bounds for ZO with conventional one-point feedback, which suggests that ZO with residual feedback can better track the optimizer of online optimization problems.

Cooperative Multi-Agent Reinforcement Learning with Partial Observations

no code implementations18 Jun 2020 Yan Zhang, Michael M. Zavlanos

The advantage of the proposed zeroth-order policy optimization method is that it allows the agents to compute the local policy gradients needed to update their local policy functions using local estimates of the global accumulated rewards that depend on partial state and action information only and can be obtained using consensus.

Multi-agent Reinforcement Learning reinforcement-learning +1

A New One-Point Residual-Feedback Oracle For Black-Box Learning and Control

no code implementations18 Jun 2020 Yan Zhang, Yi Zhou, Kaiyi Ji, Michael M. Zavlanos

When optimizing a deterministic Lipschitz function, we show that the query complexity of ZO with the proposed one-point residual feedback matches that of ZO with the existing two-point schemes.

Transfer Reinforcement Learning under Unobserved Contextual Information

no code implementations9 Mar 2020 Yan Zhang, Michael M. Zavlanos

Then, the goal is to transfer this experience, excluding the underlying contextual information, to a learner agent that does not have access to the environmental context, so that they can learn a control policy using fewer samples.

Motion Planning Q-Learning +3

VarNet: Variational Neural Networks for the Solution of Partial Differential Equations

1 code implementation L4DC 2020 Reza Khodayi-mehr, Michael M. Zavlanos

In this paper we propose a new model-based unsupervised learning method, called VarNet, for the solution of partial differential equations (PDEs) using deep neural networks (NNs).

A Distributed Online Convex Optimization Algorithm with Improved Dynamic Regret

no code implementations12 Nov 2019 Yan Zhang, Robert J. Ravier, Michael M. Zavlanos, Vahid Tarokh

In this paper, we consider the problem of distributed online convex optimization, where a network of local agents aim to jointly optimize a convex function over a period of multiple time steps.

Control Synthesis from Linear Temporal Logic Specifications using Model-Free Reinforcement Learning

2 code implementations16 Sep 2019 Alper Kamil Bozkurt, Yu Wang, Michael M. Zavlanos, Miroslav Pajic

We present a reinforcement learning (RL) framework to synthesize a control policy from a given linear temporal logic (LTL) specification in an unknown stochastic environment that can be modeled as a Markov Decision Process (MDP).

Motion Planning reinforcement-learning +1

Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus

no code implementations21 Mar 2019 Yan Zhang, Michael M. Zavlanos

In this paper, we propose a distributed off-policy actor critic method to solve multi-agent reinforcement learning problems.

Multi-agent Reinforcement Learning reinforcement-learning +1

Deep Learning for Robotic Mass Transport Cloaking

no code implementations11 Dec 2018 Reza Khodayi-mehr, Michael M. Zavlanos

Unlike passive cloaking methods that use metamaterials to steer the mass flux, our method is the first to use mobile robots to actively control the concentration levels and create safe zones independent of environmental conditions.

Cannot find the paper you are looking for? You can Submit a new open access paper.