Search Results for author: Michael M. Zavlanos

Found 23 papers, 2 papers with code

Risk-averse Learning with Non-Stationary Distributions

no code implementations • 3 Apr 2024 • Siyi Wang, Zifan Wang, Xinlei Yi, Michael M. Zavlanos, Karl H. Johansson, Sandra Hirche

Considering non-stationary environments in online optimization enables decision-maker to effectively adapt to changes and improve its performance over time.

Paper
Add Code

Path Signatures and Graph Neural Networks for Slow Earthquake Analysis: Better Together?

no code implementations • 5 Feb 2024 • Hans Riess, Manolis Veveakis, Michael M. Zavlanos

The path signature, having enjoyed recent success in the machine learning community, is a theoretically-driven method for engineering features from irregular paths.

Earthquake prediction

Paper
Add Code

Wasserstein Distributionally Robust Policy Evaluation and Learning for Contextual Bandits

no code implementations • 15 Sep 2023 • Yi Shen, Pan Xu, Michael M. Zavlanos

To overcome these limitations, we propose a novel DRO approach that employs the Wasserstein distance instead.

Multi-Armed Bandits Off-policy evaluation

Paper
Add Code

Policy Evaluation in Distributional LQR

no code implementations • 23 Mar 2023 • Zifan Wang, Yulong Gao, Siyi Wang, Michael M. Zavlanos, Alessandro Abate, Karl H. Johansson

Distributional reinforcement learning (DRL) enhances the understanding of the effects of the randomness in the environment by letting agents learn the distribution of a random return, rather than its expected value as in standard RL.

Distributional Reinforcement Learning

Paper
Add Code

Risk-Averse Multi-Armed Bandits with Unobserved Confounders: A Case Study in Emotion Regulation in Mobile Health

no code implementations • 9 Sep 2022 • Yi Shen, Jessilyn Dunn, Michael M. Zavlanos

In this paper, we consider a risk-averse multi-armed bandit (MAB) problem where the goal is to learn a policy that minimizes the risk of low expected return, as opposed to maximizing the expected return itself, which is the objective in the usual approach to risk-neutral MAB.

Multi-Armed Bandits Transfer Learning

Paper
Add Code

A Zeroth-Order Momentum Method for Risk-Averse Online Convex Games

no code implementations • 6 Sep 2022 • Zifan Wang, Yi Shen, Zachary I. Bell, Scott Nivison, Michael M. Zavlanos, Karl H. Johansson

Specifically, the agents use the conditional value at risk (CVaR) as a risk measure and rely on bandit feedback in the form of the cost values of the selected actions at every episode to estimate their CVaR values and update their actions.

Paper
Add Code

Risk-Averse No-Regret Learning in Online Convex Games

no code implementations • 16 Mar 2022 • Zifan Wang, Yi Shen, Michael M. Zavlanos

To address this challenge, we propose a new online risk-averse learning algorithm that relies on one-point zeroth-order estimation of the CVaR gradients computed using CVaR values that are estimated by appropriately sampling the cost functions.

Paper
Add Code

Failing with Grace: Learning Neural Network Controllers that are Boundedly Unsafe

no code implementations • 22 Jun 2021 • Panagiotis Vlantis, Leila J. Bridgeman, Michael M. Zavlanos

As a result, our method can learn a safe vector field for the closed-loop system and, at the same time, provide worst-case bounds on safety violation over the whole configuration space, defined by the overlap between the over-approximation of the forward reachable set of the closed-loop system and the set of unsafe states.

Paper
Add Code

Learning without Knowing: Unobserved Context in Continuous Transfer Reinforcement Learning

no code implementations • 7 Jun 2021 • Chenyu Liu, Yan Zhang, Yi Shen, Michael M. Zavlanos

We assume that this context is not accessible to a learner agent who can only observe the expert data.

Autonomous Driving Imitation Learning +3

Paper
Add Code

Formal Verification of Stochastic Systems with ReLU Neural Network Controllers

no code implementations • 8 Mar 2021 • Shiqi Sun, Yan Zhang, Xusheng Luo, Panagiotis Vlantis, Miroslav Pajic, Michael M. Zavlanos

Using this abstraction, we propose a method to compute tight bounds on the safety probabilities of nodes in this graph, despite possible over-approximations of the transition probabilities between these nodes.

Robot Navigation

Paper
Add Code

Learning Optimal Strategies for Temporal Tasks in Stochastic Games

no code implementations • 8 Feb 2021 • Alper Kamil Bozkurt, Yu Wang, Michael M. Zavlanos, Miroslav Pajic

By deriving distinct rewards and discount factors from the acceptance condition of the DPA, we reduce the maximization of the worst-case probability of satisfying the LTL specification into the maximization of a discounted reward objective in the product game; this enables the use of model-free RL algorithms to learn an optimal controller strategy.

Reinforcement Learning (RL)

Paper
Add Code

Temporal Logic Task Allocation in Heterogeneous Multi-Robot Systems

no code implementations • 14 Jan 2021 • Xusheng Luo, Michael M. Zavlanos

To obtain a scalable solution to this complex temporal logic task allocation problem, we propose a hierarchical approach that first allocates specific robots to tasks using the information about the tasks contained in the Nondeterministic Buchi Automaton (NBA) that captures the LTL specification, and then designs low-level executable plans for the robots that respect the high-level assignment.

Robotics

Paper
Add Code

Plane Wave Elastography: A Frequency-Domain Ultrasound Shear Wave Elastography Approach

no code implementations • 8 Dec 2020 • Reza Khodayi-mehr, Matthew W. Urban, Michael M. Zavlanos, Wilkins Aquino

Currently, commercial methods for SWE rely on directional filtering based on the prior knowledge of the wave propagation direction, to remove complicated wave patterns formed due to reflection and refraction.

Paper
Add Code

Boosting One-Point Derivative-Free Online Optimization via Residual Feedback

no code implementations • 14 Oct 2020 • Yan Zhang, Yi Zhou, Kaiyi Ji, Michael M. Zavlanos

As a result, our regret bounds are much tighter compared to existing regret bounds for ZO with conventional one-point feedback, which suggests that ZO with residual feedback can better track the optimizer of online optimization problems.

Paper
Add Code

A New One-Point Residual-Feedback Oracle For Black-Box Learning and Control

no code implementations • 18 Jun 2020 • Yan Zhang, Yi Zhou, Kaiyi Ji, Michael M. Zavlanos

When optimizing a deterministic Lipschitz function, we show that the query complexity of ZO with the proposed one-point residual feedback matches that of ZO with the existing two-point schemes.

Paper
Add Code

Cooperative Multi-Agent Reinforcement Learning with Partial Observations

no code implementations • 18 Jun 2020 • Yan Zhang, Michael M. Zavlanos

The advantage of the proposed zeroth-order policy optimization method is that it allows the agents to compute the local policy gradients needed to update their local policy functions using local estimates of the global accumulated rewards that depend on partial state and action information only and can be obtained using consensus.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Transfer Reinforcement Learning under Unobserved Contextual Information

no code implementations • 9 Mar 2020 • Yan Zhang, Michael M. Zavlanos

Then, the goal is to transfer this experience, excluding the underlying contextual information, to a learner agent that does not have access to the environmental context, so that they can learn a control policy using fewer samples.

Motion Planning Q-Learning +3

Paper
Add Code

VarNet: Variational Neural Networks for the Solution of Partial Differential Equations

1 code implementation • L4DC 2020 • Reza Khodayi-mehr, Michael M. Zavlanos

In this paper we propose a new model-based unsupervised learning method, called VarNet, for the solution of partial differential equations (PDEs) using deep neural networks (NNs).

Paper
Code

A Distributed Online Convex Optimization Algorithm with Improved Dynamic Regret

no code implementations • 12 Nov 2019 • Yan Zhang, Robert J. Ravier, Michael M. Zavlanos, Vahid Tarokh

In this paper, we consider the problem of distributed online convex optimization, where a network of local agents aim to jointly optimize a convex function over a period of multiple time steps.

Paper
Add Code

Control Synthesis from Linear Temporal Logic Specifications using Model-Free Reinforcement Learning

2 code implementations • 16 Sep 2019 • Alper Kamil Bozkurt, Yu Wang, Michael M. Zavlanos, Miroslav Pajic

We present a reinforcement learning (RL) framework to synthesize a control policy from a given linear temporal logic (LTL) specification in an unknown stochastic environment that can be modeled as a Markov Decision Process (MDP).

Motion Planning reinforcement-learning +1

Paper
Code

Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus

no code implementations • 21 Mar 2019 • Yan Zhang, Michael M. Zavlanos

In this paper, we propose a distributed off-policy actor critic method to solve multi-agent reinforcement learning problems.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Deep Learning for Robotic Mass Transport Cloaking

no code implementations • 11 Dec 2018 • Reza Khodayi-mehr, Michael M. Zavlanos

Unlike passive cloaking methods that use metamaterials to steer the mass flux, our method is the first to use mobile robots to actively control the concentration levels and create safe zones independent of environmental conditions.

Paper
Add Code

Physics-Based Learning for Robotic Environmental Sensing

no code implementations • 10 Dec 2018 • Reza Khodayi-mehr, Michael M. Zavlanos

We propose a physics-based method to learn environmental fields (EFs) using a mobile robot.

Bayesian Inference Gaussian Processes +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.