Search Results for author: Ufuk Topcu

Found 123 papers, 22 papers with code

Task-Oriented Active Perception and Planning in Environments with Partially Known Semantics

no code implementations ICML 2020 Mahsa Ghasemi, Erdem Bulgur, Ufuk Topcu

Furthermore, as new data arrive, the belief over the atomic propositions evolves and, subsequently, the planning strategy adapts accordingly.

Auto-Encoding Bayesian Inverse Games

no code implementations14 Feb 2024 Xinjie Liu, Lasse Peters, Javier Alonso-Mora, Ufuk Topcu, David Fridovich-Keil

When multiple agents interact in a common environment, each agent's actions impact others' future decisions, and noncooperative dynamic games naturally capture this coupling.

Motion Planning

Online Foundation Model Selection in Robotics

no code implementations13 Feb 2024 Po-han Li, Oyku Selin Toprak, Aditya Narayanan, Ufuk Topcu, Sandeep Chinchali

We thus formulate a user-centric online model selection problem and propose a novel solution that combines an open-source encoder to output context and an online learning algorithm that processes this context.

Model Selection

Using Large Language Models to Automate and Expedite Reinforcement Learning with Reward Machine

no code implementations11 Feb 2024 Shayan Meshkat Alsadat, Jean-Raphael Gaglione, Daniel Neider, Ufuk Topcu, Zhe Xu

Our method uses Large Language Models (LLM) to obtain high-level domain-specific knowledge using prompt engineering instead of providing the reinforcement learning algorithm directly with the high-level knowledge which requires an expert to encode the automaton.

Language Modelling Large Language Model +2

News Source Credibility Assessment: A Reddit Case Study

no code implementations7 Feb 2024 Arash Amini, Yigit Ege Bayiz, Ashwin Ram, Radu Marculescu, Ufuk Topcu

In the era of social media platforms, identifying the credibility of online content is crucial to combat misinformation.

Binary Classification Misinformation

Zero-Shot Reinforcement Learning via Function Encoders

1 code implementation30 Jan 2024 Tyler Ingebrand, Amy Zhang, Ufuk Topcu

Although reinforcement learning (RL) can solve many challenging sequential decision making problems, achieving zero-shot transfer across related tasks remains a challenge.

Decision Making reinforcement-learning +2

Noise-Aware and Equitable Urban Air Traffic Management: An Optimization Approach

no code implementations1 Jan 2024 Zhenyu Gao, Yue Yu, Qinshuang Wei, Ufuk Topcu, John-Paul Clarke

Urban air mobility (UAM), a transformative concept for the transport of passengers and cargo, faces several integration challenges in complex urban environments.

Fairness Management

A Multifidelity Sim-to-Real Pipeline for Verifiable and Compositional Reinforcement Learning

no code implementations2 Dec 2023 Cyrus Neary, Christian Ellis, Aryaman Singh Samyal, Craig Lennon, Ufuk Topcu

We propose and demonstrate a compositional framework for training and verifying reinforcement learning (RL) systems within a multifidelity sim-to-real pipeline, in order to deploy reliable and adaptable RL policies on physical hardware.

reinforcement-learning Reinforcement Learning (RL)

Formal Methods for Autonomous Systems

no code implementations2 Nov 2023 Tichakorn Wongpiromsarn, Mahsa Ghasemi, Murat Cubuktepe, Georgios Bakirtzis, Steven Carr, Mustafa O. Karabag, Cyrus Neary, Parham Gohari, Ufuk Topcu

Formal methods refer to rigorous, mathematical approaches to system development and have played a key role in establishing the correctness of safety-critical systems.

Fine-Tuning Language Models Using Formal Methods Feedback

no code implementations27 Oct 2023 Yunhao Yang, Neel P. Bhatt, Tyler Ingebrand, William Ward, Steven Carr, Zhangyang Wang, Ufuk Topcu

Although pre-trained language models encode generic knowledge beneficial for planning and control, they may fail to generate appropriate control policies for domain-specific tasks.

Autonomous Driving

Algorithmic Robustness

no code implementations17 Oct 2023 David Jensen, Brian LaMacchia, Ufuk Topcu, Pamela Wisniewski

Algorithmic robustness refers to the sustained performance of a computational system in the face of change in the nature of the environment in which that system operates or in the task that the system is meant to perform.


Encouraging Inferable Behavior for Autonomy: Repeated Bimatrix Stackelberg Games with Observations

no code implementations30 Sep 2023 Mustafa O. Karabag, Sophia Smith, David Fridovich-Keil, Ufuk Topcu

As a converse result, we also provide a game where the required number of interactions is lower bounded by a function of the desired inferability loss.

Decision Making

Specification-Driven Video Search via Foundation Models and Formal Verification

no code implementations18 Sep 2023 Yunhao Yang, Jean-Raphaël Gaglione, Sandeep Chinchali, Ufuk Topcu

The increasing abundance of video data enables users to search for events of interest, e. g., emergency incidents.

Autonomous Driving

Privacy-Engineered Value Decomposition Networks for Cooperative Multi-Agent Reinforcement Learning

no code implementations13 Sep 2023 Parham Gohari, Matthew Hale, Ufuk Topcu

Accordingly, we propose Privacy-Engineered Value Decomposition Networks (PE-VDN), a Co-MARL algorithm that models multi-agent coordination while provably safeguarding the confidentiality of the agents' environment interaction data.

Privacy Preserving reinforcement-learning +2

Verifiable Reinforcement Learning Systems via Compositionality

no code implementations9 Sep 2023 Cyrus Neary, Aryaman Singh Samyal, Christos Verginis, Murat Cubuktepe, Ufuk Topcu

We propose a framework for verifiable and compositional reinforcement learning (RL) in which a collection of RL subsystems, each of which learns to accomplish a separate subtask, are composed to achieve an overall task.

reinforcement-learning Reinforcement Learning (RL)

Simulator-Driven Deceptive Control via Path Integral Approach

no code implementations27 Aug 2023 Apurva Patil, Mustafa O. Karabag, Takashi Tanaka, Ufuk Topcu

We study the deception problem in the continuous-state discrete-time stochastic dynamics setting and, using motivations from hypothesis testing theory, formulate a Kullback-Leibler control problem for the synthesis of deceptive policies.

Active Inverse Learning in Stackelberg Trajectory Games

no code implementations15 Aug 2023 Yue Yu, Jacob Levy, Negar Mehr, David Fridovich-Keil, Ufuk Topcu

We formulate an inverse learning problem in a Stackelberg game between a leader and a follower, where each player's action is the trajectory of a dynamical system.

Multimodal Pretrained Models for Sequential Decision-Making: Synthesis, Verification, Grounding, and Perception

no code implementations10 Aug 2023 Yunhao Yang, Cyrus Neary, Ufuk Topcu

We develop an algorithm that utilizes the knowledge from pretrained models to construct and verify controllers for sequential decision-making tasks, and to ground these controllers to task environments through visual observations.

Decision Making Robot Manipulation +1

How to Learn and Generalize From Three Minutes of Data: Physics-Constrained and Uncertainty-Aware Neural Stochastic Differential Equations

no code implementations10 Jun 2023 Franck Djeumou, Cyrus Neary, Ufuk Topcu

We present a framework and algorithms to learn controlled dynamics models using neural stochastic differential equations (SDEs) -- SDEs whose drift and diffusion terms are both parametrized by neural networks.

Inductive Bias Model-based Reinforcement Learning +1

Risk-aware Urban Air Mobility Network Design with Overflow Redundancy

1 code implementation8 Jun 2023 Qinshuang Wei, Zhenyu Gao, John-Paul Clarke, Ufuk Topcu

In our methodology, we first model how disruptions to a given UAM network might impact on the nominal traffic flow and how this flow might be re-accommodated on an extended network with reserve capacity.


Reinforcement Learning With Reward Machines in Stochastic Games

no code implementations27 May 2023 Jueming Hu, Jean-Raphael Gaglione, Yanze Wang, Zhe Xu, Ufuk Topcu, Yongming Liu

We develop an algorithm called Q-learning with reward machines for stochastic games (QRM-SG), to learn the best-response strategy at Nash equilibrium for each agent.

Multi-agent Reinforcement Learning Q-Learning +1

Reward-Machine-Guided, Self-Paced Reinforcement Learning

1 code implementation25 May 2023 Cevahir Koprulu, Ufuk Topcu

Self-paced reinforcement learning (RL) aims to improve the data efficiency of learning by automatically creating sequences, namely curricula, of probability distributions over contexts.

reinforcement-learning Reinforcement Learning (RL)

Dynamic Routing in Stochastic Urban Air Mobility Networks: A Markov Decision Process Approach

no code implementations11 May 2023 Qinshuang Wei, Yue Yu, Ufuk Topcu

Urban air mobility (UAM) is an emerging concept in short-range aviation transportation, where the aircraft will take off, land, and charge their batteries at a set of vertistops, and travel only through a set of flight corridors connecting these vertistops.

Decision Making

Efficient Sensitivity Analysis for Parametric Robust Markov Chains

no code implementations1 May 2023 Thom Badings, Sebastian Junges, Ahmadreza Marandi, Ufuk Topcu, Nils Jansen

As our main contribution, we present an efficient method to compute these partial derivatives.

On the Sample Complexity of Vanilla Model-Based Offline Reinforcement Learning with Dependent Samples

no code implementations7 Mar 2023 Mustafa O. Karabag, Ufuk Topcu

Under no assumption of independent samples, we provide a high-probability, polynomial sample complexity bound for vanilla model-based off-policy evaluation that requires partial or uniform coverage.

Offline RL Off-policy evaluation +1

Differential Privacy in Cooperative Multiagent Planning

1 code implementation20 Jan 2023 Bo Chen, Calvin Hawkins, Mustafa O. Karabag, Cyrus Neary, Matthew Hale, Ufuk Topcu

We synthesize policies that are robust to privacy by reducing the value of the total correlation.

Decision Making

Physics-Informed Kernel Embeddings: Integrating Prior System Knowledge with Data-Driven Control

no code implementations9 Jan 2023 Adam J. Thorpe, Cyrus Neary, Franck Djeumou, Meeko M. K. Oishi, Ufuk Topcu

Our proposed approach incorporates prior knowledge of the system dynamics as a bias term in the kernel learning problem.

Task-Guided IRL in POMDPs that Scales

1 code implementation30 Dec 2022 Franck Djeumou, Christian Ellis, Murat Cubuktepe, Craig Lennon, Ufuk Topcu

First, they require an excessive amount of data due to the information asymmetry between the expert and the learner.


Automaton-Based Representations of Task Knowledge from Generative Language Models

no code implementations4 Dec 2022 Yunhao Yang, Jean-Raphaël Gaglione, Cyrus Neary, Ufuk Topcu

However, the textual outputs from GLMs cannot be formally verified or used for sequential decision-making.

Decision Making

Learning Temporal Logic Properties: an Overview of Two Recent Methods

no code implementations2 Dec 2022 Jean-Raphaël Gaglione, Rajarshi Roy, Nasim Baharisangari, Daniel Neider, Zhe Xu, Ufuk Topcu

Learning linear temporal logic (LTL) formulas from examples labeled as positive or negative has found applications in inferring descriptions of system behavior.

Specificity Vocal Bursts Valence Prediction

Compositional Learning of Dynamical System Models Using Port-Hamiltonian Neural Networks

1 code implementation1 Dec 2022 Cyrus Neary, Ufuk Topcu

Toward the objective of learning composite models of such systems from data, we present i) a framework for compositional neural networks, ii) algorithms to train these models, iii) a method to compose the learned models, iv) theoretical results that bound the error of the resulting composite models, and v) a method to learn the composition itself, when it is not known a priori.

Inductive Bias

Sensor Placement for Online Fault Diagnosis

no code implementations21 Nov 2022 Dhananjay Raju, Georgios Bakirtzis, Ufuk Topcu

Fault diagnosis is the problem of determining a set of faulty system components that explain discrepancies between observed and expected behavior.

Differentially Private Timeseries Forecasts for Networked Control

no code implementations1 Oct 2022 Po-han Li, Sandeep P. Chinchali, Ufuk Topcu

We analyze a cost-minimization problem in which the controller relies on an imperfect timeseries forecast.

Online Poisoning Attacks Against Data-Driven Predictive Control

no code implementations19 Sep 2022 Yue Yu, Ruihan Zhao, Sandeep Chinchali, Ufuk Topcu

Data-driven predictive control (DPC) is a feedback control method for systems with unknown dynamics.

Learning Interpretable Temporal Properties from Positive Examples Only

1 code implementation6 Sep 2022 Rajarshi Roy, Jean-Raphaël Gaglione, Nasim Baharisangari, Daniel Neider, Zhe Xu, Ufuk Topcu

To learn meaningful models from positive examples only, we design algorithms that rely on conciseness and language minimality of models as regularizers.

Categorical semantics of compositional reinforcement learning

no code implementations29 Aug 2022 Georgios Bakirtzis, Michail Savvas, Ufuk Topcu

However, generating compositional models requires the characterization of minimal assumptions for the robustness of the compositional feature.

reinforcement-learning Reinforcement Learning (RL)

Non-Parametric Neuro-Adaptive Formation Control

no code implementations17 Jul 2022 Christos K. Verginis, Zhe Xu, Ufuk Topcu

Most existing algorithms either assume certain parametric forms for the unknown dynamic terms or resort to unnecessarily large control inputs in order to provide theoretical guarantees.

Adversarial Examples for Model-Based Control: A Sensitivity Analysis

no code implementations14 Jul 2022 Po-han Li, Ufuk Topcu, Sandeep P. Chinchali

We propose a method to attack controllers that rely on external timeseries forecasts as task parameters.

Adversarial Attack

Deceptive Planning for Resource Allocation

no code implementations2 Jun 2022 Shenghui Chen, Yagiz Savas, Mustafa O. Karabag, Brian M. Sadler, Ufuk Topcu

We consider a team of autonomous agents that navigate in an adversarial environment and aim to achieve a task by allocating their resources over a set of target locations.


Additive Logistic Mechanism for Privacy-Preserving Self-Supervised Learning

no code implementations25 May 2022 Yunhao Yang, Parham Gohari, Ufuk Topcu

We study the privacy risks that are associated with training a neural network's weights with self-supervised learning algorithms.

Privacy Preserving Self-Supervised Learning

Joint Learning of Reward Machines and Policies in Environments with Partially Known Semantics

no code implementations20 Apr 2022 Christos Verginis, Cevahir Koprulu, Sandeep Chinchali, Ufuk Topcu

We develop a reinforcement-learning algorithm that infers a reward machine that encodes the underlying task while learning how to execute it, despite the uncertainties of the propositions' truth values.

Q-Learning reinforcement-learning +1

Safe Reinforcement Learning via Shielding under Partial Observability

no code implementations2 Apr 2022 Steven Carr, Nils Jansen, Sebastian Junges, Ufuk Topcu

Safe exploration is a common problem in reinforcement learning (RL) that aims to prevent agents from making disastrous decisions while exploring their environment.

reinforcement-learning Reinforcement Learning (RL) +2

Safely: Safe Stochastic Motion Planning Under Constrained Sensing via Duality

no code implementations5 Mar 2022 Michael Hibbard, Abraham P. Vinod, Jesse Quattrociocchi, Ufuk Topcu

We introduce the Safely motion planner, a receding-horizon control framework, that simultaneously synthesizes both a trajectory for the robot to follow as well as a sensor selection strategy that prescribes trajectory-relevant obstacles to measure at each time step while respecting the sensing constraints of the robot.

Motion Planning

No-Regret Learning in Dynamic Stackelberg Games

no code implementations10 Feb 2022 Niklas Lauffer, Mahsa Ghasemi, Abolfazl Hashemi, Yagiz Savas, Ufuk Topcu

The regret of the proposed learning algorithm is independent of the size of the state space and polynomial in the rest of the parameters of the game.


Planning Not to Talk: Multiagent Systems that are Robust to Communication Loss

1 code implementation17 Jan 2022 Mustafa O. Karabag, Cyrus Neary, Ufuk Topcu

In this work, we develop joint policies for cooperative multiagent systems that are robust to potential losses in communication.

Taylor-Lagrange Neural Ordinary Differential Equations: Toward Fast Training and Evaluation of Neural ODEs

1 code implementation14 Jan 2022 Franck Djeumou, Cyrus Neary, Eric Goubault, Sylvie Putot, Ufuk Topcu

Neural ordinary differential equations (NODEs) -- parametrizations of differential equations using neural networks -- have shown tremendous promise in learning models of unknown continuous-time dynamical systems from data.

Density Estimation Image Classification +1

Non-Parametric Neuro-Adaptive Coordination of Multi-Agent Systems

no code implementations11 Oct 2021 Christos K. Verginis, Zhe Xu, Ufuk Topcu

Most existing algorithms either assume certain parametric forms for the unknown dynamic terms or resort to unnecessarily large control inputs in order to provide theoretical guarantees.

On the Privacy Risks of Deploying Recurrent Neural Networks in Machine Learning Models

no code implementations6 Oct 2021 Yunhao Yang, Parham Gohari, Ufuk Topcu

Additionally, we study the effectiveness of two prominent mitigation methods for preempting MIAs, namely weight regularization and differential privacy.

BIG-bench Machine Learning Image Classification +1

Deceptive Decision-Making Under Uncertainty

no code implementations14 Sep 2021 Yagiz Savas, Christos K. Verginis, Ufuk Topcu

We study the design of autonomous agents that are capable of deceiving outside observers about their intentions while carrying out tasks in stochastic, complex environments.

Decision Making Decision Making Under Uncertainty

Neural Networks with Physics-Informed Architectures and Constraints for Dynamical Systems Modeling

1 code implementation14 Sep 2021 Franck Djeumou, Cyrus Neary, Eric Goubault, Sylvie Putot, Ufuk Topcu

The physics-informed constraints are enforced via the augmented Lagrangian method during the model's training.

Inductive Bias

Simultaneous Perception-Action Design via Invariant Finite Belief Sets

no code implementations10 Sep 2021 Michael Hibbard, Takashi Tanaka, Ufuk Topcu

Although perception is an increasingly dominant portion of the overall computational cost for autonomous systems, only a fraction of the information perceived is likely to be relevant to the current task.

Convex Optimization for Parameter Synthesis in MDPs

1 code implementation30 Jun 2021 Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ufuk Topcu

The parameter synthesis problem is to compute an instantiation of these unspecified parameters such that the resulting MDP satisfies the temporal logic specification.

Collision Avoidance

On the Benefits of Inducing Local Lipschitzness for Robust Generative Adversarial Imitation Learning

no code implementations30 Jun 2021 Farzan Memarian, Abolfazl Hashemi, Scott Niekum, Ufuk Topcu

We explore methodologies to improve the robustness of generative adversarial imitation learning (GAIL) algorithms to observation noise.

Imitation Learning

Probabilistic Control of Heterogeneous Swarms Subject to Graph Temporal Logic Specifications: A Decentralized and Scalable Approach

no code implementations29 Jun 2021 Franck Djeumou, Zhe Xu, Murat Cubuktepe, Ufuk Topcu

Specifically, we study a setting in which the agents move along the nodes of a graph, and the high-level task specifications for the swarm are expressed in a recently-proposed language called graph temporal logic (GTL).

Learning to Reach, Swim, Walk and Fly in One Trial: Data-Driven Control with Scarce Data and Side Information

no code implementations19 Jun 2021 Franck Djeumou, Ufuk Topcu

We develop a learning-based control algorithm for unknown dynamical systems under very severe data limitations.

Model Predictive Control

Robust Training in High Dimensions via Block Coordinate Geometric Median Descent

2 code implementations16 Jun 2021 Anish Acharya, Abolfazl Hashemi, Prateek Jain, Sujay Sanghavi, Inderjit S. Dhillon, Ufuk Topcu

Geometric median (\textsc{Gm}) is a classical method in statistics for achieving a robust estimation of the uncorrupted data; under gross corruption, it achieves the optimal breakdown point of 0. 5.

Ranked #19 on Image Classification on MNIST (Accuracy metric)

Image Classification Vocal Bursts Intensity Prediction

Verifiable and Compositional Reinforcement Learning Systems

1 code implementation7 Jun 2021 Cyrus Neary, Christos Verginis, Murat Cubuktepe, Ufuk Topcu

We propose a framework for verifiable and compositional reinforcement learning (RL) in which a collection of RL subsystems, each of which learns to accomplish a separate subtask, are composed to achieve an overall task.

reinforcement-learning Reinforcement Learning (RL)

Task-Guided Inverse Reinforcement Learning Under Partial Information

no code implementations28 May 2021 Franck Djeumou, Murat Cubuktepe, Craig Lennon, Ufuk Topcu

Nevertheless, the resulting formulation is still nonconvex due to the intrinsic nonconvexity of the so-called forward problem, i. e., computing an optimal policy given a reward function, in POMDPs.

reinforcement-learning Reinforcement Learning (RL)

Uncertainty-Aware Signal Temporal Logic Inference

1 code implementation24 May 2021 Nasim Baharisangari, Jean-Raphaël Gaglione, Daniel Neider, Ufuk Topcu, Zhe Xu

In this paper, we first investigate the uncertainties associated with trajectories of a system and represent such uncertainties in the form of interval trajectories.

Safety-Constrained Learning and Control using Scarce Data and Reciprocal Barriers

no code implementations13 May 2021 Christos K. Verginis, Franck Djeumou, Ufuk Topcu

We develop a control algorithm that ensures the safety, in terms of confinement in a set, of a system with unknown, 2nd-order nonlinear dynamics.

Efficient Strategy Synthesis for MDPs with Resource Constraints

no code implementations5 May 2021 František Blahoudek, Petr Novotný, Melkior Ornik, Pranay Thangeda, Ufuk Topcu

We consider qualitative strategy synthesis for the formalism called consumption Markov decision processes.

Polynomial-Time Algorithms for Multi-Agent Minimal-Capacity Planning

no code implementations4 May 2021 Murat Cubuktepe, František Blahoudek, Ufuk Topcu

We develop an algorithm that solves this graph problem in time that is \emph{polynomial} in the number of agents, target locations, and size of the consumption Markov decision process.

Learning Linear Temporal Properties from Noisy Data: A MaxSAT Approach

no code implementations30 Apr 2021 Jean-Raphaël Gaglione, Daniel Neider, Rajarshi Roy, Ufuk Topcu, Zhe Xu

Our first algorithm infers minimal LTL formulas by reducing the inference problem to a problem in maximum satisfiability and then using off-the-shelf MaxSAT solvers to find a solution.

Temporal-Logic-Based Intermittent, Optimal, and Safe Continuous-Time Learning for Trajectory Tracking

no code implementations6 Apr 2021 Aris Kanellopoulos, Filippos Fotiadis, Chuangchuang Sun, Zhe Xu, Kyriakos G. Vamvoudakis, Ufuk Topcu, Warren E. Dixon

In this paper, we develop safe reinforcement-learning-based controllers for systems tasked with accomplishing complex missions that can be expressed as linear temporal logic specifications, similar to those required by search-and-rescue missions.

Reinforcement Learning (RL) Safe Reinforcement Learning

Self-Supervised Online Reward Shaping in Sparse-Reward Environments

1 code implementation8 Mar 2021 Farzan Memarian, Wonjoon Goo, Rudolf Lioutikov, Scott Niekum, Ufuk Topcu

We introduce Self-supervised Online Reward Shaping (SORS), which aims to improve the sample efficiency of any RL algorithm in sparse-reward environments by automatically densifying rewards.

Generalization Bounds for Sparse Random Feature Expansions

2 code implementations4 Mar 2021 Abolfazl Hashemi, Hayden Schaeffer, Robert Shi, Ufuk Topcu, Giang Tran, Rachel Ward

In particular, we provide generalization bounds for functions in a certain class (that is dense in a reproducing kernel Hilbert space) depending on the number of samples and the distribution of features.

BIG-bench Machine Learning Compressive Sensing +1

Physical-Layer Security via Distributed Beamforming in the Presence of Adversaries with Unknown Locations

no code implementations28 Feb 2021 Yagiz Savas, Abolfazl Hashemi, Abraham P. Vinod, Brian M. Sadler, Ufuk Topcu

In such a setting, we develop a periodic transmission strategy, i. e., a sequence of joint beamforming gain and artificial noise pairs, that prevents the adversaries from decreasing their uncertainty on the information sequence by eavesdropping on the transmission.

Privacy-Preserving Kickstarting Deep Reinforcement Learning with Privacy-Aware Learners

no code implementations18 Feb 2021 Parham Gohari, Bo Chen, Bo Wu, Matthew Hale, Ufuk Topcu

We then develop a kickstarted deep reinforcement learning algorithm for the student that is privacy-aware because we calibrate its objective with the parameters of the teacher's privacy mechanism.

Privacy Preserving reinforcement-learning +1

On Controllability and Persistency of Excitation in Data-Driven Control: Extensions of Willems' Fundamental Lemma

no code implementations5 Feb 2021 Yue Yu, Shahriar Talebi, Henk J. van Waarde, Ufuk Topcu, Mehran Mesbahi, Behçet Açıkmeşe

Willems' fundamental lemma asserts that all trajectories of a linear time-invariant system can be obtained from a finite number of measured ones, assuming that controllability and a persistency of excitation condition hold.

LEMMA Model Predictive Control

Safe Multi-Agent Reinforcement Learning via Shielding

no code implementations27 Jan 2021 Ingy Elsayed-Aly, Suda Bharadwaj, Christopher Amato, Rüdiger Ehlers, Ufuk Topcu, Lu Feng

Multi-agent reinforcement learning (MARL) has been increasingly used in a wide range of safety-critical applications, which require guaranteed safety (e. g., no unsafe states are ever visited) during the learning process. Unfortunately, current MARL methods do not have safety guarantees.

Multi-agent Reinforcement Learning reinforcement-learning +1

Multiple Plans are Better than One: Diverse Stochastic Planning

no code implementations31 Dec 2020 Mahsa Ghasemi, Evan Scope Crafts, Bo Zhao, Ufuk Topcu

In planning problems, it is often challenging to fully model the desired specifications.

Faster Non-Convex Federated Learning via Global and Local Momentum

no code implementations7 Dec 2020 Rudrajit Das, Anish Acharya, Abolfazl Hashemi, Sujay Sanghavi, Inderjit S. Dhillon, Ufuk Topcu

We propose \texttt{FedGLOMO}, a novel federated learning (FL) algorithm with an iteration complexity of $\mathcal{O}(\epsilon^{-1. 5})$ to converge to an $\epsilon$-stationary point (i. e., $\mathbb{E}[\|\nabla f(\bm{x})\|^2] \leq \epsilon$) for smooth non-convex functions -- under arbitrary client heterogeneity and compressed communication -- compared to the $\mathcal{O}(\epsilon^{-2})$ complexity of most prior works.

Federated Learning

Towards Online Monitoring and Data-driven Control: A Study of Segmentation Algorithms for Laser Powder Bed Fusion Processes

no code implementations18 Nov 2020 Alexander Nettekoven, Scott Fish, Joseph Beaman, Ufuk Topcu

The identified algorithms can be readily applied to the laser powder bed fusion machines to address each of the above limitations and thus, significantly improve process control.


On-The-Fly Control of Unknown Systems: From Side Information to Performance Guarantees through Reachability

no code implementations11 Nov 2020 Franck Djeumou, Abraham P. Vinod, Eric Goubault, Sylvie Putot, Ufuk Topcu

Besides, $\texttt{DaTaControl}$ achieves near-optimal control and is suitable for real-time control of such systems.

Assured Autonomy: Path Toward Living With Autonomous Systems We Can Trust

no code implementations27 Oct 2020 Ufuk Topcu, Nadya Bliss, Nancy Cooke, Missy Cummings, Ashley Llorens, Howard Shrobe, Lenore Zuck

The second workshop held in February 2020, focused on existing capabilities, current research, and research trends that could address the challenges and problems identified in workshop.

On-The-Fly Control of Unknown Smooth Systems from Limited Data

no code implementations27 Sep 2020 Franck Djeumou, Abraham P. Vinod, Eric Goubault, Sylvie Putot, Ufuk Topcu

We investigate the problem of data-driven, on-the-fly control of systems with unknown nonlinear dynamics where data from only a single finite-horizon trajectory and possibly side information on the dynamics are available.

Robust Finite-State Controllers for Uncertain POMDPs

no code implementations24 Sep 2020 Murat Cubuktepe, Nils Jansen, Sebastian Junges, Ahmadreza Marandi, Marnix Suilen, Ufuk Topcu

(3) We linearize this dual problem and (4) solve the resulting finite linear program to obtain locally optimal solutions to the original problem.

Collision Avoidance Motion Planning

Constrained Active Classification Using Partially Observable Markov Decision Processes

no code implementations10 Aug 2020 Bo Wu, Niklas Lauffer, Mohamadreza Ahmadi, Suda Bharadwaj, Zhe Xu, Ufuk Topcu

The proposed framework relies on assigning a classification belief (a probability distribution) to the attributes of interest.

Attribute Classification +1

Byzantine-Resilient Distributed Hypothesis Testing With Time-Varying Network Topology

no code implementations1 Aug 2020 Bo Wu, Steven Carr, Suda Bharadwaj, Zhe Xu, Ufuk Topcu

We study the problem of distributed hypothesis testing over a network of mobile agents with limited communication and sensing ranges to infer the true hypothesis collaboratively.

Near-Optimal Reactive Synthesis Incorporating Runtime Information

no code implementations31 Jul 2020 Suda Bharadwaj, Abraham P. Vinod, Rayna Dimitrova, Ufuk Topcu

We consider the problem of optimal reactive synthesis - compute a strategy that satisfies a mission specification in a dynamic environment, and optimizes a performance metric.

Management Motion Planning

On the Complexity of Sequential Incentive Design

1 code implementation16 Jul 2020 Yagiz Savas, Vijay Gupta, Ufuk Topcu

We model the agent's behavior as a Markov decision process, express its intrinsic motivation as a reward function, which belongs to a finite set of possible reward functions, and consider the incentives as additional rewards offered to the agent.

Optimization and Control

Temporal-Logic-Based Reward Shaping for Continuing Reinforcement Learning Tasks

no code implementations3 Jul 2020 Yuqian Jiang, Sudarshanan Bharadwaj, Bo Wu, Rishi Shah, Ufuk Topcu, Peter Stone

Reward shaping is a common approach for incorporating domain knowledge into reinforcement learning in order to speed up convergence to an optimal policy.

reinforcement-learning Reinforcement Learning (RL)

Reward Machines for Cooperative Multi-Agent Reinforcement Learning

2 code implementations3 Jul 2020 Cyrus Neary, Zhe Xu, Bo Wu, Ufuk Topcu

In cooperative multi-agent reinforcement learning, a collection of agents learns to interact in a shared environment to achieve a common goal.

Multi-agent Reinforcement Learning Q-Learning +2

Active Finite Reward Automaton Inference and Reinforcement Learning Using Queries and Counterexamples

no code implementations28 Jun 2020 Zhe Xu, Bo Wu, Aditya Ojha, Daniel Neider, Ufuk Topcu

We compare our algorithm with the state-of-the-art RL algorithms for non-Markovian reward functions, such as Joint Inference of Reward machines and Policies for RL (JIRP), Learning Reward Machine (LRM), and Proximal Policy Optimization (PPO2).

Active Learning Q-Learning +2

Collaborative Beamforming Under Localization Errors: A Discrete Optimization Approach

no code implementations27 Mar 2020 Erfaun Noorani, Yagiz Savas, Alec Koppel, John Baras, Ufuk Topcu, Brian M. Sadler

In particular, we formulate a discrete optimization problem to choose only a subset of agents to transmit the message signal so that the variance of the signal-to-noise ratio (SNR) received by the base station is minimized while the expected SNR exceeds a desired threshold.

Verifiable RNN-Based Policies for POMDPs Under Temporal Logic Constraints

no code implementations13 Feb 2020 Steven Carr, Nils Jansen, Ufuk Topcu

Recurrent neural networks (RNNs) have emerged as an effective representation of control policies in sequential decision-making problems.

Decision Making

Adaptive Teaching of Temporal Logic Formulas to Learners with Preferences

no code implementations27 Jan 2020 Zhe Xu, Yuxin Chen, Ufuk Topcu

In the context of teaching temporal logic formulas, an exhaustive search even for a myopic solution takes exponential time (with respect to the time span of the task).

Learning and Planning for Time-Varying MDPs Using Maximum Likelihood Estimation

no code implementations29 Nov 2019 Melkior Ornik, Ufuk Topcu

This paper proposes a formal approach to online learning and planning for agents operating in a priori unknown, time-varying environments.

Online Synthesis for Runtime Enforcement of Safety in Multi-Agent Systems

no code implementations23 Oct 2019 Dhananjay Raju, Suda Bharadwaj, Ufuk Topcu

In this approach, which is fundamentally decentralized, the shield on every agent has two components: a pathfinder that corrects the behavior of the agent and an ordering mechanism that dynamically modifies the priority of the agent.

Collision Avoidance Pathfinder +1

Online Active Perception for Partially Observable Markov Decision Processes with Limited Budget

no code implementations4 Oct 2019 Mahsa Ghasemi, Ufuk Topcu

In applications in which the agent does not have prior knowledge about the available information sources, it is crucial to synthesize active perception strategies at runtime.

Identifying Sparse Low-Dimensional Structures in Markov Chains: A Nonnegative Matrix Factorization Approach

no code implementations27 Sep 2019 Mahsa Ghasemi, Abolfazl Hashemi, Haris Vikalo, Ufuk Topcu

We formulate the task of representation learning as that of mapping the state space of the model to a low-dimensional state space, called the kernel space.

Representation Learning

Joint Inference of Reward Machines and Policies for Reinforcement Learning

no code implementations12 Sep 2019 Zhe Xu, Ivan Gavran, Yousef Ahmad, Rupak Majumdar, Daniel Neider, Ufuk Topcu, Bo Wu

The experiments show that learning high-level knowledge in the form of reward machines can lead to fast convergence to optimal policies in RL, while standard RL methods such as q-learning and hierarchical RL methods fail to converge to optimal policies after a substantial number of training steps in many tasks.

Q-Learning reinforcement-learning +1

Transfer of Temporal Logic Formulas in Reinforcement Learning

no code implementations10 Sep 2019 Zhe Xu, Ufuk Topcu

Transferring high-level knowledge from a source task to a target task is an effective way to expedite reinforcement learning (RL).

reinforcement-learning Reinforcement Learning (RL) +1

An Encoder-Decoder Based Approach for Anomaly Detection with Application in Additive Manufacturing

no code implementations26 Jul 2019 Baihong Jin, Yingshui Tan, Alexander Nettekoven, Yuxin Chen, Ufuk Topcu, Yisong Yue, Alberto Sangiovanni Vincentelli

We show that the encoder-decoder model is able to identify the injected anomalies in a modern manufacturing process in an unsupervised fashion.

Anomaly Detection

Synthesis of Provably Correct Autonomy Protocols for Shared Control

no code implementations15 May 2019 Murat Cubuktepe, Nils Jansen, Mohammed Alsiekh, Ufuk Topcu

We design the autonomy protocol to ensure that the resulting robot behavior satisfies given safety and performance specifications in probabilistic temporal logic.

Perception-Aware Point-Based Value Iteration for Partially Observable Markov Decision Processes

no code implementations ICLR 2019 Mahsa Ghasemi, Ufuk Topcu

However, in a variety of real-world scenarios the agent has an active role in its perception by selecting which observations to receive.

Decision Making

Reward-Based Deception with Cognitive Bias

no code implementations25 Apr 2019 Bo Wu, Murat Cubuktepe, Suda Bharadwaj, Ufuk Topcu

In this paper, we consider deceiving adversaries with bounded rationality and in terms of expected rewards.

Counterexample-Guided Strategy Improvement for POMDPs Using Recurrent Neural Networks

no code implementations20 Mar 2019 Steven Carr, Nils Jansen, Ralf Wimmer, Alexandru C. Serban, Bernd Becker, Ufuk Topcu

The particular problem is to determine strategies that provably adhere to (probabilistic) temporal logic constraints.

Distributed Synthesis of Surveillance Strategies for Mobile Sensors

no code implementations6 Feb 2019 Suda Bharadwaj, Rayna Dimitrova, Ufuk Topcu

We study the problem of synthesizing strategies for a mobile sensor network to conduct surveillance in partnership with static alarm triggers.

Algorithms for Fairness in Sequential Decision Making

1 code implementation24 Jan 2019 Min Wen, Osbert Bastani, Ufuk Topcu

It has recently been shown that if feedback effects of decisions are ignored, then imposing fairness constraints such as demographic parity or equality of opportunity can actually exacerbate unfairness.

Decision Making Fairness

Constrained Cross-Entropy Method for Safe Reinforcement Learning

no code implementations NeurIPS 2018 Min Wen, Ufuk Topcu

We show that the asymptotic behavior of the proposed algorithm can be almost-surely described by that of an ordinary differential equation.

reinforcement-learning Reinforcement Learning (RL) +1

The Partially Observable Games We Play for Cyber Deception

no code implementations28 Sep 2018 Mohamadreza Ahmadi, Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ufuk Topcu

Then, the deception problem is to compute a strategy for the deceiver that minimizes the expected cost of deception against all strategies of the infiltrator.

Entropy Maximization for Markov Decision Processes Under Temporal Logic Constraints

no code implementations9 Jul 2018 Yagiz Savas, Melkior Ornik, Murat Cubuktepe, Mustafa O. Karabag, Ufuk Topcu

Such a policy minimizes the predictability of the paths it generates, or dually, maximizes the exploration of different paths in an MDP while ensuring the satisfaction of a temporal logic specification.

Motion Planning

Deception in Optimal Control

no code implementations8 May 2018 Melkior Ornik, Ufuk Topcu

In this paper, we consider an adversarial scenario where one agent seeks to achieve an objective and its adversary seeks to learn the agent's intentions and prevent the agent from achieving its objective.

Counterexamples for Robotic Planning Explained in Structured Language

no code implementations23 Mar 2018 Lu Feng, Mahsa Ghasemi, Kai-Wei Chang, Ufuk Topcu

Automated techniques such as model checking have been used to verify models of robotic mission plans based on Markov decision processes (MDPs) and generate counterexamples that may help diagnose requirement violations.

Synthesis in pMDPs: A Tale of 1001 Parameters

no code implementations5 Mar 2018 Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ufuk Topcu

This paper considers parametric Markov decision processes (pMDPs) whose transitions are equipped with affine functions over a finite set of parameters.

Verification of Markov Decision Processes with Risk-Sensitive Measures

no code implementations28 Feb 2018 Murat Cubuktepe, Ufuk Topcu

We develop a method for computing policies in Markov decision processes with risk-sensitive measures subject to temporal logic constraints.

Human-in-the-Loop Synthesis for Partially Observable Markov Decision Processes

no code implementations27 Feb 2018 Steven Carr, Nils Jansen, Ralf Wimmer, Jie Fu, Ufuk Topcu

The efficient verification of this MC gives quantitative insights into the quality of the inferred human strategy by proving or disproving given system specifications.

Sensor Synthesis for POMDPs with Reachability Objectives

no code implementations29 Sep 2017 Krishnendu Chatterjee, Martin Chmelik, Ufuk Topcu

Partially observable Markov decision processes (POMDPs) are widely used in probabilistic planning problems in which an agent interacts with an environment using noisy and imprecise sensors.

Control-Oriented Learning on the Fly

no code implementations14 Sep 2017 Melkior Ornik, Arie Israel, Ufuk Topcu

This paper focuses on developing a strategy for control of systems whose dynamics are almost entirely unknown.

Safe Reinforcement Learning via Shielding

1 code implementation29 Aug 2017 Mohammed Alshiekh, Roderick Bloem, Ruediger Ehlers, Bettina Könighofer, Scott Niekum, Ufuk Topcu

In the first one, the shield acts each time the learning agent is about to make a decision and provides a list of safe actions.

reinforcement-learning Reinforcement Learning (RL) +1

Strategy Synthesis in POMDPs via Game-Based Abstractions

no code implementations14 Aug 2017 Leonore Winterer, Sebastian Junges, Ralf Wimmer, Nils Jansen, Ufuk Topcu, Joost-Pieter Katoen, Bernd Becker

We study synthesis problems with constraints in partially observable Markov decision processes (POMDPs), where the objective is to compute a strategy for an agent that is guaranteed to satisfy certain safety and performance specifications.

Motion Planning

Environment-Independent Task Specifications via GLTL

no code implementations14 Apr 2017 Michael L. Littman, Ufuk Topcu, Jie Fu, Charles Isbell, Min Wen, James Macglashan

We propose a new task-specification language for Markov decision processes that is designed to be an improvement over reward functions by being environment independent.

reinforcement-learning Reinforcement Learning (RL)

Synthesis of Shared Control Protocols with Provable Safety and Performance Guarantees

no code implementations26 Oct 2016 Nils Jansen, Murat Cubuktepe, Ufuk Topcu

We formalize synthesis of shared control protocols with correctness guarantees for temporal logic specifications.

An Automaton Learning Approach to Solving Safety Games over Infinite Graphs

no code implementations7 Jan 2016 Daniel Neider, Ufuk Topcu

We propose a method to construct finite-state reactive controllers for systems whose interactions with their adversarial environment are modeled by infinite-duration two-player games over (possibly) infinite graphs.

Motion Planning

Correct-by-synthesis reinforcement learning with temporal logic constraints

no code implementations5 Mar 2015 Min Wen, Ruediger Ehlers, Ufuk Topcu

We establish both correctness (with respect to the temporal logic specifications) and optimality (with respect to the a priori unknown performance criterion) of this two-step technique for a fragment of temporal logic specifications.

Motion Planning Q-Learning +2

Integrating active sensing into reactive synthesis with temporal logic constraints under partial observations

no code implementations1 Oct 2014 Jie Fu, Ufuk Topcu

We show that by alternating between the observation-based strategy and the active sensing strategy, under a mild technical assumption of the set of sensors in the system, the given temporal logic specification can be satisfied with probability 1.

Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints

no code implementations28 Apr 2014 Jie Fu, Ufuk Topcu

We model the interaction between the system and its environment as a Markov decision process (MDP) with initially unknown transition probabilities.

Cannot find the paper you are looking for? You can Submit a new open access paper.