no code implementations • ICML 2020 • Mahsa Ghasemi, Erdem Bulgur, Ufuk Topcu
Furthermore, as new data arrive, the belief over the atomic propositions evolves and, subsequently, the planning strategy adapts accordingly.
no code implementations • 3 Feb 2025 • Shenghui Chen, Ruihan Zhao, Sandeep Chinchali, Ufuk Topcu
We introduce CoNav-Maze, a simulated robotics environment where a robot navigates using local perception while a human operator provides guidance based on an inaccurate map.
1 code implementation • 31 Jan 2025 • Mustafa O. Karabag, Ufuk Topcu
The empirical results show that while non-chameleon LLM agents identify the chameleon, they fail to conceal the secret from the chameleon, and their winning probability is far from the levels of even trivial strategies.
no code implementations • 30 Jan 2025 • Yerin Kim, Alexander Benvenuti, Bo Chen, Mustafa Karabag, Abhishek Kulkarni, Nathaniel D. Bastian, Ufuk Topcu, Matthew Hale
Autonomous systems are increasingly expected to operate in the presence of adversaries, though an adversary may infer sensitive information simply by observing a system, without even needing to interact with it.
no code implementations • 30 Jan 2025 • Tyler Ingebrand, Adam J. Thorpe, Ufuk Topcu
A central challenge in transfer learning is designing algorithms that can quickly adapt and generalize to new tasks without retraining.
no code implementations • 15 Jan 2025 • Surya Murthy, John-Paul Clarke, Ufuk Topcu, Zhenyu Gao
Urban air mobility (UAM) is a transformative system that operates various small aerial vehicles in urban environments to reshape urban transportation.
1 code implementation • 15 Dec 2024 • Cyrus Neary, Nathan Tsao, Ufuk Topcu
Towards developing deep learning models for constrained dynamical systems, we introduce neural port-Hamiltonian differential algebraic equations (N-PHDAEs), which use neural networks to parametrize unknown terms in both the differential and algebraic components of a port-Hamiltonian DAE.
no code implementations • 2 Dec 2024 • Cevahir Koprulu, Po-han Li, Tianyu Qiu, Ruihan Zhao, Tyler Westenbroek, David Fridovich-Keil, Sandeep Chinchali, Ufuk Topcu
Many continuous control problems can be formulated as sparse-reward reinforcement learning (RL) tasks.
no code implementations • 15 Nov 2024 • Po-han Li, Yunhao Yang, Mohammad Omama, Sandeep Chinchali, Ufuk Topcu
Autonomous agents perceive and interpret their surroundings by integrating multimodal inputs, such as vision, audio, and LiDAR.
no code implementations • 3 Nov 2024 • Neel P. Bhatt, Yunhao Yang, Rohan Siva, Daniel Milan, Ufuk Topcu, Zhangyang Wang
To quantify each type of uncertainty, we propose methods tailored to the unique properties of perception and decision-making: we use conformal prediction to calibrate perception uncertainty and introduce Formal-Methods-Driven Prediction (FMDP) to quantify decision uncertainty, leveraging formal verification techniques for theoretical guarantees.
no code implementations • 23 Oct 2024 • Shenghui Chen, Ruihan Zhao, Sandeep Chinchali, Ufuk Topcu
To synthesize cooperative policies for the agent in this extended game, we propose an approach featuring a memory module for a running probabilistic belief of the environment dynamics and an online planning algorithm called IntentMCTS.
no code implementations • 18 Oct 2024 • Yunhao Yang, William Ward, Zichao Hu, Joydeep Biswas, Ufuk Topcu
Given a high-level task description in natural language, the proposed method queries a language model to generate plans in the form of executable robot programs.
no code implementations • 18 Oct 2024 • Yunhao Yang, Leonard Berthellemy, Ufuk Topcu
We develop a method that integrates the tree of thoughts and multi-agent framework to enhance the capability of pre-trained language models in solving complex, unfamiliar games.
no code implementations • 10 Oct 2024 • Po-han Li, Sandeep P. Chinchali, Ufuk Topcu
CSA maps unimodal features into a multimodal space, using a new similarity score to retain only the multimodal information.
no code implementations • 2 Oct 2024 • Yunhao Yang, Yuxin Hu, Mao Ye, Zaiwei Zhang, Zhichao Lu, Yi Xu, Ufuk Topcu, Ben Snyder
Multimodal foundation models offer promising advancements for enhancing driving perception systems, but their high computational and financial costs pose challenges.
1 code implementation • 30 Sep 2024 • Tyler Ingebrand, Adam J. Thorpe, Somdatta Goswami, Krishna Kumar, Ufuk Topcu
We present Basis-to-Basis (B2B) operator learning, a novel approach for learning operators on Hilbert spaces of functions based on the foundational ideas of function encoders.
2 code implementations • 30 Sep 2024 • Kevin Wang, Junbo Li, Neel P. Bhatt, Yihan Xi, Qiang Liu, Ufuk Topcu, Zhangyang Wang
Recent advancements in Large Language Models (LLMs) have showcased their ability to perform complex reasoning tasks, but their effectiveness in planning remains underexplored.
no code implementations • 19 Sep 2024 • Shenghui Chen, Shufang Zhu, Giuseppe De Giacomo, Ufuk Topcu
We demonstrate how an autonomous agent can learn to cooperate by interpreting its partner's actions, which are used to hint at its intents.
no code implementations • 23 Aug 2024 • Georgios Bakirtzis, Michail Savvas, Ruihan Zhao, Sandeep Chinchali, Ufuk Topcu
In reinforcement learning, conducting task composition by forming cohesive, executable sequences from multiple tasks remains challenging.
1 code implementation • 20 Aug 2024 • Surya Murthy, Mustafa O. Karabag, Ufuk Topcu
The offering agent makes trade offers to improve its utility without knowing the responding agent's utility function, and the responding agent only accepts offers that improve its utility.
no code implementations • 16 Aug 2024 • Maris F. L. Galesloot, Marnix Suilen, Thiago D. Simão, Steven Carr, Matthijs T. J. Spaan, Ufuk Topcu, Nils Jansen
To compute such robust memory-based policies, we propose the pessimistic iterative planning (PIP) framework, which alternates between two main steps: (1) selecting a pessimistic (non-robust) POMDP via worst-case probability instances from the uncertainty sets; and (2) computing a finite-state controller (FSC) for this pessimistic POMDP.
no code implementations • 16 Aug 2024 • Georgios Bakirtzis, Andrea Aler Tubella, Andreas Theodorou, David Danks, Ufuk Topcu
Sociotechnical requirements shape the governance of artificially intelligent (AI) systems.
1 code implementation • 23 May 2024 • Shenghui Chen, Daniel Fried, Ufuk Topcu
To solve this problem, we propose a communication-based approach comprising a language module and a planning module.
no code implementations • 1 Apr 2024 • Lisong C. Sun, Neel P. Bhatt, Jonathan C. Liu, Zhiwen Fan, Zhangyang Wang, Todd E. Humphreys, Ufuk Topcu
We show for the first time that using 3D Gaussians for map representation with unposed camera images and inertial measurements can enable accurate SLAM.
no code implementations • 25 Mar 2024 • Kevin S. Miller, Adam J. Thorpe, Ufuk Topcu
We present an active learning algorithm for learning dynamics that leverages side information by explicitly incorporating prior domain knowledge into the sampling process.
no code implementations • 14 Feb 2024 • Xinjie Liu, Lasse Peters, Javier Alonso-Mora, Ufuk Topcu, David Fridovich-Keil
When multiple agents interact in a common environment, each agent's actions impact others' future decisions, and noncooperative dynamic games naturally capture this coupling.
no code implementations • 13 Feb 2024 • Po-han Li, Oyku Selin Toprak, Aditya Narayanan, Ufuk Topcu, Sandeep Chinchali
We thus formulate a user-centric online model selection problem and propose a novel solution that combines an open-source encoder to output context and an online learning algorithm that processes this context.
no code implementations • 11 Feb 2024 • Shayan Meshkat Alsadat, Jean-Raphael Gaglione, Daniel Neider, Ufuk Topcu, Zhe Xu
Our method uses Large Language Models (LLM) to obtain high-level domain-specific knowledge using prompt engineering instead of providing the reinforcement learning algorithm directly with the high-level knowledge which requires an expert to encode the automaton.
no code implementations • 7 Feb 2024 • Arash Amini, Yigit Ege Bayiz, Ashwin Ram, Radu Marculescu, Ufuk Topcu
In the era of social media platforms, identifying the credibility of online content is crucial to combat misinformation.
2 code implementations • 30 Jan 2024 • Tyler Ingebrand, Amy Zhang, Ufuk Topcu
Although reinforcement learning (RL) can solve many challenging sequential decision making problems, achieving zero-shot transfer across related tasks remains a challenge.
no code implementations • 1 Jan 2024 • Zhenyu Gao, Yue Yu, Qinshuang Wei, Ufuk Topcu, John-Paul Clarke
Urban air mobility (UAM), a transformative concept for the transport of passengers and cargo, faces several integration challenges in complex urban environments.
no code implementations • 2 Dec 2023 • Cyrus Neary, Christian Ellis, Aryaman Singh Samyal, Craig Lennon, Ufuk Topcu
We propose and demonstrate a compositional framework for training and verifying reinforcement learning (RL) systems within a multifidelity sim-to-real pipeline, in order to deploy reliable and adaptable RL policies on physical hardware.
no code implementations • 2 Nov 2023 • Tichakorn Wongpiromsarn, Mahsa Ghasemi, Murat Cubuktepe, Georgios Bakirtzis, Steven Carr, Mustafa O. Karabag, Cyrus Neary, Parham Gohari, Ufuk Topcu
Formal methods refer to rigorous, mathematical approaches to system development and have played a key role in establishing the correctness of safety-critical systems.
no code implementations • 27 Oct 2023 • Yunhao Yang, Neel P. Bhatt, Tyler Ingebrand, William Ward, Steven Carr, Zhangyang Wang, Ufuk Topcu
Although pre-trained language models encode generic knowledge beneficial for planning and control, they may fail to generate appropriate control policies for domain-specific tasks.
no code implementations • 17 Oct 2023 • David Jensen, Brian LaMacchia, Ufuk Topcu, Pamela Wisniewski
Algorithmic robustness refers to the sustained performance of a computational system in the face of change in the nature of the environment in which that system operates or in the task that the system is meant to perform.
no code implementations • 30 Sep 2023 • Mustafa O. Karabag, Sophia Smith, Negar Mehr, David Fridovich-Keil, Ufuk Topcu
We also analyze bimatrix Stackelberg games and identify a set of games where the leader's near-optimal strategy may suffer from a large inferability gap.
no code implementations • 18 Sep 2023 • Yunhao Yang, Jean-Raphaël Gaglione, Sandeep Chinchali, Ufuk Topcu
The increasing abundance of video data enables users to search for events of interest, e. g., emergency incidents.
no code implementations • 13 Sep 2023 • Parham Gohari, Matthew Hale, Ufuk Topcu
Accordingly, we propose Privacy-Engineered Value Decomposition Networks (PE-VDN), a Co-MARL algorithm that models multi-agent coordination while provably safeguarding the confidentiality of the agents' environment interaction data.
no code implementations • 9 Sep 2023 • Cyrus Neary, Aryaman Singh Samyal, Christos Verginis, Murat Cubuktepe, Ufuk Topcu
We propose a framework for verifiable and compositional reinforcement learning (RL) in which a collection of RL subsystems, each of which learns to accomplish a separate subtask, are composed to achieve an overall task.
no code implementations • 27 Aug 2023 • Apurva Patil, Mustafa O. Karabag, Takashi Tanaka, Ufuk Topcu
We study the deception problem in the continuous-state discrete-time stochastic dynamics setting and, using motivations from hypothesis testing theory, formulate a Kullback-Leibler control problem for the synthesis of deceptive policies.
no code implementations • 15 Aug 2023 • William Ward, Yue Yu, Jacob Levy, Negar Mehr, David Fridovich-Keil, Ufuk Topcu
We formulate an inverse learning problem in a Stackelberg game between a leader and a follower, where each player's action is the trajectory of a dynamical system.
no code implementations • 10 Aug 2023 • Yunhao Yang, Cyrus Neary, Ufuk Topcu
We develop an algorithm that utilizes the knowledge from pretrained models to construct and verify controllers for sequential decision-making tasks, and to ground these controllers to task environments through visual observations with formal guarantees.
no code implementations • 23 Jun 2023 • Yash Paliwal, Rajarshi Roy, Jean-Raphaël Gaglione, Nasim Baharisangari, Daniel Neider, Xiaoming Duan, Ufuk Topcu, Zhe Xu
We study a class of reinforcement learning (RL) tasks where the objective of the agent is to accomplish temporally extended goals.
no code implementations • 10 Jun 2023 • Franck Djeumou, Jonathan Y. M. Goh, Ufuk Topcu, Avinash Balachandran
Near the limits of adhesion, the forces generated by a tire are nonlinear and intricately coupled.
no code implementations • 10 Jun 2023 • Franck Djeumou, Cyrus Neary, Ufuk Topcu
We present a framework and algorithms to learn controlled dynamics models using neural stochastic differential equations (SDEs) -- SDEs whose drift and diffusion terms are both parametrized by neural networks.
1 code implementation • 8 Jun 2023 • Qinshuang Wei, Zhenyu Gao, John-Paul Clarke, Ufuk Topcu
In our methodology, we first model how disruptions to a given UAM network might impact on the nominal traffic flow and how this flow might be re-accommodated on an extended network with reserve capacity.
no code implementations • 27 May 2023 • Jueming Hu, Jean-Raphael Gaglione, Yanze Wang, Zhe Xu, Ufuk Topcu, Yongming Liu
We develop an algorithm called Q-learning with reward machines for stochastic games (QRM-SG), to learn the best-response strategy at Nash equilibrium for each agent.
1 code implementation • 25 May 2023 • Cevahir Koprulu, Ufuk Topcu
Self-paced reinforcement learning (RL) aims to improve the data efficiency of learning by automatically creating sequences, namely curricula, of probability distributions over contexts.
1 code implementation • NeurIPS 2023 • Po-han Li, Sravan Kumar Ankireddy, Ruihan Zhao, Hossein Nourkhiz Mahjoub, Ehsan Moradi-Pari, Ufuk Topcu, Sandeep Chinchali, Hyeji Kim
A decoder at the central node decompresses and passes the data to a pre-trained machine learning-based task to generate the final output.
no code implementations • 11 May 2023 • Qinshuang Wei, Yue Yu, Ufuk Topcu
Urban air mobility (UAM) is an emerging concept in short-range aviation transportation, where the aircraft will take off, land, and charge their batteries at a set of vertistops, and travel only through a set of flight corridors connecting these vertistops.
no code implementations • 1 May 2023 • Thom Badings, Sebastian Junges, Ahmadreza Marandi, Ufuk Topcu, Nils Jansen
As our main contribution, we present an efficient method to compute these partial derivatives.
1 code implementation • 31 Mar 2023 • Shenghui Chen, Yue Yu, David Fridovich-Keil, Ufuk Topcu
Markov games model interactions among multiple players in a stochastic, dynamic environment.
no code implementations • 7 Mar 2023 • Mustafa O. Karabag, Ufuk Topcu
Under no assumption of independent samples, we provide a high-probability, polynomial sample complexity bound for vanilla model-based off-policy evaluation that requires partial or uniform coverage.
1 code implementation • 20 Jan 2023 • Bo Chen, Calvin Hawkins, Mustafa O. Karabag, Cyrus Neary, Matthew Hale, Ufuk Topcu
We synthesize policies that are robust to privacy by reducing the value of the total correlation.
no code implementations • 9 Jan 2023 • Adam J. Thorpe, Cyrus Neary, Franck Djeumou, Meeko M. K. Oishi, Ufuk Topcu
Our proposed approach incorporates prior knowledge of the system dynamics as a bias term in the kernel learning problem.
1 code implementation • 30 Dec 2022 • Franck Djeumou, Christian Ellis, Murat Cubuktepe, Craig Lennon, Ufuk Topcu
First, they require an excessive amount of data due to the information asymmetry between the expert and the learner.
no code implementations • 4 Dec 2022 • Yunhao Yang, Jean-Raphaël Gaglione, Cyrus Neary, Ufuk Topcu
However, the textual outputs from GLMs cannot be formally verified or used for sequential decision-making.
no code implementations • 2 Dec 2022 • Jean-Raphaël Gaglione, Rajarshi Roy, Nasim Baharisangari, Daniel Neider, Zhe Xu, Ufuk Topcu
Learning linear temporal logic (LTL) formulas from examples labeled as positive or negative has found applications in inferring descriptions of system behavior.
1 code implementation • 1 Dec 2022 • Cyrus Neary, Ufuk Topcu
Toward the objective of learning composite models of such systems from data, we present i) a framework for compositional neural networks, ii) algorithms to train these models, iii) a method to compose the learned models, iv) theoretical results that bound the error of the resulting composite models, and v) a method to learn the composition itself, when it is not known a priori.
no code implementations • 21 Nov 2022 • Dhananjay Raju, Georgios Bakirtzis, Ufuk Topcu
Fault diagnosis is the problem of determining a set of faulty system components that explain discrepancies between observed and expected behavior.
no code implementations • 1 Oct 2022 • Po-han Li, Sandeep P. Chinchali, Ufuk Topcu
We analyze a cost-minimization problem in which the controller relies on an imperfect timeseries forecast.
no code implementations • 19 Sep 2022 • Yue Yu, Ruihan Zhao, Sandeep Chinchali, Ufuk Topcu
Data-driven predictive control (DPC) is a feedback control method for systems with unknown dynamics.
1 code implementation • 6 Sep 2022 • Rajarshi Roy, Jean-Raphaël Gaglione, Nasim Baharisangari, Daniel Neider, Zhe Xu, Ufuk Topcu
To learn meaningful models from positive examples only, we design algorithms that rely on conciseness and language minimality of models as regularizers.
no code implementations • 29 Aug 2022 • Georgios Bakirtzis, Michail Savvas, Ufuk Topcu
However, generating compositional models requires the characterization of minimal assumptions for the robustness of the compositional feature.
no code implementations • 17 Jul 2022 • Christos K. Verginis, Zhe Xu, Ufuk Topcu
Most existing algorithms either assume certain parametric forms for the unknown dynamic terms or resort to unnecessarily large control inputs in order to provide theoretical guarantees.
no code implementations • 14 Jul 2022 • Po-han Li, Ufuk Topcu, Sandeep P. Chinchali
We propose a method to attack controllers that rely on external timeseries forecasts as task parameters.
no code implementations • 2 Jun 2022 • Shenghui Chen, Yagiz Savas, Mustafa O. Karabag, Brian M. Sadler, Ufuk Topcu
We consider a team of autonomous agents that navigate in an adversarial environment and aim to achieve a task by allocating their resources over a set of target locations.
no code implementations • 25 May 2022 • Yunhao Yang, Parham Gohari, Ufuk Topcu
We study the privacy risks that are associated with training a neural network's weights with self-supervised learning algorithms.
no code implementations • 20 Apr 2022 • Christos Verginis, Cevahir Koprulu, Sandeep Chinchali, Ufuk Topcu
We develop a reinforcement-learning algorithm that infers a reward machine that encodes the underlying task while learning how to execute it, despite the uncertainties of the propositions' truth values.
no code implementations • 2 Apr 2022 • Steven Carr, Nils Jansen, Sebastian Junges, Ufuk Topcu
Safe exploration is a common problem in reinforcement learning (RL) that aims to prevent agents from making disastrous decisions while exploring their environment.
no code implementations • 5 Mar 2022 • Michael Hibbard, Abraham P. Vinod, Jesse Quattrociocchi, Ufuk Topcu
We introduce the Safely motion planner, a receding-horizon control framework, that simultaneously synthesizes both a trajectory for the robot to follow as well as a sensor selection strategy that prescribes trajectory-relevant obstacles to measure at each time step while respecting the sensing constraints of the robot.
no code implementations • 10 Feb 2022 • Niklas Lauffer, Mahsa Ghasemi, Abolfazl Hashemi, Yagiz Savas, Ufuk Topcu
The regret of the proposed learning algorithm is independent of the size of the state space and polynomial in the rest of the parameters of the game.
no code implementations • 26 Jan 2022 • Chenyu You, Ruihan Zhao, Fenglin Liu, Siyuan Dong, Sandeep Chinchali, Ufuk Topcu, Lawrence Staib, James S. Duncan
In this work, we present CASTformer, a novel type of adversarial transformers, for 2D medical image segmentation.
1 code implementation • 17 Jan 2022 • Mustafa O. Karabag, Cyrus Neary, Ufuk Topcu
In this work, we develop joint policies for cooperative multiagent systems that are robust to potential losses in communication.
1 code implementation • 14 Jan 2022 • Franck Djeumou, Cyrus Neary, Eric Goubault, Sylvie Putot, Ufuk Topcu
Neural ordinary differential equations (NODEs) -- parametrizations of differential equations using neural networks -- have shown tremendous promise in learning models of unknown continuous-time dynamical systems from data.
no code implementations • 11 Oct 2021 • Christos K. Verginis, Zhe Xu, Ufuk Topcu
Most existing algorithms either assume certain parametric forms for the unknown dynamic terms or resort to unnecessarily large control inputs in order to provide theoretical guarantees.
no code implementations • 6 Oct 2021 • Yunhao Yang, Parham Gohari, Ufuk Topcu
Additionally, we study the effectiveness of two prominent mitigation methods for preempting MIAs, namely weight regularization and differential privacy.
no code implementations • 14 Sep 2021 • Yagiz Savas, Christos K. Verginis, Ufuk Topcu
We study the design of autonomous agents that are capable of deceiving outside observers about their intentions while carrying out tasks in stochastic, complex environments.
1 code implementation • 14 Sep 2021 • Franck Djeumou, Cyrus Neary, Eric Goubault, Sylvie Putot, Ufuk Topcu
The physics-informed constraints are enforced via the augmented Lagrangian method during the model's training.
no code implementations • 10 Sep 2021 • Michael Hibbard, Takashi Tanaka, Ufuk Topcu
Although perception is an increasingly dominant portion of the overall computational cost for autonomous systems, only a fraction of the information perceived is likely to be relevant to the current task.
1 code implementation • 30 Jun 2021 • Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ufuk Topcu
The parameter synthesis problem is to compute an instantiation of these unspecified parameters such that the resulting MDP satisfies the temporal logic specification.
no code implementations • 30 Jun 2021 • Farzan Memarian, Abolfazl Hashemi, Scott Niekum, Ufuk Topcu
We explore methodologies to improve the robustness of generative adversarial imitation learning (GAIL) algorithms to observation noise.
no code implementations • 29 Jun 2021 • Franck Djeumou, Zhe Xu, Murat Cubuktepe, Ufuk Topcu
Specifically, we study a setting in which the agents move along the nodes of a graph, and the high-level task specifications for the swarm are expressed in a recently-proposed language called graph temporal logic (GTL).
no code implementations • 19 Jun 2021 • Franck Djeumou, Ufuk Topcu
We develop a learning-based control algorithm for unknown dynamical systems under very severe data limitations.
2 code implementations • 16 Jun 2021 • Anish Acharya, Abolfazl Hashemi, Prateek Jain, Sujay Sanghavi, Inderjit S. Dhillon, Ufuk Topcu
Geometric median (\textsc{Gm}) is a classical method in statistics for achieving a robust estimation of the uncorrupted data; under gross corruption, it achieves the optimal breakdown point of 0. 5.
Ranked #24 on
Image Classification
on MNIST
(Accuracy metric)
1 code implementation • 7 Jun 2021 • Cyrus Neary, Christos Verginis, Murat Cubuktepe, Ufuk Topcu
We propose a framework for verifiable and compositional reinforcement learning (RL) in which a collection of RL subsystems, each of which learns to accomplish a separate subtask, are composed to achieve an overall task.
no code implementations • 28 May 2021 • Franck Djeumou, Murat Cubuktepe, Craig Lennon, Ufuk Topcu
Nevertheless, the resulting formulation is still nonconvex due to the intrinsic nonconvexity of the so-called forward problem, i. e., computing an optimal policy given a reward function, in POMDPs.
1 code implementation • 24 May 2021 • Nasim Baharisangari, Jean-Raphaël Gaglione, Daniel Neider, Ufuk Topcu, Zhe Xu
In this paper, we first investigate the uncertainties associated with trajectories of a system and represent such uncertainties in the form of interval trajectories.
no code implementations • 13 May 2021 • Christos K. Verginis, Franck Djeumou, Ufuk Topcu
We develop a control algorithm that ensures the safety, in terms of confinement in a set, of a system with unknown, 2nd-order nonlinear dynamics.
no code implementations • 5 May 2021 • František Blahoudek, Petr Novotný, Melkior Ornik, Pranay Thangeda, Ufuk Topcu
We consider qualitative strategy synthesis for the formalism called consumption Markov decision processes.
no code implementations • 4 May 2021 • Murat Cubuktepe, František Blahoudek, Ufuk Topcu
We develop an algorithm that solves this graph problem in time that is \emph{polynomial} in the number of agents, target locations, and size of the consumption Markov decision process.
no code implementations • 30 Apr 2021 • Jean-Raphaël Gaglione, Daniel Neider, Rajarshi Roy, Ufuk Topcu, Zhe Xu
Our first algorithm infers minimal LTL formulas by reducing the inference problem to a problem in maximum satisfiability and then using off-the-shelf MaxSAT solvers to find a solution.
no code implementations • 6 Apr 2021 • Aris Kanellopoulos, Filippos Fotiadis, Chuangchuang Sun, Zhe Xu, Kyriakos G. Vamvoudakis, Ufuk Topcu, Warren E. Dixon
In this paper, we develop safe reinforcement-learning-based controllers for systems tasked with accomplishing complex missions that can be expressed as linear temporal logic specifications, similar to those required by search-and-rescue missions.
1 code implementation • 8 Mar 2021 • Farzan Memarian, Wonjoon Goo, Rudolf Lioutikov, Scott Niekum, Ufuk Topcu
We introduce Self-supervised Online Reward Shaping (SORS), which aims to improve the sample efficiency of any RL algorithm in sparse-reward environments by automatically densifying rewards.
2 code implementations • 4 Mar 2021 • Abolfazl Hashemi, Hayden Schaeffer, Robert Shi, Ufuk Topcu, Giang Tran, Rachel Ward
In particular, we provide generalization bounds for functions in a certain class (that is dense in a reproducing kernel Hilbert space) depending on the number of samples and the distribution of features.
no code implementations • 28 Feb 2021 • Yagiz Savas, Abolfazl Hashemi, Abraham P. Vinod, Brian M. Sadler, Ufuk Topcu
In such a setting, we develop a periodic transmission strategy, i. e., a sequence of joint beamforming gain and artificial noise pairs, that prevents the adversaries from decreasing their uncertainty on the information sequence by eavesdropping on the transmission.
no code implementations • 18 Feb 2021 • Parham Gohari, Bo Chen, Bo Wu, Matthew Hale, Ufuk Topcu
We then develop a kickstarted deep reinforcement learning algorithm for the student that is privacy-aware because we calibrate its objective with the parameters of the teacher's privacy mechanism.
no code implementations • 5 Feb 2021 • Yue Yu, Shahriar Talebi, Henk J. van Waarde, Ufuk Topcu, Mehran Mesbahi, Behçet Açıkmeşe
Willems' fundamental lemma asserts that all trajectories of a linear time-invariant system can be obtained from a finite number of measured ones, assuming that controllability and a persistency of excitation condition hold.
no code implementations • 27 Jan 2021 • Ingy Elsayed-Aly, Suda Bharadwaj, Christopher Amato, Rüdiger Ehlers, Ufuk Topcu, Lu Feng
Multi-agent reinforcement learning (MARL) has been increasingly used in a wide range of safety-critical applications, which require guaranteed safety (e. g., no unsafe states are ever visited) during the learning process. Unfortunately, current MARL methods do not have safety guarantees.
Multi-agent Reinforcement Learning
reinforcement-learning
+2
no code implementations • 31 Dec 2020 • Mahsa Ghasemi, Evan Scope Crafts, Bo Zhao, Ufuk Topcu
In planning problems, it is often challenging to fully model the desired specifications.
no code implementations • 7 Dec 2020 • Rudrajit Das, Anish Acharya, Abolfazl Hashemi, Sujay Sanghavi, Inderjit S. Dhillon, Ufuk Topcu
We propose \texttt{FedGLOMO}, a novel federated learning (FL) algorithm with an iteration complexity of $\mathcal{O}(\epsilon^{-1. 5})$ to converge to an $\epsilon$-stationary point (i. e., $\mathbb{E}[\|\nabla f(\bm{x})\|^2] \leq \epsilon$) for smooth non-convex functions -- under arbitrary client heterogeneity and compressed communication -- compared to the $\mathcal{O}(\epsilon^{-2})$ complexity of most prior works.
no code implementations • 18 Nov 2020 • Alexander Nettekoven, Scott Fish, Joseph Beaman, Ufuk Topcu
The identified algorithms can be readily applied to the laser powder bed fusion machines to address each of the above limitations and thus, significantly improve process control.
no code implementations • 11 Nov 2020 • Franck Djeumou, Abraham P. Vinod, Eric Goubault, Sylvie Putot, Ufuk Topcu
Besides, $\texttt{DaTaControl}$ achieves near-optimal control and is suitable for real-time control of such systems.
no code implementations • 27 Oct 2020 • Ufuk Topcu, Nadya Bliss, Nancy Cooke, Missy Cummings, Ashley Llorens, Howard Shrobe, Lenore Zuck
The second workshop held in February 2020, focused on existing capabilities, current research, and research trends that could address the challenges and problems identified in workshop.
no code implementations • 27 Sep 2020 • Franck Djeumou, Abraham P. Vinod, Eric Goubault, Sylvie Putot, Ufuk Topcu
We investigate the problem of data-driven, on-the-fly control of systems with unknown nonlinear dynamics where data from only a single finite-horizon trajectory and possibly side information on the dynamics are available.
no code implementations • 24 Sep 2020 • Murat Cubuktepe, Nils Jansen, Sebastian Junges, Ahmadreza Marandi, Marnix Suilen, Ufuk Topcu
(3) We linearize this dual problem and (4) solve the resulting finite linear program to obtain locally optimal solutions to the original problem.
no code implementations • 10 Aug 2020 • Bo Wu, Niklas Lauffer, Mohamadreza Ahmadi, Suda Bharadwaj, Zhe Xu, Ufuk Topcu
The proposed framework relies on assigning a classification belief (a probability distribution) to the attributes of interest.
no code implementations • 1 Aug 2020 • Bo Wu, Steven Carr, Suda Bharadwaj, Zhe Xu, Ufuk Topcu
We study the problem of distributed hypothesis testing over a network of mobile agents with limited communication and sensing ranges to infer the true hypothesis collaboratively.
no code implementations • 31 Jul 2020 • Suda Bharadwaj, Abraham P. Vinod, Rayna Dimitrova, Ufuk Topcu
We consider the problem of optimal reactive synthesis - compute a strategy that satisfies a mission specification in a dynamic environment, and optimizes a performance metric.
1 code implementation • 16 Jul 2020 • Yagiz Savas, Vijay Gupta, Ufuk Topcu
We model the agent's behavior as a Markov decision process, express its intrinsic motivation as a reward function, which belongs to a finite set of possible reward functions, and consider the incentives as additional rewards offered to the agent.
Optimization and Control
no code implementations • 3 Jul 2020 • Yuqian Jiang, Sudarshanan Bharadwaj, Bo Wu, Rishi Shah, Ufuk Topcu, Peter Stone
Reward shaping is a common approach for incorporating domain knowledge into reinforcement learning in order to speed up convergence to an optimal policy.
2 code implementations • 3 Jul 2020 • Cyrus Neary, Zhe Xu, Bo Wu, Ufuk Topcu
In cooperative multi-agent reinforcement learning, a collection of agents learns to interact in a shared environment to achieve a common goal.
no code implementations • 28 Jun 2020 • Zhe Xu, Bo Wu, Aditya Ojha, Daniel Neider, Ufuk Topcu
We compare our algorithm with the state-of-the-art RL algorithms for non-Markovian reward functions, such as Joint Inference of Reward machines and Policies for RL (JIRP), Learning Reward Machine (LRM), and Proximal Policy Optimization (PPO2).
no code implementations • 27 Mar 2020 • Erfaun Noorani, Yagiz Savas, Alec Koppel, John Baras, Ufuk Topcu, Brian M. Sadler
In particular, we formulate a discrete optimization problem to choose only a subset of agents to transmit the message signal so that the variance of the signal-to-noise ratio (SNR) received by the base station is minimized while the expected SNR exceeds a desired threshold.
no code implementations • 13 Feb 2020 • Steven Carr, Nils Jansen, Ufuk Topcu
Recurrent neural networks (RNNs) have emerged as an effective representation of control policies in sequential decision-making problems.
no code implementations • 27 Jan 2020 • Zhe Xu, Yuxin Chen, Ufuk Topcu
In the context of teaching temporal logic formulas, an exhaustive search even for a myopic solution takes exponential time (with respect to the time span of the task).
no code implementations • 29 Nov 2019 • Melkior Ornik, Ufuk Topcu
This paper proposes a formal approach to online learning and planning for agents operating in a priori unknown, time-varying environments.
no code implementations • 23 Oct 2019 • Dhananjay Raju, Suda Bharadwaj, Ufuk Topcu
In this approach, which is fundamentally decentralized, the shield on every agent has two components: a pathfinder that corrects the behavior of the agent and an ordering mechanism that dynamically modifies the priority of the agent.
no code implementations • 4 Oct 2019 • Mahsa Ghasemi, Ufuk Topcu
In applications in which the agent does not have prior knowledge about the available information sources, it is crucial to synthesize active perception strategies at runtime.
no code implementations • 27 Sep 2019 • Mahsa Ghasemi, Abolfazl Hashemi, Haris Vikalo, Ufuk Topcu
We formulate the task of representation learning as that of mapping the state space of the model to a low-dimensional state space, called the kernel space.
no code implementations • 12 Sep 2019 • Zhe Xu, Ivan Gavran, Yousef Ahmad, Rupak Majumdar, Daniel Neider, Ufuk Topcu, Bo Wu
The experiments show that learning high-level knowledge in the form of reward machines can lead to fast convergence to optimal policies in RL, while standard RL methods such as q-learning and hierarchical RL methods fail to converge to optimal policies after a substantial number of training steps in many tasks.
no code implementations • 10 Sep 2019 • Zhe Xu, Ufuk Topcu
Transferring high-level knowledge from a source task to a target task is an effective way to expedite reinforcement learning (RL).
no code implementations • 26 Jul 2019 • Baihong Jin, Yingshui Tan, Alexander Nettekoven, Yuxin Chen, Ufuk Topcu, Yisong Yue, Alberto Sangiovanni Vincentelli
We show that the encoder-decoder model is able to identify the injected anomalies in a modern manufacturing process in an unsupervised fashion.
no code implementations • 15 May 2019 • Murat Cubuktepe, Nils Jansen, Mohammed Alsiekh, Ufuk Topcu
We design the autonomy protocol to ensure that the resulting robot behavior satisfies given safety and performance specifications in probabilistic temporal logic.
no code implementations • ICLR 2019 • Mahsa Ghasemi, Ufuk Topcu
However, in a variety of real-world scenarios the agent has an active role in its perception by selecting which observations to receive.
no code implementations • 25 Apr 2019 • Bo Wu, Murat Cubuktepe, Suda Bharadwaj, Ufuk Topcu
In this paper, we consider deceiving adversaries with bounded rationality and in terms of expected rewards.
no code implementations • 20 Mar 2019 • Steven Carr, Nils Jansen, Ralf Wimmer, Alexandru C. Serban, Bernd Becker, Ufuk Topcu
The particular problem is to determine strategies that provably adhere to (probabilistic) temporal logic constraints.
no code implementations • 6 Feb 2019 • Suda Bharadwaj, Rayna Dimitrova, Ufuk Topcu
We study the problem of synthesizing strategies for a mobile sensor network to conduct surveillance in partnership with static alarm triggers.
1 code implementation • 24 Jan 2019 • Min Wen, Osbert Bastani, Ufuk Topcu
It has recently been shown that if feedback effects of decisions are ignored, then imposing fairness constraints such as demographic parity or equality of opportunity can actually exacerbate unfairness.
no code implementations • NeurIPS 2018 • Min Wen, Ufuk Topcu
We show that the asymptotic behavior of the proposed algorithm can be almost-surely described by that of an ordinary differential equation.
no code implementations • 28 Sep 2018 • Mohamadreza Ahmadi, Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ufuk Topcu
Then, the deception problem is to compute a strategy for the deceiver that minimizes the expected cost of deception against all strategies of the infiltrator.
no code implementations • 9 Jul 2018 • Yagiz Savas, Melkior Ornik, Murat Cubuktepe, Mustafa O. Karabag, Ufuk Topcu
Such a policy minimizes the predictability of the paths it generates, or dually, maximizes the exploration of different paths in an MDP while ensuring the satisfaction of a temporal logic specification.
no code implementations • 8 May 2018 • Melkior Ornik, Ufuk Topcu
In this paper, we consider an adversarial scenario where one agent seeks to achieve an objective and its adversary seeks to learn the agent's intentions and prevent the agent from achieving its objective.
no code implementations • 23 Mar 2018 • Lu Feng, Mahsa Ghasemi, Kai-Wei Chang, Ufuk Topcu
Automated techniques such as model checking have been used to verify models of robotic mission plans based on Markov decision processes (MDPs) and generate counterexamples that may help diagnose requirement violations.
no code implementations • 5 Mar 2018 • Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ufuk Topcu
This paper considers parametric Markov decision processes (pMDPs) whose transitions are equipped with affine functions over a finite set of parameters.
no code implementations • 28 Feb 2018 • Murat Cubuktepe, Ufuk Topcu
We develop a method for computing policies in Markov decision processes with risk-sensitive measures subject to temporal logic constraints.
no code implementations • 27 Feb 2018 • Steven Carr, Nils Jansen, Ralf Wimmer, Jie Fu, Ufuk Topcu
The efficient verification of this MC gives quantitative insights into the quality of the inferred human strategy by proving or disproving given system specifications.
no code implementations • 29 Sep 2017 • Krishnendu Chatterjee, Martin Chmelik, Ufuk Topcu
Partially observable Markov decision processes (POMDPs) are widely used in probabilistic planning problems in which an agent interacts with an environment using noisy and imprecise sensors.
no code implementations • 14 Sep 2017 • Melkior Ornik, Arie Israel, Ufuk Topcu
This paper focuses on developing a strategy for control of systems whose dynamics are almost entirely unknown.
1 code implementation • 29 Aug 2017 • Mohammed Alshiekh, Roderick Bloem, Ruediger Ehlers, Bettina Könighofer, Scott Niekum, Ufuk Topcu
In the first one, the shield acts each time the learning agent is about to make a decision and provides a list of safe actions.
no code implementations • 14 Aug 2017 • Leonore Winterer, Sebastian Junges, Ralf Wimmer, Nils Jansen, Ufuk Topcu, Joost-Pieter Katoen, Bernd Becker
We study synthesis problems with constraints in partially observable Markov decision processes (POMDPs), where the objective is to compute a strategy for an agent that is guaranteed to satisfy certain safety and performance specifications.
no code implementations • 14 Apr 2017 • Michael L. Littman, Ufuk Topcu, Jie Fu, Charles Isbell, Min Wen, James Macglashan
We propose a new task-specification language for Markov decision processes that is designed to be an improvement over reward functions by being environment independent.
1 code implementation • 28 Oct 2016 • Sebastian Junges, Nils Jansen, Joost-Pieter Katoen, Ufuk Topcu
Probabilistic model checking is used to predict the human's behavior.
no code implementations • 26 Oct 2016 • Nils Jansen, Murat Cubuktepe, Ufuk Topcu
We formalize synthesis of shared control protocols with correctness guarantees for temporal logic specifications.
no code implementations • 7 Jan 2016 • Daniel Neider, Ufuk Topcu
We propose a method to construct finite-state reactive controllers for systems whose interactions with their adversarial environment are modeled by infinite-duration two-player games over (possibly) infinite graphs.
no code implementations • 5 Mar 2015 • Min Wen, Ruediger Ehlers, Ufuk Topcu
We establish both correctness (with respect to the temporal logic specifications) and optimality (with respect to the a priori unknown performance criterion) of this two-step technique for a fragment of temporal logic specifications.
no code implementations • 1 Oct 2014 • Jie Fu, Ufuk Topcu
We show that by alternating between the observation-based strategy and the active sensing strategy, under a mild technical assumption of the set of sensors in the system, the given temporal logic specification can be satisfied with probability 1.
no code implementations • 28 Apr 2014 • Jie Fu, Ufuk Topcu
We model the interaction between the system and its environment as a Markov decision process (MDP) with initially unknown transition probabilities.