2 code implementations • 12 Apr 2024 • Haoran Qiu, Weichao Mao, Archit Patke, Shengkun Cui, Saurabh Jha, Chen Wang, Hubertus Franke, Zbigniew T. Kalbarczyk, Tamer Başar, Ravishankar K. Iyer
Large language models (LLMs) have been driving a new wave of interactive AI applications across numerous domains.
no code implementations • 3 Apr 2024 • Xiangyuan Zhang, Weichao Mao, Haoran Qiu, Tamer Başar
Closed-loop control of nonlinear dynamical systems with partial-state observability demands expert knowledge of a diverse, less standardized set of theoretical tools.
no code implementations • 2 Feb 2024 • Weichao Mao, Haoran Qiu, Chen Wang, Hubertus Franke, Zbigniew Kalbarczyk, Tamer Başar
No-regret learning has a long history of being closely connected to game theory.
1 code implementation • 30 Nov 2023 • Xiangyuan Zhang, Weichao Mao, Saviz Mowlavi, Mouhacine Benosman, Tamer Başar
This project serves the learning for dynamics & control (L4DC) community, aiming to explore key questions: the convergence of RL algorithms in learning control policies; the stability and robustness issues of learning-based controllers; and the scalability of RL algorithms to high- and potentially infinite-dimensional systems.
no code implementations • 11 Jan 2023 • Weichao Mao, Ruta Desai, Michael Louis Iuzzolino, Nitin Kamra
Given video demonstrations and paired narrations of an at-home procedural task such as changing a tire, we present an approach to extract the underlying task structure -- relevant actions and their temporal dependencies -- via action-centric task graphs.
no code implementations • 12 Oct 2021 • Weichao Mao, Lin F. Yang, Kaiqing Zhang, Tamer Başar
Multi-agent reinforcement learning (MARL) algorithms often suffer from an exponential sample complexity dependence on the number of agents, a phenomenon known as \emph{the curse of multiagents}.
no code implementations • 12 Oct 2021 • Weichao Mao, Tamer Başar
We show that the agents can find an $\epsilon$-approximate CCE in at most $\widetilde{O}( H^6S A /\epsilon^2)$ episodes, where $S$ is the number of states, $A$ is the size of the largest individual action space, and $H$ is the length of an episode.
no code implementations • 29 Sep 2021 • Weichao Mao, Tamer Basar, Lin Yang, Kaiqing Zhang
Many real-world applications of multi-agent reinforcement learning (RL), such as multi-robot navigation and decentralized control of cyber-physical systems, involve the cooperation of agents as a team with aligned objectives.
no code implementations • 7 Oct 2020 • Weichao Mao, Kaiqing Zhang, Ruihao Zhu, David Simchi-Levi, Tamer Başar
We consider model-free reinforcement learning (RL) in non-stationary Markov decision processes.
no code implementations • 28 Sep 2020 • Weichao Mao, Kaiqing Zhang, Ruihao Zhu, David Simchi-Levi, Tamer Basar
We consider model-free reinforcement learning (RL) in non-stationary Markov decision processes (MDPs).
no code implementations • NeurIPS 2020 • Weichao Mao, Kaiqing Zhang, Qiaomin Xie, Tamer Başar
Monte-Carlo planning, as exemplified by Monte-Carlo Tree Search (MCTS), has demonstrated remarkable performance in applications with finite spaces.
1 code implementation • 2 Apr 2020 • Weichao Mao, Kaiqing Zhang, Erik Miehling, Tamer Başar
To enable the development of tractable algorithms, we introduce the concept of an information state embedding that serves to compress agents' histories.
Multi-agent Reinforcement Learning reinforcement-learning +1