Search Results for author: Weichao Mao

Found 12 papers, 3 papers with code

Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction

2 code implementations • 12 Apr 2024 • Haoran Qiu, Weichao Mao, Archit Patke, Shengkun Cui, Saurabh Jha, Chen Wang, Hubertus Franke, Zbigniew T. Kalbarczyk, Tamer Başar, Ravishankar K. Iyer

Large language models (LLMs) have been driving a new wave of interactive AI applications across numerous domains.

Blocking Management +1

186

Paper
Code

Decision Transformer as a Foundation Model for Partially Observable Continuous Control

no code implementations • 3 Apr 2024 • Xiangyuan Zhang, Weichao Mao, Haoran Qiu, Tamer Başar

Closed-loop control of nonlinear dynamical systems with partial-state observability demands expert knowledge of a diverse, less standardized set of theoretical tools.

Continuous Control Zero-shot Generalization

Paper
Add Code

$\widetilde{O}(T^{-1})$ Convergence to (Coarse) Correlated Equilibria in Full-Information General-Sum Markov Games

no code implementations • 2 Feb 2024 • Weichao Mao, Haoran Qiu, Chen Wang, Hubertus Franke, Zbigniew Kalbarczyk, Tamer Başar

No-regret learning has a long history of being closely connected to game theory.

Multi-agent Reinforcement Learning

Paper
Add Code

Controlgym: Large-Scale Control Environments for Benchmarking Reinforcement Learning Algorithms

1 code implementation • 30 Nov 2023 • Xiangyuan Zhang, Weichao Mao, Saviz Mowlavi, Mouhacine Benosman, Tamer Başar

This project serves the learning for dynamics & control (L4DC) community, aiming to explore key questions: the convergence of RL algorithms in learning control policies; the stability and robustness issues of learning-based controllers; and the scalability of RL algorithms to high- and potentially infinite-dimensional systems.

Benchmarking OpenAI Gym +2

Paper
Code

Action Dynamics Task Graphs for Learning Plannable Representations of Procedural Tasks

no code implementations • 11 Jan 2023 • Weichao Mao, Ruta Desai, Michael Louis Iuzzolino, Nitin Kamra

Given video demonstrations and paired narrations of an at-home procedural task such as changing a tire, we present an approach to extract the underlying task structure -- relevant actions and their temporal dependencies -- via action-centric task graphs.

Task 2

Paper
Add Code

On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning

no code implementations • 12 Oct 2021 • Weichao Mao, Lin F. Yang, Kaiqing Zhang, Tamer Başar

Multi-agent reinforcement learning (MARL) algorithms often suffer from an exponential sample complexity dependence on the number of agents, a phenomenon known as \emph{the curse of multiagents}.

Multi-agent Reinforcement Learning Q-Learning +3

Paper
Add Code

Provably Efficient Reinforcement Learning in Decentralized General-Sum Markov Games

no code implementations • 12 Oct 2021 • Weichao Mao, Tamer Başar

We show that the agents can find an $\epsilon$-approximate CCE in at most $\widetilde{O}( H^6S A /\epsilon^2)$ episodes, where $S$ is the number of states, $A$ is the size of the largest individual action space, and $H$ is the length of an episode.

Multi-agent Reinforcement Learning Q-Learning +2

Paper
Add Code

Decentralized Cooperative Multi-Agent Reinforcement Learning with Exploration

no code implementations • 29 Sep 2021 • Weichao Mao, Tamer Basar, Lin Yang, Kaiqing Zhang

Many real-world applications of multi-agent reinforcement learning (RL), such as multi-robot navigation and decentralized control of cyber-physical systems, involve the cooperation of agents as a team with aligned objectives.

Multi-agent Reinforcement Learning Q-Learning +3

Paper
Add Code

Model-Free Non-Stationary RL: Near-Optimal Regret and Applications in Multi-Agent RL and Inventory Control

no code implementations • 7 Oct 2020 • Weichao Mao, Kaiqing Zhang, Ruihao Zhu, David Simchi-Levi, Tamer Başar

We consider model-free reinforcement learning (RL) in non-stationary Markov decision processes.

Computational Efficiency Q-Learning +1

Paper
Add Code

Near-Optimal Regret Bounds for Model-Free RL in Non-Stationary Episodic MDPs

no code implementations • 28 Sep 2020 • Weichao Mao, Kaiqing Zhang, Ruihao Zhu, David Simchi-Levi, Tamer Basar

We consider model-free reinforcement learning (RL) in non-stationary Markov decision processes (MDPs).

Q-Learning Reinforcement Learning (RL)

Paper
Add Code

POLY-HOOT: Monte-Carlo Planning in Continuous Space MDPs with Non-Asymptotic Analysis

no code implementations • NeurIPS 2020 • Weichao Mao, Kaiqing Zhang, Qiaomin Xie, Tamer Başar

Monte-Carlo planning, as exemplified by Monte-Carlo Tree Search (MCTS), has demonstrated remarkable performance in applications with finite spaces.

Paper
Add Code

Information State Embedding in Partially Observable Cooperative Multi-Agent Reinforcement Learning

1 code implementation • 2 Apr 2020 • Weichao Mao, Kaiqing Zhang, Erik Miehling, Tamer Başar

To enable the development of tractable algorithms, we introduce the concept of an information state embedding that serves to compress agents' histories.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.