Search Results for author: Siliang Zeng

Found 7 papers, 2 papers with code

When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning

1 code implementation NeurIPS 2023 Siliang Zeng, Chenliang Li, Alfredo Garcia, Mingyi Hong

Offline inverse reinforcement learning (Offline IRL) aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.

Autonomous Driving Continuous Control +2

Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees

no code implementations4 Oct 2022 Siliang Zeng, Mingyi Hong, Alfredo Garcia

Other approaches in the inverse reinforcement learning (IRL) literature emphasize policy estimation at the expense of reduced reward estimation accuracy.

Imitation Learning

Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees

no code implementations4 Oct 2022 Siliang Zeng, Chenliang Li, Alfredo Garcia, Mingyi Hong

To reduce the computational burden of a nested loop, novel methods such as SQIL [1] and IQ-Learn [2] emphasize policy estimation at the expense of reward estimation accuracy.

counterfactual Imitation Learning +2

Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees

no code implementations11 Oct 2021 Siliang Zeng, Tianyi Chen, Alfredo Garcia, Mingyi Hong

The flexibility in our design allows the proposed MARL-CAC algorithm to be used in a {\it fully decentralized} setting, where the agents can only communicate with their neighbors, as well as a {\it federated} setting, where the agents occasionally communicate with a server while optimizing their (partially personalized) local models.

Multi-agent Reinforcement Learning

On the Divergence of Decentralized Non-Convex Optimization

no code implementations20 Jun 2020 Mingyi Hong, Siliang Zeng, Junyu Zhang, Haoran Sun

However, by constructing some counter-examples, we show that when certain local Lipschitz conditions (LLC) on the local function gradient $\nabla f_i$'s are not satisfied, most of the existing decentralized algorithms diverge, even if the global Lipschitz condition (GLC) is satisfied, where the sum function $f$ has Lipschitz gradient.

Open-Ended Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.