Search Results for author: Yuhao Ding

Found 9 papers, 0 papers with code

Scalable Multi-Agent Reinforcement Learning with General Utilities

no code implementations15 Feb 2023 Donghao Ying, Yuhao Ding, Alec Koppel, Javad Lavaei

The objective is to find a localized policy that maximizes the average of the team's local utility functions without the full observability of each agent in the team.

Multi-agent Reinforcement Learning reinforcement-learning +1

Non-stationary Risk-sensitive Reinforcement Learning: Near-optimal Dynamic Regret, Adaptive Detection, and Separation Design

no code implementations19 Nov 2022 Yuhao Ding, Ming Jin, Javad Lavaei

We study risk-sensitive reinforcement learning (RL) based on an entropic risk measure in episodic non-stationary Markov decision processes (MDPs).

Reinforcement Learning (RL)

Policy-based Primal-Dual Methods for Convex Constrained Markov Decision Processes

no code implementations22 May 2022 Donghao Ying, Mengzi Amy Guo, Yuhao Ding, Javad Lavaei, Zuo-Jun Max Shen

We study convex Constrained Markov Decision Processes (CMDPs) in which the objective is concave and the constraints are convex in the state-action occupancy measure.

Provably Efficient Primal-Dual Reinforcement Learning for CMDPs with Non-stationary Objectives and Constraints

no code implementations28 Jan 2022 Yuhao Ding, Javad Lavaei

We consider primal-dual-based reinforcement learning (RL) in episodic constrained Markov decision processes (CMDPs) with non-stationary objectives and constraints, which plays a central role in ensuring the safety of RL in time-varying environments.

Reinforcement Learning (RL) Safe Exploration

On the Global Optimum Convergence of Momentum-based Policy Gradient

no code implementations19 Oct 2021 Yuhao Ding, Junzi Zhang, Javad Lavaei

For the generic Fisher-non-degenerate policy parametrizations, our result is the first single-loop and finite-batch PG algorithm achieving $\tilde{O}(\epsilon^{-3})$ global optimality sample complexity.

A Dual Approach to Constrained Markov Decision Processes with Entropy Regularization

no code implementations17 Oct 2021 Donghao Ying, Yuhao Ding, Javad Lavaei

We study entropy-regularized constrained Markov decision processes (CMDPs) under the soft-max parameterization, in which an agent aims to maximize the entropy-regularized value function while satisfying constraints on the expected total utility.

Ontology-Enhanced Slot Filling

no code implementations25 Aug 2021 Yuhao Ding, Yik-Cheung Tam

In multi-domain task-oriented dialog system, user utterances and system responses may mention multiple named entities and attributes values.

dialog state tracking slot-filling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.