Search Results for author: Yuhao Ding

Found 9 papers, 0 papers with code

DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural Network Inference

no code implementations • 24 Feb 2023 • Jiajun Zhou, Jiajun Wu, Yizhao Gao, Yuhao Ding, Chaofan Tao, Boyu Li, Fengbin Tu, Kwang-Ting Cheng, Hayden Kwok-Hay So, Ngai Wong

To accelerate the inference of deep neural networks (DNNs), quantization with low-bitwidth numbers is actively researched.

Quantization

Paper
Add Code

Scalable Multi-Agent Reinforcement Learning with General Utilities

no code implementations • 15 Feb 2023 • Donghao Ying, Yuhao Ding, Alec Koppel, Javad Lavaei

The objective is to find a localized policy that maximizes the average of the team's local utility functions without the full observability of each agent in the team.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Non-stationary Risk-sensitive Reinforcement Learning: Near-optimal Dynamic Regret, Adaptive Detection, and Separation Design

no code implementations • 19 Nov 2022 • Yuhao Ding, Ming Jin, Javad Lavaei

We study risk-sensitive reinforcement learning (RL) based on an entropic risk measure in episodic non-stationary Markov decision processes (MDPs).

Reinforcement Learning (RL)

Paper
Add Code

Policy-based Primal-Dual Methods for Convex Constrained Markov Decision Processes

no code implementations • 22 May 2022 • Donghao Ying, Mengzi Amy Guo, Yuhao Ding, Javad Lavaei, Zuo-Jun Max Shen

We study convex Constrained Markov Decision Processes (CMDPs) in which the objective is concave and the constraints are convex in the state-action occupancy measure.

Paper
Add Code

Provably Efficient Primal-Dual Reinforcement Learning for CMDPs with Non-stationary Objectives and Constraints

no code implementations • 28 Jan 2022 • Yuhao Ding, Javad Lavaei

We consider primal-dual-based reinforcement learning (RL) in episodic constrained Markov decision processes (CMDPs) with non-stationary objectives and constraints, which plays a central role in ensuring the safety of RL in time-varying environments.

Reinforcement Learning (RL) Safe Exploration

Paper
Add Code

Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization

no code implementations • 19 Oct 2021 • Yuhao Ding, Junzi Zhang, Javad Lavaei

Our result is the first global convergence and sample complexity results for the stochastic entropy-regularized vanilla PG method.

Policy Gradient Methods Reinforcement Learning (RL)

Paper
Add Code

On the Global Optimum Convergence of Momentum-based Policy Gradient

no code implementations • 19 Oct 2021 • Yuhao Ding, Junzi Zhang, Javad Lavaei

For the generic Fisher-non-degenerate policy parametrizations, our result is the first single-loop and finite-batch PG algorithm achieving $\tilde{O}(\epsilon^{-3})$ global optimality sample complexity.

Paper
Add Code

A Dual Approach to Constrained Markov Decision Processes with Entropy Regularization

no code implementations • 17 Oct 2021 • Donghao Ying, Yuhao Ding, Javad Lavaei

We study entropy-regularized constrained Markov decision processes (CMDPs) under the soft-max parameterization, in which an agent aims to maximize the entropy-regularized value function while satisfying constraints on the expected total utility.

Paper
Add Code

Ontology-Enhanced Slot Filling

no code implementations • 25 Aug 2021 • Yuhao Ding, Yik-Cheung Tam

In multi-domain task-oriented dialog system, user utterances and system responses may mention multiple named entities and attributes values.

dialog state tracking slot-filling +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.