no code implementations • 1 Apr 2022 • Hengshuai Yao
In the second part of this paper, we propose a new class of methods of accelerating gradient descent that have some distinctiveness from existing techniques.
no code implementations • 21 Dec 2021 • Shahin Atakishiyev, Mohammad Salameh, Hengshuai Yao, Randy Goebel
First, we provide a thorough overview of the state-of-the-art studies on XAI for autonomous driving.
no code implementations • 20 Nov 2021 • Shahin Atakishiyev, Mohammad Salameh, Hengshuai Yao, Randy Goebel
There has been growing interest in the development and deployment of autonomous vehicles on roads over the last few years, encouraged by the empirical successes of powerful artificial intelligence techniques (AI), especially in the applications of deep learning and reinforcement learning.
no code implementations • 29 Sep 2021 • Ke Sun, Yi Liu, Yingnan Zhao, Hengshuai Yao, Shangling Jui, Linglong Kong
In real scenarios, state observations that an agent observes may contain measurement errors or adversarial noises, misleading the agent to take suboptimal actions or even collapse while training.
Distributional Reinforcement Learning
reinforcement-learning
no code implementations • 17 Sep 2021 • Ke Sun, Yi Liu, Yingnan Zhao, Hengshuai Yao, Shangling Jui, Linglong Kong
In real scenarios, state observations that an agent observes may contain measurement errors or adversarial noises, misleading the agent to take suboptimal actions or even collapse while training.
1 code implementation • 21 Jan 2021 • Shangtong Zhang, Hengshuai Yao, Shimon Whiteson
The deadly triad refers to the instability of a reinforcement learning algorithm when it employs off-policy learning, function approximation, and bootstrapping simultaneously.
no code implementations • 1 Jan 2021 • Vincent Liu, Adam M White, Hengshuai Yao, Martha White
Catastrophic interference is common in many network-based learning systems, and many proposals exist for mitigating it.
1 code implementation • 28 Sep 2020 • Jincheng Mei, Yangchen Pan, Martha White, Amir-Massoud Farahmand, Hengshuai Yao
The prioritized Experience Replay (ER) method has attracted great attention; however, there is little theoretical understanding of such prioritization strategy and why they help.
no code implementations • 14 Sep 2020 • Daoming Lyu, Qi Qi, Mohammad Ghavamzadeh, Hengshuai Yao, Tianbao Yang, Bo Liu
To achieve variance-reduced off-policy-stable policy optimization, we propose an algorithm family that is memory-efficient, stochastically variance-reduced, and capable of learning from off-policy samples.
no code implementations • 19 Jul 2020 • Jincheng Mei, Yangchen Pan, Amir-Massoud Farahmand, Hengshuai Yao, Martha White
The prioritized Experience Replay (ER) method has attracted great attention; however, there is little theoretical understanding about why it can help and its limitations.
no code implementations • 7 Jul 2020 • Vincent Liu, Adam White, Hengshuai Yao, Martha White
In this work, we provide a definition of interference for control in reinforcement learning.
no code implementations • 26 Jan 2020 • Mennatullah Siam, Naren Doraiswamy, Boris N. Oreshkin, Hengshuai Yao, Martin Jagersand
Our results show that few-shot segmentation benefits from utilizing word embeddings, and that we are able to perform few-shot segmentation using stacked joint visual semantic processing with weak image-level labels.
no code implementations • 18 Dec 2019 • Mennatullah Siam, Naren Doraiswamy, Boris N. Oreshkin, Hengshuai Yao, Martin Jagersand
Conventional few-shot object segmentation methods learn object segmentation from a few labelled support images with strongly labelled segmentation masks.
1 code implementation • ICML 2020 • Shangtong Zhang, Bo Liu, Hengshuai Yao, Shimon Whiteson
With the help of the emphasis critic and the canonical value function critic, we show convergence for COF-PAC, where the critics are linear and the actor can be nonlinear.
no code implementations • 8 Nov 2019 • Jun Jin, Nhat M. Nguyen, Nazmus Sakib, Daniel Graves, Hengshuai Yao, Martin Jagersand
We observe that our method demonstrates time-efficient path planning behavior with high success rate in mapless navigation tasks.
Robotics
no code implementations • 4 Oct 2019 • Abhishek Naik, Roshan Shariff, Niko Yasui, Hengshuai Yao, Richard S. Sutton
Discounted reinforcement learning is fundamentally incompatible with function approximation for control in continuing tasks.
no code implementations • 3 Oct 2019 • Khurram Javed, Hengshuai Yao, Martha White
Gradient-based meta-learning has proven to be highly effective at learning model initializations, representations, and update rules that allow fast adaptation from a few samples.
no code implementations • 25 Sep 2019 • Chao GAO, Martin Mueller, Ryan Hayward, Hengshuai Yao, Shangling Jui
A three-head network architecture has been recently proposed that can learn a third action-value head on a fixed dataset the same as for two-head net.
no code implementations • 18 Jun 2019 • Yangchen Pan, Hengshuai Yao, Amir-Massoud Farahmand, Martha White
In this work, we propose to generate such states by using the trajectory obtained from Hill Climbing (HC) the current estimate of the value function.
no code implementations • 13 May 2019 • Borislav Mavrin, Shangtong Zhang, Hengshuai Yao, Linglong Kong, Kaiwen Wu, Yao-Liang Yu
In distributional reinforcement learning (RL), the estimated distribution of value function models both the parametric and intrinsic uncertainties.
no code implementations • 20 Mar 2019 • Nazmus Sakib, Hengshuai Yao, Hong Zhang, Shangling Jui
In this paper, we use reinforcement learning for safety driving in adversary settings.
no code implementations • 18 Mar 2019 • Borislav Mavrin, Hengshuai Yao, Linglong Kong
Further experiments on the losing games show that our decorelation algorithms can win over DQN and QR-DQN with a fined tuned regularization factor.
1 code implementation • 6 Nov 2018 • Shangtong Zhang, Hao Chen, Hengshuai Yao
In this paper, we propose an actor ensemble algorithm, named ACE, for continuous control with a deterministic policy in reinforcement learning.
4 code implementations • 5 Nov 2018 • Shangtong Zhang, Borislav Mavrin, Linglong Kong, Bo Liu, Hengshuai Yao
In this paper, we propose the Quantile Option Architecture (QUOTA) for exploration based on recent advances in distributional reinforcement learning (RL).
no code implementations • 27 Apr 2018 • Donglai Zhu, Hengshuai Yao, Bei Jiang, Peng Yu
In deep neural network, the cross-entropy loss function is commonly used for classification.
no code implementations • 8 Feb 2018 • Donglai Zhu, Hao Chen, Hengshuai Yao, Masoud Nosrati, Peyman Yadmellat, Yunfei Zhang
Our major finding is that action tiling encoding is the most important factor leading to the remarkable performance of the CDNA model.
no code implementations • NeurIPS 2014 • Hengshuai Yao, Csaba Szepesvari, Richard S. Sutton, Joseph Modayil, Shalabh Bhatnagar
We prove that the UOM of an option can construct a traditional option model given a reward function, and the option-conditional return is computed directly by a single dot-product of the UOM with the reward function.
no code implementations • NeurIPS 2009 • Hengshuai Yao, Shalabh Bhatnagar, Dongcui Diao, Richard S. Sutton, Csaba Szepesvári
We extend Dyna planning architecture for policy evaluation and control in two significant aspects.