Search Results for author: Hongyao Tang

Found 25 papers, 11 papers with code

Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn

1 code implementation7 Sep 2024 Hongyao Tang, Glen Berseth

Deep neural networks provide Reinforcement Learning (RL) powerful function approximators to address large-scale decision-making problems.

Deep Reinforcement Learning Reinforcement Learning (RL)

MFE-ETP: A Comprehensive Evaluation Benchmark for Multi-modal Foundation Models on Embodied Task Planning

1 code implementation6 Jul 2024 Min Zhang, Xian Fu, Jianye Hao, Peilong Han, Hao Zhang, Lei Shi, Hongyao Tang, Yan Zheng

To this end, based on the characteristics of embodied task planning, we first develop a systematic evaluation framework, which encapsulates four crucial capabilities of MFMs: object understanding, spatio-temporal perception, task understanding, and embodied reasoning.

Embodied Question Answering Question Answering +1

Bridging Evolutionary Algorithms and Reinforcement Learning: A Comprehensive Survey on Hybrid Algorithms

1 code implementation22 Jan 2024 Pengyi Li, Jianye Hao, Hongyao Tang, Xian Fu, Yan Zheng, Ke Tang

Evolutionary Reinforcement Learning (ERL), which integrates Evolutionary Algorithms (EAs) and Reinforcement Learning (RL) for optimization, has demonstrated remarkable performance advancements.

Evolutionary Algorithms reinforcement-learning +2

The Ladder in Chaos: A Simple and Effective Improvement to General DRL Algorithms by Policy Path Trimming and Boosting

no code implementations2 Mar 2023 Hongyao Tang, Min Zhang, Jianye Hao

On typical MuJoCo and DeepMind Control Suite (DMC) benchmarks, we find common phenomena for TD3 and RAD agents: 1) the activity of policy network parameters is highly asymmetric and policy networks advance monotonically along very few major parameter directions; 2) severe detours occur in parameter update and harmonic-like changes are observed for all minor parameter directions.

Reinforcement Learning (RL)

State-Aware Proximal Pessimistic Algorithms for Offline Reinforcement Learning

no code implementations28 Nov 2022 Chen Chen, Hongyao Tang, Yi Ma, Chao Wang, Qianli Shen, Dong Li, Jianye Hao

The key idea of SA-PP is leveraging discounted stationary state distribution ratios between the learning policy and the offline dataset to modulate the degree of behavior regularization in a state-wise manner, so that pessimism can be implemented in a more appropriate way.

Offline RL Q-Learning +3

ERL-Re$^2$: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation

1 code implementation26 Oct 2022 Jianye Hao, Pengyi Li, Hongyao Tang, Yan Zheng, Xian Fu, Zhaopeng Meng

The state representation conveys expressive common features of the environment learned by all the agents collectively; the linear policy representation provides a favorable space for efficient policy optimization, where novel behavior-level crossover and mutation operations can be performed.

continuous-control Continuous Control +4

Towards A Unified Policy Abstraction Theory and Representation Learning Approach in Markov Decision Processes

no code implementations16 Sep 2022 Min Zhang, Hongyao Tang, Jianye Hao, Yan Zheng

First, we propose a unified policy abstraction theory, containing three types of policy abstraction associated to policy features at different levels.

Decision Making Metric Learning +2

PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations

1 code implementation6 Apr 2022 Tong Sang, Hongyao Tang, Yi Ma, Jianye Hao, Yan Zheng, Zhaopeng Meng, Boyan Li, Zhen Wang

In online adaptation phase, with the environment context inferred from few experiences collected in new environments, the policy is optimized by gradient ascent with respect to the PDVF.

Contrastive Learning Decision Making +1

PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration

1 code implementation16 Mar 2022 Pengyi Li, Hongyao Tang, Tianpei Yang, Xiaotian Hao, Tong Sang, Yan Zheng, Jianye Hao, Matthew E. Taylor, Wenyuan Tao, Zhen Wang, Fazl Barez

However, we reveal sub-optimal collaborative behaviors also emerge with strong correlations, and simply maximizing the MI can, surprisingly, hinder the learning towards better collaboration.

Multi-agent Reinforcement Learning reinforcement-learning +1

Uncertainty-aware Low-Rank Q-Matrix Estimation for Deep Reinforcement Learning

no code implementations19 Nov 2021 Tong Sang, Hongyao Tang, Jianye Hao, Yan Zheng, Zhaopeng Meng

Such a reconstruction exploits the underlying structure of value matrix to improve the value approximation, thus leading to a more efficient learning process of value function.

continuous-control Continuous Control +3

Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain

no code implementations14 Sep 2021 Jianye Hao, Tianpei Yang, Hongyao Tang, Chenjia Bai, Jinyi Liu, Zhaopeng Meng, Peng Liu, Zhen Wang

In addition to algorithmic analysis, we provide a comprehensive and unified empirical comparison of different exploration methods for DRL on a set of commonly used benchmarks.

Autonomous Vehicles Deep Reinforcement Learning +5

Addressing Action Oscillations through Learning Policy Inertia

no code implementations3 Mar 2021 Chen Chen, Hongyao Tang, Jianye Hao, Wulong Liu, Zhaopeng Meng

We propose Nested Policy Iteration as a general training algorithm for PIC-augmented policy which ensures monotonically non-decreasing updates under some mild conditions.

Atari Games Autonomous Driving +2

What About Inputing Policy in Value Function: Policy Representation and Policy-extended Value Function Approximator

1 code implementation NeurIPS 2021 Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Changmin Yu, Hangyu Mao, Wulong Liu, Yaodong Yang, Wenyuan Tao, Li Wang

We study Policy-extended Value Function Approximator (PeVFA) in Reinforcement Learning (RL), which extends conventional value function approximator (VFA) to take as input not only the state (and action) but also an explicit policy representation.

continuous-control Continuous Control +4

KoGuN: Accelerating Deep Reinforcement Learning via Integrating Human Suboptimal Knowledge

no code implementations18 Feb 2020 Peng Zhang, Jianye Hao, Weixun Wang, Hongyao Tang, Yi Ma, Yihai Duan, Yan Zheng

Our framework consists of a fuzzy rule controller to represent human knowledge and a refine module to fine-tune suboptimal prior knowledge.

Common Sense Reasoning continuous-control +4

MGHRL: Meta Goal-generation for Hierarchical Reinforcement Learning

no code implementations30 Sep 2019 Haotian Fu, Hongyao Tang, Jianye Hao, Wulong Liu, Chen Chen

Most meta reinforcement learning (meta-RL) methods learn to adapt to new tasks by directly optimizing the parameters of policies over primitive action space.

Hierarchical Reinforcement Learning Meta-Learning +4

Efficient meta reinforcement learning via meta goal generation

no code implementations25 Sep 2019 Haotian Fu, Hongyao Tang, Jianye Hao

Meta reinforcement learning (meta-RL) is able to accelerate the acquisition of new tasks by learning from past experience.

Meta-Learning Meta Reinforcement Learning +3

Deep Multi-Agent Reinforcement Learning with Discrete-Continuous Hybrid Action Spaces

1 code implementation12 Mar 2019 Haotian Fu, Hongyao Tang, Jianye Hao, Zihan Lei, Yingfeng Chen, Changjie Fan

Deep Reinforcement Learning (DRL) has been applied to address a variety of cooperative multi-agent problems with either discrete action spaces or continuous action spaces.

Deep Reinforcement Learning Multi-agent Reinforcement Learning +3

Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction

no code implementations25 Sep 2018 Hongyao Tang, Jianye Hao, Tangjie Lv, Yingfeng Chen, Zongzhang Zhang, Hangtian Jia, Chunxu Ren, Yan Zheng, Zhaopeng Meng, Changjie Fan, Li Wang

Besides, we propose a new experience replay mechanism to alleviate the issue of the sparse transitions at the high level of abstraction and the non-stationarity of multiagent learning.

reinforcement-learning Reinforcement Learning +1

An Optimal Rewiring Strategy for Reinforcement Social Learning in Cooperative Multiagent Systems

no code implementations13 May 2018 Hongyao Tang, Li Wang, Zan Wang, Tim Baarslag, Jianye Hao

Multiagent coordination in cooperative multiagent systems (MASs) has been widely studied in both fixed-agent repeated interaction setting and the static social learning framework.

Cannot find the paper you are looking for? You can Submit a new open access paper.