Search Results for author: Zhang-Wei Hong

Found 26 papers, 10 papers with code

Curiosity-driven Red-teaming for Large Language Models

1 code implementation29 Feb 2024 Zhang-Wei Hong, Idan Shenfeld, Tsun-Hsuan Wang, Yung-Sung Chuang, Aldo Pareja, James Glass, Akash Srivastava, Pulkit Agrawal

To probe when an LLM generates unwanted content, the current paradigm is to recruit a \textit{red team} of human testers to design input prompts (i. e., test cases) that elicit undesirable responses from LLMs.

Reinforcement Learning (RL)

Neuro-Inspired Fragmentation and Recall to Overcome Catastrophic Forgetting in Curiosity

1 code implementation26 Oct 2023 Jaedong Hwang, Zhang-Wei Hong, Eric Chen, Akhilan Boopathy, Pulkit Agrawal, Ila Fiete

Deep reinforcement learning methods exhibit impressive performance on a range of tasks but still struggle on hard exploration tasks in large environments with sparse rewards.

Parallel $Q$-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation

no code implementations24 Jul 2023 Zechu Li, Tao Chen, Zhang-Wei Hong, Anurag Ajay, Pulkit Agrawal

This paper presents a Parallel $Q$-Learning (PQL) scheme that outperforms PPO in wall-clock time while maintaining superior sample efficiency of off-policy learning.

Q-Learning reinforcement-learning

Grid Cell-Inspired Fragmentation and Recall for Efficient Map Building

1 code implementation11 Jul 2023 Jaedong Hwang, Zhang-Wei Hong, Eric Chen, Akhilan Boopathy, Pulkit Agrawal, Ila Fiete

Agents build and use a local map to predict their observations; high surprisal leads to a "fragmentation event" that truncates the local map.

Clustering Navigate +1

TGRL: An Algorithm for Teacher Guided Reinforcement Learning

no code implementations6 Jul 2023 Idan Shenfeld, Zhang-Wei Hong, Aviv Tamar, Pulkit Agrawal

To combine the benefits of these different forms of learning, it is common to train a policy to maximize a combination of reinforcement and teacher-student learning objectives.

counterfactual Decision Making +1

Redeeming Intrinsic Rewards via Constrained Optimization

1 code implementation14 Nov 2022 Eric Chen, Zhang-Wei Hong, Joni Pajarinen, Pulkit Agrawal

However, on easy exploration tasks, the agent gets distracted by intrinsic rewards and performs unnecessary exploration even when sufficient task (also called extrinsic) reward is available.

Montezuma's Revenge Reinforcement Learning (RL)

Model Predictive Control via On-Policy Imitation Learning

no code implementations17 Oct 2022 Kwangjun Ahn, Zakaria Mhammedi, Horia Mania, Zhang-Wei Hong, Ali Jadbabaie

Recent approaches to data-driven MPC have used the simplest form of imitation learning known as behavior cloning to learn controllers that mimic the performance of MPC by online sampling of the trajectories of the closed-loop MPC system.

Imitation Learning Model Predictive Control +1

Bilinear value networks

1 code implementation28 Apr 2022 Zhang-Wei Hong, Ge Yang, Pulkit Agrawal

The dominant framework for off-policy multi-goal reinforcement learning involves estimating goal conditioned Q-value function.

Multi-Goal Reinforcement Learning

Topological Experience Replay

1 code implementation ICLR 2022 Zhang-Wei Hong, Tao Chen, Yen-Chen Lin, Joni Pajarinen, Pulkit Agrawal

State-of-the-art deep Q-learning methods update Q-values using state transition tuples sampled from the experience replay buffer.


Stubborn: A Strong Baseline for Indoor Object Navigation

no code implementations14 Mar 2022 Haokuan Luo, Albert Yue, Zhang-Wei Hong, Pulkit Agrawal

We present a strong baseline that surpasses the performance of previously published methods on the Habitat Challenge task of navigating to a target object in indoor environments.


Toward Synergism in Macro Action Ensembles

1 code implementation1 Jan 2021 Yu Ming Chen, Kuan-Yu Chang, Chien Liu, Tsu-Ching Hsiao, Zhang-Wei Hong, Chun-Yi Lee

Macro actions have been demonstrated to be beneficial for the learning processes of an agent.

Neural Architecture Search

Mixture of Step Returns in Bootstrapped DQN

no code implementations16 Jul 2020 Po-Han Chiang, Hsuan-Kung Yang, Zhang-Wei Hong, Chun-Yi Lee

Nevertheless, integrating step returns into a single target sacrifices the diversity of the advantages offered by different step return targets.

Periodic Intra-Ensemble Knowledge Distillation for Reinforcement Learning

1 code implementation1 Feb 2020 Zhang-Wei Hong, Prabhat Nagarajan, Guilherme Maeda

PIEKD is a learning framework that uses an ensemble of policies to act in the environment while periodically sharing knowledge amongst policies in the ensemble through knowledge distillation.

Knowledge Distillation reinforcement-learning +1

Model-based Lookahead Reinforcement Learning

no code implementations15 Aug 2019 Zhang-Wei Hong, Joni Pajarinen, Jan Peters

Model-based Reinforcement Learning (MBRL) allows data-efficient learning which is required in real world applications such as robotics.

Continuous Control Model-based Reinforcement Learning +3

A Self-Supervised Method for Mapping Human Instructions to Robot Policies

no code implementations ICLR 2019 Hsin-Wei Yu, Po-Yu Wu, Chih-An Tsao, You-An Shen, Shih-Hsuan Lin, Zhang-Wei Hong, Yi-Hsiang Chang, Chun-Yi Lee

In this paper, we propose a modular approach which separates the instruction-to-action mapping procedure into two separate stages.

Adversarial Active Exploration for Inverse Dynamics Model Learning

no code implementations ICLR 2019 Zhang-Wei Hong, Tsu-Jui Fu, Tzu-Yun Shann, Yi-Hsiang Chang, Chun-Yi Lee

Our framework consists of a deep reinforcement learning (DRL) agent and an inverse dynamics model contesting with each other.

Imitation Learning

Diversity-Driven Exploration Strategy for Deep Reinforcement Learning

no code implementations NeurIPS 2018 Zhang-Wei Hong, Tzu-Yun Shann, Shih-Yang Su, Yi-Hsiang Chang, Chun-Yi Lee

Efficient exploration remains a challenging research problem in reinforcement learning, especially when an environment contains large state spaces, deceptive local optima, or sparse rewards.

Efficient Exploration reinforcement-learning +1

Virtual-to-Real: Learning to Control in Visual Semantic Segmentation

no code implementations1 Feb 2018 Zhang-Wei Hong, Chen Yu-Ming, Shih-Yang Su, Tzu-Yun Shann, Yi-Hsiang Chang, Hsuan-Kung Yang, Brian Hsi-Lin Ho, Chih-Chieh Tu, Yueh-Chuan Chang, Tsu-Ching Hsiao, Hsin-Wei Hsiao, Sih-Pin Lai, Chun-Yi Lee

Collecting training data from the physical world is usually time-consuming and even dangerous for fragile robots, and thus, recent advances in robot learning advocate the use of simulators as the training platform.

Image Segmentation Segmentation +1

A Deep Policy Inference Q-Network for Multi-Agent Systems

no code implementations21 Dec 2017 Zhang-Wei Hong, Shih-Yang Su, Tzu-Yun Shann, Yi-Hsiang Chang, Chun-Yi Lee

DPIQN incorporates the learned policy features as a hidden vector into its own deep Q-network (DQN), such that it is able to predict better Q values for the controllable agents than the state-of-the-art deep reinforcement learning models.

Tactics of Adversarial Attack on Deep Reinforcement Learning Agents

no code implementations8 Mar 2017 Yen-Chen Lin, Zhang-Wei Hong, Yuan-Hong Liao, Meng-Li Shih, Ming-Yu Liu, Min Sun

In the strategically-timed attack, the adversary aims at minimizing the agent's reward by only attacking the agent at a small subset of time steps in an episode.

Adversarial Attack Atari Games +2

Cannot find the paper you are looking for? You can Submit a new open access paper.