Search Results for author: Zhongwen Xu

Found 33 papers, 9 papers with code

Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform

1 code implementation29 Sep 2023 Shengyi Huang, Jiayi Weng, Rujikorn Charakorn, Min Lin, Zhongwen Xu, Santiago Ontañón

Distributed Deep Reinforcement Learning (DRL) aims to leverage more computational resources to train autonomous agents with less training time.

reinforcement-learning

Learning to Optimize for Reinforcement Learning

1 code implementation3 Feb 2023 Qingfeng Lan, A. Rupam Mahmood, Shuicheng Yan, Zhongwen Xu

Reinforcement learning (RL) is essentially different from supervised learning and in practice these learned optimizers do not work well even in simple RL tasks.

Inductive Bias Meta-Learning +2

Visual Imitation Learning with Patch Rewards

1 code implementation2 Feb 2023 Minghuan Liu, Tairan He, Weinan Zhang, Shuicheng Yan, Zhongwen Xu

Specifically, we present Adversarial Imitation Learning with Patch Rewards (PatchAIL), which employs a patch-based discriminator to measure the expertise of different local parts from given images and provide patch rewards.

Imitation Learning

Reinforcement Learning from Diverse Human Preferences

no code implementations27 Jan 2023 Wanqi Xue, Bo An, Shuicheng Yan, Zhongwen Xu

The complexity of designing reward functions has been a major obstacle to the wide application of deep reinforcement learning (RL) techniques.

reinforcement-learning Reinforcement Learning (RL)

Imitation Learning As State Matching via Differentiable Physics

no code implementations CVPR 2023 Siwei Chen, Xiao Ma, Zhongwen Xu

With the physics prior, ILD policies can not only be transferable to unseen environment specifications but also yield higher final performance on a variety of tasks.

Continuous Control Deformable Object Manipulation +1

RPM: Generalizable Behaviors for Multi-Agent Reinforcement Learning

no code implementations18 Oct 2022 Wei Qiu, Xiao Ma, Bo An, Svetlana Obraztsova, Shuicheng Yan, Zhongwen Xu

Despite the recent advancement in multi-agent reinforcement learning (MARL), the MARL agents easily overfit the training environment and perform poorly in the evaluation scenarios where other agents behave differently.

Multi-agent Reinforcement Learning reinforcement-learning +1

Boosting Offline Reinforcement Learning via Data Rebalancing

no code implementations17 Oct 2022 Yang Yue, Bingyi Kang, Xiao Ma, Zhongwen Xu, Gao Huang, Shuicheng Yan

Therefore, we propose a simple yet effective method to boost offline RL algorithms based on the observation that resampling a dataset keeps the distribution support unchanged.

D4RL Offline RL +2

Mutual Information Regularized Offline Reinforcement Learning

1 code implementation NeurIPS 2023 Xiao Ma, Bingyi Kang, Zhongwen Xu, Min Lin, Shuicheng Yan

In this work, we propose a novel MISA framework to approach offline RL from the perspective of Mutual Information between States and Actions in the dataset by directly constraining the policy improvement direction.

D4RL Offline RL +2

Efficient Offline Policy Optimization with a Learned Model

1 code implementation12 Oct 2022 Zichen Liu, Siyi Li, Wee Sun Lee, Shuicheng Yan, Zhongwen Xu

Instead of planning with the expensive MCTS, we use the learned model to construct an advantage estimation based on a one-step rollout.

Offline RL

Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning

no code implementations25 Jun 2022 Yang Yue, Bingyi Kang, Zhongwen Xu, Gao Huang, Shuicheng Yan

Recently, visual representation learning has been shown to be effective and promising for boosting sample efficiency in RL.

Contrastive Learning Data Augmentation +5

Imitation Learning via Differentiable Physics

1 code implementation10 Jun 2022 Siwei Chen, Xiao Ma, Zhongwen Xu

With the physics prior, ILD policies can not only be transferable to unseen environment specifications but also yield higher final performance on a variety of tasks.

Continuous Control Deformable Object Manipulation +1

ETA Prediction with Graph Neural Networks in Google Maps

no code implementations25 Aug 2021 Austin Derrow-Pinion, Jennifer She, David Wong, Oliver Lange, Todd Hester, Luis Perez, Marc Nunkesser, Seongjae Lee, Xueying Guo, Brett Wiltshire, Peter W. Battaglia, Vishal Gupta, Ang Li, Zhongwen Xu, Alvaro Sanchez-Gonzalez, Yujia Li, Petar Veličković

Travel-time prediction constitutes a task of high importance in transportation networks, with web mapping services like Google Maps regularly serving vast quantities of travel time queries from users and enterprises alike.

Graph Representation Learning

Balancing Constraints and Rewards with Meta-Gradient D4PG

no code implementations ICLR 2021 Dan A. Calian, Daniel J. Mankowitz, Tom Zahavy, Zhongwen Xu, Junhyuk Oh, Nir Levine, Timothy Mann

Deploying Reinforcement Learning (RL) agents to solve real-world applications often requires satisfying complex system constraints.

Reinforcement Learning (RL)

Discovering Reinforcement Learning Algorithms

1 code implementation NeurIPS 2020 Junhyuk Oh, Matteo Hessel, Wojciech M. Czarnecki, Zhongwen Xu, Hado van Hasselt, Satinder Singh, David Silver

Automating the discovery of update rules from data could lead to more efficient algorithms, or algorithms that are better adapted to specific environments.

Atari Games Meta-Learning +3

Meta-Gradient Reinforcement Learning with an Objective Discovered Online

no code implementations NeurIPS 2020 Zhongwen Xu, Hado van Hasselt, Matteo Hessel, Junhyuk Oh, Satinder Singh, David Silver

In this work, we propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network, solely from interactive experience with its environment.

Q-Learning reinforcement-learning +1

A Self-Tuning Actor-Critic Algorithm

no code implementations NeurIPS 2020 Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh

Reinforcement learning algorithms are highly sensitive to the choice of hyperparameters, typically requiring significant manual effort to identify hyperparameters that perform well on a new domain.

Atari Games reinforcement-learning +1

What Can Learned Intrinsic Rewards Capture?

no code implementations ICML 2020 Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh

Furthermore, we show that unlike policy transfer methods that capture "how" the agent should behave, the learned reward functions can generalise to other kinds of agents and to changes in the dynamics of the environment by capturing "what" the agent should strive to do.

Discovery of Useful Questions as Auxiliary Tasks

no code implementations NeurIPS 2019 Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard Lewis, Janarthanan Rajendran, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh

Arguably, intelligent agents ought to be able to discover their own questions so that in learning answers for them they learn unanticipated useful knowledge and skills; this departs from the focus in much of machine learning on agents learning answers to externally defined questions.

Reinforcement Learning (RL)

Meta-Gradient Reinforcement Learning

1 code implementation NeurIPS 2018 Zhongwen Xu, Hado van Hasselt, David Silver

Instead, the majority of reinforcement learning algorithms estimate and/or optimise a proxy for the value function.

Meta-Learning reinforcement-learning +1

An End-to-End Approach to Natural Language Object Retrieval via Context-Aware Deep Reinforcement Learning

no code implementations22 Mar 2017 Fan Wu, Zhongwen Xu, Yi Yang

We propose an end-to-end approach to the natural language object retrieval task, which localizes an object within an image according to a natural language description, i. e., referring expression.

Object Referring Expression +1

Few-Shot Object Recognition from Machine-Labeled Web Images

no code implementations CVPR 2017 Zhongwen Xu, Linchao Zhu, Yi Yang

Then, we demonstrate that with our model, machine-labeled image annotations are very effective and abundant resources to perform object recognition on novel categories.

Few-Shot Learning Object +1

Strategies for Searching Video Content with Text Queries or Video Examples

no code implementations17 Jun 2016 Shoou-I Yu, Yi Yang, Zhongwen Xu, Shicheng Xu, Deyu Meng, Zexi Mao, Zhigang Ma, Ming Lin, Xuanchong Li, Huan Li, Zhenzhong Lan, Lu Jiang, Alexander G. Hauptmann, Chuang Gan, Xingzhong Du, Xiaojun Chang

The large number of user-generated videos uploaded on to the Internet everyday has led to many commercial video search engines, which mainly rely on text metadata for search.

Event Detection Retrieval +1

Uncovering Temporal Context for Video Question and Answering

no code implementations15 Nov 2015 Linchao Zhu, Zhongwen Xu, Yi Yang, Alexander G. Hauptmann

In this work, we introduce Video Question Answering in temporal domain to infer the past, describe the present and predict the future.

Multiple-choice Question Answering +1

A Discriminative CNN Video Representation for Event Detection

no code implementations CVPR 2015 Zhongwen Xu, Yi Yang, Alexander G. Hauptmann

In this paper, we propose a discriminative video representation for event detection over a large scale video dataset when only limited hardware resources are available.

Event Detection

Event Detection using Multi-Level Relevance Labels and Multiple Features

no code implementations CVPR 2014 Zhongwen Xu, Ivor W. Tsang, Yi Yang, Zhigang Ma, Alexander G. Hauptmann

We address the challenging problem of utilizing related exemplars for complex event detection while multiple features are available.

Event Detection

Complex Event Detection via Multi-source Video Attributes

no code implementations CVPR 2013 Zhigang Ma, Yi Yang, Zhongwen Xu, Shuicheng Yan, Nicu Sebe, Alexander G. Hauptmann

Compared to complex event videos, these external videos contain simple contents such as objects, scenes and actions which are the basic elements of complex events.

Event Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.