1 code implementation • 29 Sep 2023 • Shengyi Huang, Jiayi Weng, Rujikorn Charakorn, Min Lin, Zhongwen Xu, Santiago Ontañón
Distributed Deep Reinforcement Learning (DRL) aims to leverage more computational resources to train autonomous agents with less training time.
1 code implementation • 3 Feb 2023 • Qingfeng Lan, A. Rupam Mahmood, Shuicheng Yan, Zhongwen Xu
Reinforcement learning (RL) is essentially different from supervised learning and in practice these learned optimizers do not work well even in simple RL tasks.
1 code implementation • 2 Feb 2023 • Minghuan Liu, Tairan He, Weinan Zhang, Shuicheng Yan, Zhongwen Xu
Specifically, we present Adversarial Imitation Learning with Patch Rewards (PatchAIL), which employs a patch-based discriminator to measure the expertise of different local parts from given images and provide patch rewards.
no code implementations • 27 Jan 2023 • Wanqi Xue, Bo An, Shuicheng Yan, Zhongwen Xu
The complexity of designing reward functions has been a major obstacle to the wide application of deep reinforcement learning (RL) techniques.
no code implementations • CVPR 2023 • Siwei Chen, Xiao Ma, Zhongwen Xu
With the physics prior, ILD policies can not only be transferable to unseen environment specifications but also yield higher final performance on a variety of tasks.
no code implementations • 18 Oct 2022 • Wei Qiu, Xiao Ma, Bo An, Svetlana Obraztsova, Shuicheng Yan, Zhongwen Xu
Despite the recent advancement in multi-agent reinforcement learning (MARL), the MARL agents easily overfit the training environment and perform poorly in the evaluation scenarios where other agents behave differently.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
no code implementations • 17 Oct 2022 • Yang Yue, Bingyi Kang, Xiao Ma, Zhongwen Xu, Gao Huang, Shuicheng Yan
Therefore, we propose a simple yet effective method to boost offline RL algorithms based on the observation that resampling a dataset keeps the distribution support unchanged.
1 code implementation • 12 Oct 2022 • Zichen Liu, Siyi Li, Wee Sun Lee, Shuicheng Yan, Zhongwen Xu
Instead of planning with the expensive MCTS, we use the learned model to construct an advantage estimation based on a one-step rollout.
no code implementations • 25 Jun 2022 • Yang Yue, Bingyi Kang, Zhongwen Xu, Gao Huang, Shuicheng Yan
Recently, visual representation learning has been shown to be effective and promising for boosting sample efficiency in RL.
3 code implementations • 21 Jun 2022 • Jiayi Weng, Min Lin, Shengyi Huang, Bo Liu, Denys Makoviichuk, Viktor Makoviychuk, Zichen Liu, Yufan Song, Ting Luo, Yukun Jiang, Zhongwen Xu, Shuicheng Yan
EnvPool is open-sourced at https://github. com/sail-sg/envpool.
1 code implementation • 10 Jun 2022 • Siwei Chen, Xiao Ma, Zhongwen Xu
With the physics prior, ILD policies can not only be transferable to unseen environment specifications but also yield higher final performance on a variety of tasks.
no code implementations • 25 Aug 2021 • Austin Derrow-Pinion, Jennifer She, David Wong, Oliver Lange, Todd Hester, Luis Perez, Marc Nunkesser, Seongjae Lee, Xueying Guo, Brett Wiltshire, Peter W. Battaglia, Vishal Gupta, Ang Li, Zhongwen Xu, Alvaro Sanchez-Gonzalez, Yujia Li, Petar Veličković
Travel-time prediction constitutes a task of high importance in transportation networks, with web mapping services like Google Maps regularly serving vast quantities of travel time queries from users and enterprises alike.
no code implementations • 21 Jun 2021 • Ray Jiang, Tom Zahavy, Zhongwen Xu, Adam White, Matteo Hessel, Charles Blundell, Hado van Hasselt
In this paper, we extend the use of emphatic methods to deep reinforcement learning agents.
no code implementations • NeurIPS 2021 • Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh
Temporal abstractions in the form of options have been shown to help reinforcement learning (RL) agents learn faster.
no code implementations • ICLR 2021 • Dan A. Calian, Daniel J. Mankowitz, Tom Zahavy, Zhongwen Xu, Junhyuk Oh, Nir Levine, Timothy Mann
Deploying Reinforcement Learning (RL) agents to solve real-world applications often requires satisfying complex system constraints.
1 code implementation • NeurIPS 2020 • Junhyuk Oh, Matteo Hessel, Wojciech M. Czarnecki, Zhongwen Xu, Hado van Hasselt, Satinder Singh, David Silver
Automating the discovery of update rules from data could lead to more efficient algorithms, or algorithms that are better adapted to specific environments.
no code implementations • NeurIPS 2020 • Zhongwen Xu, Hado van Hasselt, Matteo Hessel, Junhyuk Oh, Satinder Singh, David Silver
In this work, we propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network, solely from interactive experience with its environment.
no code implementations • NeurIPS 2020 • Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh
Reinforcement learning algorithms are highly sensitive to the choice of hyperparameters, typically requiring significant manual effort to identify hyperparameters that perform well on a new domain.
no code implementations • ICML 2020 • Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh
Furthermore, we show that unlike policy transfer methods that capture "how" the agent should behave, the learned reward functions can generalise to other kinds of agents and to changes in the dynamics of the environment by capturing "what" the agent should strive to do.
no code implementations • NeurIPS 2019 • Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard Lewis, Janarthanan Rajendran, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh
Arguably, intelligent agents ought to be able to discover their own questions so that in learning answers for them they learn unanticipated useful knowledge and skills; this departs from the focus in much of machine learning on agents learning answers to externally defined questions.
no code implementations • 8 Jul 2019 • Hado van Hasselt, John Quan, Matteo Hessel, Zhongwen Xu, Diana Borsa, Andre Barreto
We consider a general class of non-linear Bellman equations.
no code implementations • NeurIPS 2018 • Zhongwen Xu, Hado van Hasselt, David Silver
Instead, the majority of reinforcement learning algorithms estimate and/or optimise a proxy for the value function.
no code implementations • NeurIPS 2017 • Zhongwen Xu, Joseph Modayil, Hado P. Van Hasselt, Andre Barreto, David Silver, Tom Schaul
Neural networks have a smooth initial inductive bias, such that small changes in input do not lead to large changes in output.
no code implementations • 22 Mar 2017 • Fan Wu, Zhongwen Xu, Yi Yang
We propose an end-to-end approach to the natural language object retrieval task, which localizes an object within an image according to a natural language description, i. e., referring expression.
no code implementations • CVPR 2017 • Zhongwen Xu, Linchao Zhu, Yi Yang
Then, we demonstrate that with our model, machine-labeled image annotations are very effective and abundant resources to perform object recognition on novel categories.
no code implementations • CVPR 2017 • Linchao Zhu, Zhongwen Xu, Yi Yang
This learning process makes the learned model more capable of dealing with motion speed variance.
no code implementations • 17 Jun 2016 • Shoou-I Yu, Yi Yang, Zhongwen Xu, Shicheng Xu, Deyu Meng, Zexi Mao, Zhigang Ma, Ming Lin, Xuanchong Li, Huan Li, Zhenzhong Lan, Lu Jiang, Alexander G. Hauptmann, Chuang Gan, Xingzhong Du, Xiaojun Chang
The large number of user-generated videos uploaded on to the Internet everyday has led to many commercial video search engines, which mainly rely on text metadata for search.
no code implementations • 15 Nov 2015 • Linchao Zhu, Zhongwen Xu, Yi Yang, Alexander G. Hauptmann
In this work, we introduce Video Question Answering in temporal domain to infer the past, describe the present and predict the future.
no code implementations • CVPR 2016 • Pingbo Pan, Zhongwen Xu, Yi Yang, Fei Wu, Yueting Zhuang
In this paper, we propose a new approach, namely Hierarchical Recurrent Neural Encoder (HRNE), to exploit temporal information of videos.
no code implementations • CVPR 2015 • Zhongwen Xu, Yi Yang, Alexander G. Hauptmann
In this paper, we propose a discriminative video representation for event detection over a large scale video dataset when only limited hardware resources are available.
no code implementations • CVPR 2014 • Zhongwen Xu, Ivor W. Tsang, Yi Yang, Zhigang Ma, Alexander G. Hauptmann
We address the challenging problem of utilizing related exemplars for complex event detection while multiple features are available.
no code implementations • CVPR 2013 • Zhigang Ma, Yi Yang, Zhongwen Xu, Shuicheng Yan, Nicu Sebe, Alexander G. Hauptmann
Compared to complex event videos, these external videos contain simple contents such as objects, scenes and actions which are the basic elements of complex events.