Search Results for author: Changmin Yu

Found 9 papers, 5 papers with code

Successor-Predecessor Intrinsic Exploration

no code implementations NeurIPS 2023 Changmin Yu, Neil Burgess, Maneesh Sahani, Samuel J. Gershman

Here we focus on exploration with intrinsic rewards, where the agent transiently augments the external rewards with self-generated intrinsic rewards.

Atari Games Efficient Exploration +1

Unsupervised representation learning with recognition-parametrised probabilistic models

2 code implementations13 Sep 2022 William I. Walker, Hugo Soulat, Changmin Yu, Maneesh Sahani

We introduce a new approach to probabilistic unsupervised learning based on the recognition-parametrised model (RPM): a normalised semi-parametric hypothesis class for joint distributions over observed and latent variables.

Image Classification Representation Learning +1

Structured Recognition for Generative Models with Explaining Away

1 code implementation12 Sep 2022 Changmin Yu, Hugo Soulat, Neil Burgess, Maneesh Sahani

A key goal of unsupervised learning is to go beyond density estimation and sample generation to reveal the structure inherent within observed data.

Density Estimation Hippocampus +2

SEREN: Knowing When to Explore and When to Exploit

no code implementations30 May 2022 Changmin Yu, David Mguni, Dong Li, Aivar Sootla, Jun Wang, Neil Burgess

Efficient reinforcement learning (RL) involves a trade-off between "exploitative" actions that maximise expected reward and "explorative'" ones that sample unvisited states.

Reinforcement Learning (RL)

Learning State Representations via Retracing in Reinforcement Learning

1 code implementation ICLR 2022 Changmin Yu, Dong Li, Jianye Hao, Jun Wang, Neil Burgess

We propose learning via retracing, a novel self-supervised approach for learning the state representation (and the associated dynamics model) for reinforcement learning tasks.

Continuous Control Model-based Reinforcement Learning +3

DESTA: A Framework for Safe Reinforcement Learning with Markov Games of Intervention

no code implementations27 Oct 2021 David Mguni, Usman Islam, Yaqi Sun, Xiuling Zhang, Joel Jennings, Aivar Sootla, Changmin Yu, Ziyan Wang, Jun Wang, Yaodong Yang

In this paper, we introduce a new generation of RL solvers that learn to minimise safety violations while maximising the task reward to the extent that can be tolerated by the safe policy.

OpenAI Gym reinforcement-learning +3

What About Inputing Policy in Value Function: Policy Representation and Policy-extended Value Function Approximator

1 code implementation NeurIPS 2021 Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Changmin Yu, Hangyu Mao, Wulong Liu, Yaodong Yang, Wenyuan Tao, Li Wang

We study Policy-extended Value Function Approximator (PeVFA) in Reinforcement Learning (RL), which extends conventional value function approximator (VFA) to take as input not only the state (and action) but also an explicit policy representation.

Continuous Control Contrastive Learning +3

Prediction and Generalisation over Directed Actions by Grid Cells

1 code implementation ICLR 2021 Changmin Yu, Timothy E. J. Behrens, Neil Burgess

Knowing how the effects of directed actions generalise to new situations (e. g. moving North, South, East and West, or turning left, right, etc.)

Continuous Control Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.