Search Results for author: Banafsheh Rafiee

Found 6 papers, 1 papers with code

Auxiliary task discovery through generate-and-test

no code implementations25 Oct 2022 Banafsheh Rafiee, Sina Ghiassian, Jun Jin, Richard Sutton, Jun Luo, Adam White

In this paper, we explore an approach to auxiliary task discovery in reinforcement learning based on ideas from representation learning.

Meta-Learning Representation Learning

What makes useful auxiliary tasks in reinforcement learning: investigating the effect of the target policy

no code implementations1 Apr 2022 Banafsheh Rafiee, Jun Jin, Jun Luo, Adam White

Our focus on the role of the target policy of the auxiliary tasks is motivated by the fact that the target policy determines the behavior about which the agent wants to make a prediction and the state-action distribution that the agent is trained on, which further affects the main task learning.

Representation Learning

From Eye-blinks to State Construction: Diagnostic Benchmarks for Online Representation Learning

1 code implementation9 Nov 2020 Banafsheh Rafiee, Zaheer Abbas, Sina Ghiassian, Raksha Kumaraswamy, Richard Sutton, Elliot Ludvig, Adam White

We present three new diagnostic prediction problems inspired by classical-conditioning experiments to facilitate research in online prediction learning.

Continual Learning Representation Learning

Improving Performance in Reinforcement Learning by Breaking Generalization in Neural Networks

no code implementations16 Mar 2020 Sina Ghiassian, Banafsheh Rafiee, Yat Long Lo, Adam White

Unfortunately, the performance of deep reinforcement learning systems is sensitive to hyper-parameter settings and architecture choices.

reinforcement-learning Reinforcement Learning (RL)

A First Empirical Study of Emphatic Temporal Difference Learning

no code implementations11 May 2017 Sina Ghiassian, Banafsheh Rafiee, Richard S. Sutton

In this paper we present the first empirical study of the emphatic temporal-difference learning algorithm (ETD), comparing it with conventional temporal-difference learning, in particular, with linear TD(0), on on-policy and off-policy variations of the Mountain Car problem.

Cannot find the paper you are looking for? You can Submit a new open access paper.