no code implementations • 25 Oct 2022 • Banafsheh Rafiee, Sina Ghiassian, Jun Jin, Richard Sutton, Jun Luo, Adam White
In this paper, we explore an approach to auxiliary task discovery in reinforcement learning based on ideas from representation learning.
no code implementations • 1 Apr 2022 • Banafsheh Rafiee, Jun Jin, Jun Luo, Adam White
Our focus on the role of the target policy of the auxiliary tasks is motivated by the fact that the target policy determines the behavior about which the agent wants to make a prediction and the state-action distribution that the agent is trained on, which further affects the main task learning.
1 code implementation • 9 Nov 2020 • Banafsheh Rafiee, Zaheer Abbas, Sina Ghiassian, Raksha Kumaraswamy, Richard Sutton, Elliot Ludvig, Adam White
We present three new diagnostic prediction problems inspired by classical-conditioning experiments to facilitate research in online prediction learning.
no code implementations • 16 Mar 2020 • Sina Ghiassian, Banafsheh Rafiee, Yat Long Lo, Adam White
Unfortunately, the performance of deep reinforcement learning systems is sensitive to hyper-parameter settings and architecture choices.
no code implementations • 18 May 2018 • Sina Ghiassian, Huizhen Yu, Banafsheh Rafiee, Richard S. Sutton
We apply neural nets with ReLU gates in online reinforcement learning.
no code implementations • 11 May 2017 • Sina Ghiassian, Banafsheh Rafiee, Richard S. Sutton
In this paper we present the first empirical study of the emphatic temporal-difference learning algorithm (ETD), comparing it with conventional temporal-difference learning, in particular, with linear TD(0), on on-policy and off-policy variations of the Mountain Car problem.