Search Results for author: Nishanth Anand

Found 3 papers, 2 papers with code

Prediction and Control in Continual Reinforcement Learning

1 code implementation NeurIPS 2023 Nishanth Anand, Doina Precup

Temporal difference (TD) learning is often used to update the estimate of the value function which is used by RL agents to extract useful policies.

Continual Learning General Knowledge +1

Preferential Temporal Difference Learning

1 code implementation11 Jun 2021 Nishanth Anand, Doina Precup

When the agent lands in a state, its value can be used to compute the TD-error, which is then propagated to other states.

Recurrent Value Functions

no code implementations23 May 2019 Pierre Thodoroff, Nishanth Anand, Lucas Caccia, Doina Precup, Joelle Pineau

Despite recent successes in Reinforcement Learning, value-based methods often suffer from high variance hindering performance.

Continuous Control Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.