no code implementations • 13 Jun 2024 • Arda Sarp Yenicesu, Furkan B. Mutlu, Suleyman S. Kozat, Ozgur S. Oguz
The utilization of the experience replay mechanism enables agents to effectively leverage their experiences on several occasions.
1 code implementation • 1 Sep 2022 • Baturay Saglam, Furkan B. Mutlu, Dogan C. Cicek, Suleyman S. Kozat
A widely-studied deep reinforcement learning (RL) technique known as Prioritized Experience Replay (PER) allows agents to learn from transitions sampled with non-uniform probability proportional to their temporal-difference (TD) error.
1 code implementation • 1 Aug 2022 • Baturay Saglam, Dogan C. Cicek, Furkan B. Mutlu, Suleyman S. Kozat
Compared to on-policy counterparts, off-policy model-free deep reinforcement learning can improve data efficiency by repeatedly using the previously gathered data.
1 code implementation • 27 Jul 2022 • Baturay Saglam, Dogan C. Cicek, Furkan B. Mutlu, Suleyman S. Kozat
Learning in high dimensional continuous tasks is challenging, mainly when the experience replay memory is very limited.
no code implementations • 12 Nov 2021 • Dogan C. Cicek, Enes Duran, Baturay Saglam, Kagan Kaya, Furkan B. Mutlu, Suleyman S. Kozat
We show through continuous control environments of OpenAI gym that our algorithm matches or outperforms the state-of-the-art off-policy policy gradient learning algorithms.
no code implementations • 2 Nov 2021 • Dogan C. Cicek, Enes Duran, Baturay Saglam, Furkan B. Mutlu, Suleyman S. Kozat
In addition, experience replay stores the transitions are generated by the previous policies of the agent that may significantly deviate from the most recent policy of the agent.
1 code implementation • 22 Sep 2021 • Baturay Saglam, Enes Duran, Dogan C. Cicek, Furkan B. Mutlu, Suleyman S. Kozat
We show that in deep actor-critic methods that aim to overcome the overestimation bias, if the reinforcement signals received by the agent have a high variance, a significant underestimation bias arises.