Search Results for author: Baturay Saglam

Found 10 papers, 8 papers with code

Compatible Gradient Approximations for Actor-Critic Algorithms

1 code implementation2 Sep 2024 Baturay Saglam, Dionysis Kalogerias

Deterministic policy gradient algorithms are foundational for actor-critic methods in controlling continuous systems, yet they often encounter inaccuracies due to their dependence on the derivative of the critic's value estimates with respect to input actions.

Deep Reinforcement Learning Based Joint Downlink Beamforming and RIS Configuration in RIS-aided MU-MISO Systems Under Hardware Impairments and Imperfect CSI

2 code implementations10 Oct 2022 Baturay Saglam, Doga Gurgunoglu, Suleyman S. Kozat

We introduce a novel deep reinforcement learning (DRL) approach to jointly optimize transmit beamforming and reconfigurable intelligent surface (RIS) phase shifts in a multiuser multiple input single output (MU-MISO) system to maximize the sum downlink rate under the phase-dependent reflection amplitude model.

Deep Intrinsically Motivated Exploration in Continuous Control

2 code implementations1 Oct 2022 Baturay Saglam, Suleyman S. Kozat

In continuous control, exploration is often performed through undirected strategies in which parameters of the networks or selected actions are perturbed by random noise.

Continuous Control reinforcement-learning +1

Actor Prioritized Experience Replay

1 code implementation1 Sep 2022 Baturay Saglam, Furkan B. Mutlu, Dogan C. Cicek, Suleyman S. Kozat

A widely-studied deep reinforcement learning (RL) technique known as Prioritized Experience Replay (PER) allows agents to learn from transitions sampled with non-uniform probability proportional to their temporal-difference (TD) error.

Continuous Control Reinforcement Learning (RL)

Mitigating Off-Policy Bias in Actor-Critic Methods with One-Step Q-learning: A Novel Correction Approach

1 code implementation1 Aug 2022 Baturay Saglam, Dogan C. Cicek, Furkan B. Mutlu, Suleyman S. Kozat

Compared to on-policy counterparts, off-policy model-free deep reinforcement learning can improve data efficiency by repeatedly using the previously gathered data.

Continuous Control Q-Learning +2

Safe and Robust Experience Sharing for Deterministic Policy Gradient Algorithms

1 code implementation27 Jul 2022 Baturay Saglam, Dogan C. Cicek, Furkan B. Mutlu, Suleyman S. Kozat

Learning in high dimensional continuous tasks is challenging, mainly when the experience replay memory is very limited.

Continuous Control OpenAI Gym +1

AWD3: Dynamic Reduction of the Estimation Bias

no code implementations12 Nov 2021 Dogan C. Cicek, Enes Duran, Baturay Saglam, Kagan Kaya, Furkan B. Mutlu, Suleyman S. Kozat

We show through continuous control environments of OpenAI gym that our algorithm matches or outperforms the state-of-the-art off-policy policy gradient learning algorithms.

Continuous Control OpenAI Gym +1

Off-Policy Correction for Deep Deterministic Policy Gradient Algorithms via Batch Prioritized Experience Replay

no code implementations2 Nov 2021 Dogan C. Cicek, Enes Duran, Baturay Saglam, Furkan B. Mutlu, Suleyman S. Kozat

In addition, experience replay stores the transitions are generated by the previous policies of the agent that may significantly deviate from the most recent policy of the agent.

Computational Efficiency Continuous Control

Parameter-free Reduction of the Estimation Bias in Deep Reinforcement Learning for Deterministic Policy Gradients

1 code implementation24 Sep 2021 Baturay Saglam, Furkan Burak Mutlu, Dogan Can Cicek, Suleyman Serdar Kozat

We show that when the reinforcement signals received by the agents have a high variance, deep actor-critic approaches that overcome the overestimation bias lead to a substantial underestimation bias.

Continuous Control Q-Learning +1

Estimation Error Correction in Deep Reinforcement Learning for Deterministic Actor-Critic Methods

1 code implementation22 Sep 2021 Baturay Saglam, Enes Duran, Dogan C. Cicek, Furkan B. Mutlu, Suleyman S. Kozat

We show that in deep actor-critic methods that aim to overcome the overestimation bias, if the reinforcement signals received by the agent have a high variance, a significant underestimation bias arises.

Continuous Control OpenAI Gym +3

Cannot find the paper you are looking for? You can Submit a new open access paper.