Search Results for author: Nitsan Soffair

Found 6 papers, 0 papers with code

Markov flow policy -- deep MC

no code implementations1 May 2024 Nitsan Soffair, Gilad Katz

Discounted algorithms often encounter evaluation errors due to their reliance on short-term estimations, which can impede their efficacy in addressing simple, short-term tasks and impose undesired temporal discounts (\(\gamma\)).

Conservative DDPG -- Pessimistic RL without Ensemble

no code implementations8 Mar 2024 Nitsan Soffair, Shie Mannor

DDPG is hindered by the overestimation bias problem, wherein its $Q$-estimates tend to overstate the actual $Q$-values.

MinMaxMin $Q$-learning

no code implementations3 Feb 2024 Nitsan Soffair, Shie Mannor

MinMaxMin $Q$-learning is a novel optimistic Actor-Critic algorithm that addresses the problem of overestimation bias ($Q$-estimations are overestimating the real $Q$-values) inherent in conservative RL algorithms.

Q-Learning

SQT -- std $Q$-target

no code implementations3 Feb 2024 Nitsan Soffair, Dotan Di-Castro, Orly Avner, Shie Mannor

We implement SQT on top of TD3/TD7 code and test it against the state-of-the-art (SOTA) actor-critic algorithms, DDPG, TD3 and TD7 on seven popular MuJoCo and Bullet tasks.

Q-Learning

Optimizing Agent Collaboration through Heuristic Multi-Agent Planning

no code implementations3 Jan 2023 Nitsan Soffair

The SOTA algorithms for addressing QDec-POMDP issues, QDec-FP and QDec-FPS, are unable to effectively tackle problems that involve different types of sensing agents.

Cannot find the paper you are looking for? You can Submit a new open access paper.