Search Results for author: Liran Szlak

Found 6 papers, 1 papers with code

Hierarchical Bias-Driven Stratification for Interpretable Causal Effect Estimation

1 code implementation • 31 Jan 2024 • Lucile Ter-Minassian, Liran Szlak, Ehud Karavani, Chris Holmes, Yishai Shimoni

Interpretability and transparency are essential for incorporating causal effect models from observational data into policy decision-making.

Decision Making

Paper
Code

Suboptimal and trait-like reinforcement learning strategies correlate with midbrain encoding of prediction errors

no code implementations • 8 Dec 2021 • Liran Szlak, Kristoffer Aberg, Rony Paz

During probabilistic learning organisms often apply a sub-optimal "probability-matching" strategy, where selection rates match reward probabilities, rather than engaging in the optimal "maximization" strategy, where the option with the highest reward probability is always selected.

Reinforcement Learning (RL)

Paper
Add Code

Replay For Safety

no code implementations • 8 Dec 2021 • Liran Szlak, Ohad Shamir

Experience replay \citep{lin1993reinforcement, mnih2015human} is a widely used technique to achieve efficient use of data and improved performance in RL algorithms.

Q-Learning

Paper
Add Code

Convergence Results For Q-Learning With Experience Replay

no code implementations • 8 Dec 2021 • Liran Szlak, Ohad Shamir

A commonly used heuristic in RL is experience replay (e. g.~\citet{lin1993reinforcement, mnih2015human}), in which a learner stores and re-uses past trajectories as if they were sampled online.

Q-Learning

Paper
Add Code

Online Learning with Local Permutations and Delayed Feedback

no code implementations • ICML 2017 • Ohad Shamir, Liran Szlak

In this paper, we consider the applicability of this setting to convex online learning with delayed feedback, in which the feedback on the prediction made in round $t$ arrives with some delay $\tau$.

Paper
Add Code

Multi-Player Bandits -- a Musical Chairs Approach

no code implementations • 9 Dec 2015 • Jonathan Rosenski, Ohad Shamir, Liran Szlak

We consider a variant of the stochastic multi-armed bandit problem, where multiple players simultaneously choose from the same set of arms and may collide, receiving no reward.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.