Search Results for author: Otmane Sakhi

Found 8 papers, 0 papers with code

Fast Slate Policy Optimization: Going Beyond Plackett-Luce

no code implementations3 Aug 2023 Otmane Sakhi, David Rohde, Nicolas Chopin

We compare our method to the commonly adopted Plackett-Luce policy class and demonstrate the effectiveness of our approach on problems with action space sizes in the order of millions.

Information Retrieval Recommendation Systems +1

PAC-Bayesian Offline Contextual Bandits With Guarantees

no code implementations24 Oct 2022 Otmane Sakhi, Pierre Alquier, Nicolas Chopin

This paper introduces a new principled approach for off-policy learning in contextual bandits.

Generalization Bounds Multi-Armed Bandits

Offline Evaluation of Reward-Optimizing Recommender Systems: The Case of Simulation

no code implementations18 Sep 2022 Imad Aouali, Amine Benhalloum, Martin Bompaire, Benjamin Heymann, Olivier Jeunen, David Rohde, Otmane Sakhi, Flavian vasile

Naturally, the reason for this is that we can directly measure utility metrics that rely on interventions, being the recommendations that are being shown to users.

counterfactual Recommendation Systems

Probabilistic Rank and Reward: A Scalable Model for Slate Recommendation

no code implementations10 Aug 2022 Imad Aouali, Achraf Ait Sidi Hammou, Sergey Ivanov, Otmane Sakhi, David Rohde, Flavian vasile

We introduce Probabilistic Rank and Reward (PRR), a scalable probabilistic model for personalized slate recommendation.

Recommendation Systems

Fast Offline Policy Optimization for Large Scale Recommendation

no code implementations8 Aug 2022 Otmane Sakhi, David Rohde, Alexandre Gilotte

Personalised interactive systems such as recommender systems require selecting relevant items from massive catalogs dependent on context.

Recommendation Systems

Improving Offline Contextual Bandits with Distributional Robustness

no code implementations13 Nov 2020 Otmane Sakhi, Louis Faury, Flavian vasile

Our approach relies on the construction of asymptotic confidence intervals for offline contextual bandits through the DRO framework.

counterfactual Multi-Armed Bandits +1

BLOB : A Probabilistic Model for Recommendation that Combines Organic and Bandit Signals

no code implementations28 Aug 2020 Otmane Sakhi, Stephen Bonner, David Rohde, Flavian vasile

In contrast, the organic signal is typically strong and covers most items, but is not always relevant to the recommendation task.

Recommendation Systems

Reconsidering Analytical Variational Bounds for Output Layers of Deep Networks

no code implementations2 Oct 2019 Otmane Sakhi, Stephen Bonner, David Rohde, Flavian vasile

The combination of the re-parameterization trick with the use of variational auto-encoders has caused a sensation in Bayesian deep learning, allowing the training of realistic generative models of images and has considerably increased our ability to use scalable latent variable models.

Binary Classification General Classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.