Search Results for author: Pratik Gajane

Found 13 papers, 3 papers with code

Generalizing distribution of partial rewards for multi-armed bandits with temporally-partitioned rewards

no code implementations13 Nov 2022 Ronald C. van den Broek, Rik Litjens, Tobias Sagis, Luc Siecker, Nina Verbeeke, Pratik Gajane

In this paper, we introduce a general formulation of how an arm's cumulative reward is distributed across several rounds, called Beta-spread property.

Multi-Armed Bandits

Survey on Fair Reinforcement Learning: Theory and Practice

no code implementations20 May 2022 Pratik Gajane, Akrati Saxena, Maryam Tavakol, George Fletcher, Mykola Pechenizkiy

In this article, we provide an extensive overview of fairness approaches that have been implemented via a reinforcement learning (RL) framework.

Decision Making Fairness +2

The Impact of Batch Learning in Stochastic Linear Bandits

1 code implementation14 Feb 2022 Danil Provodin, Pratik Gajane, Mykola Pechenizkiy, Maurits Kaptein

Our main theoretical results show that the impact of batch learning is a multiplicative factor of batch size relative to the regret of online behavior.

Autonomous exploration for navigating in non-stationary CMPs

no code implementations18 Oct 2019 Pratik Gajane, Ronald Ortner, Peter Auer, Csaba Szepesvari

We consider a setting in which the objective is to learn to navigate in a controlled Markov process (CMP) where transition probabilities may abruptly change.

Navigate

A Sliding-Window Algorithm for Markov Decision Processes with Arbitrarily Changing Rewards and Transitions

no code implementations25 May 2018 Pratik Gajane, Ronald Ortner, Peter Auer

We consider reinforcement learning in changing Markov Decision Processes where both the state-transition probabilities and the reward functions may vary over time.

reinforcement-learning

Counterfactual Learning for Machine Translation: Degeneracies and Solutions

no code implementations23 Nov 2017 Carolin Lawrence, Pratik Gajane, Stefan Riezler

Counterfactual learning is a natural scenario to improve web-based machine translation services by offline learning from feedback logged during user interactions.

Machine Translation Translation

On Formalizing Fairness in Prediction with Machine Learning

no code implementations9 Oct 2017 Pratik Gajane, Mykola Pechenizkiy

Machine learning algorithms for prediction are increasingly being used in critical decisions affecting human lives.

BIG-bench Machine Learning Fairness

Corrupt Bandits for Preserving Local Privacy

no code implementations16 Aug 2017 Pratik Gajane, Tanguy Urvoy, Emilie Kaufmann

In this framework, motivated by privacy preservation in online recommender systems, the goal is to maximize the sum of the (unobserved) rewards, based on the observation of transformation of these rewards through a stochastic corruption process with known parameters.

Recommendation Systems

A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits

no code implementations15 Jan 2016 Pratik Gajane, Tanguy Urvoy, Fabrice Clérot

We study the K-armed dueling bandit problem which is a variation of the classical Multi-Armed Bandit (MAB) problem in which the learner receives only relative feedback about the selected pairs of arms.

Information Retrieval Retrieval

Utility-based Dueling Bandits as a Partial Monitoring Game

no code implementations10 Jul 2015 Pratik Gajane, Tanguy Urvoy

Partial monitoring is a generic framework for sequential decision-making with incomplete feedback.

Decision Making online learning

Cannot find the paper you are looking for? You can Submit a new open access paper.