Search Results for author: Pratik Gajane

Found 18 papers, 3 papers with code

Investigating Gender Fairness in Machine Learning-driven Personalized Care for Chronic Pain

no code implementations • 29 Feb 2024 • Pratik Gajane, Sean Newman, Mykola Pechenizkiy, John D. Piette

In this article, we study gender fairness in personalized pain care recommendations using a real-world application of reinforcement learning (Piette et al., 2022a).

Decision Making Fairness +4

Paper
Add Code

Provably Efficient Exploration in Constrained Reinforcement Learning:Posterior Sampling Is All You Need

no code implementations • 27 Sep 2023 • Danil Provodin, Pratik Gajane, Mykola Pechenizkiy, Maurits Kaptein

We present a new algorithm based on posterior sampling for learning in constrained Markov decision processes (CMDP) in the infinite-horizon undiscounted setting.

Efficient Exploration

Paper
Add Code

Multi-Armed Bandits with Generalized Temporally-Partitioned Rewards

no code implementations • 1 Mar 2023 • Ronald C. van den Broek, Rik Litjens, Tobias Sagis, Luc Siecker, Nina Verbeeke, Pratik Gajane

In some real-world applications, feedback about a decision is delayed and may arrive via partial rewards that are observed with different delays.

Decision Making Multi-Armed Bandits

Paper
Add Code

Curiosity-driven Exploration in Sparse-reward Multi-agent Reinforcement Learning

no code implementations • 21 Feb 2023 • Jiong Li, Pratik Gajane

Sparsity of rewards while applying a deep reinforcement learning method negatively affects its sample-efficiency.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Local Differential Privacy for Sequential Decision Making in a Changing Environment

no code implementations • 2 Jan 2023 • Pratik Gajane

We study the problem of preserving privacy while still providing high utility in sequential decision making scenarios in a changing environment.

Decision Making Multi-Armed Bandits

Paper
Add Code

Generalizing distribution of partial rewards for multi-armed bandits with temporally-partitioned rewards

no code implementations • 13 Nov 2022 • Ronald C. van den Broek, Rik Litjens, Tobias Sagis, Luc Siecker, Nina Verbeeke, Pratik Gajane

In this paper, we introduce a general formulation of how an arm's cumulative reward is distributed across several rounds, called Beta-spread property.

Multi-Armed Bandits

Paper
Add Code

An Empirical Evaluation of Posterior Sampling for Constrained Reinforcement Learning

1 code implementation • 8 Sep 2022 • Danil Provodin, Pratik Gajane, Mykola Pechenizkiy, Maurits Kaptein

We study a posterior sampling approach to efficient exploration in constrained reinforcement learning.

Efficient Exploration reinforcement-learning +1

Paper
Code

Survey on Fair Reinforcement Learning: Theory and Practice

no code implementations • 20 May 2022 • Pratik Gajane, Akrati Saxena, Maryam Tavakol, George Fletcher, Mykola Pechenizkiy

In this article, we provide an extensive overview of fairness approaches that have been implemented via a reinforcement learning (RL) framework.

Decision Making Fairness +3

Paper
Add Code

The Impact of Batch Learning in Stochastic Linear Bandits

1 code implementation • 14 Feb 2022 • Danil Provodin, Pratik Gajane, Mykola Pechenizkiy, Maurits Kaptein

Our main theoretical results show that the impact of batch learning is a multiplicative factor of batch size relative to the regret of online behavior.

Paper
Code

The Impact of Batch Learning in Stochastic Bandits

1 code implementation • 3 Nov 2021 • Danil Provodin, Pratik Gajane, Mykola Pechenizkiy, Maurits Kaptein

We consider a special case of bandit problems, namely batched bandits.

Recommendation Systems

Paper
Code

Autonomous exploration for navigating in non-stationary CMPs

no code implementations • 18 Oct 2019 • Pratik Gajane, Ronald Ortner, Peter Auer, Csaba Szepesvari

We consider a setting in which the objective is to learn to navigate in a controlled Markov process (CMP) where transition probabilities may abruptly change.

Navigate

Paper
Add Code

Variational Regret Bounds for Reinforcement Learning

no code implementations • 14 May 2019 • Pratik Gajane, Ronald Ortner, Peter Auer

This is the first variational regret bound for the general reinforcement learning setting.

General Reinforcement Learning reinforcement-learning +1

Paper
Add Code

A Sliding-Window Algorithm for Markov Decision Processes with Arbitrarily Changing Rewards and Transitions

no code implementations • 25 May 2018 • Pratik Gajane, Ronald Ortner, Peter Auer

We consider reinforcement learning in changing Markov Decision Processes where both the state-transition probabilities and the reward functions may vary over time.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Counterfactual Learning for Machine Translation: Degeneracies and Solutions

no code implementations • 23 Nov 2017 • Carolin Lawrence, Pratik Gajane, Stefan Riezler

Counterfactual learning is a natural scenario to improve web-based machine translation services by offline learning from feedback logged during user interactions.

counterfactual Machine Translation +1

Paper
Add Code

On Formalizing Fairness in Prediction with Machine Learning

no code implementations • 9 Oct 2017 • Pratik Gajane, Mykola Pechenizkiy

Machine learning algorithms for prediction are increasingly being used in critical decisions affecting human lives.

BIG-bench Machine Learning Fairness

Paper
Add Code

Corrupt Bandits for Preserving Local Privacy

no code implementations • 16 Aug 2017 • Pratik Gajane, Tanguy Urvoy, Emilie Kaufmann

In this framework, motivated by privacy preservation in online recommender systems, the goal is to maximize the sum of the (unobserved) rewards, based on the observation of transformation of these rewards through a stochastic corruption process with known parameters.

Recommendation Systems

Paper
Add Code

A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits

no code implementations • 15 Jan 2016 • Pratik Gajane, Tanguy Urvoy, Fabrice Clérot

We study the K-armed dueling bandit problem which is a variation of the classical Multi-Armed Bandit (MAB) problem in which the learner receives only relative feedback about the selected pairs of arms.

Information Retrieval Retrieval

Paper
Add Code

Utility-based Dueling Bandits as a Partial Monitoring Game

no code implementations • 10 Jul 2015 • Pratik Gajane, Tanguy Urvoy

Partial monitoring is a generic framework for sequential decision-making with incomplete feedback.

Decision Making

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.