Search Results for author: Ciara Pike-Burke

Found 10 papers, 1 papers with code

Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for Dimension-Dependent Adaptivity

no code implementations2 Oct 2023 Emmeran Johnson, Ciara Pike-Burke, Patrick Rebeschini

An algorithm is sample-efficient if it uses a number of queries $n$ to the environment that is polynomial in the dimension $d$ of the problem.

reinforcement-learning

Delayed Feedback in Kernel Bandits

no code implementations1 Feb 2023 Sattar Vakili, Danyal Ahmed, Alberto Bernacchia, Ciara Pike-Burke

An abstraction of the problem can be formulated as a kernel based bandit problem (also known as Bayesian optimisation), where a learner aims at optimising a kernelized function through sequential noisy observations.

Bayesian Optimisation Recommendation Systems

Delayed Feedback in Generalised Linear Bandits Revisited

no code implementations21 Jul 2022 Benjamin Howson, Ciara Pike-Burke, Sarah Filippi

However, the stringent requirement for immediate rewards is unmet in many real-world applications where the reward is almost always delayed.

Decision Making

Bandit problems with fidelity rewards

no code implementations25 Nov 2021 Gábor Lugosi, Ciara Pike-Burke, Pierre-André Savalle

The fidelity bandits problem is a variant of the $K$-armed bandit problem in which the reward of each arm is augmented by a fidelity reward that provides the player with an additional payoff depending on how 'loyal' the player has been to that arm in the past.

Optimism and Delays in Episodic Reinforcement Learning

no code implementations15 Nov 2021 Benjamin Howson, Ciara Pike-Burke, Sarah Filippi

In this paper, we study the impact of delayed feedback in episodic reinforcement learning from a theoretical perspective and propose two general-purpose approaches to handling the delays.

reinforcement-learning Reinforcement Learning (RL)

Local Differential Privacy for Regret Minimization in Reinforcement Learning

no code implementations NeurIPS 2021 Evrard Garcelon, Vianney Perchet, Ciara Pike-Burke, Matteo Pirotta

Motivated by this, we study privacy in the context of finite-horizon Markov Decision Processes (MDPs) by requiring information to be obfuscated on the user side.

reinforcement-learning Reinforcement Learning (RL)

A Unifying View of Optimism in Episodic Reinforcement Learning

no code implementations NeurIPS 2020 Gergely Neu, Ciara Pike-Burke

The principle of optimism in the face of uncertainty underpins many theoretically successful reinforcement learning algorithms.

reinforcement-learning Reinforcement Learning (RL)

Recovering Bandits

1 code implementation NeurIPS 2019 Ciara Pike-Burke, Steffen Grünewälder

We study the recovering bandits problem, a variant of the stochastic multi-armed bandit problem where the expected reward of each arm varies according to some unknown function of the time since the arm was last played.

Computational Efficiency Gaussian Processes

Bandits with Delayed, Aggregated Anonymous Feedback

no code implementations ICML 2018 Ciara Pike-Burke, Shipra Agrawal, Csaba Szepesvari, Steffen Grunewalder

In this problem, when the player pulls an arm, a reward is generated, however it is not immediately observed.

Cannot find the paper you are looking for? You can Submit a new open access paper.