Search Results for author: Ronan Fruit

Found 9 papers, 3 papers with code

Improved Analysis of UCRL2 with Empirical Bernstein Inequality

no code implementations10 Jul 2020 Ronan Fruit, Matteo Pirotta, Alessandro Lazaric

We consider the problem of exploration-exploitation in communicating Markov Decision Processes.

Concentration Inequalities for Multinoulli Random Variables

no code implementations30 Jan 2020 Jian Qian, Ronan Fruit, Matteo Pirotta, Alessandro Lazaric

We investigate concentration inequalities for Dirichlet and Multinomial random variables.

Exploration Bonus for Regret Minimization in Discrete and Continuous Average Reward MDPs

1 code implementation NeurIPS 2019 Jian Qian, Ronan Fruit, Matteo Pirotta, Alessandro Lazaric

The exploration bonus is an effective approach to manage the exploration-exploitation trade-off in Markov Decision Processes (MDPs).

Regret Bounds for Learning State Representations in Reinforcement Learning

no code implementations NeurIPS 2019 Ronald Ortner, Matteo Pirotta, Alessandro Lazaric, Ronan Fruit, Odalric-Ambrym Maillard

We consider the problem of online reinforcement learning when several state representations (mapping histories to a discrete state space) are available to the learning agent.

reinforcement-learning Reinforcement Learning (RL)

Exploration Bonus for Regret Minimization in Undiscounted Discrete and Continuous Markov Decision Processes

no code implementations11 Dec 2018 Jian Qian, Ronan Fruit, Matteo Pirotta, Alessandro Lazaric

We introduce and analyse two algorithms for exploration-exploitation in discrete and continuous Markov Decision Processes (MDPs) based on exploration bonuses.

Efficient Exploration

Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes

1 code implementation NeurIPS 2018 Ronan Fruit, Matteo Pirotta, Alessandro Lazaric

While designing the state space of an MDP, it is common to include states that are transient or not reachable by any policy (e. g., in mountain car, the product space of speed and position contains configurations that are not physically reachable).

Efficient Exploration

Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning

1 code implementation ICML 2018 Ronan Fruit, Matteo Pirotta, Alessandro Lazaric, Ronald Ortner

We introduce SCAL, an algorithm designed to perform efficient exploration-exploitation in any unknown weakly-communicating Markov decision process (MDP) for which an upper bound $c$ on the span of the optimal bias function is known.

Efficient Exploration reinforcement-learning +1

Regret Minimization in MDPs with Options without Prior Knowledge

no code implementations NeurIPS 2017 Ronan Fruit, Matteo Pirotta, Alessandro Lazaric, Emma Brunskill

The option framework integrates temporal abstraction into the reinforcement learning model through the introduction of macro-actions (i. e., options).

Exploration--Exploitation in MDPs with Options

no code implementations25 Mar 2017 Ronan Fruit, Alessandro Lazaric

While a large body of empirical results show that temporally-extended actions and options may significantly affect the learning performance of an agent, the theoretical understanding of how and when options can be beneficial in online reinforcement learning is relatively limited.

Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.