Search Results for author: Andrea Tirinzoni

Found 23 papers, 4 papers with code

Simple Ingredients for Offline Reinforcement Learning

no code implementations • 19 Mar 2024 • Edoardo Cetin, Andrea Tirinzoni, Matteo Pirotta, Alessandro Lazaric, Yann Ollivier, Ahmed Touati

Offline reinforcement learning algorithms have proven effective on datasets highly connected to the target downstream task.

D4RL reinforcement-learning

Paper
Add Code

Towards Instance-Optimality in Online PAC Reinforcement Learning

no code implementations • 31 Oct 2023 • Aymen Al-Marjani, Andrea Tirinzoni, Emilie Kaufmann

In this paper, we propose the first instance-dependent lower bound on the sample complexity required for the PAC identification of a near-optimal policy in any tabular episodic MDP.

reinforcement-learning

Paper
Add Code

Active Coverage for PAC Reinforcement Learning

no code implementations • 23 Jun 2023 • Aymen Al-Marjani, Andrea Tirinzoni, Emilie Kaufmann

In particular, we obtain a simple algorithm for PAC reward-free exploration with an instance-dependent sample complexity that, in certain MDPs which are "easy to explore", is lower than the minimax one.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Layered State Discovery for Incremental Autonomous Exploration

no code implementations • 7 Feb 2023 • Liyu Chen, Andrea Tirinzoni, Alessandro Lazaric, Matteo Pirotta

We leverage these results to design Layered Autonomous Exploration (LAE), a novel algorithm for AX that attains a sample complexity of $\tilde{\mathcal{O}}(LS^{\rightarrow}_{L(1+\epsilon)}\Gamma_{L(1+\epsilon)} A \ln^{12}(S^{\rightarrow}_{L(1+\epsilon)})/\epsilon^2)$, where $S^{\rightarrow}_{L(1+\epsilon)}$ is the number of states that are incrementally $L(1+\epsilon)$-controllable, $A$ is the number of actions, and $\Gamma_{L(1+\epsilon)}$ is the branching factor of the transitions over such states.

Paper
Add Code

On the Complexity of Representation Learning in Contextual Linear Bandits

no code implementations • 19 Dec 2022 • Andrea Tirinzoni, Matteo Pirotta, Alessandro Lazaric

In contextual linear bandits, the reward function is assumed to be a linear combination of an unknown reward vector and a given embedding of context-arm pairs.

Model Selection Multi-Armed Bandits +1

Paper
Add Code

Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees

no code implementations • 24 Oct 2022 • Andrea Tirinzoni, Matteo Papini, Ahmed Touati, Alessandro Lazaric, Matteo Pirotta

We study the problem of representation learning in stochastic contextual linear bandits.

Multi-Armed Bandits Representation Learning

Paper
Add Code

Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path

no code implementations • 10 Oct 2022 • Liyu Chen, Andrea Tirinzoni, Matteo Pirotta, Alessandro Lazaric

We also initiate the study of learning $\epsilon$-optimal policies without access to a generative model (i. e., the so-called best-policy identification problem), and show that sample-efficient learning is impossible in general.

Paper
Add Code

Optimistic PAC Reinforcement Learning: the Instance-Dependent View

no code implementations • 12 Jul 2022 • Andrea Tirinzoni, Aymen Al-Marjani, Emilie Kaufmann

Optimistic algorithms have been extensively studied for regret minimization in episodic tabular MDPs, both from a minimax and an instance-dependent view.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

On Elimination Strategies for Bandit Fixed-Confidence Identification

1 code implementation • 22 May 2022 • Andrea Tirinzoni, Rémy Degenne

Elimination algorithms for bandit identification, which prune the plausible correct answers sequentially until only one remains, are computationally convenient since they reduce the problem size over time.

Paper
Code

Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs

no code implementations • 17 Mar 2022 • Andrea Tirinzoni, Aymen Al-Marjani, Emilie Kaufmann

In probably approximately correct (PAC) reinforcement learning (RL), an agent is required to identify an $\epsilon$-optimal policy with probability $1-\delta$.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification

1 code implementation • NeurIPS 2021 • Clémence Réda, Andrea Tirinzoni, Rémy Degenne

In this work, we first derive a tractable lower bound on the sample complexity of any $\delta$-correct algorithm for the general Top-m identification problem.

Recommendation Systems

Paper
Code

Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection

no code implementations • NeurIPS 2021 • Matteo Papini, Andrea Tirinzoni, Aldo Pacchiano, Marcello Restelli, Alessandro Lazaric, Matteo Pirotta

We study the role of the representation of state-action value functions in regret minimization in finite-horizon Markov Decision Processes (MDPs) with linear structure.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

A Fully Problem-Dependent Regret Lower Bound for Finite-Horizon MDPs

no code implementations • 24 Jun 2021 • Andrea Tirinzoni, Matteo Pirotta, Alessandro Lazaric

We derive a novel asymptotic problem-dependent lower-bound for regret minimization in finite-horizon tabular Markov Decision Processes (MDPs).

Paper
Add Code

Meta-Reinforcement Learning by Tracking Task Non-stationarity

1 code implementation • 18 May 2021 • Riccardo Poiani, Andrea Tirinzoni, Marcello Restelli

At test time, TRIO tracks the evolution of the latent parameters online, hence reducing the uncertainty over future tasks and obtaining fast adaptation through the meta-learned policy.

Meta Reinforcement Learning reinforcement-learning +1

Paper
Code

Leveraging Good Representations in Linear Contextual Bandits

no code implementations • 8 Apr 2021 • Matteo Papini, Andrea Tirinzoni, Marcello Restelli, Alessandro Lazaric, Matteo Pirotta

We show that the regret is indeed never worse than the regret obtained by running LinUCB on the best representation (up to a $\ln M$ factor).

Multi-Armed Bandits

Paper
Add Code

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

no code implementations • NeurIPS 2020 • Andrea Tirinzoni, Matteo Pirotta, Marcello Restelli, Alessandro Lazaric

Finally, we remove forced exploration and build on confidence intervals of the optimization problem to encourage a minimum level of exploration that is better adapted to the problem structure.

Paper
Add Code

Sequential Transfer in Reinforcement Learning with a Generative Model

no code implementations • ICML 2020 • Andrea Tirinzoni, Riccardo Poiani, Marcello Restelli

We are interested in how to design reinforcement learning agents that provably reduce the sample complexity for learning new tasks by transferring knowledge from previously-solved ones.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

A Novel Confidence-Based Algorithm for Structured Bandits

no code implementations • 23 May 2020 • Andrea Tirinzoni, Alessandro Lazaric, Marcello Restelli

We study finite-armed stochastic bandits where the rewards of each arm might be correlated to those of other arms.

Paper
Add Code

Gradient-Aware Model-based Policy Search

no code implementations • 9 Sep 2019 • Pierluca D'Oro, Alberto Maria Metelli, Andrea Tirinzoni, Matteo Papini, Marcello Restelli

In this paper, we introduce a novel model-based policy search approach that exploits the knowledge of the current agent policy to learn an approximate transition model, focusing on the portions of the environment that are most relevant for policy improvement.

Model-based Reinforcement Learning

Paper
Add Code

Feature Selection via Mutual Information: New Theoretical Insights

1 code implementation • 17 Jul 2019 • Mario Beraha, Alberto Maria Metelli, Matteo Papini, Andrea Tirinzoni, Marcello Restelli

Mutual information has been successfully adopted in filter feature-selection methods to assess both the relevancy of a subset of features in predicting the target variable and the redundancy with respect to other variables.

feature selection regression

Paper
Code

Transfer of Value Functions via Variational Methods

no code implementations • NeurIPS 2018 • Andrea Tirinzoni, Rafael Rodriguez Sanchez, Marcello Restelli

We consider the problem of transferring value functions in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes

no code implementations • NeurIPS 2018 • Andrea Tirinzoni, Marek Petrik, Xiangli Chen, Brian Ziebart

What policy should be employed in a Markov decision process with uncertain parameters?

Transfer Learning

Paper
Add Code

Importance Weighted Transfer of Samples in Reinforcement Learning

no code implementations • ICML 2018 • Andrea Tirinzoni, Andrea Sessa, Matteo Pirotta, Marcello Restelli

In the proposed approach, all the samples are transferred and used by a batch RL algorithm to solve the target task, but their contribution to the learning process is proportional to their importance weight.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.