You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

no code implementations • 25 Jul 2022 • Riccardo Poiani, Ciprian Stirbu, Alberto Maria Metelli, Marcello Restelli

With the continuous growth of the global economy and markets, resource imbalance has risen to be one of the central issues in real logistic scenarios.

1 code implementation • 8 Jul 2022 • Julen Cestero, Marco Quartulli, Alberto Maria Metelli, Marcello Restelli

Warehouse Management Systems have been evolving and improving thanks to new Data Intelligence techniques.

no code implementations • 3 Jun 2022 • Sancho Salcedo-Sanz, Jorge Pérez-Aracil, Guido Ascenso, Javier Del Ser, David Casillas-Pérez, Christopher Kadow, Dusan Fister, David Barriopedro, Ricardo García-Herrera, Marcello Restelli, Mateo Giuliani, Andrea Castelletti

The accurate prediction, characterization, and attribution of atmospheric EEs is therefore a key research field, in which many groups are currently working by applying different methodologies and computational tools.

no code implementations • 1 Jun 2022 • Giulia Romano, Andrea Agostini, Francesco Trovò, Nicola Gatti, Marcello Restelli

We provide two algorithms to address TP-MAB problems, namely, TP-UCB-FR and TP-UCB-EW, which exploit the partial information disclosed by the reward collected over time.

1 code implementation • 20 May 2022 • Marco Mussi, Davide Lombarda, Alberto Maria Metelli, Francesco Trovò, Marcello Restelli

In this work, we propose a general and flexible framework, namely ARLO: Automated Reinforcement Learning Optimizer, to construct automated pipelines for AutoRL.

no code implementations • 11 May 2022 • Pierre Liotet, Davide Maran, Lorenzo Bisi, Marcello Restelli

When the agent's observations or interactions are delayed, classic reinforcement learning tools usually fail.

no code implementations • ICML Workshop URL 2021 • Mirco Mutti, Stefano Del Col, Marcello Restelli

In this paper, we seek for a reward-free compression of the policy space into a finite set of representative policies, such that, given any policy $\pi$, the minimum R\'enyi divergence between the state-action distributions of the representative policies and the state-action distribution of $\pi$ is bounded.

no code implementations • 14 Feb 2022 • Mirco Mutti, Riccardo De Santi, Emanuele Rossi, Juan Felipe Calderon, Michael Bronstein, Marcello Restelli

In this setting, the agent can take a finite amount of reward-free interactions from a subset of these environments.

no code implementations • ICML Workshop URL 2021 • Mirco Mutti, Riccardo De Santi, Marcello Restelli

In the maximum state entropy exploration framework, an agent interacts with a reward-free environment to learn a policy that maximizes the entropy of the expected state visitations it is inducing.

no code implementations • 3 Feb 2022 • Mirco Mutti, Riccardo De Santi, Piersilvio De Bartolomeis, Marcello Restelli

In particular, we show that erroneously optimizing the infinite trials objective in place of the actual finite trials one, as it is usually done, can lead to a significant approximation error.

2 code implementations • 16 Dec 2021 • Mirco Mutti, Mattia Mancassola, Marcello Restelli

Along this line, we address the problem of unsupervised reinforcement learning in a class of multiple environments, in which the policy is pre-trained with interactions from the whole class, and then fine-tuned for several tasks in any environment of the class.

no code implementations • 13 Dec 2021 • Pierre Liotet, Francesco Vidaich, Alberto Maria Metelli, Marcello Restelli

This hyper-policy is trained to maximize the estimated future performance, efficiently reusing past data by means of importance sampling, at the cost of introducing a controlled bias.

1 code implementation • NeurIPS 2021 • Alberto Maria Metelli, Alessio Russo, Marcello Restelli

Importance Sampling (IS) is a widely used building block for a large variety of off-policy estimation and learning algorithms.

no code implementations • NeurIPS 2021 • Giorgia Ramponi, Alberto Maria Metelli, Alessandro Concetti, Marcello Restelli

This presupposes that the two actors have the same reward functions.

no code implementations • NeurIPS 2021 • Matteo Papini, Andrea Tirinzoni, Aldo Pacchiano, Marcello Restelli, Alessandro Lazaric, Matteo Pirotta

We study the role of the representation of state-action value functions in regret minimization in finite-horizon Markov Decision Processes (MDPs) with linear structure.

no code implementations • 29 Sep 2021 • Alberto Maria Metelli, Samuele Meta, Marcello Restelli

In this setting, Importance Sampling (IS) is typically employed as a what-if analysis tool, with the goal of estimating the performance of a target policy, given samples collected with a different behavioral policy.

no code implementations • ICLR 2022 • Lorenzo Moro, Amarildo Likmeta, Enrico Prati, Marcello Restelli

It has been extended from complex continuous domains through function approximators to bias the search of the planning tree in AlphaZero.

no code implementations • ICML Workshop URL 2021 • Mirco Mutti, Mattia Mancassola, Marcello Restelli

Along this line, we address the problem of learning to explore a class of multiple reward-free environments with a unique general strategy, which aims to provide a universal initialization to subsequent reinforcement learning problems specified over the same class.

no code implementations • ICML Workshop AutoML 2021 • Luca Sabbioni, Francesco Corda, Marcello Restelli

Policy-based algorithms are among the most widely adopted techniques in model-free RL, thanks to their strong theoretical groundings and good properties in continuous action spaces.

1 code implementation • 18 May 2021 • Riccardo Poiani, Andrea Tirinzoni, Marcello Restelli

At test time, TRIO tracks the evolution of the latent parameters online, hence reducing the uncertainty over future tasks and obtaining fast adaptation through the meta-learned policy.

no code implementations • 8 Apr 2021 • Matteo Papini, Andrea Tirinzoni, Marcello Restelli, Alessandro Lazaric, Matteo Pirotta

We show that the regret is indeed never worse than the regret obtained by running LinUCB on the best representation (up to a $\ln M$ factor).

no code implementations • 17 Mar 2021 • Conor F. Hayes, Roxana Rădulescu, Eugenio Bargiacchi, Johan Källström, Matthew Macfarlane, Mathieu Reymond, Timothy Verstraeten, Luisa M. Zintgraf, Richard Dazeley, Fredrik Heintz, Enda Howley, Athirai A. Irissappane, Patrick Mannion, Ann Nowé, Gabriel Ramos, Marcello Restelli, Peter Vamplew, Diederik M. Roijers

Real-world decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives.

no code implementations • ICLR Workshop SSL-RL 2021 • Mirco Mutti, Mattia Mancassola, Marcello Restelli

Along this line, we address the problem of learning to explore a class of multiple reward-free environments with a unique general strategy, which aims to provide a universal initialization to subsequent reinforcement learning problems specified over the same class.

no code implementations • 15 Dec 2020 • Alberto Maria Metelli, Matteo Papini, Pierluca D'Oro, Marcello Restelli

In this paper, we introduce the notion of mediator feedback that frames PO as an online learning problem over the policy space.

no code implementations • 23 Oct 2020 • Edoardo Vittori, Michele Trapletti, Marcello Restelli

In this paper we show how risk-averse reinforcement learning can be used to hedge options.

no code implementations • NeurIPS 2020 • Andrea Tirinzoni, Matteo Pirotta, Marcello Restelli, Alessandro Lazaric

Finally, we remove forced exploration and build on confidence intervals of the optimization problem to encourage a minimum level of exploration that is better adapted to the problem structure.

no code implementations • 15 Jul 2020 • Giorgia Ramponi, Marcello Restelli

In this paper, we propose NOHD (Newton Optimization on Helmholtz Decomposition), a Newton-like algorithm for multi-agent learning problems based on the decomposition of the dynamics of the system in its irrotational (Potential) and solenoidal (Hamiltonian) component.

no code implementations • NeurIPS 2020 • Giorgia Ramponi, Gianluca Drappo, Marcello Restelli

Inverse Reinforcement Learning addresses the problem of inferring an expert's reward function from demonstrations.

1 code implementation • 9 Jul 2020 • Mirco Mutti, Lorenzo Pratissoli, Marcello Restelli

In a reward-free environment, what is a suitable intrinsic objective for an agent to pursue so that it can learn an optimal task-agnostic exploration policy?

no code implementations • ICML 2020 • Andrea Tirinzoni, Riccardo Poiani, Marcello Restelli

We are interested in how to design reinforcement learning agents that provably reduce the sample complexity for learning new tasks by transferring knowledge from previously-solved ones.

1 code implementation • ICML Workshop LifelongML 2020 • Mirco Mutti, Lorenzo Pratissoli, Marcello Restelli

In a reward-free environment, what is a suitable intrinsic objective for an agent to pursue so that it can learn an optimal task-agnostic exploration policy?

no code implementations • 26 May 2020 • Giuseppe Canonaco, Andrea Soprani, Manuel Roveri, Marcello Restelli

In most of the transfer learning approaches to reinforcement learning (RL) the distribution over the tasks is assumed to be stationary.

no code implementations • 23 May 2020 • Andrea Tirinzoni, Alessandro Lazaric, Marcello Restelli

We study finite-armed stochastic bandits where the rewards of each arm might be correlated to those of other arms.

1 code implementation • ICLR 2020 • Carlo D'Eramo, Davide Tateo, Andrea Bonarini, Marcello Restelli, Jan Peters

We study the benefit of sharing representations among tasks to enable the effective use of deep neural networks in Multi-Task Reinforcement Learning.

no code implementations • 3 Mar 2020 • Alessandro Nuara, Francesco Trovò, Nicola Gatti, Marcello Restelli

We experimentally evaluate our algorithms with synthetic settings generated from real data from Yahoo!, and we present the results of the adoption of our algorithms in a real-world application with a daily average spent of 1, 000 Euros for more than one year.

1 code implementation • ICML 2020 • Alberto Maria Metelli, Flavio Mazzolini, Lorenzo Bisi, Luca Sabbioni, Marcello Restelli

The choice of the control frequency of a system has a relevant impact on the ability of reinforcement learning algorithms to learn a highly performing policy.

2 code implementations • 4 Jan 2020 • Carlo D'Eramo, Davide Tateo, Andrea Bonarini, Marcello Restelli, Jan Peters

MushroomRL is an open-source Python library developed to simplify the process of implementing and running Reinforcement Learning (RL) experiments.

no code implementations • 6 Dec 2019 • Lorenzo Bisi, Luca Sabbioni, Edoardo Vittori, Matteo Papini, Marcello Restelli

In real-world decision-making problems, for instance in the fields of finance, robotics or autonomous driving, keeping uncertainty under control is as important as maximizing expected returns.

1 code implementation • NeurIPS 2019 • Alberto Maria Metelli, Amarildo Likmeta, Marcello Restelli

How does the uncertainty of the value function propagate when performing temporal difference learning?

no code implementations • 9 Sep 2019 • Pierluca D'Oro, Alberto Maria Metelli, Andrea Tirinzoni, Matteo Papini, Marcello Restelli

In this paper, we introduce a novel model-based policy search approach that exploits the knowledge of the current agent policy to learn an approximate transition model, focusing on the portions of the environment that are most relevant for policy improvement.

no code implementations • 9 Sep 2019 • Alberto Maria Metelli, Guglielmo Manneschi, Marcello Restelli

We study the problem of identifying the policy space of a learning agent, having access to a set of demonstrations generated by its optimal policy.

1 code implementation • 17 Jul 2019 • Mario Beraha, Alberto Maria Metelli, Matteo Papini, Andrea Tirinzoni, Marcello Restelli

Mutual information has been successfully adopted in filter feature-selection methods to assess both the relevancy of a subset of features in predicting the target variable and the redundancy with respect to other variables.

no code implementations • 10 Jul 2019 • Mirco Mutti, Marcello Restelli

What is a good exploration strategy for an agent that interacts with an environment in the absence of external rewards?

no code implementations • 8 May 2019 • Matteo Papini, Matteo Pirotta, Marcello Restelli

Policy Gradient (PG) algorithms are among the best candidates for the much-anticipated applications of reinforcement learning to real-world control tasks, such as robotics.

no code implementations • NeurIPS 2018 • Andrea Tirinzoni, Rafael Rodriguez Sanchez, Marcello Restelli

We consider the problem of transferring value functions in reinforcement learning.

2 code implementations • NeurIPS 2018 • Alberto Maria Metelli, Matteo Papini, Francesco Faccio, Marcello Restelli

Policy optimization is an effective reinforcement learning approach to solve continuous control tasks.

1 code implementation • ICML 2018 • Matteo Papini, Damiano Binaghi, Giuseppe Canonaco, Matteo Pirotta, Marcello Restelli

In this paper, we propose a novel reinforcement- learning algorithm consisting in a stochastic variance-reduced version of policy gradient for solving Markov Decision Processes (MDPs).

no code implementations • ICML 2018 • Alberto Maria Metelli, Mirco Mutti, Marcello Restelli

After having introduced our approach and derived some theoretical results, we present the experimental evaluation in two explicative problems to show the benefits of the environment configurability on the performance of the learned policy.

no code implementations • ICML 2018 • Andrea Tirinzoni, Andrea Sessa, Matteo Pirotta, Marcello Restelli

In the proposed approach, all the samples are transferred and used by a batch RL algorithm to solve the target task, but their contribution to the learning process is proportional to their importance weight.

no code implementations • 9 Dec 2017 • Matteo Pirotta, Marcello Restelli

In this paper, we propose a novel approach to automatically determine the batch size in stochastic gradient descent methods.

no code implementations • NeurIPS 2017 • Matteo Papini, Matteo Pirotta, Marcello Restelli

Policy gradient methods are among the best Reinforcement Learning (RL) techniques to solve complex control problems.

no code implementations • NeurIPS 2017 • Alberto Maria Metelli, Matteo Pirotta, Marcello Restelli

Within this subspace, using a second-order criterion, we search for the reward function that penalizes the most a deviation from the expert's policy.

no code implementations • ICML 2017 • Samuele Tosatto, Matteo Pirotta, Carlo D’Eramo, Marcello Restelli

This paper is about the study of B-FQI, an Approximated Value Iteration (AVI) algorithm that exploits a boosting procedure to estimate the action-value function in reinforcement learning problems.

no code implementations • 17 Nov 2016 • Stefano Paladino, Francesco Trovò, Marcello Restelli, Nicola Gatti

We study, to the best of our knowledge, the first Bayesian algorithm for unimodal Multi-Armed Bandit (MAB) problems with graph structure.

no code implementations • NeurIPS 2014 • Daniele Calandriello, Alessandro Lazaric, Marcello Restelli

This is equivalent to assuming that the weight vectors of the task value functions are \textit{jointly sparse}, i. e., the set of their non-zero components is small and it is shared across tasks.

no code implementations • 13 Jun 2014 • Matteo Pirotta, Simone Parisi, Marcello Restelli

The paper is about learning a continuous approximation of the Pareto frontier in Multi-Objective Markov Decision Problems (MOMDPs).

no code implementations • NeurIPS 2013 • Matteo Pirotta, Marcello Restelli, Luca Bascetta

In the last decade, policy gradient methods have significantly grown in popularity in the reinforcement--learning field.

no code implementations • NeurIPS 2011 • Alessandro Lazaric, Marcello Restelli

Transfer reinforcement learning (RL) methods leverage on the experience collected on a set of source tasks to speed-up RL algorithms.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.