Search Results for author: Samuele Tosatto

Found 11 papers, 5 papers with code

Deep Probabilistic Movement Primitives with a Bayesian Aggregator

no code implementations11 Jul 2023 Michael Przystupa, Faezeh Haghverd, Martin Jagersand, Samuele Tosatto

Movement primitives are trainable parametric models that reproduce robotic movements starting from a limited set of demonstrations.

Dynamic Decision Frequency with Continuous Options

1 code implementation6 Dec 2022 Amirmohammad Karimi, Jun Jin, Jun Luo, A. Rupam Mahmood, Martin Jagersand, Samuele Tosatto

In classic reinforcement learning algorithms, agents make decisions at discrete and fixed time intervals.

Continuous Control

A Temporal-Difference Approach to Policy Gradient Estimation

1 code implementation4 Feb 2022 Samuele Tosatto, Andrew Patterson, Martha White, A. Rupam Mahmood

The policy gradient theorem (Sutton et al., 2000) prescribes the usage of a cumulative discounted state distribution under the target policy to approximate the gradient.

An Alternate Policy Gradient Estimator for Softmax Policies

1 code implementation22 Dec 2021 Shivam Garg, Samuele Tosatto, Yangchen Pan, Martha White, A. Rupam Mahmood

Policy gradient (PG) estimators are ineffective in dealing with softmax policies that are sub-optimally saturated, which refers to the situation when the policy concentrates its probability mass on sub-optimal actions.

Model-free Policy Learning with Reward Gradients

1 code implementation9 Mar 2021 Qingfeng Lan, Samuele Tosatto, Homayoon Farrahi, A. Rupam Mahmood

As a key component in reinforcement learning, the reward function is usually devised carefully to guide the agent.

Continuous Control Policy Gradient Methods

Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient

no code implementations27 Oct 2020 Samuele Tosatto, João Carvalho, Jan Peters

Off-policy Reinforcement Learning (RL) holds the promise of better data efficiency as it allows sample reuse and potentially enables safe interaction with the environment.

Policy Gradient Methods reinforcement-learning +1

Dimensionality Reduction of Movement Primitives in Parameter Space

no code implementations26 Feb 2020 Samuele Tosatto, Jonas Stadtmueller, Jan Peters

The empirical analysis shows that the dimensionality reduction in parameter space is more effective than in configuration space, as it enables the representation of the movements with a significant reduction of parameters.

Dimensionality Reduction

An Upper Bound of the Bias of Nadaraya-Watson Kernel Regression under Lipschitz Assumptions

no code implementations29 Jan 2020 Samuele Tosatto, Riad Akrour, Jan Peters

The Nadaraya-Watson kernel estimator is among the most popular nonparameteric regression technique thanks to its simplicity.

regression valid

A Nonparametric Off-Policy Policy Gradient

1 code implementation8 Jan 2020 Samuele Tosatto, Joao Carvalho, Hany Abdulsamad, Jan Peters

Reinforcement learning (RL) algorithms still suffer from high sample complexity despite outstanding recent successes.

Density Estimation Policy Gradient Methods +1

Boosted Fitted Q-Iteration

no code implementations ICML 2017 Samuele Tosatto, Matteo Pirotta, Carlo D’Eramo, Marcello Restelli

This paper is about the study of B-FQI, an Approximated Value Iteration (AVI) algorithm that exploits a boosting procedure to estimate the action-value function in reinforcement learning problems.


Cannot find the paper you are looking for? You can Submit a new open access paper.