Search Results for author: Simone Parisi

Found 7 papers, 4 papers with code

Monitored Markov Decision Processes

1 code implementation9 Feb 2024 Simone Parisi, Montaser Mohammedalamen, Alireza Kazemipour, Matthew E. Taylor, Michael Bowling

In this paper, we formalize a novel but general RL framework - Monitored MDPs - where the agent cannot always observe rewards.

Reinforcement Learning (RL)

The Unsurprising Effectiveness of Pre-Trained Vision Models for Control

no code implementations7 Mar 2022 Simone Parisi, Aravind Rajeswaran, Senthil Purushwalkam, Abhinav Gupta

In this context, we revisit and study the role of pre-trained visual representations for control, and in particular representations trained on large-scale computer vision datasets.

Interesting Object, Curious Agent: Learning Task-Agnostic Exploration

1 code implementation NeurIPS 2021 Simone Parisi, Victoria Dean, Deepak Pathak, Abhinav Gupta

In this setup, the agent first learns to explore across many environments without any extrinsic goal in a task-agnostic manner.

Object

Long-Term Visitation Value for Deep Exploration in Sparse Reward Reinforcement Learning

1 code implementation1 Jan 2020 Simone Parisi, Davide Tateo, Maximilian Hensel, Carlo D'Eramo, Jan Peters, Joni Pajarinen

Empirical results on classic and novel benchmarks show that the proposed approach outperforms existing methods in environments with sparse rewards, especially in the presence of rewards that create suboptimal modes of the objective function.

Benchmarking reinforcement-learning +1

TD-Regularized Actor-Critic Methods

1 code implementation19 Dec 2018 Simone Parisi, Voot Tangkaratt, Jan Peters, Mohammad Emtiyaz Khan

Actor-critic methods can achieve incredible performance on difficult reinforcement learning problems, but they are also prone to instability.

reinforcement-learning Reinforcement Learning (RL)

Policy Search with High-Dimensional Context Variables

no code implementations10 Nov 2016 Voot Tangkaratt, Herke van Hoof, Simone Parisi, Gerhard Neumann, Jan Peters, Masashi Sugiyama

A naive application of unsupervised dimensionality reduction methods to the context variables, such as principal component analysis, is insufficient as task-relevant input may be ignored.

Dimensionality Reduction Vocal Bursts Intensity Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.