Search Results for author: Ted Moskovitz

Found 6 papers, 3 papers with code

Minimum Description Length Control

no code implementations17 Jul 2022 Ted Moskovitz, Ta-Chu Kao, Maneesh Sahani, Matthew M. Botvinick

We propose a novel framework for multitask reinforcement learning based on the minimum description length (MDL) principle.

Bayesian Inference Continuous Control +1

Towards an Understanding of Default Policies in Multitask Policy Optimization

no code implementations4 Nov 2021 Ted Moskovitz, Michael Arbel, Jack Parker-Holder, Aldo Pacchiano

Much of the recent success of deep reinforcement learning has been driven by regularized policy optimization (RPO) algorithms with strong performance across multiple domains.

A First-Occupancy Representation for Reinforcement Learning

no code implementations ICLR 2022 Ted Moskovitz, Spencer R. Wilson, Maneesh Sahani

Both animals and artificial agents benefit from state representations that support rapid transfer of learning across tasks and which enable them to efficiently traverse their environments to reach rewarding states.

reinforcement-learning

Tactical Optimism and Pessimism for Deep Reinforcement Learning

2 code implementations NeurIPS 2021 Ted Moskovitz, Jack Parker-Holder, Aldo Pacchiano, Michael Arbel, Michael I. Jordan

In recent years, deep off-policy actor-critic algorithms have become a dominant approach to reinforcement learning for continuous control.

Continuous Control reinforcement-learning

Efficient Wasserstein Natural Gradients for Reinforcement Learning

1 code implementation ICLR 2021 Ted Moskovitz, Michael Arbel, Ferenc Huszar, Arthur Gretton

A novel optimization approach is proposed for application to policy gradient methods and evolution strategies for reinforcement learning (RL).

Policy Gradient Methods reinforcement-learning

First-Order Preconditioning via Hypergradient Descent

1 code implementation18 Oct 2019 Ted Moskovitz, Rui Wang, Janice Lan, Sanyam Kapoor, Thomas Miconi, Jason Yosinski, Aditya Rawal

Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space. These difficulties can be addressed by second-order approaches that apply a pre-conditioning matrix to the gradient to improve convergence.

Cannot find the paper you are looking for? You can Submit a new open access paper.