Search Results for author: Ted Moskovitz

Found 12 papers, 6 papers with code

First-Order Preconditioning via Hypergradient Descent

1 code implementation • 18 Oct 2019 • Ted Moskovitz, Rui Wang, Janice Lan, Sanyam Kapoor, Thomas Miconi, Jason Yosinski, Aditya Rawal

Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space. These difficulties can be addressed by second-order approaches that apply a pre-conditioning matrix to the gradient to improve convergence.

Paper
Code

Efficient Wasserstein Natural Gradients for Reinforcement Learning

1 code implementation • ICLR 2021 • Ted Moskovitz, Michael Arbel, Ferenc Huszar, Arthur Gretton

A novel optimization approach is proposed for application to policy gradient methods and evolution strategies for reinforcement learning (RL).

Policy Gradient Methods reinforcement-learning +1

Paper
Code

Tactical Optimism and Pessimism for Deep Reinforcement Learning

2 code implementations • NeurIPS 2021 • Ted Moskovitz, Jack Parker-Holder, Aldo Pacchiano, Michael Arbel, Michael I. Jordan

In recent years, deep off-policy actor-critic algorithms have become a dominant approach to reinforcement learning for continuous control.

Continuous Control reinforcement-learning +1

Paper
Code

A First-Occupancy Representation for Reinforcement Learning

no code implementations • ICLR 2022 • Ted Moskovitz, Spencer R. Wilson, Maneesh Sahani

Both animals and artificial agents benefit from state representations that support rapid transfer of learning across tasks and which enable them to efficiently traverse their environments to reach rewarding states.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Towards an Understanding of Default Policies in Multitask Policy Optimization

no code implementations • 4 Nov 2021 • Ted Moskovitz, Michael Arbel, Jack Parker-Holder, Aldo Pacchiano

Much of the recent success of deep reinforcement learning has been driven by regularized policy optimization (RPO) algorithms with strong performance across multiple domains.

Paper
Add Code

Minimum Description Length Control

no code implementations • 17 Jul 2022 • Ted Moskovitz, Ta-Chu Kao, Maneesh Sahani, Matthew M. Botvinick

We propose a novel framework for multitask reinforcement learning based on the minimum description length (MDL) principle.

Bayesian Inference Continuous Control +2

Paper
Add Code

A Unified Theory of Dual-Process Control

no code implementations • 13 Nov 2022 • Ted Moskovitz, Kevin Miller, Maneesh Sahani, Matthew M. Botvinick

We apply a single model based on this observation to findings from research on executive control, reward-based learning, and judgment and decision making, showing that seemingly diverse dual-process phenomena can be understood as domain-specific consequences of a single underlying set of computational principles.

Decision Making

Paper
Add Code

Transfer RL via the Undo Maps Formalism

no code implementations • 26 Nov 2022 • Abhi Gupta, Ted Moskovitz, David Alvarez-Melis, Aldo Pacchiano

Transferring knowledge across domains is one of the most fundamental problems in machine learning, but doing so effectively in the context of reinforcement learning remains largely an open problem.

Imitation Learning Transfer Learning

Paper
Add Code

ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs

no code implementations • 2 Feb 2023 • Ted Moskovitz, Brendan O'Donoghue, Vivek Veeriah, Sebastian Flennerhag, Satinder Singh, Tom Zahavy

Such applications often require to put constraints on the agent's behavior.

Continuous Control reinforcement-learning +1

Paper
Add Code

Confronting Reward Model Overoptimization with Constrained RLHF

1 code implementation • 6 Oct 2023 • Ted Moskovitz, Aaditya K. Singh, DJ Strouse, Tuomas Sandholm, Ruslan Salakhutdinov, Anca D. Dragan, Stephen Mcaleer

Large language models are typically aligned with human preferences by optimizing $\textit{reward models}$ (RMs) fitted to human feedback.

Paper
Code

The Transient Nature of Emergent In-Context Learning in Transformers

2 code implementations • NeurIPS 2023 • Aaditya K. Singh, Stephanie C. Y. Chan, Ted Moskovitz, Erin Grant, Andrew M. Saxe, Felix Hill

The transient nature of ICL is observed in transformers across a range of model sizes and datasets, raising the question of how much to "overtrain" transformers when seeking compact, cheaper-to-run models.

Bayesian Inference In-Context Learning +1

Paper
Code

What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation

3 code implementations • 10 Apr 2024 • Aaditya K. Singh, Ted Moskovitz, Felix Hill, Stephanie C. Y. Chan, Andrew M. Saxe

By clamping subsets of activations throughout training, we then identify three underlying subcircuits that interact to drive IH formation, yielding the phase change.

In-Context Learning

124

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.