Search Results for author: Ted Moskovitz

Found 12 papers, 6 papers with code

First-Order Preconditioning via Hypergradient Descent

1 code implementation18 Oct 2019 Ted Moskovitz, Rui Wang, Janice Lan, Sanyam Kapoor, Thomas Miconi, Jason Yosinski, Aditya Rawal

Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space. These difficulties can be addressed by second-order approaches that apply a pre-conditioning matrix to the gradient to improve convergence.

Efficient Wasserstein Natural Gradients for Reinforcement Learning

1 code implementation ICLR 2021 Ted Moskovitz, Michael Arbel, Ferenc Huszar, Arthur Gretton

A novel optimization approach is proposed for application to policy gradient methods and evolution strategies for reinforcement learning (RL).

Policy Gradient Methods reinforcement-learning +1

A First-Occupancy Representation for Reinforcement Learning

no code implementations ICLR 2022 Ted Moskovitz, Spencer R. Wilson, Maneesh Sahani

Both animals and artificial agents benefit from state representations that support rapid transfer of learning across tasks and which enable them to efficiently traverse their environments to reach rewarding states.

reinforcement-learning Reinforcement Learning (RL)

Towards an Understanding of Default Policies in Multitask Policy Optimization

no code implementations4 Nov 2021 Ted Moskovitz, Michael Arbel, Jack Parker-Holder, Aldo Pacchiano

Much of the recent success of deep reinforcement learning has been driven by regularized policy optimization (RPO) algorithms with strong performance across multiple domains.

Minimum Description Length Control

no code implementations17 Jul 2022 Ted Moskovitz, Ta-Chu Kao, Maneesh Sahani, Matthew M. Botvinick

We propose a novel framework for multitask reinforcement learning based on the minimum description length (MDL) principle.

Bayesian Inference Continuous Control +2

A Unified Theory of Dual-Process Control

no code implementations13 Nov 2022 Ted Moskovitz, Kevin Miller, Maneesh Sahani, Matthew M. Botvinick

We apply a single model based on this observation to findings from research on executive control, reward-based learning, and judgment and decision making, showing that seemingly diverse dual-process phenomena can be understood as domain-specific consequences of a single underlying set of computational principles.

Decision Making

Transfer RL via the Undo Maps Formalism

no code implementations26 Nov 2022 Abhi Gupta, Ted Moskovitz, David Alvarez-Melis, Aldo Pacchiano

Transferring knowledge across domains is one of the most fundamental problems in machine learning, but doing so effectively in the context of reinforcement learning remains largely an open problem.

Imitation Learning Transfer Learning

Confronting Reward Model Overoptimization with Constrained RLHF

1 code implementation6 Oct 2023 Ted Moskovitz, Aaditya K. Singh, DJ Strouse, Tuomas Sandholm, Ruslan Salakhutdinov, Anca D. Dragan, Stephen Mcaleer

Large language models are typically aligned with human preferences by optimizing $\textit{reward models}$ (RMs) fitted to human feedback.

The Transient Nature of Emergent In-Context Learning in Transformers

2 code implementations NeurIPS 2023 Aaditya K. Singh, Stephanie C. Y. Chan, Ted Moskovitz, Erin Grant, Andrew M. Saxe, Felix Hill

The transient nature of ICL is observed in transformers across a range of model sizes and datasets, raising the question of how much to "overtrain" transformers when seeking compact, cheaper-to-run models.

Bayesian Inference In-Context Learning +1

What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation

3 code implementations10 Apr 2024 Aaditya K. Singh, Ted Moskovitz, Felix Hill, Stephanie C. Y. Chan, Andrew M. Saxe

By clamping subsets of activations throughout training, we then identify three underlying subcircuits that interact to drive IH formation, yielding the phase change.

In-Context Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.