Search Results for author: Joshua Susskind

Found 8 papers, 4 papers with code

Vanishing Gradients in Reinforcement Finetuning of Language Models

1 code implementation31 Oct 2023 Noam Razin, Hattie Zhou, Omid Saremi, Vimal Thilak, Arwen Bradley, Preetum Nakkiran, Joshua Susskind, Etai Littwin

Pretrained language models are commonly aligned with human preferences and downstream tasks via reinforcement finetuning (RFT), which refers to maximizing a (possibly learned) reward function using policy gradient algorithms.

Transformers learn through gradual rank increase

no code implementations NeurIPS 2023 Enric Boix-Adsera, Etai Littwin, Emmanuel Abbe, Samy Bengio, Joshua Susskind

Our experiments support the theory and also show that phenomenon can occur in practice without the simplifying assumptions.

Incremental Learning

Position Prediction as an Effective Pretraining Strategy

1 code implementation15 Jul 2022 Shuangfei Zhai, Navdeep Jaitly, Jason Ramapuram, Dan Busbridge, Tatiana Likhomanenko, Joseph Yitan Cheng, Walter Talbott, Chen Huang, Hanlin Goh, Joshua Susskind

This pretraining strategy which has been used in BERT models in NLP, Wav2Vec models in Speech and, recently, in MAE models in Vision, forces the model to learn about relationships between the content in different parts of the input using autoencoding related objectives.

Position speech-recognition +1

The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon

no code implementations10 Jun 2022 Vimal Thilak, Etai Littwin, Shuangfei Zhai, Omid Saremi, Roni Paiss, Joshua Susskind

While common and easily reproduced in more general settings, the Slingshot Mechanism does not follow from any known optimization theories that we are aware of, and can be easily overlooked without an in depth examination.

Inductive Bias

Efficient Embedding of Semantic Similarity in Control Policies via Entangled Bisimulation

no code implementations28 Jan 2022 Martin Bertran, Walter Talbott, Nitish Srivastava, Joshua Susskind

Learning generalizeable policies from visual input in the presence of visual distractions is a challenging problem in reinforcement learning.

Data Augmentation Reinforcement Learning (RL) +2

Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning

2 code implementations17 May 2021 Yue Wu, Shuangfei Zhai, Nitish Srivastava, Joshua Susskind, Jian Zhang, Ruslan Salakhutdinov, Hanlin Goh

Offline Reinforcement Learning promises to learn effective policies from previously-collected, static datasets without the need for exploration.

Offline RL Q-Learning +2

Collegial Ensembles

no code implementations NeurIPS 2020 Etai Littwin, Ben Myara, Sima Sabah, Joshua Susskind, Shuangfei Zhai, Oren Golan

Modern neural network performance typically improves as model size increases.

Cannot find the paper you are looking for? You can Submit a new open access paper.