Search Results for author: Susan Zhang

Found 9 papers, 4 papers with code

LIMA: Less Is More for Alignment

5 code implementations NeurIPS 2023 Chunting Zhou, PengFei Liu, Puxin Xu, Srini Iyer, Jiao Sun, Yuning Mao, Xuezhe Ma, Avia Efrat, Ping Yu, Lili Yu, Susan Zhang, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer, Omer Levy

Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences.

Language Modelling reinforcement-learning

Effective Theory of Transformers at Initialization

no code implementations4 Apr 2023 Emily Dinan, Sho Yaida, Susan Zhang

We perform an effective-theory analysis of forward-backward signal propagation in wide and deep Transformers, i. e., residual neural networks with multi-head self-attention blocks and multilayer perceptron blocks.

Scaling Laws for Generative Mixed-Modal Language Models

no code implementations10 Jan 2023 Armen Aghajanyan, Lili Yu, Alexis Conneau, Wei-Ning Hsu, Karen Hambardzumyan, Susan Zhang, Stephen Roller, Naman Goyal, Omer Levy, Luke Zettlemoyer

To better understand the scaling properties of such mixed-modal models, we conducted over 250 experiments using seven different modalities and model sizes ranging from 8 million to 30 billion, trained on 5-100 billion tokens.

Long-Term Planning and Situational Awareness in OpenAI Five

no code implementations13 Dec 2019 Jonathan Raiman, Susan Zhang, Filip Wolski

Understanding how knowledge about the world is represented within model-free deep reinforcement learning methods is a major challenge given the black box nature of its learning process within high-dimensional observation and action spaces.

Dota 2

Neural Network Surgery with Sets

no code implementations13 Dec 2019 Jonathan Raiman, Susan Zhang, Christy Dennison

The cost to train machine learning models has been increasing exponentially, making exploration and research into the correct features and architecture a costly or intractable endeavor at scale.

Dota 2

Cannot find the paper you are looking for? You can Submit a new open access paper.