Search Results for author: Nolan Dey

Found 3 papers, 3 papers with code

Position Interpolation Improves ALiBi Extrapolation

1 code implementation18 Oct 2023 Faisal Al-Khateeb, Nolan Dey, Daria Soboleva, Joel Hestness

Linear position interpolation helps pre-trained models using rotary position embeddings (RoPE) to extrapolate to longer sequence lengths.

Language Modelling Position +1

Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster

2 code implementations6 Apr 2023 Nolan Dey, Gurpreet Gosal, Zhiming, Chen, Hemant Khachane, William Marshall, Ribhu Pathria, Marvin Tom, Joel Hestness

We study recent research advances that improve large language models through efficient pre-training and scaling, and open datasets and tools.

Cannot find the paper you are looking for? You can Submit a new open access paper.