1 code implementation • 8 Dec 2023 • Luka Ribar, Ivan Chelombiev, Luke Hudlass-Galley, Charlie Blake, Carlo Luschi, Douglas Orr
The computational difficulties of large language model (LLM) inference remain a significant obstacle to their widespread deployment.
no code implementations • 29 Mar 2023 • Zhiyi Li, Douglas Orr, Valeriu Ohan, Godfrey Da Costa, Tom Murray, Adam Sanders, Deniz Beker, Dominic Masters
Furthermore, static sparsity in general outperforms dynamic sparsity.
2 code implementations • 20 Mar 2023 • Charlie Blake, Douglas Orr, Carlo Luschi
We present unit scaling, a paradigm for designing deep learning models that simplifies the use of low-precision number formats.
1 code implementation • 22 Nov 2022 • Alberto Cattaneo, Daniel Justus, Harry Mellor, Douglas Orr, Jerome Maloberti, Zhenying Liu, Thorin Farnsworth, Andrew Fitzgibbon, Blazej Banaszewski, Carlo Luschi
We present the award-winning submission to the WikiKG90Mv2 track of OGB-LSC@NeurIPS 2022.
no code implementations • 13 Aug 2021 • Anastasia Dietrich, Frithjof Gressmann, Douglas Orr, Ivan Chelombiev, Daniel Justus, Carlo Luschi
Identifying algorithms for computational efficient unsupervised training of large language models is an important and active area of research.
no code implementations • 10 Jun 2021 • Ivan Chelombiev, Daniel Justus, Douglas Orr, Anastasia Dietrich, Frithjof Gressmann, Alexandros Koliousis, Carlo Luschi
Attention based language models have become a critical component in state-of-the-art natural language processing systems.
no code implementations • 29 Jan 2021 • Osman Ramadan, James Withers, Douglas Orr
It first transforms the counts to log space, approximating the distribution of the public and private data as Gaussian.