Search Results for author: Douglas Orr

Found 7 papers, 3 papers with code

SparQ Attention: Bandwidth-Efficient LLM Inference

1 code implementation • 8 Dec 2023 • Luka Ribar, Ivan Chelombiev, Luke Hudlass-Galley, Charlie Blake, Carlo Luschi, Douglas Orr

The computational difficulties of large language model (LLM) inference remain a significant obstacle to their widespread deployment.

Language Modelling Large Language Model

12

Paper
Code

PopSparse: Accelerated block sparse matrix multiplication on IPU

no code implementations • 29 Mar 2023 • Zhiyi Li, Douglas Orr, Valeriu Ohan, Godfrey Da Costa, Tom Murray, Adam Sanders, Deniz Beker, Dominic Masters

Furthermore, static sparsity in general outperforms dynamic sparsity.

Paper
Add Code

Unit Scaling: Out-of-the-Box Low-Precision Training

2 code implementations • 20 Mar 2023 • Charlie Blake, Douglas Orr, Carlo Luschi

We present unit scaling, a paradigm for designing deep learning models that simplifies the use of low-precision number formats.

63

Paper
Code

BESS: Balanced Entity Sampling and Sharing for Large-Scale Knowledge Graph Completion

1 code implementation • 22 Nov 2022 • Alberto Cattaneo, Daniel Justus, Harry Mellor, Douglas Orr, Jerome Maloberti, Zhenying Liu, Thorin Farnsworth, Andrew Fitzgibbon, Blazej Banaszewski, Carlo Luschi

We present the award-winning submission to the WikiKG90Mv2 track of OGB-LSC@NeurIPS 2022.

Knowledge Graph Completion Knowledge Graph Embedding +1

12

Paper
Code

Towards Structured Dynamic Sparse Pre-Training of BERT

no code implementations • 13 Aug 2021 • Anastasia Dietrich, Frithjof Gressmann, Douglas Orr, Ivan Chelombiev, Daniel Justus, Carlo Luschi

Identifying algorithms for computational efficient unsupervised training of large language models is an important and active area of research.

Language Modelling

Paper
Add Code

GroupBERT: Enhanced Transformer Architecture with Efficient Grouped Structures

no code implementations • 10 Jun 2021 • Ivan Chelombiev, Daniel Justus, Douglas Orr, Anastasia Dietrich, Frithjof Gressmann, Alexandros Koliousis, Carlo Luschi

Attention based language models have become a critical component in state-of-the-art natural language processing systems.

Representation Learning

Paper
Add Code

N-grams Bayesian Differential Privacy

no code implementations • 29 Jan 2021 • Osman Ramadan, James Withers, Douglas Orr

It first transforms the counts to log space, approximating the distribution of the public and private data as Gaussian.

Language Modelling

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.