Search Results for author: Ivan Chelombiev

Found 4 papers, 1 papers with code

SparQ Attention: Bandwidth-Efficient LLM Inference

1 code implementation • 8 Dec 2023 • Luka Ribar, Ivan Chelombiev, Luke Hudlass-Galley, Charlie Blake, Carlo Luschi, Douglas Orr

The computational difficulties of large language model (LLM) inference remain a significant obstacle to their widespread deployment.

Language Modelling Large Language Model

Paper
Code

Towards Structured Dynamic Sparse Pre-Training of BERT

no code implementations • 13 Aug 2021 • Anastasia Dietrich, Frithjof Gressmann, Douglas Orr, Ivan Chelombiev, Daniel Justus, Carlo Luschi

Identifying algorithms for computational efficient unsupervised training of large language models is an important and active area of research.

Language Modelling

Paper
Add Code

GroupBERT: Enhanced Transformer Architecture with Efficient Grouped Structures

no code implementations • 10 Jun 2021 • Ivan Chelombiev, Daniel Justus, Douglas Orr, Anastasia Dietrich, Frithjof Gressmann, Alexandros Koliousis, Carlo Luschi

Attention based language models have become a critical component in state-of-the-art natural language processing systems.

Representation Learning

Paper
Add Code

Adaptive Estimators Show Information Compression in Deep Neural Networks

no code implementations • ICLR 2019 • Ivan Chelombiev, Conor Houghton, Cian O'Donnell

With two improved methods of estimation, firstly, we show that saturation of the activation function is not required for compression, and the amount of compression varies between different activation functions.

L2 Regularization Mutual Information Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.