Search Results for author: Carlo Luschi

Found 14 papers, 7 papers with code

Reducing the Cost of Quantum Chemical Data By Backpropagating Through Density Functional Theory

no code implementations6 Feb 2024 Alexander Mathiasen, Hatem Helal, Paul Balanca, Adam Krzywaniak, Ali Parviz, Frederik Hvilshøj, Blazej Banaszewski, Carlo Luschi, Andrew William Fitzgibbon

For comparison, Sch\"utt et al. (2019) spent 626 hours creating a dataset on which they trained their NN for 160h, for a total of 786h; our method achieves comparable performance within 31h.

SparQ Attention: Bandwidth-Efficient LLM Inference

1 code implementation8 Dec 2023 Luka Ribar, Ivan Chelombiev, Luke Hudlass-Galley, Charlie Blake, Carlo Luschi, Douglas Orr

The computational difficulties of large language model (LLM) inference remain a significant obstacle to their widespread deployment.

Language Modelling Large Language Model

Generating QM1B with PySCF$_{\text{IPU}}$

2 code implementations NeurIPS 2023 Alexander Mathiasen, Hatem Helal, Kerstin Klaser, Paul Balanca, Josef Dean, Carlo Luschi, Dominique Beaini, Andrew Fitzgibbon, Dominic Masters

Similar benefits are yet to be unlocked for quantum chemistry, where the potential of deep learning is constrained by comparatively small datasets with 100k to 20M training examples.

Unit Scaling: Out-of-the-Box Low-Precision Training

2 code implementations20 Mar 2023 Charlie Blake, Douglas Orr, Carlo Luschi

We present unit scaling, a paradigm for designing deep learning models that simplifies the use of low-precision number formats.

8-bit Numerical Formats for Deep Neural Networks

no code implementations6 Jun 2022 Badreddine Noune, Philip Jones, Daniel Justus, Dominic Masters, Carlo Luschi

Given the current trend of increasing size and complexity of machine learning architectures, it has become of critical importance to identify new approaches to improve the computational efficiency of model training.

Computational Efficiency Image Classification

Towards Structured Dynamic Sparse Pre-Training of BERT

no code implementations13 Aug 2021 Anastasia Dietrich, Frithjof Gressmann, Douglas Orr, Ivan Chelombiev, Daniel Justus, Carlo Luschi

Identifying algorithms for computational efficient unsupervised training of large language models is an important and active area of research.

Language Modelling

Parallel Training of Deep Networks with Local Updates

1 code implementation7 Dec 2020 Michael Laskin, Luke Metz, Seth Nabarro, Mark Saroufim, Badreddine Noune, Carlo Luschi, Jascha Sohl-Dickstein, Pieter Abbeel

Deep learning models trained on large data sets have been widely successful in both vision and language domains.

Improving Neural Network Training in Low Dimensional Random Bases

1 code implementation NeurIPS 2020 Frithjof Gressmann, Zach Eaton-Rosen, Carlo Luschi

Stochastic Gradient Descent (SGD) has proven to be remarkably effective in optimizing deep neural networks that employ ever-larger numbers of parameters.

Revisiting Small Batch Training for Deep Neural Networks

3 code implementations20 Apr 2018 Dominic Masters, Carlo Luschi

Modern deep neural network training is typically based on mini-batch stochastic gradient optimization.

Cannot find the paper you are looking for? You can Submit a new open access paper.