Search Results for author: Alexander Borzunov

Found 7 papers, 6 papers with code

Distributed Deep Learning in Open Collaborations

2 code implementations • NeurIPS 2021 • Michael Diskin, Alexey Bukhtiyarov, Max Ryabinin, Lucile Saulnier, Quentin Lhoest, Anton Sinitsin, Dmitry Popov, Dmitry Pyrkin, Maxim Kashirin, Alexander Borzunov, Albert Villanova del Moral, Denis Mazur, Ilia Kobelev, Yacine Jernite, Thomas Wolf, Gennady Pekhimenko

Modern deep learning applications require increasingly more compute to train state-of-the-art models.

Language Modelling

1,833

Paper
Code

Secure Distributed Training at Scale

3 code implementations • 21 Jun 2021 • Eduard Gorbunov, Alexander Borzunov, Michael Diskin, Max Ryabinin

Training such models requires a lot of computational resources (e. g., HPC clusters) that are not available to small research groups and independent researchers.

Distributed Optimization Image Classification +1

1,833

Paper
Code

Training Transformers Together

1 code implementation • 7 Jul 2022 • Alexander Borzunov, Max Ryabinin, Tim Dettmers, Quentin Lhoest, Lucile Saulnier, Michael Diskin, Yacine Jernite, Thomas Wolf

The infrastructure necessary for training state-of-the-art models is becoming overly expensive, which makes training such models affordable only to large corporations and institutions.

1,833

Paper
Code

Petals: Collaborative Inference and Fine-tuning of Large Models

1 code implementation • 2 Sep 2022 • Alexander Borzunov, Dmitry Baranchuk, Tim Dettmers, Max Ryabinin, Younes Belkada, Artem Chumachenko, Pavel Samygin, Colin Raffel

However, these techniques have innate limitations: offloading is too slow for interactive inference, while APIs are not flexible enough for research that requires access to weights, attention or logits.

Collaborative Inference

8,639

Paper
Code

SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient

1 code implementation • 27 Jan 2023 • Max Ryabinin, Tim Dettmers, Michael Diskin, Alexander Borzunov

Many deep learning applications benefit from using large models with billions of parameters.

Language Modelling

107

Paper
Code

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

1 code implementation • 5 Jun 2023 • Tim Dettmers, Ruslan Svirschevski, Vage Egiazarian, Denis Kuznedelev, Elias Frantar, Saleh Ashkboos, Alexander Borzunov, Torsten Hoefler, Dan Alistarh

Recent advances in large language model (LLM) pretraining have led to high-quality LLMs with impressive abilities.

Language Modelling Large Language Model +1

509

Paper
Code

Distributed Inference and Fine-tuning of Large Language Models Over The Internet

no code implementations • NeurIPS 2023 • Alexander Borzunov, Max Ryabinin, Artem Chumachenko, Dmitry Baranchuk, Tim Dettmers, Younes Belkada, Pavel Samygin, Colin Raffel

Large language models (LLMs) are useful in many NLP tasks and become more capable with size, with the best open-source models having over 50 billion parameters.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.