Search Results for author: Georgii Novikov

Found 2 papers, 1 papers with code

Efficient GPT Model Pre-training using Tensor Train Matrix Representation

no code implementations5 Jun 2023 Viktoriia Chekalina, Georgii Novikov, Julia Gusak, Ivan Oseledets, Alexander Panchenko

On the downstream tasks, including language understanding and text summarization, the model performs similarly to the original GPT-2 model.

Language Modelling Text Summarization

Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction

2 code implementations1 Feb 2022 Georgii Novikov, Daniel Bershatsky, Julia Gusak, Alex Shonenkov, Denis Dimitrov, Ivan Oseledets

Every modern neural network model has quite a few pointwise nonlinearities in its architecture, and such operation induces additional memory costs which -- as we show -- can be significantly reduced by quantization of the gradients.

Neural Network Compression Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.