Search Results for author: Denis Dimitrov

Found 19 papers, 13 papers with code

Pixel-Level BPE for Auto-Regressive Image Generation

no code implementations • MMMPIE (COLING) 2022 • Anton Razzhigaev, Anton Voronov, Andrey Kaznacheev, Andrey Kuznetsov, Denis Dimitrov, Alexander Panchenko

Pixel-level autoregression with Transformer models (Image GPT or iGPT) is one of the recent approaches to image generation that has not received massive attention and elaboration due to quadratic complexity of attention as it imposes huge memory requirements and thus restricts the resolution of the generated images.

Image Generation

Paper
Add Code

OmniFusion Technical Report

2 code implementations • 9 Apr 2024 • Elizaveta Goncharova, Anton Razzhigaev, Matvey Mikhalchuk, Maxim Kurkin, Irina Abdullaeva, Matvey Skripkin, Ivan Oseledets, Denis Dimitrov, Andrey Kuznetsov

We propose an \textit{OmniFusion} model based on a pretrained LLM and adapters for visual modality.

Ranked #39 on Visual Question Answering on MM-Vet

Visual Question Answering

187

Paper
Code

MERA: A Comprehensive LLM Evaluation in Russian

1 code implementation • 9 Jan 2024 • Alena Fenogenova, Artem Chervyakov, Nikita Martynov, Anastasia Kozlova, Maria Tikhonova, Albina Akhmetgareeva, Anton Emelyanov, Denis Shevelev, Pavel Lebedev, Leonid Sinev, Ulyana Isaeva, Katerina Kolomeytseva, Daniil Moskovskiy, Elizaveta Goncharova, Nikita Savushkin, Polina Mikhailova, Denis Dimitrov, Alexander Panchenko, Sergei Markov

To address these issues, we introduce an open Multimodal Evaluation of Russian-language Architectures (MERA), a new instruction benchmark for evaluating foundation models oriented towards the Russian language.

Paper
Code

Kandinsky 3.0 Technical Report

1 code implementation • 6 Dec 2023 • Vladimir Arkhipkin, Andrei Filatov, Viacheslav Vasilev, Anastasia Maltseva, Said Azizov, Igor Pavlov, Julia Agafonova, Andrey Kuznetsov, Denis Dimitrov

We focus on the key components that, as we have identified as a result of a large number of experiments, had the most significant impact on improving the quality of our model compared to the others.

Text-to-Image Generation

Paper
Code

FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline

1 code implementation • 22 Nov 2023 • Vladimir Arkhipkin, Zein Shaheen, Viacheslav Vasilev, Elizaveta Dakhova, Andrey Kuznetsov, Denis Dimitrov

The first stage concerns keyframes synthesis to figure the storyline of a video, while the second one is devoted to interpolation frames generation to make movements of the scene and objects smooth.

SSIM Text-to-Video Generation +1

142

Paper
Code

The Shape of Learning: Anisotropy and Intrinsic Dimensions in Transformer-Based Models

no code implementations • 10 Nov 2023 • Anton Razzhigaev, Matvey Mikhalchuk, Elizaveta Goncharova, Ivan Oseledets, Denis Dimitrov, Andrey Kuznetsov

In this study, we present an investigation into the anisotropy dynamics and intrinsic dimension of embeddings in transformer architectures, focusing on the dichotomy between encoders and decoders.

Paper
Add Code

Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion

1 code implementation • 5 Oct 2023 • Anton Razzhigaev, Arseniy Shakhmatov, Anastasia Maltseva, Vladimir Arkhipkin, Igor Pavlov, Ilya Ryabov, Angelina Kuts, Alexander Panchenko, Andrey Kuznetsov, Denis Dimitrov

Text-to-image generation is a significant domain in modern computer vision and has achieved substantial improvements through the evolution of generative architectures.

Ranked #22 on Text-to-Image Generation on MS COCO

Text-to-Image Generation

Paper
Code

MineralImage5k: A benchmark for zero-shot raw mineral visual recognition and description

1 code implementation • Computers and Geosciences 2023 • Sergey Nesteruk, Julia Agafonova, Igor Pavlov, Maxim Gerasimov, Nikolay Latyshev, Denis Dimitrov, Andrey Kuznetsov, Artur Kadurin, Pavel Plechov

On the contrary, in a raw sample, the target mineral can appear in the form of thinly represented inclusions.

Zero-Shot Learning

Paper
Code

RusTitW: Russian Language Text Dataset for Visual Text in-the-Wild Recognition

1 code implementation • 29 Mar 2023 • Igor Markov, Sergey Nesteruk, Andrey Kuznetsov, Denis Dimitrov

In this paper, we present a large-scale human-labeled dataset for Russian text recognition in-the-wild.

Paper
Code

Eco2AI: carbon emissions tracking of machine learning models as the first step towards sustainable AI

1 code implementation • 31 Jul 2022 • Semen Budennyy, Vladimir Lazarev, Nikita Zakharenko, Alexey Korovin, Olga Plosskaya, Denis Dimitrov, Vladimir Arkhipkin, Ivan Oseledets, Ivan Barsola, Ilya Egorov, Aleksandra Kosterina, Leonid Zhukov

The size and complexity of deep neural networks continue to grow exponentially, significantly increasing energy consumption for training and inference by these models.

215

Paper
Code

RuCLIP -- new models and experiments: a technical report

1 code implementation • 22 Feb 2022 • Alex Shonenkov, Andrey Kuznetsov, Denis Dimitrov, Tatyana Shavrina, Daniil Chesakov, Anastasia Maltseva, Alena Fenogenova, Igor Pavlov, Anton Emelyanov, Sergey Markov, Daria Bakshandaeva, Vera Shybaeva, Andrey Chertok

In the report we propose six new implementations of ruCLIP model trained on our 240M pairs.

Translation

762

Paper
Code

Survey on Large Scale Neural Network Training

no code implementations • 21 Feb 2022 • Julia Gusak, Daria Cherniuk, Alena Shilova, Alexander Katrutsa, Daniel Bershatsky, Xunyi Zhao, Lionel Eyraud-Dubois, Oleg Shlyazhko, Denis Dimitrov, Ivan Oseledets, Olivier Beaumont

Modern Deep Neural Networks (DNNs) require significant memory to store weight, activations, and other intermediate tensors during training.

Paper
Add Code

A new face swap method for image and video domains: a technical report

no code implementations • 7 Feb 2022 • Daniil Chesakov, Anastasia Maltseva, Alexander Groshev, Andrey Kuznetsov, Denis Dimitrov

Deep fake technology became a hot field of research in the last few years.

Face Swapping Specificity +1

Paper
Add Code

Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction

2 code implementations • 1 Feb 2022 • Georgii Novikov, Daniel Bershatsky, Julia Gusak, Alex Shonenkov, Denis Dimitrov, Ivan Oseledets

Every modern neural network model has quite a few pointwise nonlinearities in its architecture, and such operation induces additional memory costs which -- as we show -- can be significantly reduced by quantization of the gradients.

Neural Network Compression Quantization

Paper
Code

Handwritten text generation and strikethrough characters augmentation

no code implementations • 14 Dec 2021 • Alex Shonenkov, Denis Karachev, Max Novopoltsev, Mark Potanin, Denis Dimitrov, Andrey Chertok

We introduce two data augmentation techniques, which, used with a Resnet-BiLSTM-CTC network, significantly reduce Word Error Rate (WER) and Character Error Rate (CER) beyond best-reported results on handwriting text recognition (HTR) tasks.

Data Augmentation HTR +1

Paper
Add Code

Emojich -- zero-shot emoji generation using Russian language: a technical report

no code implementations • 4 Dec 2021 • Alex Shonenkov, Daria Bakshandaeva, Denis Dimitrov, Aleksandr Nikolich

This technical report presents a text-to-image neural network "Emojich" that generates emojis using captions in Russian language as a condition.

Paper
Add Code

Many Heads but One Brain: Fusion Brain -- a Competition and a Single Multimodal Multitask Architecture

1 code implementation • 22 Nov 2021 • Daria Bakshandaeva, Denis Dimitrov, Vladimir Arkhipkin, Alex Shonenkov, Mark Potanin, Denis Karachev, Andrey Kuznetsov, Anton Voronov, Vera Davydova, Elena Tutubalina, Aleksandr Petiushko

Supporting the current trend in the AI community, we present the AI Journey 2021 Challenge called Fusion Brain, the first competition which is targeted to make the universal architecture which could process different modalities (in this case, images, texts, and code) and solve multiple tasks for vision and language.

Handwritten Text Recognition object-detection +4

Paper
Code

StackMix and Blot Augmentations for Handwritten Text Recognition

1 code implementation • 26 Aug 2021 • Alex Shonenkov, Denis Karachev, Maxim Novopoltsev, Mark Potanin, Denis Dimitrov

This paper proposes a handwritten text recognition(HTR) system that outperforms current state-of-the-artmethods.

Ranked #1 on Handwritten Text Recognition on IAM-D

Data Augmentation Handwritten Text Recognition +2

Paper
Code

Digital Peter: Dataset, Competition and Handwriting Recognition Methods

2 code implementations • 16 Mar 2021 • Mark Potanin, Denis Dimitrov, Alex Shonenkov, Vladimir Bataev, Denis Karachev, Maxim Novopoltsev

This paper presents a new dataset of Peter the Great's manuscripts and describes a segmentation procedure that converts initial images of documents into the lines.

BIG-bench Machine Learning Handwriting Recognition +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.