Search Results for author: Andrey Zhmoginov

Found 20 papers, 6 papers with code

Continual Few-Shot Learning Using HyperTransformers

no code implementations • 11 Jan 2023 • Max Vladymyrov, Andrey Zhmoginov, Mark Sandler

We focus on the problem of learning without forgetting from multiple tasks arriving sequentially, where each task is defined using a few-shot episode of novel or already seen classes.

Class Incremental Learning continual few-shot learning +2

Paper
Add Code

Training trajectories, mini-batch losses and the curious role of the learning rate

no code implementations • 5 Jan 2023 • Mark Sandler, Andrey Zhmoginov, Max Vladymyrov, Nolan Miller

In particular, for Exponential Moving Average (EMA) and Stochastic Weight Averaging we show that our proposed model matches the observed training trajectories on ImageNet.

Paper
Add Code

Transformers learn in-context by gradient descent

1 code implementation • 15 Dec 2022 • Johannes von Oswald, Eyvind Niklasson, Ettore Randazzo, João Sacramento, Alexander Mordvintsev, Andrey Zhmoginov, Max Vladymyrov

We start by providing a simple weight construction that shows the equivalence of data transformations induced by 1) a single linear self-attention layer and by 2) gradient-descent (GD) on a regression loss.

In-Context Learning Meta-Learning +1

279

Paper
Code

Decentralized Learning with Multi-Headed Distillation

no code implementations • CVPR 2023 • Andrey Zhmoginov, Mark Sandler, Nolan Miller, Gus Kristiansen, Max Vladymyrov

We study the effects of data and model architecture heterogeneity and the impact of the underlying communication graph topology on learning efficiency and show that our agents can significantly improve their performance compared to learning in isolation.

Paper
Add Code

Fine-tuning Image Transformers using Learnable Memory

1 code implementation • CVPR 2022 • Mark Sandler, Andrey Zhmoginov, Max Vladymyrov, Andrew Jackson

In this paper we propose augmenting Vision Transformer models with learnable memory tokens.

Paper
Code

HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning

1 code implementation • 11 Jan 2022 • Andrey Zhmoginov, Mark Sandler, Max Vladymyrov

In this work we propose a HyperTransformer, a Transformer-based model for supervised and semi-supervised few-shot learning that generates weights of a convolutional neural network (CNN) directly from support samples.

Ranked #5 on Few-Shot Image Classification on OMNIGLOT - 1-Shot, 20-way

Few-Shot Image Classification Few-Shot Learning

32,750

Paper
Code

HyperTransformer: Attention-Based CNN Model Generation from Few Samples

no code implementations • 29 Sep 2021 • Andrey Zhmoginov, Max Vladymyrov, Mark Sandler

In this work we propose a HyperTransformer, a transformer based model that generates all weights of a CNN model directly from the support samples.

Few-Shot Learning

Paper
Add Code

Compositional Models: Multi-Task Learning and Knowledge Transfer with Modular Networks

no code implementations • 23 Jul 2021 • Andrey Zhmoginov, Dina Bashkirova, Mark Sandler

From practical perspective, our approach allows to: (a) reuse existing modules for learning new task by adjusting the computation order, (b) use it for unsupervised multi-source domain adaptation to illustrate that adaptation to unseen data can be achieved by only manipulating the order of pretrained modules, (c) show how our approach can be used to increase accuracy of existing architectures for image classification tasks such as ImageNet, without any parameter increase, by reusing the same block multiple times.

Domain Adaptation Image Classification +1

Paper
Add Code

BasisNet: Two-stage Model Synthesis for Efficient Inference

no code implementations • 7 May 2021 • Mingda Zhang, Chun-Te Chu, Andrey Zhmoginov, Andrew Howard, Brendan Jou, Yukun Zhu, Li Zhang, Rebecca Hwa, Adriana Kovashka

With early termination, the average cost can be further reduced to 198M MAdds while maintaining accuracy of 80. 0% on ImageNet.

Ranked #661 on Image Classification on ImageNet

Efficient Neural Network Image Classification +1

Paper
Add Code

Meta-Learning Bidirectional Update Rules

1 code implementation • 10 Apr 2021 • Mark Sandler, Max Vladymyrov, Andrey Zhmoginov, Nolan Miller, Andrew Jackson, Tom Madams, Blaise Aguera y Arcas

We show that classical gradient-based backpropagation in neural networks can be seen as a special case of a two-state network where one state is used for activations and another for gradients, with update rules derived from the chain rule.

Meta-Learning

32,753

Paper
Code

Large-Scale Generative Data-Free Distillation

no code implementations • 10 Dec 2020 • Liangchen Luo, Mark Sandler, Zi Lin, Andrey Zhmoginov, Andrew Howard

Knowledge distillation is one of the most popular and effective techniques for knowledge transfer, model compression and semi-supervised learning.

Knowledge Distillation Model Compression +1

Paper
Add Code

Image segmentation via Cellular Automata

no code implementations • 11 Aug 2020 • Mark Sandler, Andrey Zhmoginov, Liangcheng Luo, Alexander Mordvintsev, Ettore Randazzo, Blaise Agúera y Arcas

The update rule is applied repeatedly in parallel to a large random subset of cells and after convergence is used to produce segmentation masks that are then back-propagated to learn the optimal update rules using standard gradient descent methods.

Image Segmentation Segmentation +1

Paper
Add Code

Non-discriminative data or weak model? On the relative importance of data and model resolution

no code implementations • 7 Sep 2019 • Mark Sandler, Jonathan Baccash, Andrey Zhmoginov, Andrew Howard

We explore the question of how the resolution of the input image ("input resolution") affects the performance of a neural network when compared to the resolution of the hidden layers ("internal resolution").

Paper
Add Code

Information-Bottleneck Approach to Salient Region Discovery

no code implementations • 22 Jul 2019 • Andrey Zhmoginov, Ian Fischer, Mark Sandler

We propose a new method for learning image attention masks in a semi-supervised setting based on the Information Bottleneck principle.

Paper
Add Code

K For The Price Of 1: Parameter Efficient Multi-task And Transfer Learning

no code implementations • ICLR 2019 • Pramod Kaushik Mudrakarta, Mark Sandler, Andrey Zhmoginov, Andrew Howard

We introduce a novel method that enables parameter-efficient transfer and multitask learning.

Multi-Task Learning

Paper
Add Code

K for the Price of 1: Parameter-efficient Multi-task and Transfer Learning

no code implementations • ICLR 2019 • Pramod Kaushik Mudrakarta, Mark Sandler, Andrey Zhmoginov, Andrew Howard

We introduce a novel method that enables parameter-efficient transfer and multi-task learning with deep neural networks.

Image Classification Multi-Task Learning

Paper
Add Code

MobileNetV2: Inverted Residuals and Linear Bottlenecks

148 code implementations • CVPR 2018 • Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen

In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes.

Ranked #7 on Retinal OCT Disease Classification on OCT2017

Image Classification Image Segmentation +4

76,585

Paper
Code

CycleGAN, a Master of Steganography

no code implementations • 8 Dec 2017 • Casey Chu, Andrey Zhmoginov, Mark Sandler

CycleGAN (Zhu et al. 2017) is one recent successful approach to learn a transformation between two image distributions.

Paper
Add Code

The Power of Sparsity in Convolutional Neural Networks

no code implementations • 21 Feb 2017 • Soravit Changpinyo, Mark Sandler, Andrey Zhmoginov

Deep convolutional networks are well-known for their high computational and memory demands.

Paper
Add Code

Inverting face embeddings with convolutional neural networks

1 code implementation • 14 Jun 2016 • Andrey Zhmoginov, Mark Sandler

Deep neural networks have dramatically advanced the state of the art for many areas of machine learning.

Face Transfer

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.