Search Results for author: Alexander Kolesnikov

Found 33 papers, 21 papers with code

Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design

no code implementations22 May 2023 Ibrahim Alabdulmohsin, Xiaohua Zhai, Alexander Kolesnikov, Lucas Beyer

Scaling laws have been recently employed to derive compute-optimal model size (number of parameters) for a given compute duration.

Image Classification Visual Question Answering (VQA)

Tuning computer vision models with task rewards

no code implementations16 Feb 2023 André Susano Pinto, Alexander Kolesnikov, Yuge Shi, Lucas Beyer, Xiaohua Zhai

Misalignment between model predictions and intended usage can be detrimental for the deployment of computer vision models.

Colorization Image Captioning +5

Better plain ViT baselines for ImageNet-1k

2 code implementations3 May 2022 Lucas Beyer, Xiaohua Zhai, Alexander Kolesnikov

It is commonly accepted that the Vision Transformer model requires sophisticated regularization techniques to excel at ImageNet-1k scale data.

Data Augmentation Image Classification

LiT: Zero-Shot Transfer with Locked-image text Tuning

4 code implementations CVPR 2022 Xiaohua Zhai, Xiao Wang, Basil Mustafa, Andreas Steiner, Daniel Keysers, Alexander Kolesnikov, Lucas Beyer

This paper presents contrastive-tuning, a simple method employing contrastive training to align image and text models while still taking advantage of their pre-training.

Image Classification Retrieval +2

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

10 code implementations18 Jun 2021 Andreas Steiner, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, Lucas Beyer

Vision Transformers (ViT) have been shown to attain highly competitive performance for a wide range of vision applications, such as image classification, object detection and semantic image segmentation.

Data Augmentation Image Classification +5

Scaling Vision Transformers

1 code implementation CVPR 2022 Xiaohua Zhai, Alexander Kolesnikov, Neil Houlsby, Lucas Beyer

As a result, we successfully train a ViT model with two billion parameters, which attains a new state-of-the-art on ImageNet of 90. 45% top-1 accuracy.

Ranked #3 on Image Classification on VTAB-1k (using extra training data)

Few-Shot Image Classification Few-Shot Learning

Detecting Visual Relationships Using Box Attention

no code implementations5 Jul 2018 Alexander Kolesnikov, Alina Kuznetsova, Christoph H. Lampert, Vittorio Ferrari

We propose a new model for detecting visual relationships, such as "person riding motorcycle" or "bottle on table".

object-detection Object Detection

Probabilistic Image Colorization

1 code implementation11 May 2017 Amelie Royer, Alexander Kolesnikov, Christoph H. Lampert

We develop a probabilistic technique for colorizing grayscale natural images.

Colorization Image Colorization

PixelCNN Models with Auxiliary Variables for Natural Image Modeling

no code implementations ICML 2017 Alexander Kolesnikov, Christoph H. Lampert

We study probabilistic models of natural images and extend the autoregressive family of PixelCNN architectures by incorporating auxiliary variables.

Ranked #13 on Image Generation on ImageNet 64x64 (Bits per dim metric)

Image Generation

iCaRL: Incremental Classifier and Representation Learning

9 code implementations CVPR 2017 Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, Christoph H. Lampert

A major open problem on the road to artificial intelligence is the development of incrementally learning systems that learn about more and more concepts over time from a stream of data.

class-incremental learning Class Incremental Learning +2

Improving Weakly-Supervised Object Localization By Micro-Annotation

no code implementations18 May 2016 Alexander Kolesnikov, Christoph H. Lampert

Weakly-supervised object localization methods tend to fail for object classes that consistently co-occur with the same background elements, e. g. trains on tracks.

Semantic Segmentation Weakly-Supervised Object Localization

Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation

2 code implementations19 Mar 2016 Alexander Kolesnikov, Christoph H. Lampert

We introduce a new loss function for the weakly-supervised training of semantic image segmentation models based on three guiding principles: to seed with weak localization cues, to expand objects based on the information about which classes can occur in an image, and to constrain the segmentations to coincide with object boundaries.

Image Segmentation Semantic Segmentation

Identifying Reliable Annotations for Large Scale Image Segmentation

no code implementations28 Apr 2015 Alexander Kolesnikov, Christoph H. Lampert

In this work, we present a Gaussian process (GP) based technique for simultaneously identifying which images of a training set have unreliable annotation and learning a segmentation model in which the negative effect of these images is suppressed.

Image Segmentation Semantic Segmentation

Closed-Form Training of Conditional Random Fields for Large Scale Image Segmentation

no code implementations27 Mar 2014 Alexander Kolesnikov, Matthieu Guillaumin, Vittorio Ferrari, Christoph H. Lampert

It is inspired by existing closed-form expressions for the maximum likelihood parameters of a generative graphical model with tree topology.

Image Segmentation Semantic Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.