1 code implementation • 20 May 2022 • Alexander Kolesnikov, André Susano Pinto, Lucas Beyer, Xiaohua Zhai, Jeremiah Harmsen, Neil Houlsby
We introduce UViM, a unified approach capable of modeling a wide range of computer vision tasks.
2 code implementations • 3 May 2022 • Lucas Beyer, Xiaohua Zhai, Alexander Kolesnikov
It is commonly accepted that the Vision Transformer model requires sophisticated regularization techniques to excel at ImageNet-1k scale data.
1 code implementation • CVPR 2022 • Klaus Greff, Francois Belletti, Lucas Beyer, Carl Doersch, Yilun Du, Daniel Duckworth, David J. Fleet, Dan Gnanapragasam, Florian Golemo, Charles Herrmann, Thomas Kipf, Abhijit Kundu, Dmitry Lagun, Issam Laradji, Hsueh-Ti, Liu, Henning Meyer, Yishu Miao, Derek Nowrouzezahrai, Cengiz Oztireli, Etienne Pot, Noha Radwan, Daniel Rebain, Sara Sabour, Mehdi S. M. Sajjadi, Matan Sela, Vincent Sitzmann, Austin Stone, Deqing Sun, Suhani Vora, Ziyu Wang, Tianhao Wu, Kwang Moo Yi, Fangcheng Zhong, Andrea Tagliasacchi
Data is the driving force of machine learning, with the amount and quality of training data often being more important for the performance of a system than architecture and training details.
no code implementations • 17 Dec 2021 • Wuyang Chen, Xianzhi Du, Fan Yang, Lucas Beyer, Xiaohua Zhai, Tsung-Yi Lin, Huizhong Chen, Jing Li, Xiaodan Song, Zhangyang Wang, Denny Zhou
In this paper, we comprehensively study three architecture design choices on ViT -- spatial reduction, doubled channels, and multiscale features -- and demonstrate that a vanilla ViT architecture can fulfill this goal without handcrafting multiscale features, maintaining the original ViT design philosophy.
3 code implementations • CVPR 2022 • Xiaohua Zhai, Xiao Wang, Basil Mustafa, Andreas Steiner, Daniel Keysers, Alexander Kolesnikov, Lucas Beyer
This paper presents contrastive-tuning, a simple method employing contrastive training to align image and text models while still taking advantage of their pre-training.
no code implementations • ICLR 2022 • Mostafa Dehghani, Anurag Arnab, Lucas Beyer, Ashish Vaswani, Yi Tay
We further present suggestions to improve reporting of efficiency metrics.
8 code implementations • 18 Jun 2021 • Andreas Steiner, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, Lucas Beyer
Vision Transformers (ViT) have been shown to attain highly competitive performance for a wide range of vision applications, such as image classification, object detection and semantic image segmentation.
3 code implementations • CVPR 2022 • Lucas Beyer, Xiaohua Zhai, Amélie Royer, Larisa Markeeva, Rohan Anil, Alexander Kolesnikov
In particular, we uncover that there are certain implicit design choices, which may drastically affect the effectiveness of distillation.
Ranked #271 on
Image Classification
on ImageNet
1 code implementation • CVPR 2022 • Xiaohua Zhai, Alexander Kolesnikov, Neil Houlsby, Lucas Beyer
As a result, we successfully train a ViT model with two billion parameters, which attains a new state-of-the-art on ImageNet of 90. 45% top-1 accuracy.
Ranked #3 on
Image Classification
on VTAB-1k
(using extra training data)
39 code implementations • NeurIPS 2021 • Ilya Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Andreas Steiner, Daniel Keysers, Jakob Uszkoreit, Mario Lucic, Alexey Dosovitskiy
Convolutional Neural Networks (CNNs) are the go-to model for computer vision.
Ranked #18 on
Image Classification
on OmniBenchmark
1 code implementation • 9 Apr 2021 • Jessica Yung, Rob Romijnders, Alexander Kolesnikov, Lucas Beyer, Josip Djolonga, Neil Houlsby, Sylvain Gelly, Mario Lucic, Xiaohua Zhai
Before deploying machine learning models it is critical to assess their robustness.
108 code implementations • ICLR 2021 • Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby
While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited.
Ranked #1 on
Image Classification
on CIFAR-10
(using extra training data)
Document Image Classification
Fine-Grained Image Classification
+1
1 code implementation • CVPR 2021 • Josip Djolonga, Jessica Yung, Michael Tschannen, Rob Romijnders, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Matthias Minderer, Alexander D'Amour, Dan Moldovan, Sylvain Gelly, Neil Houlsby, Xiaohua Zhai, Mario Lucic
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts.
1 code implementation • 12 Jun 2020 • Lucas Beyer, Olivier J. Hénaff, Alexander Kolesnikov, Xiaohua Zhai, Aäron van den Oord
Yes, and no.
6 code implementations • ECCV 2020 • Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, Neil Houlsby
We conduct detailed analysis of the main components that lead to high transfer performance.
Ranked #1 on
Fine-Grained Image Classification
on Oxford 102 Flowers
(using extra training data)
2 code implementations • arXiv 2020 • Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, Neil Houlsby
And, how close are we to general visual representations?
Ranked #9 on
Image Classification
on VTAB-1k
(using extra training data)
no code implementations • 25 Sep 2019 • Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, Neil Houlsby
Representation learning promises to unlock deep learning for the long tail of vision tasks without expansive labelled datasets.
no code implementations • 1 Jul 2019 • Lucas Beyer, Damien Vincent, Olivier Teboul, Sylvain Gelly, Matthieu Geist, Olivier Pietquin
An agent learning through interactions should balance its action selection process between probing the environment to discover new rewards and using the information acquired in the past to adopt useful behaviour.
no code implementations • 30 May 2019 • Gabriel Dulac-Arnold, Neil Zeghidour, Marco Cuturi, Lucas Beyer, Jean-Philippe Vert
We propose a learning algorithm capable of learning from label proportions instead of direct data labels.
1 code implementation • ICCV 2019 • Xiaohua Zhai, Avital Oliver, Alexander Kolesnikov, Lucas Beyer
This work tackles the problem of semi-supervised learning of image classifiers.
Ranked #11 on
Semi-Supervised Image Classification
on ImageNet - 10% labeled data
(Top 5 Accuracy metric)
5 code implementations • CVPR 2019 • Alexander Kolesnikov, Xiaohua Zhai, Lucas Beyer
Unsupervised visual representation learning remains a largely unsolved problem in computer vision research.
Ranked #96 on
Self-Supervised Image Classification
on ImageNet
Representation Learning
Self-Supervised Image Classification
+1
1 code implementation • 26 Apr 2018 • Stefan Breuers, Lucas Beyer, Umer Rafi, Bastian Leibe
In the past decade many robots were deployed in the wild, and people detection and tracking is an important component of such deployments.
1 code implementation • 6 Apr 2018 • Lucas Beyer, Alexander Hermans, Timm Linder, Kai O. Arras, Bastian Leibe
Detecting humans is a key skill for mobile robots and intelligent vehicles in a large variety of applications.
2 code implementations • 31 May 2017 • Vitaly Kurin, Sebastian Nowozin, Katja Hofmann, Lucas Beyer, Bastian Leibe
Recent progress in Reinforcement Learning (RL), fueled by its combination, with Deep Learning has enabled impressive results in learning to interact with complex virtual environments, yet real-world applications of RL are still scarce.
2 code implementations • 12 May 2017 • Lucas Beyer, Stefan Breuers, Vitaly Kurin, Bastian Leibe
With the rise of end-to-end learning through deep learning, person detectors and re-identification (ReID) models have recently become very strong.
29 code implementations • 22 Mar 2017 • Alexander Hermans, Lucas Beyer, Bastian Leibe
In the past few years, the field of computer vision has gone through a revolution fueled mainly by the advent of large datasets and the adoption of deep convolutional neural networks for end-to-end learning.
Ranked #3 on
Person Re-Identification
on CUHK03
(Rank-5 metric)
no code implementations • 8 Mar 2016 • Lucas Beyer, Alexander Hermans, Bastian Leibe
We propose a Convolutional Neural Network (CNN) based detector for this task.