Search Results for author: Lucas Beyer

Found 39 papers, 34 papers with code

Image Captioners Are Scalable Vision Learners Too

1 code implementation NeurIPS 2023 Michael Tschannen, Manoj Kumar, Andreas Steiner, Xiaohua Zhai, Neil Houlsby, Lucas Beyer

We further analyze the effect of the model architecture and scale, as well as the pretraining data on the representation quality, and find that captioning exhibits the same or better scaling behavior along these axes.

Image Captioning

Tuning computer vision models with task rewards

1 code implementation16 Feb 2023 André Susano Pinto, Alexander Kolesnikov, Yuge Shi, Lucas Beyer, Xiaohua Zhai

Misalignment between model predictions and intended usage can be detrimental for the deployment of computer vision models.

Colorization Image Captioning +5

VeLO: Training Versatile Learned Optimizers by Scaling Up

1 code implementation17 Nov 2022 Luke Metz, James Harrison, C. Daniel Freeman, Amil Merchant, Lucas Beyer, James Bradbury, Naman Agrawal, Ben Poole, Igor Mordatch, Adam Roberts, Jascha Sohl-Dickstein

While deep learning models have replaced hand-designed features across many domains, these models are still trained with hand-designed optimizers.

Better plain ViT baselines for ImageNet-1k

4 code implementations3 May 2022 Lucas Beyer, Xiaohua Zhai, Alexander Kolesnikov

It is commonly accepted that the Vision Transformer model requires sophisticated regularization techniques to excel at ImageNet-1k scale data.

Data Augmentation Image Classification

A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation

3 code implementations17 Dec 2021 Wuyang Chen, Xianzhi Du, Fan Yang, Lucas Beyer, Xiaohua Zhai, Tsung-Yi Lin, Huizhong Chen, Jing Li, Xiaodan Song, Zhangyang Wang, Denny Zhou

In this paper, we comprehensively study three architecture design choices on ViT -- spatial reduction, doubled channels, and multiscale features -- and demonstrate that a vanilla ViT architecture can fulfill this goal without handcrafting multiscale features, maintaining the original ViT design philosophy.

Image Classification Instance Segmentation +6

LiT: Zero-Shot Transfer with Locked-image text Tuning

4 code implementations CVPR 2022 Xiaohua Zhai, Xiao Wang, Basil Mustafa, Andreas Steiner, Daniel Keysers, Alexander Kolesnikov, Lucas Beyer

This paper presents contrastive-tuning, a simple method employing contrastive training to align image and text models while still taking advantage of their pre-training.

Image Classification Retrieval +3

The Efficiency Misnomer

no code implementations ICLR 2022 Mostafa Dehghani, Anurag Arnab, Lucas Beyer, Ashish Vaswani, Yi Tay

We further present suggestions to improve reporting of efficiency metrics.

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

14 code implementations18 Jun 2021 Andreas Steiner, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, Lucas Beyer

Vision Transformers (ViT) have been shown to attain highly competitive performance for a wide range of vision applications, such as image classification, object detection and semantic image segmentation.

Data Augmentation Image Classification +5

Scaling Vision Transformers

1 code implementation CVPR 2022 Xiaohua Zhai, Alexander Kolesnikov, Neil Houlsby, Lucas Beyer

As a result, we successfully train a ViT model with two billion parameters, which attains a new state-of-the-art on ImageNet of 90. 45% top-1 accuracy.

Ranked #3 on Image Classification on VTAB-1k (using extra training data)

Few-Shot Image Classification Few-Shot Learning

MULEX: Disentangling Exploitation from Exploration in Deep RL

no code implementations1 Jul 2019 Lucas Beyer, Damien Vincent, Olivier Teboul, Sylvain Gelly, Matthieu Geist, Olivier Pietquin

An agent learning through interactions should balance its action selection process between probing the environment to discover new rewards and using the information acquired in the past to adopt useful behaviour.

Detection-Tracking for Efficient Person Analysis: The DetTA Pipeline

1 code implementation26 Apr 2018 Stefan Breuers, Lucas Beyer, Umer Rafi, Bastian Leibe

In the past decade many robots were deployed in the wild, and people detection and tracking is an important component of such deployments.

Attribute

Deep Person Detection in 2D Range Data

1 code implementation6 Apr 2018 Lucas Beyer, Alexander Hermans, Timm Linder, Kai O. Arras, Bastian Leibe

Detecting humans is a key skill for mobile robots and intelligent vehicles in a large variety of applications.

Human Detection

The Atari Grand Challenge Dataset

2 code implementations31 May 2017 Vitaly Kurin, Sebastian Nowozin, Katja Hofmann, Lucas Beyer, Bastian Leibe

Recent progress in Reinforcement Learning (RL), fueled by its combination, with Deep Learning has enabled impressive results in learning to interact with complex virtual environments, yet real-world applications of RL are still scarce.

Imitation Learning Reinforcement Learning (RL)

Towards a Principled Integration of Multi-Camera Re-Identification and Tracking through Optimal Bayes Filters

2 code implementations12 May 2017 Lucas Beyer, Stefan Breuers, Vitaly Kurin, Bastian Leibe

With the rise of end-to-end learning through deep learning, person detectors and re-identification (ReID) models have recently become very strong.

In Defense of the Triplet Loss for Person Re-Identification

31 code implementations22 Mar 2017 Alexander Hermans, Lucas Beyer, Bastian Leibe

In the past few years, the field of computer vision has gone through a revolution fueled mainly by the advent of large datasets and the adoption of deep convolutional neural networks for end-to-end learning.

Ranked #3 on Person Re-Identification on CUHK03 (Rank-5 metric)

General Classification Metric Learning +1

DROW: Real-Time Deep Learning based Wheelchair Detection in 2D Range Data

no code implementations8 Mar 2016 Lucas Beyer, Alexander Hermans, Bastian Leibe

We propose a Convolutional Neural Network (CNN) based detector for this task.

Cannot find the paper you are looking for? You can Submit a new open access paper.