Search Results for author: Daniel Keysers

Found 16 papers, 8 papers with code

PaLI-3 Vision Language Models: Smaller, Faster, Stronger

1 code implementation • 13 Oct 2023 • Xi Chen, Xiao Wang, Lucas Beyer, Alexander Kolesnikov, Jialin Wu, Paul Voigtlaender, Basil Mustafa, Sebastian Goodman, Ibrahim Alabdulmohsin, Piotr Padlewski, Daniel Salz, Xi Xiong, Daniel Vlasic, Filip Pavetic, Keran Rong, Tianli Yu, Daniel Keysers, Xiaohua Zhai, Radu Soricut

This paper presents PaLI-3, a smaller, faster, and stronger vision language model (VLM) that compares favorably to similar models that are 10x larger.

Ranked #2 on Temporal/Casual QA on NExT-QA (using extra training data)

Chart Question Answering Image Classification +4

117

Paper
Code

Video OWL-ViT: Temporally-consistent open-world localization in video

no code implementations • ICCV 2023 • Georg Heigold, Matthias Minderer, Alexey Gritsenko, Alex Bewley, Daniel Keysers, Mario Lučić, Fisher Yu, Thomas Kipf

Our model is end-to-end trainable on video data and enjoys improved temporal consistency compared to tracking-by-detection baselines, while retaining the open-world capabilities of the backbone detector.

Object Object Localization

Paper
Add Code

PaLI-X: On Scaling up a Multilingual Vision and Language Model

2 code implementations • 29 May 2023 • Xi Chen, Josip Djolonga, Piotr Padlewski, Basil Mustafa, Soravit Changpinyo, Jialin Wu, Carlos Riquelme Ruiz, Sebastian Goodman, Xiao Wang, Yi Tay, Siamak Shakeri, Mostafa Dehghani, Daniel Salz, Mario Lucic, Michael Tschannen, Arsha Nagrani, Hexiang Hu, Mandar Joshi, Bo Pang, Ceslee Montgomery, Paulina Pietrzyk, Marvin Ritter, AJ Piergiovanni, Matthias Minderer, Filip Pavetic, Austin Waters, Gang Li, Ibrahim Alabdulmohsin, Lucas Beyer, Julien Amelot, Kenton Lee, Andreas Peter Steiner, Yang Li, Daniel Keysers, Anurag Arnab, Yuanzhong Xu, Keran Rong, Alexander Kolesnikov, Mojtaba Seyedhosseini, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, Radu Soricut

We present the training recipe and results of scaling up PaLI-X, a multilingual vision and language model, both in terms of size of the components and the breadth of its training task mixture.

Ranked #1 on Fine-Grained Image Recognition on OVEN

Chart Question Answering document understanding +9

Paper
Code

Scaling Vision Transformers to 22 Billion Parameters

1 code implementation • 10 Feb 2023 • Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Patrick Collier, Alexey Gritsenko, Vighnesh Birodkar, Cristina Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetić, Dustin Tran, Thomas Kipf, Mario Lučić, Xiaohua Zhai, Daniel Keysers, Jeremiah Harmsen, Neil Houlsby

The scaling of Transformers has driven breakthrough capabilities for language models.

Ranked #1 on Zero-Shot Transfer Image Classification on ObjectNet

Action Classification Fairness +3

192

Paper
Code

LiT: Zero-Shot Transfer with Locked-image text Tuning

4 code implementations • CVPR 2022 • Xiaohua Zhai, Xiao Wang, Basil Mustafa, Andreas Steiner, Daniel Keysers, Alexander Kolesnikov, Lucas Beyer

This paper presents contrastive-tuning, a simple method employing contrastive training to align image and text models while still taking advantage of their pre-training.

Ranked #1 on Zero-Shot Transfer Image Classification on ImageNet ReaL

Image Classification Retrieval +2

9,238

Paper
Code

The Impact of Reinitialization on Generalization in Convolutional Neural Networks

no code implementations • 1 Sep 2021 • Ibrahim Alabdulmohsin, Hartmut Maennel, Daniel Keysers

Recent results suggest that reinitializing a subset of the parameters of a neural network during training can improve generalization, particularly for small training sets.

Generalization Bounds Image Classification +1

Paper
Add Code

Continental-Scale Building Detection from High Resolution Satellite Imagery

no code implementations • 26 Jul 2021 • Wojciech Sirko, Sergii Kashubin, Marvin Ritter, Abigail Annkah, Yasser Salah Eddine Bouchareb, Yann Dauphin, Daniel Keysers, Maxim Neumann, Moustapha Cisse, John Quinn

Identifying the locations and footprints of buildings is vital for many practical and scientific purposes.

Instance Segmentation Semantic Segmentation +1

Paper
Add Code

A Generalized Lottery Ticket Hypothesis

no code implementations • 3 Jul 2021 • Ibrahim Alabdulmohsin, Larisa Markeeva, Daniel Keysers, Ilya Tolstikhin

We introduce a generalization to the lottery ticket hypothesis in which the notion of "sparsity" is relaxed by choosing an arbitrary basis in the space of parameters.

Paper
Add Code

Scaling Vision with Sparse Mixture of Experts

1 code implementation • NeurIPS 2021 • Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, André Susano Pinto, Daniel Keysers, Neil Houlsby

We present a Vision MoE (V-MoE), a sparse version of the Vision Transformer, that is scalable and competitive with the largest dense networks.

Ranked #1 on Few-Shot Image Classification on ImageNet - 5-shot

Few-Shot Image Classification

507

Paper
Code

MLP-Mixer: An all-MLP Architecture for Vision

46 code implementations • NeurIPS 2021 • Ilya Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Andreas Steiner, Daniel Keysers, Jakob Uszkoreit, Mario Lucic, Alexey Dosovitskiy

Convolutional Neural Networks (CNNs) are the go-to model for computer vision.

Ranked #17 on Image Classification on OmniBenchmark

Image Classification

47,783

Paper
Code

Deep Ensembles for Low-Data Transfer Learning

no code implementations • 14 Oct 2020 • Basil Mustafa, Carlos Riquelme, Joan Puigcerver, André Susano Pinto, Daniel Keysers, Neil Houlsby

In the low-data regime, it is difficult to train good supervised models from scratch.

Ranked #6 on Image Classification on VTAB-1k (using extra training data)

Image Classification Transfer Learning

Paper
Add Code

Scalable Transfer Learning with Expert Models

no code implementations • ICLR 2021 • Joan Puigcerver, Carlos Riquelme, Basil Mustafa, Cedric Renggli, André Susano Pinto, Sylvain Gelly, Daniel Keysers, Neil Houlsby

We explore the use of expert representations for transfer with a simple, yet effective, strategy.

Ranked #11 on Image Classification on VTAB-1k (using extra training data)

Image Classification Transfer Learning

Paper
Add Code

What Do Neural Networks Learn When Trained With Random Labels?

no code implementations • NeurIPS 2020 • Hartmut Maennel, Ibrahim Alabdulmohsin, Ilya Tolstikhin, Robert J. N. Baldock, Olivier Bousquet, Sylvain Gelly, Daniel Keysers

We show how this alignment produces a positive transfer: networks pre-trained with random labels train faster downstream compared to training from scratch even after accounting for simple effects, such as weight scaling.

Memorization

Paper
Add Code

Predicting Neural Network Accuracy from Weights

1 code implementation • 26 Feb 2020 • Thomas Unterthiner, Daniel Keysers, Sylvain Gelly, Olivier Bousquet, Ilya Tolstikhin

Furthermore, the predictors are able to rank networks trained on different, unobserved datasets and with different architectures.

Paper
Code

Measuring Compositional Generalization: A Comprehensive Method on Realistic Data

3 code implementations • ICLR 2020 • Daniel Keysers, Nathanael Schärli, Nathan Scales, Hylke Buisman, Daniel Furrer, Sergii Kashubin, Nikola Momchev, Danila Sinopalnikov, Lukasz Stafiniak, Tibor Tihon, Dmitry Tsarkov, Xiao Wang, Marc van Zee, Olivier Bousquet

We present a large and realistic natural language question answering dataset that is constructed according to this method, and we use it to analyze the compositional generalization ability of three machine learning architectures.

Ranked #5 on Semantic Parsing on CFQ

BIG-bench Machine Learning Question Answering +1

32,781

Paper
Code

Fast Multi-language LSTM-based Online Handwriting Recognition

no code implementations • 22 Feb 2019 • Victor Carbune, Pedro Gonnet, Thomas Deselaers, Henry A. Rowley, Alexander Daryin, Marcos Calvo, Li-Lun Wang, Daniel Keysers, Sandro Feuz, Philippe Gervais

We describe an online handwriting system that is able to support 102 languages using a deep neural network architecture.

Handwriting Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.