Search Results for author: Joan Puigcerver

Found 20 papers, 11 papers with code

Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

1 code implementation • 12 Jul 2023 • Mostafa Dehghani, Basil Mustafa, Josip Djolonga, Jonathan Heek, Matthias Minderer, Mathilde Caron, Andreas Steiner, Joan Puigcerver, Robert Geirhos, Ibrahim Alabdulmohsin, Avital Oliver, Piotr Padlewski, Alexey Gritsenko, Mario Lučić, Neil Houlsby

The ubiquitous and demonstrably suboptimal choice of resizing images to a fixed resolution before processing them with computer vision models has not yet been successfully challenged.

Fairness Image Classification +5

10,024

Paper
Code

A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark

2 code implementations • arXiv 2020 • Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, Neil Houlsby

And, how close are we to general visual representations?

Ranked #10 on Image Classification on VTAB-1k (using extra training data)

Image Classification Representation Learning

3,227

Paper
Code

PaLI: A Jointly-Scaled Multilingual Language-Image Model

1 code implementation • 14 Sep 2022 • Xi Chen, Xiao Wang, Soravit Changpinyo, AJ Piergiovanni, Piotr Padlewski, Daniel Salz, Sebastian Goodman, Adam Grycner, Basil Mustafa, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Nan Ding, Keran Rong, Hassan Akbari, Gaurav Mishra, Linting Xue, Ashish Thapliyal, James Bradbury, Weicheng Kuo, Mojtaba Seyedhosseini, Chao Jia, Burcu Karagol Ayan, Carlos Riquelme, Andreas Steiner, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, Radu Soricut

PaLI generates text based on visual and textual inputs, and with this interface performs many vision, language, and multimodal tasks, in many languages.

Ranked #1 on Zero-Shot Transfer Image Classification on ImageNet-S

Few-Shot Image Classification Image Captioning +5

1,539

Paper
Code

Big Transfer (BiT): General Visual Representation Learning

8 code implementations • ECCV 2020 • Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, Neil Houlsby

We conduct detailed analysis of the main components that lead to high transfer performance.

Ranked #1 on Out-of-Distribution Generalization on ImageNet-W (using extra training data)

Few-Shot Learning Fine-Grained Image Classification +2

1,490

Paper
Code

Scaling Vision with Sparse Mixture of Experts

1 code implementation • NeurIPS 2021 • Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, André Susano Pinto, Daniel Keysers, Neil Houlsby

We present a Vision MoE (V-MoE), a sparse version of the Vision Transformer, that is scalable and competitive with the largest dense networks.

Ranked #1 on Few-Shot Image Classification on ImageNet - 5-shot

Few-Shot Image Classification

503

Paper
Code

Sparse MoEs meet Efficient Ensembles

1 code implementation • 7 Oct 2021 • James Urquhart Allingham, Florian Wenzel, Zelda E Mariet, Basil Mustafa, Joan Puigcerver, Neil Houlsby, Ghassen Jerfel, Vincent Fortuin, Balaji Lakshminarayanan, Jasper Snoek, Dustin Tran, Carlos Riquelme Ruiz, Rodolphe Jenatton

Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, often exhibit strong performance compared to individual models.

Few-Shot Learning

503

Paper
Code

Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints

1 code implementation • 9 Dec 2022 • Aran Komatsuzaki, Joan Puigcerver, James Lee-Thorp, Carlos Riquelme Ruiz, Basil Mustafa, Joshua Ainslie, Yi Tay, Mostafa Dehghani, Neil Houlsby

In this work, we propose sparse upcycling -- a simple way to reuse sunk training costs by initializing a sparsely activated Mixture-of-Experts model from a dense checkpoint.

503

Paper
Code

From Sparse to Soft Mixtures of Experts

4 code implementations • 2 Aug 2023 • Joan Puigcerver, Carlos Riquelme, Basil Mustafa, Neil Houlsby

Sparse mixture of expert architectures (MoEs) scale model capacity without large increases in training or inference costs.

503

Paper
Code

Scaling Vision Transformers to 22 Billion Parameters

1 code implementation • 10 Feb 2023 • Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Patrick Collier, Alexey Gritsenko, Vighnesh Birodkar, Cristina Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetić, Dustin Tran, Thomas Kipf, Mario Lučić, Xiaohua Zhai, Daniel Keysers, Jeremiah Harmsen, Neil Houlsby

The scaling of Transformers has driven breakthrough capabilities for language models.

Ranked #1 on Zero-Shot Transfer Image Classification on ObjectNet

Action Classification Fairness +3

192

Paper
Code

On Robustness and Transferability of Convolutional Neural Networks

1 code implementation • CVPR 2021 • Josip Djolonga, Jessica Yung, Michael Tschannen, Rob Romijnders, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Matthias Minderer, Alexander D'Amour, Dan Moldovan, Sylvain Gelly, Neil Houlsby, Xiaohua Zhai, Mario Lucic

Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts.

Image Classification Transfer Learning

Paper
Code

Learning to Merge Tokens in Vision Transformers

1 code implementation • 24 Feb 2022 • Cedric Renggli, André Susano Pinto, Neil Houlsby, Basil Mustafa, Joan Puigcerver, Carlos Riquelme

Transformers are widely applied to solve natural language understanding and computer vision tasks.

Natural Language Understanding

Paper
Code

Scalable Transfer Learning with Expert Models

no code implementations • ICLR 2021 • Joan Puigcerver, Carlos Riquelme, Basil Mustafa, Cedric Renggli, André Susano Pinto, Sylvain Gelly, Daniel Keysers, Neil Houlsby

We explore the use of expert representations for transfer with a simple, yet effective, strategy.

Ranked #11 on Image Classification on VTAB-1k (using extra training data)

Image Classification Transfer Learning

Paper
Add Code

Which Model to Transfer? Finding the Needle in the Growing Haystack

no code implementations • CVPR 2022 • Cedric Renggli, André Susano Pinto, Luka Rimanic, Joan Puigcerver, Carlos Riquelme, Ce Zhang, Mario Lucic

Transfer learning has been recently popularized as a data-efficient alternative to training models from scratch, in particular for computer vision tasks where it provides a remarkably solid baseline.

Transfer Learning

Paper
Add Code

Deep Ensembles for Low-Data Transfer Learning

no code implementations • 14 Oct 2020 • Basil Mustafa, Carlos Riquelme, Joan Puigcerver, André Susano Pinto, Daniel Keysers, Neil Houlsby

In the low-data regime, it is difficult to train good supervised models from scratch.

Ranked #6 on Image Classification on VTAB-1k (using extra training data)

Image Classification Transfer Learning

Paper
Add Code

The Visual Task Adaptation Benchmark

no code implementations • 25 Sep 2019 • Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, Neil Houlsby

Representation learning promises to unlock deep learning for the long tail of vision tasks without expansive labelled datasets.

Representation Learning

Paper
Add Code

Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts

no code implementations • 6 Jun 2022 • Basil Mustafa, Carlos Riquelme, Joan Puigcerver, Rodolphe Jenatton, Neil Houlsby

MoEs are a natural fit for a multimodal backbone, since expert layers can learn an appropriate partitioning of modalities.

Contrastive Learning

Paper
Add Code

Sparsity-Constrained Optimal Transport

no code implementations • 30 Sep 2022 • Tianlin Liu, Joan Puigcerver, Mathieu Blondel

The smoothness of the objectives increases as $k$ increases, giving rise to a trade-off between convergence speed and sparsity of the optimal plan.

Paper
Add Code

On the Adversarial Robustness of Mixture of Experts

no code implementations • 19 Oct 2022 • Joan Puigcerver, Rodolphe Jenatton, Carlos Riquelme, Pranjal Awasthi, Srinadh Bhojanapalli

We next empirically evaluate the robustness of MoEs on ImageNet using adversarial attacks and show they are indeed more robust than dense models with the same computational cost.

Adversarial Robustness Open-Ended Question Answering

Paper
Add Code

Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective

no code implementations • 2 Feb 2023 • Michael E. Sander, Joan Puigcerver, Josip Djolonga, Gabriel Peyré, Mathieu Blondel

In this paper, we propose new differentiable and sparse top-k operators.

Paper
Add Code

Routers in Vision Mixture of Experts: An Empirical Study

no code implementations • 29 Jan 2024 • Tianlin Liu, Mathieu Blondel, Carlos Riquelme, Joan Puigcerver

Routers for sparse MoEs can be further grouped into two variants: Token Choice, which matches experts to each token, and Expert Choice, which matches tokens to each expert.

Language Modelling

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.