1 code implementation • 23 Aug 2023 • Roy Hirsch, Mathilde Caron, Regev Cohen, Amir Livne, Ron Shapiro, Tomer Golany, Roman Goldenberg, Daniel Freedman, Ehud Rivlin
To fully exploit the power of SSL, we create sizable unlabeled endoscopic video datasets for training MSNs.
no code implementations • 12 Jul 2023 • Mostafa Dehghani, Basil Mustafa, Josip Djolonga, Jonathan Heek, Matthias Minderer, Mathilde Caron, Andreas Steiner, Joan Puigcerver, Robert Geirhos, Ibrahim Alabdulmohsin, Avital Oliver, Piotr Padlewski, Alexey Gritsenko, Mario Lučić, Neil Houlsby
The ubiquitous and demonstrably suboptimal choice of resizing images to a fixed resolution before processing them with computer vision models has not yet been successfully challenged.
no code implementations • 12 Jun 2023 • Ahmet Iscen, Mathilde Caron, Alireza Fathi, Cordelia Schmid
Remarkably, we show that this can be done with a light-weight, single-layer, fusion transformer on top of a frozen CLIP.
Ranked #3 on
Fine-Grained Image Recognition
on OVEN
no code implementations • 13 Apr 2023 • Liliane Momeni, Mathilde Caron, Arsha Nagrani, Andrew Zisserman, Cordelia Schmid
Understanding verbs is crucial to modelling how people and objects interact with each other and the environment through space and time.
Ranked #8 on
Video Question Answering
on NExT-QA
(using extra training data)
no code implementations • 10 Feb 2023 • Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Patrick Collier, Alexey Gritsenko, Vighnesh Birodkar, Cristina Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetić, Dustin Tran, Thomas Kipf, Mario Lučić, Xiaohua Zhai, Daniel Keysers, Jeremiah Harmsen, Neil Houlsby
The scaling of Transformers has driven breakthrough capabilities for language models.
Ranked #1 on
Linear-Probe Classification
on ImageNet
(using extra training data)
3 code implementations • CVPR 2023 • Lucas Beyer, Pavel Izmailov, Alexander Kolesnikov, Mathilde Caron, Simon Kornblith, Xiaohua Zhai, Matthias Minderer, Michael Tschannen, Ibrahim Alabdulmohsin, Filip Pavetic
Vision Transformers convert images to sequences by slicing them into patches.
1 code implementation • 5 Dec 2022 • Mathilde Caron, Neil Houlsby, Cordelia Schmid
Pixel-level labels are particularly expensive to acquire.
no code implementations • 10 Oct 2022 • Ahmet Iscen, Thomas Bird, Mathilde Caron, Alireza Fathi, Cordelia Schmid
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from.
2 code implementations • 14 Apr 2022 • Mahmoud Assran, Mathilde Caron, Ishan Misra, Piotr Bojanowski, Florian Bordes, Pascal Vincent, Armand Joulin, Michael Rabbat, Nicolas Ballas
We propose Masked Siamese Networks (MSN), a self-supervised learning framework for learning image representations.
Self-Supervised Image Classification
Self-Supervised Learning
+1
1 code implementation • 16 Feb 2022 • Priya Goyal, Quentin Duval, Isaac Seessel, Mathilde Caron, Ishan Misra, Levent Sagun, Armand Joulin, Piotr Bojanowski
Discriminative self-supervised learning allows training models on any random group of internet images, and possibly recover salient information that helps differentiate between the images.
Ranked #1 on
Out-of-Distribution Generalization
on ImageNet-W
(using extra training data)
5 code implementations • 16 Dec 2021 • Gautier Izacard, Mathilde Caron, Lucas Hosseini, Sebastian Riedel, Piotr Bojanowski, Armand Joulin, Edouard Grave
In this work, we explore the limits of contrastive learning as a way to train unsupervised dense retrievers and show that it leads to strong performance in various retrieval settings.
no code implementations • 29 Sep 2021 • Gautier Izacard, Mathilde Caron, Lucas Hosseini, Sebastian Riedel, Piotr Bojanowski, Armand Joulin, Edouard Grave
By contrast, in many other NLP tasks, conventional self-supervised pre-training based on masking leads to strong generalization with small number of training examples.
11 code implementations • NeurIPS 2021 • Alaaeldin El-Nouby, Hugo Touvron, Mathilde Caron, Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, Hervé Jegou
We propose a "transposed" version of self-attention that operates across feature channels rather than tokens, where the interactions are based on the cross-covariance matrix between keys and queries.
Ranked #54 on
Instance Segmentation
on COCO minival
16 code implementations • NeurIPS 2021 • Hugo Touvron, Piotr Bojanowski, Mathilde Caron, Matthieu Cord, Alaaeldin El-Nouby, Edouard Grave, Gautier Izacard, Armand Joulin, Gabriel Synnaeve, Jakob Verbeek, Hervé Jégou
We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification.
Ranked #5 on
Image Classification
on ImageNet ReaL
(Top 1 Accuracy metric)
23 code implementations • ICCV 2021 • Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, Armand Joulin
In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets).
Ranked #2 on
Visual Place Recognition
on Laurel Caverns
4 code implementations • ICCV 2021 • Mahmoud Assran, Mathilde Caron, Ishan Misra, Piotr Bojanowski, Armand Joulin, Nicolas Ballas, Michael Rabbat
This paper proposes a novel method of learning by predicting view assignments with support samples (PAWS).
1 code implementation • 2 Mar 2021 • Priya Goyal, Mathilde Caron, Benjamin Lefaudeux, Min Xu, Pengchao Wang, Vivek Pai, Mannat Singh, Vitaliy Liptchinsky, Ishan Misra, Armand Joulin, Piotr Bojanowski
Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods.
Ranked #6 on
Image Classification
on Places205
Self-Supervised Image Classification
Self-Supervised Learning
+1
15 code implementations • NeurIPS 2020 • Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, Armand Joulin
In addition, we also propose a new data augmentation strategy, multi-crop, that uses a mix of views with different resolutions in place of two full-resolution views, without increasing the memory or compute requirements much.
Ranked #1 on
Contrastive Learning
on imagenet-1k
no code implementations • 10 Jan 2020 • Mathilde Caron, Ari Morcos, Piotr Bojanowski, Julien Mairal, Armand Joulin
In this work, we investigate the use of standard pruning methods, developed primarily for supervised learning, for networks trained without labels (i. e. on self-supervised tasks).
no code implementations • 25 Sep 2019 • Mathilde Caron, Ari Morcos, Piotr Bojanowski, Julien Mairal, Armand Joulin
The lottery ticket hypothesis argues that neural networks contain sparse subnetworks, which, if appropriately initialized (the winning tickets), are capable of matching the accuracy of the full network when trained in isolation.
2 code implementations • ICCV 2019 • Mathilde Caron, Piotr Bojanowski, Julien Mairal, Armand Joulin
Our goal is to bridge the performance gap between unsupervised methods trained on curated data, which are costly to obtain, and massive raw datasets that are easily available.
Ranked #63 on
Self-Supervised Image Classification
on ImageNet (finetuned)
(using extra training data)
9 code implementations • ECCV 2018 • Mathilde Caron, Piotr Bojanowski, Armand Joulin, Matthijs Douze
In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features.
Ranked #1 on
Image Clustering
on CIFAR-100
(Train Set metric, using extra
training data)