Search Results for author: Hugo Touvron

Found 21 papers, 18 papers with code

Code Llama: Open Foundation Models for Code

2 code implementations24 Aug 2023 Baptiste Rozière, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat, Xiaoqing Ellen Tan, Yossi Adi, Jingyu Liu, Romain Sauvestre, Tal Remez, Jérémy Rapin, Artyom Kozhevnikov, Ivan Evtimov, Joanna Bitton, Manish Bhatt, Cristian Canton Ferrer, Aaron Grattafiori, Wenhan Xiong, Alexandre Défossez, Jade Copet, Faisal Azhar, Hugo Touvron, Louis Martin, Nicolas Usunier, Thomas Scialom, Gabriel Synnaeve

We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks.

Code Generation Instruction Following

Co-Training 2L Submodels for Visual Recognition

1 code implementation CVPR 2023 Hugo Touvron, Matthieu Cord, Maxime Oquab, Piotr Bojanowski, Jakob Verbeek, Hervé Jégou

Given a neural network to be trained, for each sample we implicitly instantiate two altered networks, "submodels", with stochastic depth: i. e. activating only a subset of the layers and skipping others.

Image Classification Semantic Segmentation

Co-training $2^L$ Submodels for Visual Recognition

1 code implementation9 Dec 2022 Hugo Touvron, Matthieu Cord, Maxime Oquab, Piotr Bojanowski, Jakob Verbeek, Hervé Jégou

We introduce submodel co-training, a regularization method related to co-training, self-distillation and stochastic depth.

Image Classification Semantic Segmentation

DeiT III: Revenge of the ViT

8 code implementations14 Apr 2022 Hugo Touvron, Matthieu Cord, Hervé Jégou

Our evaluations on Image classification (ImageNet-1k with and without pre-training on ImageNet-21k), transfer learning and semantic segmentation show that our procedure outperforms by a large margin previous fully supervised training recipes for ViT.

 Ranked #1 on Image Classification on ImageNet ReaL (Number of params metric)

Data Augmentation Image Classification +3

Three things everyone should know about Vision Transformers

4 code implementations18 Mar 2022 Hugo Touvron, Matthieu Cord, Alaaeldin El-Nouby, Jakob Verbeek, Hervé Jégou

(2) Fine-tuning the weights of the attention layers is sufficient to adapt vision transformers to a higher resolution and to other classification tasks.

Ranked #8 on Image Classification on CIFAR-10 (using extra training data)

Fine-Grained Image Classification

Are Large-scale Datasets Necessary for Self-Supervised Pre-training?

no code implementations20 Dec 2021 Alaaeldin El-Nouby, Gautier Izacard, Hugo Touvron, Ivan Laptev, Hervé Jegou, Edouard Grave

Our study shows that denoising autoencoders, such as BEiT or a variant that we introduce in this paper, are more robust to the type and size of the pre-training data than popular self-supervised methods trained by comparing image embeddings. We obtain competitive performance compared to ImageNet pre-training on a variety of classification datasets, from different domains.

Denoising Instance Segmentation +1

ResNet strikes back: An improved training procedure in timm

11 code implementations NeurIPS Workshop ImageNet_PPF 2021 Ross Wightman, Hugo Touvron, Hervé Jégou

We share competitive training settings and pre-trained models in the timm open-source library, with the hope that they will serve as better baselines for future work.

Data Augmentation Domain Generalization +2

XCiT: Cross-Covariance Image Transformers

11 code implementations NeurIPS 2021 Alaaeldin El-Nouby, Hugo Touvron, Mathilde Caron, Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, Hervé Jegou

We propose a "transposed" version of self-attention that operates across feature channels rather than tokens, where the interactions are based on the cross-covariance matrix between keys and queries.

Instance Segmentation object-detection +3

Emerging Properties in Self-Supervised Vision Transformers

26 code implementations ICCV 2021 Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, Armand Joulin

In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets).

Copy Detection Self-Supervised Image Classification +6

Going deeper with Image Transformers

19 code implementations ICCV 2021 Hugo Touvron, Matthieu Cord, Alexandre Sablayrolles, Gabriel Synnaeve, Hervé Jégou

In particular, we investigate the interplay of architecture and optimization of such dedicated transformers.

Ranked #5 on Image Classification on CIFAR-10 (using extra training data)

Image Classification Transfer Learning

ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases

9 code implementations19 Mar 2021 Stéphane d'Ascoli, Hugo Touvron, Matthew Leavitt, Ari Morcos, Giulio Biroli, Levent Sagun

We initialise the GPSA layers to mimic the locality of convolutional layers, then give each attention head the freedom to escape locality by adjusting a gating parameter regulating the attention paid to position versus content information.

Image Classification Inductive Bias

Powers of layers for image-to-image translation

no code implementations13 Aug 2020 Hugo Touvron, Matthijs Douze, Matthieu Cord, Hervé Jégou

We propose a simple architecture to address unpaired image-to-image translation tasks: style or class transfer, denoising, deblurring, deblocking, etc.

 Ranked #1 on Image-to-Image Translation on horse2zebra (Frechet Inception Distance metric)

Deblurring Denoising +2

Fixing the train-test resolution discrepancy: FixEfficientNet

1 code implementation18 Mar 2020 Hugo Touvron, Andrea Vedaldi, Matthijs Douze, Hervé Jégou

An EfficientNet-L2 pre-trained with weak supervision on 300M unlabeled images and further optimized with FixRes achieves 88. 5% top-1 accuracy (top-5: 98. 7%), which establishes the new state of the art for ImageNet with a single crop.

Ranked #9 on Image Classification on ImageNet ReaL (using extra training data)

Data Augmentation Image Classification +1

Fixing the train-test resolution discrepancy

3 code implementations NeurIPS 2019 Hugo Touvron, Andrea Vedaldi, Matthijs Douze, Hervé Jégou

Conversely, when training a ResNeXt-101 32x48d pre-trained in weakly-supervised fashion on 940 million public images at resolution 224x224 and further optimizing for test resolution 320x320, we obtain a test top-1 accuracy of 86. 4% (top-5: 98. 0%) (single-crop).

Ranked #2 on Fine-Grained Image Classification on Birdsnap (using extra training data)

Data Augmentation Fine-Grained Image Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.