1 code implementation • 17 Sep 2024 • Alexandre Défossez, Laurent Mazaré, Manu Orsini, Amélie Royer, Patrick Pérez, Hervé Jégou, Edouard Grave, Neil Zeghidour
Our resulting model is the first real-time full-duplex spoken large language model, with a theoretical latency of 160ms, 200ms in practice, and is available at https://github. com/kyutai-labs/moshi.
1 code implementation • 24 May 2024 • Huy V. Vo, Vasil Khalidov, Timothée Darcet, Théo Moutakanni, Nikita Smetanin, Marc Szafraniec, Hugo Touvron, Camille Couprie, Maxime Oquab, Armand Joulin, Hervé Jégou, Patrick Labatut, Piotr Bojanowski
This manual process has some limitations similar to those encountered in supervised learning, e. g., the crowd-sourced selection of data is costly and time-consuming, preventing scaling the dataset size.
1 code implementation • 16 Jan 2024 • Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, Hervé Jégou
Vector databases typically manage large collections of embedding vectors.
2 code implementations • ICCV 2023 • Pierre Fernandez, Guillaume Couairon, Hervé Jégou, Matthijs Douze, Teddy Furon
For instance, it detects the origin of an image generated from a text prompt, then cropped to keep $10\%$ of the content, with $90$+$\%$ accuracy at a false positive rate below 10$^{-6}$.
no code implementations • 26 Jan 2023 • Matthew J. Muckley, Alaaeldin El-Nouby, Karen Ullrich, Hervé Jégou, Jakob Verbeek
Lossy image compression aims to represent images in as few bits as possible while maintaining fidelity to the original.
1 code implementation • CVPR 2023 • Hugo Touvron, Matthieu Cord, Maxime Oquab, Piotr Bojanowski, Jakob Verbeek, Hervé Jégou
Given a neural network to be trained, for each sample we implicitly instantiate two altered networks, "submodels", with stochastic depth: i. e. activating only a subset of the layers and skipping others.
no code implementations • 14 Dec 2022 • Alaaeldin El-Nouby, Matthew J. Muckley, Karen Ullrich, Ivan Laptev, Jakob Verbeek, Hervé Jégou
In this work, we attempt to bring these lines of research closer by revisiting vector quantization for image compression.
1 code implementation • 9 Dec 2022 • Hugo Touvron, Matthieu Cord, Maxime Oquab, Piotr Bojanowski, Jakob Verbeek, Hervé Jégou
We introduce submodel co-training, a regularization method related to co-training, self-distillation and stochastic depth.
Ranked #70 on
Image Classification
on ImageNet
1 code implementation • 5 Oct 2022 • Pierre Fernandez, Matthijs Douze, Hervé Jégou, Teddy Furon
First, a neural network maps an image to a vector representation, that is relatively robust to various transformations of the image.
11 code implementations • 14 Apr 2022 • Hugo Touvron, Matthieu Cord, Hervé Jégou
Our evaluations on Image classification (ImageNet-1k with and without pre-training on ImageNet-21k), transfer learning and semantic segmentation show that our procedure outperforms by a large margin previous fully supervised training recipes for ViT.
Ranked #1 on
Image Classification
on ImageNet ReaL
(Number of params metric)
7 code implementations • 18 Mar 2022 • Hugo Touvron, Matthieu Cord, Alaaeldin El-Nouby, Jakob Verbeek, Hervé Jégou
(2) Fine-tuning the weights of the attention layers is sufficient to adapt vision transformers to a higher resolution and to other classification tasks.
Ranked #9 on
Image Classification
on CIFAR-10
(using extra training data)
5 code implementations • 27 Dec 2021 • Hugo Touvron, Matthieu Cord, Alaaeldin El-Nouby, Piotr Bojanowski, Armand Joulin, Gabriel Synnaeve, Hervé Jégou
We show how to augment any convolutional network with an attention-based global map to achieve non-local reasoning.
Ranked #40 on
Semantic Segmentation
on ADE20K val
no code implementations • 17 Dec 2021 • Kenza Amara, Matthijs Douze, Alexandre Sablayrolles, Hervé Jégou
Modern approaches for fast retrieval of similar vectors on billion-scaled datasets rely on compressed-domain approaches such as binary sketches or product quantization.
1 code implementation • 17 Dec 2021 • Pierre Fernandez, Alexandre Sablayrolles, Teddy Furon, Hervé Jégou, Matthijs Douze
We revisit watermarking techniques based on pre-trained deep networks, in the light of self-supervised approaches.
14 code implementations • NeurIPS Workshop ImageNet_PPF 2021 • Ross Wightman, Hugo Touvron, Hervé Jégou
We share competitive training settings and pre-trained models in the timm open-source library, with the hope that they will serve as better baselines for future work.
Ranked #2 on
Medical Image Classification
on NCT-CRC-HE-100K
17 code implementations • NeurIPS 2021 • Hugo Touvron, Piotr Bojanowski, Mathilde Caron, Matthieu Cord, Alaaeldin El-Nouby, Edouard Grave, Gautier Izacard, Armand Joulin, Gabriel Synnaeve, Jakob Verbeek, Hervé Jégou
We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification.
Ranked #1 on
Image Classification
on Certificate Verification
31 code implementations • ICCV 2021 • Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, Armand Joulin
In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets).
Ranked #2 on
Copy Detection
on Copydays strong subset
2 code implementations • EMNLP 2021 • Chuan Guo, Alexandre Sablayrolles, Hervé Jégou, Douwe Kiela
We propose the first general-purpose gradient-based attack against transformer models.
11 code implementations • ICCV 2021 • Ben Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Hervé Jégou, Matthijs Douze
We design a family of image classification architectures that optimize the trade-off between accuracy and efficiency in a high-speed regime.
Ranked #15 on
Image Classification
on Stanford Cars
19 code implementations • ICCV 2021 • Hugo Touvron, Matthieu Cord, Alexandre Sablayrolles, Gabriel Synnaeve, Hervé Jégou
In particular, we investigate the interplay of architecture and optimization of such dedicated transformers.
Ranked #5 on
Image Classification
on Stanford Cars
1 code implementation • 10 Feb 2021 • Alaaeldin El-Nouby, Natalia Neverova, Ivan Laptev, Hervé Jégou
Transformers have shown outstanding results for natural language understanding and, more recently, for image classification.
35 code implementations • 23 Dec 2020 • Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou
In this work, we produce a competitive convolution-free transformer by training on Imagenet only.
Ranked #4 on
Efficient ViTs
on ImageNet-1K (with DeiT-S)
no code implementations • ICCV 2021 • Hugo Touvron, Alexandre Sablayrolles, Matthijs Douze, Matthieu Cord, Hervé Jégou
By jointly leveraging the coarse labels and the underlying fine-grained latent space, it significantly improves the accuracy of category-level retrieval methods.
Ranked #2 on
Learning with coarse labels
on cifar100
Fine-Grained Image Classification
Learning with coarse labels
+3
no code implementations • 13 Aug 2020 • Hugo Touvron, Matthijs Douze, Matthieu Cord, Hervé Jégou
We propose a simple architecture to address unpaired image-to-image translation tasks: style or class transfer, denoising, deblurring, deblocking, etc.
Ranked #1 on
Image-to-Image Translation
on vangogh2photo
(Frechet Inception Distance metric)
1 code implementation • 18 Mar 2020 • Hugo Touvron, Andrea Vedaldi, Matthijs Douze, Hervé Jégou
An EfficientNet-L2 pre-trained with weak supervision on 300M unlabeled images and further optimized with FixRes achieves 88. 5% top-1 accuracy (top-5: 98. 7%), which establishes the new state of the art for ImageNet with a single crop.
Ranked #9 on
Image Classification
on ImageNet ReaL
(using extra training data)
2 code implementations • ICML 2020 • Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou
The mark is robust to strong variations such as different architectures or optimization methods.
no code implementations • 29 Aug 2019 • Alexandre Sablayrolles, Matthijs Douze, Yann Ollivier, Cordelia Schmid, Hervé Jégou
Membership inference determines, given a sample and trained parameters of a machine learning model, whether the sample was part of the training set.
3 code implementations • ICLR 2020 • Pierre Stock, Armand Joulin, Rémi Gribonval, Benjamin Graham, Hervé Jégou
In this paper, we address the problem of reducing the memory footprint of convolutional network architectures.
7 code implementations • NeurIPS 2019 • Guillaume Lample, Alexandre Sablayrolles, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou
In our experiments we consider a dataset with up to 30 billion words, and we plug our memory layer in a state-of-the-art transformer-based architecture.
3 code implementations • NeurIPS 2019 • Hugo Touvron, Andrea Vedaldi, Matthijs Douze, Hervé Jégou
Conversely, when training a ResNeXt-101 32x48d pre-trained in weakly-supervised fashion on 940 million public images at resolution 224x224 and further optimizing for test resolution 320x320, we obtain a test top-1 accuracy of 86. 4% (top-5: 98. 0%) (single-crop).
Ranked #2 on
Fine-Grained Image Classification
on Birdsnap
(using extra training data)
4 code implementations • 2 May 2019 • I. Zeki Yalniz, Hervé Jégou, Kan Chen, Manohar Paluri, Dhruv Mahajan
This paper presents a study of semi-supervised learning with large convolutional networks.
Ranked #6 on
Image Classification
on OmniBenchmark
(using extra training data)
1 code implementation • ICLR 2019 • Pierre Stock, Benjamin Graham, Rémi Gribonval, Hervé Jégou
Modern neural networks are over-parametrized.
3 code implementations • 14 Feb 2019 • Maxim Berman, Hervé Jégou, Andrea Vedaldi, Iasonas Kokkinos, Matthijs Douze
When fed to a linear classifier, the learned embeddings provide state-of-the-art classification accuracy.
Ranked #1 on
Image Retrieval
on INRIA Holidays
3 code implementations • 27 Nov 2018 • Arun Mukundan, Giorgos Tolias, Andrei Bursuc, Hervé Jégou, Ondřej Chum
We propose a multiple-kernel local-patch descriptor based on efficient match kernels from pixel gradients.
no code implementations • ICLR 2019 • Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou
Convolutional neural networks memorize part of their training data, which is why strategies such as data augmentation and drop-out are employed to mitigate overfitting.
2 code implementations • ICLR 2019 • Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou
Discretizing multi-dimensional data distributions is a fundamental step of modern indexing methods.
8 code implementations • CVPR 2018 • Matthijs Douze, Alexandre Sablayrolles, Hervé Jégou
Similarity search approaches based on graph walks have recently attained outstanding speed-accuracy trade-offs, taking aside the memory requirements.
19 code implementations • ICLR 2018 • Alexis Conneau, Guillaume Lample, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou
We finally describe experiments on the English-Esperanto low-resource language pair, on which there only exists a limited amount of parallel data, to show the potential impact of our method in fully unsupervised machine translation.
Ranked #2 on
Word Alignment
on en-es
no code implementations • 9 Aug 2017 • Matthijs Douze, Hervé Jégou, Jeff Johnson
While k-means is usually considered as the gold standard for this task, we evaluate and show the interest of diffusion methods that have been neglected by the state of the art, such as the Markov Clustering algorithm.
1 code implementation • CVPR 2018 • Matthijs Douze, Arthur Szlam, Bharath Hariharan, Hervé Jégou
This paper considers the problem of inferring image labels from images when only a few annotated examples are available at training time.
14 code implementations • 28 Feb 2017 • Jeff Johnson, Matthijs Douze, Hervé Jégou
Similarity search finds application in specialized database systems handling complex data such as images or videos, which are typically represented by high-dimensional features and require specific indexing structures.
no code implementations • 24 Nov 2016 • Naila Murray, Hervé Jégou, Florent Perronnin, Andrew Zisserman
The second one involves equalising the match of a single descriptor to the aggregated vector.
1 code implementation • 21 Sep 2016 • Alexandre Sablayrolles, Matthijs Douze, Hervé Jégou, Nicolas Usunier
Hashing produces compact representations for documents, to perform tasks like classification or retrieval based on these short codes.
12 code implementations • ICML 2017 • Edouard Grave, Armand Joulin, Moustapha Cissé, David Grangier, Hervé Jégou
We propose an approximate strategy to efficiently train neural network based language models over very large vocabularies.
11 code implementations • 7 Sep 2016 • Matthijs Douze, Hervé Jégou, Florent Perronnin
This paper considers the problem of approximate nearest neighbor search in the compressed domain.
no code implementations • 10 Aug 2016 • Himalaya Jain, Patrick Pérez, Rémi Gribonval, Joaquin Zepeda, Hervé Jégou
This paper tackles the task of storing a large collection of vectors, such as visual descriptors, and of searching in it.
no code implementations • 7 Jul 2016 • Mihir Jain, Jan van Gemert, Hervé Jégou, Patrick Bouthemy, Cees G. M. Snoek
First, inspired by selective search for object proposals, we introduce an approach to generate action proposals from spatiotemporal super-voxels in an unsupervised manner, we call them Tubelets.
6 code implementations • 18 Nov 2015 • Giorgos Tolias, Ronan Sicre, Hervé Jégou
Recently, image representation built upon Convolutional Neural Network (CNN) has been shown to provide effective descriptors for image search, outperforming pre-CNN features as short-vector representations.
Ranked #4 on
Image Retrieval
on Par6k
1 code implementation • 8 Jun 2015 • Matthijs Douze, Jérôme Revaud, Jakob Verbeek, Hervé Jégou, Cordelia Schmid
We address the problem of specific video event retrieval.
no code implementations • 10 Dec 2014 • Ahmet Iscen, Teddy Furon, Vincent Gripon, Michael Rabbat, Hervé Jégou
We study an indexing architecture to store and search in a database of high-dimensional vectors from the perspective of statistical signal processing and decision theory.
no code implementations • 29 Oct 2014 • Ahmet Iscen, Giorgos Tolias, Philippe-Henri Gosselin, Hervé Jégou
Our results show that the regular dense detector is outperformed by other methods in most situations, leading us to improve the state of the art in comparable setups on standard retrieval and fined-grain benchmarks.
no code implementations • 8 Jul 2014 • Giorgos Tolias, Teddy Furon, Hervé Jégou
Our geometric-aware aggregation strategy is effective for image search, as shown by experiments performed on standard benchmarks for image and particular object retrieval, namely Holidays and Oxford buildings.