no code implementations • 24 Feb 2025 • Matthijs Douze
Machine learning and vector search are two research topics that developed in parallel in nearby communities.
no code implementations • 12 Feb 2025 • Pierre-Emmanuel Mazaré, Gergely Szilvasy, Maria Lomeli, Francisco Massa, Naila Murray, Hervé Jégou, Matthijs Douze
Self-attention in transformer models is an incremental associative memory that maps key vectors to value vectors.
1 code implementation • 16 Jan 2025 • Daniel Severo, Giuseppe Ottaviano, Matthew Muckley, Karen Ullrich, Matthijs Douze
Approximate nearest neighbor search for vectors relies on indexes that are most often accessed from RAM.
1 code implementation • 6 Jan 2025 • Théophane Vallaeys, Matthew Muckley, Jakob Verbeek, Matthijs Douze
QINCo recently addressed this inefficiency by using a neural network to determine the quantization codebook in RQ based on the vector reconstruction from previous steps.
1 code implementation • 11 Nov 2024 • Tom Sander, Pierre Fernandez, Alain Durmus, Teddy Furon, Matthijs Douze
Image watermarking methods are not tailored to handle small watermarked areas.
1 code implementation • 21 Oct 2024 • Zhuoming Chen, Ranajoy Sadhukhan, Zihao Ye, Yang Zhou, Jianyu Zhang, Niklas Nolte, Yuandong Tian, Matthijs Douze, Leon Bottou, Zhihao Jia, Beidi Chen
Large language models (LLMs) with long context windows have gained significant attention.
1 code implementation • 25 Sep 2024 • Harsha Vardhan Simhadri, Martin Aumüller, Amir Ingber, Matthijs Douze, George Williams, Magdalen Dobson Manohar, Dmitry Baranchuk, Edo Liberty, Frank Liu, Ben Landrum, Mazin Karjikar, Laxman Dhulipala, Meng Chen, Yue Chen, Rui Ma, Kai Zhang, Yuzheng Cai, Jiayang Shi, Yizhuo Chen, Weiguo Zheng, Zihao Wan, Jie Yin, Ben Huang
The 2023 Big ANN Challenge, held at NeurIPS 2023, focused on advancing the state-of-the-art in indexing data structures and search algorithms for practical variants of Approximate Nearest Neighbor (ANN) search that reflect the growing complexity and diversity of workloads.
no code implementations • 16 Mar 2024 • Gergely Szilvasy, Pierre-Emmanuel Mazaré, Matthijs Douze
Although convenient to compute, this metric is distantly related to the end-to-end accuracy of a full system that integrates vector search.
1 code implementation • 22 Feb 2024 • Tom Sander, Pierre Fernandez, Alain Durmus, Matthijs Douze, Teddy Furon
We discover that, on the contrary, it is possible to reliably determine if a language model was trained on synthetic data if that data is output by a watermarked LLM.
1 code implementation • 26 Jan 2024 • Iris A. M. Huijben, Matthijs Douze, Matthew Muckley, Ruud J. G. van Sloun, Jakob Verbeek
For example, QINCo achieves better nearest-neighbor search accuracy using 12-byte codes than the state-of-the-art UNQ using 16 bytes on the BigANN1M and Deep1M datasets.
1 code implementation • 16 Jan 2024 • Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, Hervé Jégou
Vector databases typically manage large collections of embedding vectors.
no code implementations • 17 Oct 2023 • Pierre Fernandez, Guillaume Couairon, Teddy Furon, Matthijs Douze
The rapid growth of transformer-based models increases the concerns about their integrity and ownership insurance.
no code implementations • ICCV 2023 • Dmitry Baranchuk, Matthijs Douze, Yash Upadhyay, I. Zeki Yalniz
We investigate the impact of this "content drift" for large-scale similarity search tools, based on nearest neighbor search in embedding space.
1 code implementation • 15 Jun 2023 • Ed Pizzi, Giorgos Kordopatis-Zilos, Hiral Patel, Gheorghe Postelnicu, Sugosh Nagavara Ravindra, Akshay Gupta, Symeon Papadopoulos, Giorgos Tolias, Matthijs Douze
The problem comprises two distinct but related tasks: determining whether a query video shares content with a reference video ("detection"), and additionally temporally localizing the shared content within each video ("localization").
3 code implementations • ICCV 2023 • Pierre Fernandez, Guillaume Couairon, Hervé Jégou, Matthijs Douze, Teddy Furon
For instance, it detects the origin of an image generated from a text prompt, then cropped to keep $10\%$ of the content, with $90$+$\%$ accuracy at a false positive rate below 10$^{-6}$.
1 code implementation • 5 Oct 2022 • Pierre Fernandez, Matthijs Douze, Hervé Jégou, Teddy Furon
First, a neural network maps an image to a vector representation, that is relatively robust to various transformations of the image.
no code implementations • 8 May 2022 • Harsha Vardhan Simhadri, George Williams, Martin Aumüller, Matthijs Douze, Artem Babenko, Dmitry Baranchuk, Qi Chen, Lucas Hosseini, Ravishankar Krishnaswamy, Gopal Srinivasa, Suhas Jayaram Subramanya, Jingdong Wang
The outcome of the competition was ranked leaderboards of algorithms in each track based on recall at a query throughput threshold.
2 code implementations • CVPR 2022 • Ed Pizzi, Sreya Dutta Roy, Sugosh Nagavara Ravindra, Priya Goyal, Matthijs Douze
We adapt this method to the copy detection task by changing the architecture and training objective, including a pooling operator from the instance matching literature, and adapting contrastive learning to augmentations that combine images.
no code implementations • 8 Feb 2022 • Zoë Papakipos, Giorgos Tolias, Tomas Jenicek, Ed Pizzi, Shuhei Yokoo, Wenhao Wang, Yifan Sun, Weipu Zhang, Yi Yang, Sanjay Addicam, Sergio Manuel Papadakis, Cristian Canton Ferrer, Ondrej Chum, Matthijs Douze
The 2021 Image Similarity Challenge introduced a dataset to serve as a new benchmark to evaluate recent image copy detection methods.
1 code implementation • 17 Dec 2021 • Pierre Fernandez, Alexandre Sablayrolles, Teddy Furon, Hervé Jégou, Matthijs Douze
We revisit watermarking techniques based on pre-trained deep networks, in the light of self-supervised approaches.
no code implementations • 17 Dec 2021 • Kenza Amara, Matthijs Douze, Alexandre Sablayrolles, Hervé Jégou
Modern approaches for fast retrieval of similar vectors on billion-scaled datasets rely on compressed-domain approaches such as binary sketches or product quantization.
no code implementations • 6 Dec 2021 • Guillaume Couairon, Matthieu Cord, Matthijs Douze, Holger Schwenk
We introduce the SIMAT dataset to evaluate the task of Image Retrieval with Multimodal queries.
1 code implementation • 17 Jun 2021 • Matthijs Douze, Giorgos Tolias, Ed Pizzi, Zoë Papakipos, Lowik Chanussot, Filip Radenovic, Tomas Jenicek, Maxim Maximov, Laura Leal-Taixé, Ismail Elezi, Ondřej Chum, Cristian Canton Ferrer
This benchmark is used for the Image Similarity Challenge at NeurIPS'21 (ISC2021).
Ranked #1 on
Image Similarity Detection
on DISC21 dev
12 code implementations • NeurIPS 2021 • Alaaeldin El-Nouby, Hugo Touvron, Mathilde Caron, Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, Hervé Jegou
We propose a "transposed" version of self-attention that operates across feature channels rather than tokens, where the interactions are based on the cross-covariance matrix between keys and queries.
Ranked #58 on
Instance Segmentation
on COCO minival
12 code implementations • ICCV 2021 • Ben Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Hervé Jégou, Matthijs Douze
We design a family of image classification architectures that optimize the trade-off between accuracy and efficiency in a high-speed regime.
Ranked #15 on
Image Classification
on Stanford Cars
37 code implementations • 23 Dec 2020 • Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou
In this work, we produce a competitive convolution-free transformer by training on Imagenet only.
Ranked #4 on
Efficient ViTs
on ImageNet-1K (with DeiT-S)
no code implementations • ICCV 2021 • Hugo Touvron, Alexandre Sablayrolles, Matthijs Douze, Matthieu Cord, Hervé Jégou
By jointly leveraging the coarse labels and the underlying fine-grained latent space, it significantly improves the accuracy of category-level retrieval methods.
Ranked #2 on
Learning with coarse labels
on cifar100
Fine-Grained Image Classification
Learning with coarse labels
+3
no code implementations • 13 Aug 2020 • Hugo Touvron, Matthijs Douze, Matthieu Cord, Hervé Jégou
We propose a simple architecture to address unpaired image-to-image translation tasks: style or class transfer, denoising, deblurring, deblocking, etc.
Ranked #1 on
Image-to-Image Translation
on vangogh2photo
(Frechet Inception Distance metric)
1 code implementation • 9 Jul 2020 • Yongqin Xian, Bruno Korbar, Matthijs Douze, Lorenzo Torresani, Bernt Schiele, Zeynep Akata
Few-shot learning aims to recognize novel classes from a few examples.
1 code implementation • 2 Jul 2020 • Eugene Kharitonov, Morgane Rivière, Gabriel Synnaeve, Lior Wolf, Pierre-Emmanuel Mazaré, Matthijs Douze, Emmanuel Dupoux
Contrastive Predictive Coding (CPC), based on predicting future segments of speech based on past segments is emerging as a powerful algorithm for representation learning of speech signal.
1 code implementation • 18 Mar 2020 • Hugo Touvron, Andrea Vedaldi, Matthijs Douze, Hervé Jégou
An EfficientNet-L2 pre-trained with weak supervision on 300M unlabeled images and further optimized with FixRes achieves 88. 5% top-1 accuracy (top-5: 98. 7%), which establishes the new state of the art for ImageNet with a single crop.
Ranked #9 on
Image Classification
on ImageNet ReaL
(using extra training data)
2 code implementations • ICML 2020 • Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou
The mark is robust to strong variations such as different architectures or optimization methods.
no code implementations • 16 Oct 2019 • Adrien Dufraux, Emmanuel Vincent, Awni Hannun, Armelle Brun, Matthijs Douze
The transcriptions used to train an Automatic Speech Recognition (ASR) system may contain errors.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 29 Aug 2019 • Alexandre Sablayrolles, Matthijs Douze, Yann Ollivier, Cordelia Schmid, Hervé Jégou
Membership inference determines, given a sample and trained parameters of a machine learning model, whether the sample was part of the training set.
3 code implementations • NeurIPS 2019 • Hugo Touvron, Andrea Vedaldi, Matthijs Douze, Hervé Jégou
Conversely, when training a ResNeXt-101 32x48d pre-trained in weakly-supervised fashion on 940 million public images at resolution 224x224 and further optimizing for test resolution 320x320, we obtain a test top-1 accuracy of 86. 4% (top-5: 98. 0%) (single-crop).
Ranked #2 on
Fine-Grained Image Classification
on Birdsnap
(using extra training data)
3 code implementations • 14 Feb 2019 • Maxim Berman, Hervé Jégou, Andrea Vedaldi, Iasonas Kokkinos, Matthijs Douze
When fed to a linear classifier, the learned embeddings provide state-of-the-art classification accuracy.
Ranked #1 on
Image Retrieval
on INRIA Holidays
no code implementations • ICLR 2019 • Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou
Convolutional neural networks memorize part of their training data, which is why strategies such as data augmentation and drop-out are employed to mitigate overfitting.
9 code implementations • ECCV 2018 • Mathilde Caron, Piotr Bojanowski, Armand Joulin, Matthijs Douze
In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features.
Ranked #5 on
Unsupervised Semantic Segmentation
on ImageNet-S-50
2 code implementations • ICLR 2019 • Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou
Discretizing multi-dimensional data distributions is a fundamental step of modern indexing methods.
1 code implementation • CVPR 2018 • Lorenzo Baraldi, Matthijs Douze, Rita Cucchiara, Hervé Jégou
This paper considers a learnable approach for comparing and aligning videos.
8 code implementations • CVPR 2018 • Matthijs Douze, Alexandre Sablayrolles, Hervé Jégou
Similarity search approaches based on graph walks have recently attained outstanding speed-accuracy trade-offs, taking aside the memory requirements.
no code implementations • 9 Aug 2017 • Matthijs Douze, Hervé Jégou, Jeff Johnson
While k-means is usually considered as the gold standard for this task, we evaluate and show the interest of diffusion methods that have been neglected by the state of the art, such as the Markov Clustering algorithm.
1 code implementation • CVPR 2018 • Matthijs Douze, Arthur Szlam, Bharath Hariharan, Hervé Jégou
This paper considers the problem of inferring image labels from images when only a few annotated examples are available at training time.
1 code implementation • WS 2017 • Holger Schwenk, Matthijs Douze
In this paper, we use the framework of neural machine translation to learn joint sentence representations across six very different languages.
Joint Multilingual Sentence Representations
Machine Translation
+2
14 code implementations • 28 Feb 2017 • Jeff Johnson, Matthijs Douze, Hervé Jégou
Similarity search finds application in specialized database systems handling complex data such as images or videos, which are typically represented by high-dimensional features and require specific indexing structures.
44 code implementations • 12 Dec 2016 • Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hérve Jégou, Tomas Mikolov
We consider the problem of producing compact architectures for text classification, such that the full model fits in a limited amount of memory.
1 code implementation • 21 Sep 2016 • Alexandre Sablayrolles, Matthijs Douze, Hervé Jégou, Nicolas Usunier
Hashing produces compact representations for documents, to perform tasks like classification or retrieval based on these short codes.
11 code implementations • 7 Sep 2016 • Matthijs Douze, Hervé Jégou, Florent Perronnin
This paper considers the problem of approximate nearest neighbor search in the compressed domain.
no code implementations • 1 Mar 2016 • Mattis Paulin, Julien Mairal, Matthijs Douze, Zaid Harchaoui, Florent Perronnin, Cordelia Schmid
Convolutional neural networks (CNNs) have recently received a lot of attention due to their ability to model local stationary structures in natural images in a multi-scale fashion, when learning all model parameters with supervision.
no code implementations • ICCV 2015 • Mattis Paulin, Matthijs Douze, Zaid Harchaoui, Julien Mairal, Florent Perronin, Cordelia Schmid
Patch-level descriptors underlie several important computer vision tasks, such as stereo-matching or content-based image retrieval.
no code implementations • 15 Aug 2015 • Danila Potapov, Matthijs Douze, Jerome Revaud, Zaid Harchaoui, Cordelia Schmid
While important advances were recently made towards temporally localizing and recognizing specific human actions or activities in videos, efficient detection and classification of long video chunks belonging to semantically defined categories such as "pursuit" or "romance" remains challenging. We introduce a new dataset, Action Movie Franchises, consisting of a collection of Hollywood action movie franchises.
1 code implementation • 8 Jun 2015 • Matthijs Douze, Jérôme Revaud, Jakob Verbeek, Hervé Jégou, Cordelia Schmid
We address the problem of specific video event retrieval.
no code implementations • CVPR 2013 • Jerome Revaud, Matthijs Douze, Cordelia Schmid, Herve Jegou
Furthermore, we extend product quantization to complex vectors in order to compress our descriptors, and to compare them in the compressed domain.