1 code implementation • 27 Feb 2024 • Tyler L. Hayes, César R. de Souza, Namil Kim, Jiwon Kim, Riccardo Volpi, Diane Larlus
In this work, we look at ways to extend a detector trained for a set of base classes so it can i) spot the presence of novel classes, and ii) automatically enrich its repertoire to be able to detect those newly discovered classes together with the base ones.
no code implementations • 17 Feb 2024 • Juliette Marrie, Michael Arbel, Julien Mairal, Diane Larlus
Large pretrained visual models exhibit remarkable generalization across diverse recognition tasks.
no code implementations • 14 Feb 2024 • Yannis Kalantidis, Mert Bülent Sarıyıldız, Rafael S. Rezende, Philippe Weinzaepfel, Diane Larlus, Gabriela Csurka
After expanding the training set, we propose a training approach that leverages the specificities and the underlying geometry of this mix of real and synthetic images.
no code implementations • CVPR 2023 • Juliette Marrie, Michael Arbel, Diane Larlus, Julien Mairal
Data augmentation is known to improve the generalization capabilities of neural networks, provided that the set of transformations is chosen with care, a selection often performed manually.
1 code implementation • NeurIPS 2023 • Vadim Tschernezki, Ahmad Darkhalil, Zhifan Zhu, David Fouhey, Iro Laina, Diane Larlus, Dima Damen, Andrea Vedaldi
Compared to other neural rendering datasets, EPIC Fields is better tailored to video understanding because it is paired with labelled action segments and the recent VISOR segment annotations.
no code implementations • 31 May 2023 • Subhankar Roy, Riccardo Volpi, Gabriela Csurka, Diane Larlus
Class-incremental semantic image segmentation assumes multiple model updates, each enriching the model to segment new categories.
no code implementations • CVPR 2023 • Mert Bulent Sariyildiz, Karteek Alahari, Diane Larlus, Yannis Kalantidis
We show that with minimal and class-agnostic prompt engineering, ImageNet clones are able to close a large part of the gap between models produced by synthetic images and models trained with real images, for the several standard classification benchmarks that we consider in this study.
no code implementations • 5 Oct 2022 • Jon Almazán, Byungsoo Ko, Geonmo Gu, Diane Larlus, Yannis Kalantidis
We address it with the proposed Grappa, an approach that starts from a strong pretrained model, and adapts it to tackle multiple retrieval tasks concurrently, using only unlabeled images from the different task domains.
no code implementations • 7 Sep 2022 • Vadim Tschernezki, Iro Laina, Diane Larlus, Andrea Vedaldi
We present Neural Feature Fusion Fields (N3F), a method that improves dense 2D image feature extractors when the latter are applied to the analysis of multiple images reconstructible as a 3D scene.
1 code implementation • 30 Jun 2022 • Mert Bulent Sariyildiz, Yannis Kalantidis, Karteek Alahari, Diane Larlus
We consider the problem of training a deep neural network on a given classification task, e. g., ImageNet-1K (IN1K), so that it excels at both the training task as well as at other (future) transfer tasks.
1 code implementation • CVPR 2022 • Riccardo Volpi, Pau de Jorge, Diane Larlus, Gabriela Csurka
We propose a new problem formulation and a corresponding evaluation framework to advance research on unsupervised domain adaptation for semantic image segmentation.
1 code implementation • ICLR 2022 • Ginger Delmas, Rafael Sampaio de Rezende, Gabriela Csurka, Diane Larlus
While the first provides rich and implicit context for the search, the latter explicitly calls for new traits, or specifies how some elements of the example image should be changed to retrieve the desired target image.
Ranked #11 on Image Retrieval on CIRR
1 code implementation • ICLR 2022 • Philippe Weinzaepfel, Thomas Lucas, Diane Larlus, Yannis Kalantidis
Second, they are typically trained with a global loss that only acts on top of an aggregation of local features; by contrast, testing is based on local feature matching, which creates a discrepancy between training and testing.
Ranked #3 on Image Retrieval on ROxford (Medium)
no code implementations • 25 Oct 2021 • Jonathan Munro, Michael Wray, Diane Larlus, Gabriela Csurka, Dima Damen
Given a gallery of uncaptioned video sequences, this paper considers the task of retrieving videos based on their relevance to an unseen text query.
no code implementations • 19 Oct 2021 • Vadim Tschernezki, Diane Larlus, Andrea Vedaldi
Given a raw video sequence taken from a freely-moving camera, we study the problem of decomposing the observed 3D scene into a static background and a dynamic foreground containing the objects that move in the video sequence.
1 code implementation • 18 Oct 2021 • Yannis Kalantidis, Carlos Lassance, Jon Almazan, Diane Larlus
Dimensionality reduction methods are unsupervised approaches which learn low-dimensional spaces where some properties of the initial space, typically the notion of "neighborhood", are preserved.
4 code implementations • CVPR 2021 • Sanghyuk Chun, Seong Joon Oh, Rafael Sampaio de Rezende, Yannis Kalantidis, Diane Larlus
Instead, we propose to use Probabilistic Cross-Modal Embedding (PCME), where samples from the different modalities are represented as probabilistic distributions in the common embedding space.
1 code implementation • ICCV 2021 • Mert Bulent Sariyildiz, Yannis Kalantidis, Diane Larlus, Karteek Alahari
In this paper, we argue that the semantic relationships between seen and unseen concepts affect generalization performance and propose ImageNet-CoG, a novel benchmark on the ImageNet-21K (IN-21K) dataset that enables measuring concept generalization in a principled way.
1 code implementation • 8 Dec 2020 • Andrés Mafla, Rafael Sampaio de Rezende, Lluís Gómez, Diane Larlus, Dimosthenis Karatzas
Then, armed with this dataset, we describe several approaches which leverage scene text, including a better scene-text aware cross-modal retrieval method which uses specialized representations for text from the captions and text from the visual scene, and reconcile them in a common embedding space.
no code implementations • CVPR 2021 • Riccardo Volpi, Diane Larlus, Grégory Rogez
In this context, we show that one way to learn models that are inherently more robust against forgetting is domain randomization - for vision tasks, randomizing the current domain's distribution with heavy image manipulations.
1 code implementation • NeurIPS 2020 • Yannis Kalantidis, Mert Bulent Sariyildiz, Noe Pion, Philippe Weinzaepfel, Diane Larlus
Based on these observations, and motivated by the success of data mixing, we propose hard negative mixing strategies at the feature level, that can be computed on-the-fly with a minimal computational overhead.
no code implementations • ECCV 2020 • Mert Bulent Sariyildiz, Julien Perez, Diane Larlus
Starting from the observation that captioned images are easily crawlable, we argue that this overlooked source of information can be exploited to supervise the training of visual representations.
no code implementations • ICCV 2019 • Michael Wray, Diane Larlus, Gabriela Csurka, Dima Damen
We report the first retrieval results on fine-grained actions for the large-scale EPIC dataset, in a generalised zero-shot setting.
no code implementations • ECCV 2018 • David Novotny, Samuel Albanie, Diane Larlus, Andrea Vedaldi
Object detection and instance segmentation are dominated by region-based methods such as Mask RCNN.
no code implementations • CVPR 2018 • David Novotny, Samuel Albanie, Diane Larlus, Andrea Vedaldi
Self-supervision can dramatically cut back the amount of manually-labelled data required to train deep neural networks.
no code implementations • 16 Jan 2018 • Jon Almazan, Bojana Gajic, Naila Murray, Diane Larlus
In this paper we adopt a different approach and carefully design each component of a simple deep architecture and, critically, the strategy for training it effectively for person re-identification.
no code implementations • CVPR 2017 • Albert Gordo, Diane Larlus
Following this observation, we learn a visual embedding of the images where the similarity in the visual space is correlated with their semantic similarity surrogate.
no code implementations • ICCV 2017 • David Novotny, Diane Larlus, Andrea Vedaldi
Traditional approaches for learning 3D object categories use either synthetic data or manual supervision.
no code implementations • CVPR 2017 • David Novotny, Diane Larlus, Andrea Vedaldi
Despite significant progress of deep learning in recent years, state-of-the-art semantic matching methods still rely on legacy features such as SIFT or HoG.
4 code implementations • 25 Oct 2016 • Albert Gordo, Jon Almazan, Jerome Revaud, Diane Larlus
Second, we build on the recent R-MAC descriptor, show that it can be interpreted as a deep and differentiable architecture, and present improvements to enhance it.
Ranked #13 on Image Retrieval on ROxford (Medium)
no code implementations • 5 Jul 2016 • David Novotny, Diane Larlus, Andrea Vedaldi
While recent research in image understanding has often focused on recognizing more types of objects, understanding more about the objects is just as important.
3 code implementations • 5 Apr 2016 • Albert Gordo, Jon Almazan, Jerome Revaud, Diane Larlus
We propose a novel approach for instance-level image retrieval.
Ranked #3 on Image Retrieval on Oxf105k
no code implementations • 3 Mar 2016 • Gabriela Csurka, Diane Larlus, Albert Gordo, Jon Almazan
In this article we study the problem of document image representation based on visual features.
no code implementations • Conference 2015 • Florent Perronnin, Diane Larlus
Fisher Vectors (FV) and Convolutional Neural Networks(CNN) are two image classification pipelines with different strengths.
no code implementations • 18 Apr 2015 • David Novotný, Diane Larlus, Florent Perronnin, Andrea Vedaldi
Fisher Vectors and related orderless visual statistics have demonstrated excellent performance in object detection, sometimes superior to established approaches such as the Deformable Part Models.
no code implementations • 19 Aug 2014 • Yangmuzi Zhang, Diane Larlus, Florent Perronnin
A natural approach to teaching a visual concept, e. g. a bird species, is to show relevant images.
no code implementations • 24 Jun 2014 • Neda Salamati, Diane Larlus, Gabriela Csurka, Sabine Süsstrunk
Based on a state-of-the-art segmentation framework and a novel manually segmented image database (both indoor and outdoor scenes) that contain 4-channel images (RGB+NIR), we study how to best incorporate the specific characteristics of the NIR response.