Search Results for author: Matthieu Cord

Found 122 papers, 72 papers with code

Mind-to-Image: Projecting Visual Mental Imagination of the Brain from fMRI

no code implementations • 8 Apr 2024 • Hugo Caselles-Dupré, Charles Mellerio, Paul Hérent, Alizée Lopez-Persem, Benoit Béranger, Mathieu Soularue, Pierre Fautrel, Gauthier Vernier, Matthieu Cord

The reconstruction of images observed by subjects from fMRI data collected during visual stimuli has made significant strides in the past decade, thanks to the availability of extensive fMRI datasets and advancements in generative models for image generation.

Image Generation

Paper
Add Code

FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models

no code implementations • 29 Mar 2024 • Barbara Toniella Corradini, Mustafa Shukor, Paul Couairon, Guillaume Couairon, Franco Scarselli, Matthieu Cord

The pipeline is as follows: the image is passed to both a captioner model (i. e. BLIP) and a diffusion model (i. e., Stable Diffusion Model) to generate a text description and visual representation, respectively.

Image Generation Image Segmentation +3

Paper
Add Code

UniTraj: A Unified Framework for Scalable Vehicle Trajectory Prediction

1 code implementation • 22 Mar 2024 • Lan Feng, Mohammadhossein Bahari, Kaouther Messaoud Ben Amor, Éloi Zablocki, Matthieu Cord, Alexandre Alahi

Vehicle trajectory prediction has increasingly relied on data-driven solutions, but their ability to scale to different data domains and the impact of larger dataset sizes on their generalization remain under-explored.

Ranked #1 on Trajectory Prediction on nuScenes (using extra training data)

Trajectory Prediction

Paper
Code

Improved Baselines for Data-efficient Perceptual Augmentation of LLMs

no code implementations • 20 Mar 2024 • Théophane Vallaeys, Mustafa Shukor, Matthieu Cord, Jakob Verbeek

The abilities of large language models (LLMs) have recently progressed to unprecedented levels, paving the way to novel applications in a wide variety of areas.

Audio captioning Image Captioning +2

Paper
Add Code

Manipulating Trajectory Prediction with Backdoors

no code implementations • 21 Dec 2023 • Kaouther Messaoud, Kathrin Grosse, Mickael Chen, Matthieu Cord, Patrick Pérez, Alexandre Alahi

In this paper, we focus on backdoors - a security threat acknowledged in other fields but so far overlooked for trajectory prediction.

Autonomous Vehicles Trajectory Prediction

Paper
Add Code

Reliability in Semantic Segmentation: Can We Use Synthetic Data?

no code implementations • 14 Dec 2023 • Thibaut Loiseau, Tuan-Hung Vu, Mickael Chen, Patrick Pérez, Matthieu Cord

Assessing the reliability of perception models to covariate shifts and out-of-distribution (OOD) detection is crucial for safety-critical applications such as autonomous vehicles.

Autonomous Vehicles Out of Distribution (OOD) Detection +1

Paper
Add Code

ManiPose: Manifold-Constrained Multi-Hypothesis 3D Human Pose Estimation

no code implementations • 11 Dec 2023 • Cédric Rommel, Victor Letzelter, Nermin Samet, Renaud Marlet, Matthieu Cord, Patrick Pérez, Eduardo Valle

Monocular 3D human pose estimation (3D-HPE) is an inherently ambiguous task, as a 2D pose in an image might originate from different possible 3D poses.

Monocular 3D Human Pose Estimation regression

Paper
Add Code

PointBeV: A Sparse Approach to BeV Predictions

1 code implementation • 1 Dec 2023 • Loick Chambon, Eloi Zablocki, Mickael Chen, Florent Bartoccioni, Patrick Perez, Matthieu Cord

To address this, we propose PointBeV, a novel sparse BeV segmentation model operating on sparse BeV cells instead of dense grids.

Ranked #1 on Bird's-Eye View Semantic Segmentation on Lyft Level 5

Bird's-Eye View Semantic Segmentation

Paper
Code

ToddlerDiffusion: Flash Interpretable Controllable Diffusion Model

no code implementations • 24 Nov 2023 • Eslam Mohamed BAKR, Liangbing Zhao, Vincent Tao Hu, Matthieu Cord, Patrick Perez, Mohamed Elhoseiny

Diffusion-based generative models excel in perceptually impressive synthesis but face challenges in interpretability.

Denoising Image Generation

Paper
Add Code

Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context Learning

1 code implementation • 1 Oct 2023 • Mustafa Shukor, Alexandre Rame, Corentin Dancette, Matthieu Cord

Based on our ICL study, (3) we push ICL further and propose new multimodal ICL variants such as; Multitask-ICL, Chain-of-Hindsight-ICL, and Self-Correcting-ICL.

In-Context Learning Instruction Following +1

Paper
Code

Gradpaint: Gradient-Guided Inpainting with Diffusion Models

no code implementations • 18 Sep 2023 • Asya Grechka, Guillaume Couairon, Matthieu Cord

For the specific task of image inpainting, the current guiding mechanism relies on copying-and-pasting the known regions from the input image at each denoising step.

Denoising Image Inpainting +1

Paper
Add Code

DiffHPE: Robust, Coherent 3D Human Pose Lifting with Diffusion

no code implementations • 4 Sep 2023 • Cédric Rommel, Eduardo Valle, Mickaël Chen, Souhaiel Khalfaoui, Renaud Marlet, Matthieu Cord, Patrick Pérez

We present an innovative approach to 3D Human Pose Estimation (3D-HPE) by integrating cutting-edge diffusion models, which have revolutionized diverse fields, but are relatively unexplored in 3D-HPE.

3D Human Pose Estimation

Paper
Add Code

UnIVAL: Unified Model for Image, Video, Audio and Language Tasks

1 code implementation • 30 Jul 2023 • Mustafa Shukor, Corentin Dancette, Alexandre Rame, Matthieu Cord

Our model is efficiently pretrained on many tasks, based on task balancing and multimodal curriculum learning.

Out-of-Distribution Generalization

218

Paper
Code

MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments

no code implementations • 18 Jul 2023 • Spyros Gidaris, Andrei Bursuc, Oriane Simeoni, Antonin Vobecky, Nikos Komodakis, Matthieu Cord, Patrick Pérez

Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks for very large fully-annotated datasets.

Representation Learning Self-Supervised Learning

Paper
Add Code

Zero-shot spatial layout conditioning for text-to-image diffusion models

no code implementations • ICCV 2023 • Guillaume Couairon, Marlène Careil, Matthieu Cord, Stéphane Lathuilière, Jakob Verbeek

Large-scale text-to-image diffusion models have significantly improved the state of the art in generative image modelling and allow for an intuitive and powerful user interface to drive the image generation process.

Image Generation Segmentation +1

Paper
Add Code

OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents

1 code implementation • NeurIPS 2023 • Hugo Laurençon, Lucile Saulnier, Léo Tronchon, Stas Bekman, Amanpreet Singh, Anton Lozhkov, Thomas Wang, Siddharth Karamcheti, Alexander M. Rush, Douwe Kiela, Matthieu Cord, Victor Sanh

Large multimodal models trained on natural documents, which interleave images and text, outperform models trained on image-text pairs on various multimodal benchmarks.

146

Paper
Code

Towards Motion Forecasting with Real-World Perception Inputs: Are End-to-End Approaches Competitive?

1 code implementation • 15 Jun 2023 • Yihong Xu, Loïck Chambon, Éloi Zablocki, Mickaël Chen, Alexandre Alahi, Matthieu Cord, Patrick Pérez

In fact, conventional forecasting methods are usually not trained nor tested in real-world pipelines (e. g., with upstream detection, tracking, and mapping modules).

Benchmarking Motion Forecasting

Paper
Code

Improving Selective Visual Question Answering by Learning from Your Peers

1 code implementation • CVPR 2023 • Corentin Dancette, Spencer Whitehead, Rishabh Maheshwary, Ramakrishna Vedantam, Stefan Scherer, Xinlei Chen, Matthieu Cord, Marcus Rohrbach

In this work, we explore Selective VQA in both in-distribution (ID) and OOD scenarios, where models are presented with mixtures of ID and OOD data.

Question Answering Visual Question Answering

Paper
Code

eP-ALM: Efficient Perceptual Augmentation of Language Models

1 code implementation • ICCV 2023 • Mustafa Shukor, Corentin Dancette, Matthieu Cord

In this work, we propose to rather direct effort to efficient adaptations of existing models, and propose to augment Language Models with perception.

In-Context Learning Visual Question Answering (VQA)

Paper
Code

PowerQuant: Automorphism Search for Non-Uniform Quantization

no code implementations • 24 Jan 2023 • Edouard Yvinec, Arnaud Dapogny, Matthieu Cord, Kevin Bailly

In this paper, we identity the uniformity of the quantization operator as a limitation of existing approaches, and propose a data-free non-uniform method.

Quantization

Paper
Add Code

Co-Training 2L Submodels for Visual Recognition

1 code implementation • CVPR 2023 • Hugo Touvron, Matthieu Cord, Maxime Oquab, Piotr Bojanowski, Jakob Verbeek, Hervé Jégou

Given a neural network to be trained, for each sample we implicitly instantiate two altered networks, "submodels", with stochastic depth: i. e. activating only a subset of the layers and skipping others.

Image Classification Semantic Segmentation

3,854

Paper
Code

Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization

1 code implementation • 20 Dec 2022 • Alexandre Ramé, Kartik Ahuja, Jianyu Zhang, Matthieu Cord, Léon Bottou, David Lopez-Paz

In this paper, we thus propose model ratatouille, a new strategy to recycle the multiple fine-tunings of the same foundation model on diverse auxiliary tasks.

Ranked #14 on Domain Generalization on PACS

Domain Generalization Out-of-Distribution Generalization

Paper
Code

Co-training $2^L$ Submodels for Visual Recognition

1 code implementation • 9 Dec 2022 • Hugo Touvron, Matthieu Cord, Maxime Oquab, Piotr Bojanowski, Jakob Verbeek, Hervé Jégou

We introduce submodel co-training, a regularization method related to co-training, self-distillation and stochastic depth.

Ranked #67 on Image Classification on ImageNet

Image Classification Semantic Segmentation

3,854

Paper
Code

Vision and Structured-Language Pretraining for Cross-Modal Food Retrieval

1 code implementation • 8 Dec 2022 • Mustafa Shukor, Nicolas Thome, Matthieu Cord

Finally, we validate the generalization of the approach to other tasks (i. e, Food Recognition) and domains with structured text such as the Medical domain on the ROCO dataset.

Ranked #1 on Cross-Modal Retrieval on Recipe1M+

Cross-Modal Retrieval Food Recognition +1

Paper
Code

CoMFormer: Continual Learning in Semantic and Panoptic Segmentation

no code implementations • CVPR 2023 • Fabio Cermelli, Matthieu Cord, Arthur Douillard

%a In this paper, we present the first continual learning model capable of operating on both semantic and panoptic segmentation.

Ranked #2 on Continual Semantic Segmentation on ADE20K

Continual Learning Continual Semantic Segmentation +2

Paper
Add Code

OCTET: Object-aware Counterfactual Explanations

1 code implementation • CVPR 2023 • Mehdi Zemni, Mickaël Chen, Éloi Zablocki, Hédi Ben-Younes, Patrick Pérez, Matthieu Cord

We conduct a set of experiments on counterfactual explanation benchmarks for driving scenes, and we show that our method can be adapted beyond classification, e. g., to explain semantic segmentation models.

Autonomous Driving counterfactual +4

Paper
Code

DiffEdit: Diffusion-based semantic image editing with mask guidance

4 code implementations • 20 Oct 2022 • Guillaume Couairon, Jakob Verbeek, Holger Schwenk, Matthieu Cord

Semantic image editing is an extension of image generation, with the additional constraint that the generated image should be as similar as possible to a given input image.

Image Generation

Paper
Code

Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment

1 code implementation • 29 Aug 2022 • Mustafa Shukor, Guillaume Couairon, Matthieu Cord

Vision and Language Pretraining has become the prevalent approach for tackling multimodal downstream tasks.

Retrieval Text Retrieval +4

Paper
Code

SInGE: Sparsity via Integrated Gradients Estimation of Neuron Relevance

no code implementations • 8 Jul 2022 • Edouard Yvinec, Arnaud Dapogny, Matthieu Cord, Kevin Bailly

The leap in performance in state-of-the-art computer vision methods is attributed to the development of deep neural networks.

Paper
Add Code

LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic Segmentation

1 code implementation • 27 Jun 2022 • Florent Bartoccioni, Éloi Zablocki, Andrei Bursuc, Patrick Pérez, Matthieu Cord, Karteek Alahari

Recent works in autonomous driving have widely adopted the bird's-eye-view (BEV) semantic map as an intermediate representation of the world.

Ranked #6 on Bird's-Eye View Semantic Segmentation on nuScenes

Autonomous Driving Bird's-Eye View Semantic Segmentation +1

Paper
Code

Dynamic Query Selection for Fast Visual Perceiver

no code implementations • 22 May 2022 • Corentin Dancette, Matthieu Cord

Transformers have been matching deep convolutional networks for vision architectures in recent works.

Paper
Add Code

Swapping Semantic Contents for Mixing Images

no code implementations • 20 May 2022 • Rémy Sun, Clément Masson, Gilles Hénaff, Nicolas Thome, Matthieu Cord

Deep architecture have proven capable of solving many tasks provided a sufficient amount of labeled data.

Data Augmentation

Paper
Add Code

Towards efficient feature sharing in MIMO architectures

no code implementations • 20 May 2022 • Rémy Sun, Alexandre Ramé, Clément Masson, Nicolas Thome, Matthieu Cord

To solve this issue, we propose a novel unmixing step in MIMO architectures that allows subnetworks to properly share features.

Paper
Add Code

Diverse Weight Averaging for Out-of-Distribution Generalization

2 code implementations • 19 May 2022 • Alexandre Ramé, Matthieu Kirchmeyer, Thibaud Rahier, Alain Rakotomamonjy, Patrick Gallinari, Matthieu Cord

Standard neural networks struggle to generalize under distribution shifts in computer vision.

Out-of-Distribution Generalization

Paper
Code

Multi-Head Distillation for Continual Unsupervised Domain Adaptation in Semantic Segmentation

1 code implementation • 25 Apr 2022 • Antoine Saporta, Arthur Douillard, Tuan-Hung Vu, Patrick Pérez, Matthieu Cord

Unsupervised Domain Adaptation (UDA) is a transfer learning task which aims at training on an unlabeled target domain by leveraging a labeled source domain.

Continual Learning Semantic Segmentation +2

Paper
Code

Transformer Decoders with MultiModal Regularization for Cross-Modal Food Retrieval

1 code implementation • 20 Apr 2022 • Mustafa Shukor, Guillaume Couairon, Asya Grechka, Matthieu Cord

We propose a new retrieval framework, T-Food (Transformer Decoders with MultiModal Regularization for Cross-Modal Food Retrieval) that exploits the interaction between modalities in a novel regularization scheme, while using only unimodal encoders at test time for efficient retrieval.

Ranked #3 on Cross-Modal Retrieval on Recipe1M

Cross-Modal Retrieval Retrieval

Paper
Code

DeiT III: Revenge of the ViT

9 code implementations • 14 Apr 2022 • Hugo Touvron, Matthieu Cord, Hervé Jégou

Our evaluations on Image classification (ImageNet-1k with and without pre-training on ImageNet-21k), transfer learning and semantic segmentation show that our procedure outperforms by a large margin previous fully supervised training recipes for ViT.

Ranked #1 on Image Classification on ImageNet ReaL (Number of params metric)

Data Augmentation Image Classification +3

29,648

Paper
Code

SPIQ: Data-Free Per-Channel Static Input Quantization

no code implementations • 28 Mar 2022 • Edouard Yvinec, Arnaud Dapogny, Matthieu Cord, Kevin Bailly

Computationally expensive neural networks are ubiquitous in computer vision and solutions for efficient inference have drawn a growing attention in the machine learning community.

Data Free Quantization object-detection +2

Paper
Add Code

Three things everyone should know about Vision Transformers

5 code implementations • 18 Mar 2022 • Hugo Touvron, Matthieu Cord, Alaaeldin El-Nouby, Jakob Verbeek, Hervé Jégou

(2) Fine-tuning the weights of the attention layers is sufficient to adapt vision transformers to a higher resolution and to other classification tasks.

Ranked #8 on Image Classification on CIFAR-10 (using extra training data)

Fine-Grained Image Classification

29,648

Paper
Code

FlexIT: Towards Flexible Semantic Image Translation

1 code implementation • CVPR 2022 • Guillaume Couairon, Asya Grechka, Jakob Verbeek, Holger Schwenk, Matthieu Cord

Via the latent space of an auto-encoder, we iteratively transform the input image toward the target point, ensuring coherence and quality with a variety of novel regularization terms.

Image Generation Translation

Paper
Code

Augmenting Convolutional networks with attention-based aggregation

5 code implementations • 27 Dec 2021 • Hugo Touvron, Matthieu Cord, Alaaeldin El-Nouby, Piotr Bojanowski, Armand Joulin, Gabriel Synnaeve, Hervé Jégou

We show how to augment any convolutional network with an attention-based global map to achieve non-local reasoning.

Ranked #38 on Semantic Segmentation on ADE20K val

Classification Image Classification +3

3,854

Paper
Code

CSG0: Continual Urban Scene Generation with Zero Forgetting

no code implementations • 6 Dec 2021 • Himalaya Jain, Tuan-Hung Vu, Patrick Pérez, Matthieu Cord

With the rapid advances in generative adversarial networks (GANs), the visual quality of synthesised scenes keeps improving, including for complex urban scenes with applications to automated driving.

Continual Learning Scene Generation +1

Paper
Add Code

Embedding Arithmetic of Multimodal Queries for Image Retrieval

no code implementations • 6 Dec 2021 • Guillaume Couairon, Matthieu Cord, Matthijs Douze, Holger Schwenk

We introduce the SIMAT dataset to evaluate the task of Image Retrieval with Multimodal queries.

Image Retrieval Image-text matching +3

Paper
Add Code

RED : Looking for Redundancies for Data-FreeStructured Compression of Deep Neural Networks

no code implementations • NeurIPS 2021 • Edouard Yvinec, Arnaud Dapogny, Matthieu Cord, Kevin Bailly

Deep Neural Networks (DNNs) are ubiquitous in today's computer vision landscape, despite involving considerable computational costs.

Paper
Add Code

DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion

1 code implementation • CVPR 2022 • Arthur Douillard, Alexandre Ramé, Guillaume Couairon, Matthieu Cord

Our strategy scales to a large number of tasks while having negligible memory and time overheads due to strict control of the parameters expansion.

Ranked #2 on Incremental Learning on ImageNet - 10 steps

Class Incremental Learning Incremental Learning

134

Paper
Code

STEEX: Steering Counterfactual Explanations with Semantics

1 code implementation • 17 Nov 2021 • Paul Jacob, Éloi Zablocki, Hédi Ben-Younes, Mickaël Chen, Patrick Pérez, Matthieu Cord

In this work, we address the problem of producing counterfactual explanations for high-quality images and complex scenes.

counterfactual Counterfactual Explanation

Paper
Code

Look at the Variance! Efficient Black-box Explanations with Sobol-based Sensitivity Analysis

1 code implementation • NeurIPS 2021 • Thomas Fel, Remi Cadene, Mathieu Chalvidal, Matthieu Cord, David Vigouroux, Thomas Serre

We describe a novel attribution method which is grounded in Sensitivity Analysis and uses Sobol indices.

Paper
Code

RED++ : Data-Free Pruning of Deep Neural Networks via Input Splitting and Output Merging

no code implementations • 30 Sep 2021 • Edouard Yvinec, Arnaud Dapogny, Matthieu Cord, Kevin Bailly

Pruning Deep Neural Networks (DNNs) is a prominent field of study in the goal of inference runtime acceleration.

Paper
Add Code

Effective Uncertainty Estimation with Evidential Models for Open-World Recognition

no code implementations • 29 Sep 2021 • Charles Corbière, Marc Lafon, Nicolas Thome, Matthieu Cord, Patrick Perez

A crucial property of KLoS is to be a class-wise divergence measure built from in-distribution samples and to not require OOD training data, in contrast to current second-order uncertainty measures.

Paper
Add Code

Raising context awareness in motion forecasting

1 code implementation • 16 Sep 2021 • Hédi Ben-Younes, Éloi Zablocki, Mickaël Chen, Patrick Pérez, Matthieu Cord

Learning-based trajectory prediction models have encountered great success, with the promise of leveraging contextual information in addition to motion history.

Motion Forecasting Trajectory Prediction

Paper
Code

LiDARTouch: Monocular metric depth estimation with a few-beam LiDAR

1 code implementation • 8 Sep 2021 • Florent Bartoccioni, Éloi Zablocki, Patrick Pérez, Matthieu Cord, Karteek Alahari

In such a monocular setup, dense depth is obtained with either additional input from one or several expensive LiDARs, e. g., with 64 beams, or camera-only methods, which suffer from scale-ambiguity and infinite-depth problems.

Depth Completion Depth Estimation

Paper
Code

Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization

2 code implementations • 7 Sep 2021 • Alexandre Rame, Corentin Dancette, Matthieu Cord

In this paper, we introduce a new regularization - named Fishr - that enforces domain invariance in the space of the gradients of the loss: specifically, the domain-level variances of gradients are matched across training domains.

Ranked #22 on Domain Generalization on TerraIncognita

Domain Generalization Out-of-Distribution Generalization

1,327

Paper
Code

Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation

1 code implementation • ICCV 2021 • Antoine Saporta, Tuan-Hung Vu, Matthieu Cord, Patrick Pérez

In this work, we address the task of unsupervised domain adaptation (UDA) for semantic segmentation in presence of multiple target domains: The objective is to train a single model that can handle all these domains at test time.

Segmentation Semantic Segmentation +2

Paper
Code

Tackling Catastrophic Forgetting and Background Shift in Continual Semantic Segmentation

1 code implementation • 29 Jun 2021 • Arthur Douillard, Yifu Chen, Arnaud Dapogny, Matthieu Cord

classes predicted by the old model to deal with background shift and avoid catastrophic forgetting of the old classes.

Ranked #6 on Overlapped 15-1 on PASCAL VOC 2012

Class Incremental Learning Continual Semantic Segmentation +5

138

Paper
Code

Semantic Palette: Guiding Scene Generation with Class Proportions

1 code implementation • CVPR 2021 • Guillaume Le Moing, Tuan-Hung Vu, Himalaya Jain, Patrick Pérez, Matthieu Cord

Despite the recent progress of generative adversarial networks (GANs) at synthesizing photo-realistic images, producing complex urban scenes remains a challenging problem.

Data Augmentation Image Generation +1

Paper
Code

RED : Looking for Redundancies for Data-Free Structured Compression of Deep Neural Networks

no code implementations • 31 May 2021 • Edouard Yvinec, Arnaud Dapogny, Matthieu Cord, Kevin Bailly

Deep Neural Networks (DNNs) are ubiquitous in today's computer vision land-scape, despite involving considerable computational costs.

Paper
Add Code

ResMLP: Feedforward networks for image classification with data-efficient training

15 code implementations • NeurIPS 2021 • Hugo Touvron, Piotr Bojanowski, Mathilde Caron, Matthieu Cord, Alaaeldin El-Nouby, Edouard Grave, Gautier Izacard, Armand Joulin, Gabriel Synnaeve, Jakob Verbeek, Hervé Jégou

We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification.

Ranked #1 on Image Classification on Certificate Verification

Data Augmentation Fine-Grained Image Classification +4

29,648

Paper
Code

Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering

1 code implementation • ICCV 2021 • Corentin Dancette, Remi Cadene, Damien Teney, Matthieu Cord

We use this new evaluation in a large-scale study of existing approaches for VQA.

Ranked #1 on Visual Question Answering (VQA) on VQA-CE

Question Answering Visual Question Answering

Paper
Code

Going deeper with Image Transformers

19 code implementations • ICCV 2021 • Hugo Touvron, Matthieu Cord, Alexandre Sablayrolles, Gabriel Synnaeve, Hervé Jégou

In particular, we investigate the interplay of architecture and optimization of such dedicated transformers.

Ranked #5 on Image Classification on CIFAR-10 (using extra training data)

Image Classification Transfer Learning

29,648

Paper
Code

MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks

1 code implementation • ICCV 2021 • Alexandre Rame, Remy Sun, Matthieu Cord

Recent strategies achieved ensembling "for free" by fitting concurrently diverse subnetworks inside a single base network.

Ranked #15 on Image Classification on Tiny ImageNet Classification

Image Classification

Paper
Code

DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation

no code implementations • ICLR 2021 • Alexandre Rame, Matthieu Cord

Deep ensembles perform better than a single network thanks to the diversity among their members.

Out-of-Distribution Detection

Paper
Add Code

Explainability of deep vision-based autonomous driving systems: Review and challenges

no code implementations • 13 Jan 2021 • Éloi Zablocki, Hédi Ben-Younes, Patrick Pérez, Matthieu Cord

The concept of explainability has several facets and the need for explainability is strong in driving, a safety-critical application.

Autonomous Driving Explainable artificial intelligence

Paper
Add Code

Training data-efficient image transformers & distillation through attention

33 code implementations • 23 Dec 2020 • Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou

In this work, we produce a competitive convolution-free transformer by training on Imagenet only.

Ranked #4 on Efficient ViTs on ImageNet-1K (with DeiT-S)

Document Image Classification Document Layout Analysis +2

124,457

Paper
Code

OBoW: Online Bag-of-Visual-Words Generation for Self-Supervised Learning

2 code implementations • CVPR 2021 • Spyros Gidaris, Andrei Bursuc, Gilles Puy, Nikos Komodakis, Matthieu Cord, Patrick Pérez

With this in mind, we propose a teacher-student scheme to learn representations by training a convolutional net to reconstruct a bag-of-visual-words (BoW) representation of an image, given as input a perturbed version of that same image.

Ranked #18 on Semi-Supervised Image Classification on ImageNet - 1% labeled data (Top 5 Accuracy metric)

object-detection Object Detection +5

Paper
Code

Confidence Estimation via Auxiliary Models

no code implementations • 11 Dec 2020 • Charles Corbière, Nicolas Thome, Antoine Saporta, Tuan-Hung Vu, Matthieu Cord, Patrick Pérez

In this paper, we introduce a novel target criterion for model confidence, namely the true class probability (TCP).

Domain Adaptation Image Classification +1

Paper
Add Code

Driving Behavior Explanation with Multi-level Fusion

1 code implementation • 9 Dec 2020 • Hédi Ben-Younes, Éloi Zablocki, Patrick Pérez, Matthieu Cord

In this era of active development of autonomous vehicles, it becomes crucial to provide driving systems with the capacity to explain their decisions.

Explainable artificial intelligence Trajectory Prediction

Paper
Code

Detecting 32 Pedestrian Attributes for Autonomous Vehicles

1 code implementation • 4 Dec 2020 • Taylor Mordan, Matthieu Cord, Patrick Pérez, Alexandre Alahi

By increasing the number of attributes jointly learned, we highlight an issue related to the scales of gradients, which arises in MTL with numerous tasks.

Attribute Autonomous Driving +1

Paper
Code

Grafit: Learning fine-grained image representations with coarse labels

no code implementations • ICCV 2021 • Hugo Touvron, Alexandre Sablayrolles, Matthijs Douze, Matthieu Cord, Hervé Jégou

By jointly leveraging the coarse labels and the underlying fine-grained latent space, it significantly improves the accuracy of category-level retrieval methods.

Ranked #2 on Learning with coarse labels on cifar100

Fine-Grained Image Classification Learning with coarse labels +3

Paper
Add Code

PLOP: Learning without Forgetting for Continual Semantic Segmentation

1 code implementation • CVPR 2021 • Arthur Douillard, Yifu Chen, Arnaud Dapogny, Matthieu Cord

classes predicted by the old model to deal with background shift and avoid catastrophic forgetting of the old classes.

Ranked #1 on Domain 11-5 on Cityscapes val

Class Incremental Learning Continual Semantic Segmentation +16

138

Paper
Code

Powers of layers for image-to-image translation

no code implementations • 13 Aug 2020 • Hugo Touvron, Matthijs Douze, Matthieu Cord, Hervé Jégou

We propose a simple architecture to address unpaired image-to-image translation tasks: style or class transfer, denoising, deblurring, deblocking, etc.

Ranked #1 on Image-to-Image Translation on horse2zebra (Frechet Inception Distance metric)

Deblurring Denoising +2

Paper
Add Code

Insights from the Future for Continual Learning

1 code implementation • 24 Jun 2020 • Arthur Douillard, Eduardo Valle, Charles Ollion, Thomas Robert, Matthieu Cord

Continual learning aims to learn tasks sequentially, with (often severe) constraints on the storage of old learning samples, without suffering from catastrophic forgetting.

Class Incremental Learning Representation Learning +1

367

Paper
Code

Overcoming Statistical Shortcuts for Open-ended Visual Counting

1 code implementation • 17 Jun 2020 • Corentin Dancette, Remi Cadene, Xinlei Chen, Matthieu Cord

First, we propose the Modifying Count Distribution (MCD) protocol, which penalizes models that over-rely on statistical shortcuts.

Paper
Code

ESL: Entropy-guided Self-supervised Learning for Domain Adaptation in Semantic Segmentation

1 code implementation • 15 Jun 2020 • Antoine Saporta, Tuan-Hung Vu, Matthieu Cord, Patrick Pérez

While fully-supervised deep learning yields good models for urban scene semantic segmentation, these models struggle to generalize to new environments with different lighting or weather conditions for instance.

Self-Supervised Learning Semantic Segmentation +1

Paper
Code

PODNet: Pooled Outputs Distillation for Small-Tasks Incremental Learning

2 code implementations • ECCV 2020 • Arthur Douillard, Matthieu Cord, Charles Ollion, Thomas Robert, Eduardo Valle

Lifelong learning has attracted much attention, but existing works still struggle to fight catastrophic forgetting and accumulate knowledge over long stretches of incremental learning.

Ranked #1 on Incremental Learning on CIFAR-100 - 50 classes + 50 steps of 1 class

Class Incremental Learning Incremental Learning +1

681

Paper
Code

Deep Entwined Learning Head Pose and Face Alignment Inside an Attentional Cascade with Doubly-Conditional fusion

no code implementations • 14 Apr 2020 • Arnaud Dapogny, Kévin Bailly, Matthieu Cord

Head pose estimation and face alignment constitute a backbone preprocessing for many applications relying on face analysis.

Face Alignment Head Pose Estimation

Paper
Add Code

Handling new target classes in semantic segmentation with domain adaptation

no code implementations • 2 Apr 2020 • Maxime Bucher, Tuan-Hung Vu, Matthieu Cord, Patrick Pérez

In this work, we define and address a novel domain adaptation (DA) problem in semantic scene segmentation, where the target domain not only exhibits a data distribution shift w. r. t.

Scene Segmentation Universal Domain Adaptation +2

Paper
Add Code

Learning Representations by Predicting Bags of Visual Words

1 code implementation • CVPR 2020 • Spyros Gidaris, Andrei Bursuc, Nikos Komodakis, Patrick Pérez, Matthieu Cord

Inspired by the success of NLP methods in this area, in this work we propose a self-supervised approach based on spatially dense image descriptions that encode discrete visual concepts, here called visual words.

Representation Learning

Paper
Code

QUEST: Quantized embedding space for transferring knowledge

1 code implementation • ECCV 2020 • Himalaya Jain, Spyros Gidaris, Nikos Komodakis, Patrick Pérez, Matthieu Cord

Knowledge distillation refers to the process of training a compact student network to achieve better accuracy by learning from a high capacity teacher network.

Knowledge Distillation

Paper
Code

Addressing Failure Detection by Learning Model Confidence

1 code implementation • NeurIPS 2019 • Charles Corbière, Nicolas Thome, Avner Bar-Hen, Matthieu Cord, Patrick Pérez

In this paper, we propose a new target criterion for model confidence, corresponding to the True Class Probability (TCP).

Image Classification Semantic Segmentation

159

Paper
Code

RUBi: Reducing Unimodal Biases for Visual Question Answering

1 code implementation • NeurIPS 2019 • Remi Cadene, Corentin Dancette, Hedi Ben Younes, Matthieu Cord, Devi Parikh

We propose RUBi, a new learning strategy to reduce biases in any VQA model.

Question Answering Visual Question Answering

Paper
Code

This dataset does not exist: training models from generated images

no code implementations • 7 Nov 2019 • Victor Besnier, Himalaya Jain, Andrei Bursuc, Matthieu Cord, Patrick Pérez

This naturally brings the question: Can we train a classifier only on the generated data?

Paper
Add Code

REVE: Regularizing Deep Learning with Variational Entropy Bound

no code implementations • 15 Oct 2019 • Antoine Saporta, Yifu Chen, Michael Blot, Matthieu Cord

Studies on generalization performance of machine learning algorithms under the scope of information theory suggest that compressed representations can guarantee good generalization, inspiring many compression-based regularization methods.

Paper
Add Code

Addressing Failure Prediction by Learning Model Confidence

1 code implementation • NeurIPS 2019 • Charles Corbière, Nicolas Thome, Avner Bar-Hen, Matthieu Cord, Patrick Pérez

In this paper, we propose a new target criterion for model confidence, corresponding to the True Class Probability (TCP).

Image Classification Semantic Segmentation

159

Paper
Code

Riemannian batch normalization for SPD neural networks

no code implementations • NeurIPS 2019 • Daniel Brooks, Olivier Schwander, Frederic Barbaresco, Jean-Yves Schneider, Matthieu Cord

Covariance matrices have attracted attention for machine learning applications due to their capacity to capture interesting structure in the data.

Action Recognition

Paper
Add Code

RUBi: Reducing Unimodal Biases in Visual Question Answering

1 code implementation • 24 Jun 2019 • Remi Cadene, Corentin Dancette, Hedi Ben-Younes, Matthieu Cord, Devi Parikh

We propose RUBi, a new learning strategy to reduce biases in any VQA model.

Ranked #7 on Visual Question Answering (VQA) on VQA-CP

Question Answering Visual Question Answering

Paper
Code

Boosting Few-Shot Visual Learning with Self-Supervision

1 code implementation • ICCV 2019 • Spyros Gidaris, Andrei Bursuc, Nikos Komodakis, Patrick Pérez, Matthieu Cord

Few-shot learning and self-supervised learning address different facets of the same problem: how to train a model with little or no labeled data.

Few-Shot Learning Self-Supervised Learning

138

Paper
Code

Zero-Shot Semantic Segmentation

2 code implementations • NeurIPS 2019 • Maxime Bucher, Tuan-Hung Vu, Matthieu Cord, Patrick Pérez

Semantic segmentation models are limited in their ability to scale to large numbers of object classes.

Ranked #1 on Zero-Shot Learning on PASCAL Context

General Classification Segmentation +4

179

Paper
Code

DualDis: Dual-Branch Disentangling with Adversarial Learning

1 code implementation • 3 Jun 2019 • Thomas Robert, Nicolas Thome, Matthieu Cord

To effectively separate the information, we propose to use a combination of regular and adversarial classifiers to guide the two branches in specializing for class and attribute information respectively.

Attribute Data Augmentation +2

Paper
Code

The Missing Data Encoder: Cross-Channel Image Completion\\with Hide-And-Seek Adversarial Network

no code implementations • 6 May 2019 • Arnaud Dapogny, Matthieu Cord, Patrick Perez

Image completion is the problem of generating whole images from fragments only.

Colorization Occlusion Handling +1

Paper
Add Code

SEMEDA: Enhancing Segmentation Precision with Semantic Edge Aware Loss

no code implementations • 6 May 2019 • Yifu Chen, Arnaud Dapogny, Matthieu Cord

As a result, the predictions outputted by such networks usually struggle to accurately capture the object boundaries and exhibit holes inside the objects.

Edge Detection Segmentation +1

Paper
Add Code

SoDeep: a Sorting Deep net to learn ranking loss surrogates

1 code implementation • CVPR 2019 • Martin Engilberge, Louis Chevallier, Patrick Pérez, Matthieu Cord

Our approach is based on a deep architecture that approximates the sorting of arbitrary sets of scores.

Image Retrieval Multi-Label Image Classification +1

Paper
Code

DeCaFA: Deep Convolutional Cascade for Face Alignment In The Wild

no code implementations • ICCV 2019 • Arnaud Dapogny, Kévin Bailly, Matthieu Cord

Face Alignment is an active computer vision domain, that consists in localizing a number of facial landmarks that vary across datasets.

Ranked #22 on Face Alignment on WFLW

Face Alignment

Paper
Add Code

DADA: Depth-aware Domain Adaptation in Semantic Segmentation

2 code implementations • ICCV 2019 • Tuan-Hung Vu, Himalaya Jain, Maxime Bucher, Matthieu Cord, Patrick Pérez

As a result, the performance of the trained semantic segmentation model on the target domain is boosted.

Ranked #17 on Image-to-Image Translation on SYNTHIA-to-Cityscapes

Image-to-Image Translation Segmentation +2

375

Paper
Code

MUREL: Multimodal Relational Reasoning for Visual Question Answering

1 code implementation • CVPR 2019 • Remi Cadene, Hedi Ben-Younes, Matthieu Cord, Nicolas Thome

In this paper, we propose MuRel, a multimodal relational network which is learned end-to-end to reason over real images.

Ranked #1 on Visual Question Answering (VQA) on TDIUC

Relational Reasoning Visual Question Answering

193

Paper
Code

BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection

1 code implementation • 31 Jan 2019 • Hedi Ben-Younes, Rémi Cadene, Nicolas Thome, Matthieu Cord

We demonstrate the practical interest of our fusion model by using BLOCK for two challenging tasks: Visual Question Answering (VQA) and Visual Relationship Detection (VRD), where we design end-to-end learnable architectures for representing relevant interactions between modalities.

Ranked #2 on Visual Relationship Detection on VRD Phrase Detection

Question Answering Relationship Detection +4

333

Paper
Code

Revisiting Multi-Task Learning with ROCK: a Deep Residual Auxiliary Block for Visual Detection

1 code implementation • NeurIPS 2018 • Taylor Mordan, Nicolas Thome, Gilles Henaff, Matthieu Cord

Multi-Task Learning (MTL) is appealing for deep learning regularization.

Depth Estimation Depth Prediction +5

Paper
Code

ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation

4 code implementations • CVPR 2019 • Tuan-Hung Vu, Himalaya Jain, Maxime Bucher, Matthieu Cord, Patrick Pérez

Semantic segmentation is a key problem for many computer vision tasks.

Ranked #4 on Domain Adaptation on Panoptic SYNTHIA-to-Mapillary

Segmentation Semantic Segmentation +2

3,124

Paper
Code

HybridNet: Classification and Reconstruction Cooperation for Semi-Supervised Learning

no code implementations • ECCV 2018 • Thomas Robert, Nicolas Thome, Matthieu Cord

In this paper, we introduce a new model for leveraging unlabeled data to improve generalization performances of image classifiers: a two-branch encoder-decoder architecture called HybridNet.

Ranked #53 on Image Classification on STL-10

Classification General Classification +1

Paper
Add Code

Manifold Learning in Quotient Spaces

no code implementations • CVPR 2018 • Ãloi Mehr, AndrÃ© Lieutier, Fernando Sanchez Bermudez, Vincent Guitteny, Nicolas Thome, Matthieu Cord

Typically, we propose to quotient the space of 3D models by the action of rotations.

Paper
Add Code

SHADE: Information-Based Regularization for Deep Learning

1 code implementation • 14 May 2018 • Michael Blot, Thomas Robert, Nicolas Thome, Matthieu Cord

Regularization is a big issue for training deep neural networks.

General Classification

Paper
Code

Images & Recipes: Retrieval in the cooking context

1 code implementation • 2 May 2018 • Micael Carvalho, Rémi Cadène, David Picard, Laure Soulier, Matthieu Cord

Recent advances in the machine learning community allowed different use cases to emerge, as its association to domains like cooking which created the computational cuisine.

BIG-bench Machine Learning Retrieval

Paper
Code

Cross-Modal Retrieval in the Cooking Context: Learning Semantic Text-Image Embeddings

1 code implementation • 30 Apr 2018 • Micael Carvalho, Rémi Cadène, David Picard, Laure Soulier, Nicolas Thome, Matthieu Cord

Designing powerful tools that support cooking activities has rapidly gained popularity due to the massive amounts of available data, as well as recent advances in machine learning that are capable of analyzing them.

Ranked #9 on Cross-Modal Retrieval on Recipe1M

BIG-bench Machine Learning Cross-Modal Retrieval +1

Paper
Code

SHADE: Information Based Regularization for Deep Learning

no code implementations • 29 Apr 2018 • Michael Blot, Thomas Robert, Nicolas Thome, Matthieu Cord

Regularization is a big issue for training deep neural networks.

General Classification

Paper
Add Code

Finding beans in burgers: Deep semantic-visual embedding with localization

1 code implementation • CVPR 2018 • Martin Engilberge, Louis Chevallier, Patrick Pérez, Matthieu Cord

Several works have proposed to learn a two-path neural network that maps images and texts, respectively, to a same shared Euclidean space where geometry captures useful semantic relationships.

Cross-Modal Retrieval Image Captioning +2

Paper
Code

GoSGD: Distributed Optimization for Deep Learning with Gossip Exchange

no code implementations • 4 Apr 2018 • Michael Blot, David Picard, Matthieu Cord

We address the issue of speeding up the training of convolutional neural networks by studying a distributed method adapted to stochastic gradient descent.

Distributed Optimization

Paper
Add Code

SHADE: SHAnnon DEcay Information-Based Regularization for Deep Learning

no code implementations • ICLR 2018 • Michael Blot, Thomas Robert, Nicolas Thome, Matthieu Cord

Regularization is a big issue for training deep neural networks.

General Classification

Paper
Add Code

Deformable Part-based Fully Convolutional Network for Object Detection

no code implementations • 19 Jul 2017 • Taylor Mordan, Nicolas Thome, Matthieu Cord, Gilles Henaff

Existing region-based object detectors are limited to regions with fixed box geometry to represent objects, even if those are highly non-rectangular.

Object object-detection +1

Paper
Add Code

WILDCAT: Weakly Supervised Learning of Deep ConvNets for Image Classification, Pointwise Localization and Segmentation

2 code implementations • CVPR 2017 • Thibaut Durand, Taylor Mordan, Nicolas Thome, Matthieu Cord

This paper introduces WILDCAT, a deep learning method which jointly aims at aligning image regions for gaining spatial invariance and learning strongly localized features.

Ranked #3 on Weakly Supervised Object Detection on MS COCO

General Classification Image Classification +4

264

Paper
Code

MUTAN: Multimodal Tucker Fusion for Visual Question Answering

6 code implementations • ICCV 2017 • Hedi Ben-Younes, Rémi Cadene, Matthieu Cord, Nicolas Thome

Bilinear models provide an appealing framework for mixing and merging information in Visual Question Answering (VQA) tasks.

Ranked #35 on Visual Question Answering (VQA) on VQA v2 test-std

Visual Question Answering

698

Paper
Code

Gossip training for deep learning

1 code implementation • 29 Nov 2016 • Michael Blot, David Picard, Matthieu Cord, Nicolas Thome

We address the issue of speeding up the training of convolutional networks.

Paper
Code

Maxmin convolutional neural networks for image classification

no code implementations • 25 Oct 2016 • Michael Blot, Matthieu Cord, Nicolas Thome

Convolutional neural networks (CNN) are widely used in computer vision, especially in image classification.

Classification General Classification +2

Paper
Add Code

Master's Thesis : Deep Learning for Visual Recognition

1 code implementation • 18 Oct 2016 • Rémi Cadène, Nicolas Thome, Matthieu Cord

Our last contribution is a framework, build on top of Torch7, for training and testing deep models on any visual recognition tasks and on datasets of any scale.

Weakly-supervised Learning

Paper
Code

M2CAI Workflow Challenge: Convolutional Neural Networks with Time Smoothing and Hidden Markov Model for Video Frames Classification

1 code implementation • 18 Oct 2016 • Rémi Cadène, Thomas Robert, Nicolas Thome, Matthieu Cord

Our approach is among the three best to tackle the M2CAI Workflow challenge.

General Classification

Paper
Code

Closed-Form Training of Mahalanobis Distance for Supervised Clustering

no code implementations • CVPR 2016 • Marc T. Law, Yao-Liang Yu, Matthieu Cord, Eric P. Xing

Clustering is the task of grouping a set of objects so that objects in the same cluster are more similar to each other than to those in other clusters.

Clustering Metric Learning +1

Paper
Add Code

WELDON: Weakly Supervised Learning of Deep Convolutional Neural Networks

1 code implementation • CVPR 2016 • Thibaut Durand, Nicolas Thome, Matthieu Cord

In this paper, we introduce a novel framework for WEakly supervised Learning of Deep cOnvolutional neural Networks (WELDON).

Multiple Instance Learning Weakly-supervised Learning

Paper
Code

Deep Neural Networks Under Stress

1 code implementation • 11 May 2016 • Micael Carvalho, Matthieu Cord, Sandra Avila, Nicolas Thome, Eduardo Valle

In recent years, deep architectures have been used for transfer learning with state-of-the-art performance in many datasets.

Transfer Learning

Paper
Code

MANTRA: Minimum Maximum Latent Structural SVM for Image Classification and Ranking

no code implementations • ICCV 2015 • Thibaut Durand, Nicolas Thome, Matthieu Cord

For ranking, we propose efficient solutions to exactly solve the inference and the loss-augmented problems.

General Classification Image Classification +2

Paper
Add Code

Recipe recognition with large multimodal food dataset

1 code implementation • IEEE 2015 • Xin Wang, Devinder Kumar, Nicolas Thome, Matthieu Cord, Frederic Precioso

We present deep experiments of recipe recognition on our dataset using visual, textual information and fusion.

Image Classification Text Classification +1

Paper
Code

Fantope Regularization in Metric Learning

no code implementations • CVPR 2014 • Marc T. Law, Nicolas Thome, Matthieu Cord

This paper introduces a regularization method to explicitly control the rank of a learned symmetric positive semidefinite distance matrix in distance metric learning.

Face Verification General Classification +2

Paper
Add Code

Sequentially Generated Instance-Dependent Image Representations for Classification

no code implementations • 20 Dec 2013 • Gabriel Dulac-Arnold, Ludovic Denoyer, Nicolas Thome, Matthieu Cord, Patrick Gallinari

In this paper, we investigate a new framework for image classification that adaptively generates spatial representations.

Classification General Classification +1

Paper
Add Code

Top-Down Regularization of Deep Belief Networks

no code implementations • NeurIPS 2013 • Hanlin Goh, Nicolas Thome, Matthieu Cord, Joo-Hwee Lim

We suggest a deep learning strategy that bridges the gap between the two phases, resulting in a three-phase learning procedure.

Object Recognition

Paper
Add Code

Dynamic Scene Classification: Learning Motion Descriptors with Slow Features Analysis

no code implementations • CVPR 2013 • Christian Theriault, Nicolas Thome, Matthieu Cord

In this paper, we address the challenging problem of categorizing video sequences composed of dynamic natural scenes.

Classification General Classification +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.