Search Results for author: Patrick Pérez

Found 82 papers, 46 papers with code

Video Inpainting of Complex Scenes

no code implementations18 Mar 2015 Alasdair Newson, Andrés Almansa, Matthieu Fradet, Yann Gousseau, Patrick Pérez

Our algorithm is able to deal with a variety of challenging situations which naturally arise in video inpainting, such as the correct reconstruction of dynamic textures, multiple moving objects and moving background.

Image Inpainting Video Editing +1

Sketching for Large-Scale Learning of Mixture Models

no code implementations9 Jun 2016 Nicolas Keriven, Anthony Bourrier, Rémi Gribonval, Patrick Pérez

We propose a "compressive learning" framework where we estimate model parameters from a sketch of the training data.

Compressive Sensing Speaker Verification

Approximate search with quantized sparse representations

no code implementations10 Aug 2016 Himalaya Jain, Patrick Pérez, Rémi Gribonval, Joaquin Zepeda, Hervé Jégou

This paper tackles the task of storing a large collection of vectors, such as visual descriptors, and of searching in it.

Quantization

ROAM: a Rich Object Appearance Model with Application to Rotoscoping

no code implementations CVPR 2017 Ondrej Miksik, Juan-Manuel Pérez-Rúa, Philip H. S. Torr, Patrick Pérez

Rotoscoping, the detailed delineation of scene elements through a video shot, is a painstaking task of tremendous importance in professional post-production pipelines.

Object

Unifying local and non-local signal processing with graph CNNs

no code implementations24 Feb 2017 Gilles Puy, Srdan Kitic, Patrick Pérez

This paper deals with the unification of local and non-local signal processing on graphs within a single convolutional neural network (CNN) framework.

Style Transfer

MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

no code implementations ICCV 2017 Ayush Tewari, Michael Zollhöfer, Hyeongwoo Kim, Pablo Garrido, Florian Bernard, Patrick Pérez, Christian Theobalt

In this work we propose a novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image.

Face Reconstruction Monocular Reconstruction

Audio style transfer

1 code implementation31 Oct 2017 Eric Grinstein, Ngoc Duong, Alexey Ozerov, Patrick Pérez

"Style transfer" among images has recently emerged as a very active research topic, fuelled by the power of convolution neural networks (CNNs), and has become fast a very popular technology in social media.

Sound Audio and Speech Processing Classical Physics

Self-supervised Multi-level Face Model Learning for Monocular Reconstruction at over 250 Hz

no code implementations CVPR 2018 Ayush Tewari, Michael Zollhöfer, Pablo Garrido, Florian Bernard, Hyeongwoo Kim, Patrick Pérez, Christian Theobalt

To alleviate this problem, we present the first approach that jointly learns 1) a regressor for face shape, expression, reflectance and illumination on the basis of 2) a concurrently learned parametric face model.

Face Model Monocular Reconstruction

Learning a Complete Image Indexing Pipeline

no code implementations CVPR 2018 Himalaya Jain, Joaquin Zepeda, Patrick Pérez, Rémi Gribonval

To work at scale, a complete image indexing system comprises two components: An inverted file index to restrict the actual search to only a subset that should contain most of the items relevant to the query; An approximate distance computation mechanism to rapidly scan these lists.

Clustering

Finding beans in burgers: Deep semantic-visual embedding with localization

1 code implementation CVPR 2018 Martin Engilberge, Louis Chevallier, Patrick Pérez, Matthieu Cord

Several works have proposed to learn a two-path neural network that maps images and texts, respectively, to a same shared Euclidean space where geometry captures useful semantic relationships.

Cross-Modal Retrieval Image Captioning +2

Weakly Supervised Representation Learning for Unsynchronized Audio-Visual Events

no code implementations19 Apr 2018 Sanjeel Parekh, Slim Essid, Alexey Ozerov, Ngoc Q. K. Duong, Patrick Pérez, Gaël Richard

Audio-visual representation learning is an important task from the perspective of designing machines with the ability to understand complex events.

Multiple Instance Learning Representation Learning

Deep Video Portraits

no code implementations29 May 2018 Hyeongwoo Kim, Pablo Garrido, Ayush Tewari, Weipeng Xu, Justus Thies, Matthias Nießner, Patrick Pérez, Christian Richardt, Michael Zollhöfer, Christian Theobalt

In order to enable source-to-target video re-animation, we render a synthetic target video with the reconstructed head animation parameters from a source video, and feed it into the trained network -- thus taking full control of the target.

Face Model

A Flexible Convolutional Solver with Application to Photorealistic Style Transfer

no code implementations13 Jun 2018 Gilles Puy, Patrick Pérez

In contrast to existing convnets that address the same task, our architecture derives directly from the structure of the gradient descent originally used to solve the style transfer problem [Gatys et al., 2016].

Rolling Shutter Correction Style Transfer

FML: Face Model Learning from Videos

no code implementations CVPR 2019 Ayush Tewari, Florian Bernard, Pablo Garrido, Gaurav Bharaj, Mohamed Elgharib, Hans-Peter Seidel, Patrick Pérez, Michael Zollhöfer, Christian Theobalt

In contrast, we propose multi-frame video-based self-supervised training of a deep network that (i) learns a face identity model both in shape and appearance while (ii) jointly learning to reconstruct 3D faces.

3D Reconstruction Face Model

Boosting Few-Shot Visual Learning with Self-Supervision

1 code implementation ICCV 2019 Spyros Gidaris, Andrei Bursuc, Nikos Komodakis, Patrick Pérez, Matthieu Cord

Few-shot learning and self-supervised learning address different facets of the same problem: how to train a model with little or no labeled data.

Few-Shot Learning Self-Supervised Learning

This dataset does not exist: training models from generated images

no code implementations7 Nov 2019 Victor Besnier, Himalaya Jain, Andrei Bursuc, Matthieu Cord, Patrick Pérez

This naturally brings the question: Can we train a classifier only on the generated data?

xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation

1 code implementation CVPR 2020 Maximilian Jaritz, Tuan-Hung Vu, Raoul de Charette, Émilie Wirbel, Patrick Pérez

In this work, we explore how to learn from multi-modality and propose cross-modal UDA (xMUDA) where we assume the presence of 2D images and 3D point clouds for 3D semantic segmentation.

3D Semantic Segmentation Autonomous Driving +2

QUEST: Quantized embedding space for transferring knowledge

1 code implementation ECCV 2020 Himalaya Jain, Spyros Gidaris, Nikos Komodakis, Patrick Pérez, Matthieu Cord

Knowledge distillation refers to the process of training a compact student network to achieve better accuracy by learning from a high capacity teacher network.

Knowledge Distillation

Scattering Features for Multimodal Gait Recognition

no code implementations23 Jan 2020 Srđan Kitić, Gilles Puy, Patrick Pérez, Philippe Gilberton

We consider the problem of identifying people on the basis of their walk (gait) pattern.

Gait Recognition

Deep Reinforcement Learning for Autonomous Driving: A Survey

no code implementations2 Feb 2020 B Ravi Kiran, Ibrahim Sobh, Victor Talpaert, Patrick Mannion, Ahmad A. Al Sallab, Senthil Yogamani, Patrick Pérez

With the development of deep representation learning, the domain of reinforcement learning (RL) has become a powerful learning framework now capable of learning complex policies in high dimensional environments.

Autonomous Driving Imitation Learning +3

Learning Representations by Predicting Bags of Visual Words

1 code implementation CVPR 2020 Spyros Gidaris, Andrei Bursuc, Nikos Komodakis, Patrick Pérez, Matthieu Cord

Inspired by the success of NLP methods in this area, in this work we propose a self-supervised approach based on spatially dense image descriptions that encode discrete visual concepts, here called visual words.

Representation Learning

StyleRig: Rigging StyleGAN for 3D Control over Portrait Images

no code implementations CVPR 2020 Ayush Tewari, Mohamed Elgharib, Gaurav Bharaj, Florian Bernard, Hans-Peter Seidel, Patrick Pérez, Michael Zollhöfer, Christian Theobalt

StyleGAN generates photorealistic portrait images of faces with eyes, teeth, hair and context (neck, shoulders, background), but lacks a rig-like control over semantic face parameters that are interpretable in 3D, such as face pose, expressions, and scene illumination.

Handling new target classes in semantic segmentation with domain adaptation

no code implementations2 Apr 2020 Maxime Bucher, Tuan-Hung Vu, Matthieu Cord, Patrick Pérez

In this work, we define and address a novel domain adaptation (DA) problem in semantic scene segmentation, where the target domain not only exhibits a data distribution shift w. r. t.

Scene Segmentation Universal Domain Adaptation +2

Photo style transfer with consistency losses

no code implementations9 May 2020 Xu Yao, Gilles Puy, Patrick Pérez

We address the problem of style transfer between two photos and propose a new way to preserve photorealism.

Style Transfer

ESL: Entropy-guided Self-supervised Learning for Domain Adaptation in Semantic Segmentation

1 code implementation15 Jun 2020 Antoine Saporta, Tuan-Hung Vu, Matthieu Cord, Patrick Pérez

While fully-supervised deep learning yields good models for urban scene semantic segmentation, these models struggle to generalize to new environments with different lighting or weather conditions for instance.

Self-Supervised Learning Semantic Segmentation +1

VRUNet: Multi-Task Learning Model for Intent Prediction of Vulnerable Road Users

no code implementations10 Jul 2020 Adithya Ranga, Filippo Giruzzi, Jagdish Bhanushali, Emilie Wirbel, Patrick Pérez, Tuan-Hung Vu, Xavier Perrotton

In this paper we propose a multi-task learning model to predict pedestrian actions, crossing intent and forecast their future path from video sequences.

Autonomous Vehicles Motion Planning +1

PIE: Portrait Image Embedding for Semantic Control

no code implementations20 Sep 2020 Ayush Tewari, Mohamed Elgharib, Mallikarjun B R., Florian Bernard, Hans-Peter Seidel, Patrick Pérez, Michael Zollhöfer, Christian Theobalt

We present the first approach for embedding real portrait images in the latent space of StyleGAN, which allows for intuitive editing of the head pose, facial expression, and scene illumination in the image.

Face Model

Detecting 32 Pedestrian Attributes for Autonomous Vehicles

1 code implementation4 Dec 2020 Taylor Mordan, Matthieu Cord, Patrick Pérez, Alexandre Alahi

By increasing the number of attributes jointly learned, we highlight an issue related to the scales of gradients, which arises in MTL with numerous tasks.

Attribute Autonomous Driving +1

Driving Behavior Explanation with Multi-level Fusion

1 code implementation9 Dec 2020 Hédi Ben-Younes, Éloi Zablocki, Patrick Pérez, Matthieu Cord

In this era of active development of autonomous vehicles, it becomes crucial to provide driving systems with the capacity to explain their decisions.

Explainable artificial intelligence Trajectory Prediction

Confidence Estimation via Auxiliary Models

no code implementations11 Dec 2020 Charles Corbière, Nicolas Thome, Antoine Saporta, Tuan-Hung Vu, Matthieu Cord, Patrick Pérez

In this paper, we introduce a novel target criterion for model confidence, namely the true class probability (TCP).

Domain Adaptation Image Classification +1

Artificial Dummies for Urban Dataset Augmentation

1 code implementation15 Dec 2020 Antonín Vobecký, David Hurych, Michal Uřičář, Patrick Pérez, Josef Šivic

This is achieved with a data generator (called DummyNet) with disentangled control of the pose, the appearance, and the target background scene.

Autonomous Driving

OBoW: Online Bag-of-Visual-Words Generation for Self-Supervised Learning

2 code implementations CVPR 2021 Spyros Gidaris, Andrei Bursuc, Gilles Puy, Nikos Komodakis, Matthieu Cord, Patrick Pérez

With this in mind, we propose a teacher-student scheme to learn representations by training a convolutional net to reconstruct a bag-of-visual-words (BoW) representation of an image, given as input a perturbed version of that same image.

object-detection Object Detection +5

Explainability of deep vision-based autonomous driving systems: Review and challenges

no code implementations13 Jan 2021 Éloi Zablocki, Hédi Ben-Younes, Patrick Pérez, Matthieu Cord

The concept of explainability has several facets and the need for explainability is strong in driving, a safety-critical application.

Autonomous Driving Explainable artificial intelligence

StyleLess layer: Improving robustness for real-world driving

no code implementations25 Mar 2021 Julien Rebut, Andrei Bursuc, Patrick Pérez

Robustness to various image corruptions, caused by changing weather conditions or sensor degradation and aging, is crucial for safety when such vehicles are deployed in the real world.

Autonomous Driving Semantic Segmentation

Neural Monocular 3D Human Motion Capture with Physical Awareness

no code implementations3 May 2021 Soshi Shimada, Vladislav Golyanik, Weipeng Xu, Patrick Pérez, Christian Theobalt

We present a new trainable system for physically plausible markerless 3D human motion capture, which achieves state-of-the-art results in a broad range of challenging scenarios.

3D Pose Estimation

Semantic Palette: Guiding Scene Generation with Class Proportions

1 code implementation CVPR 2021 Guillaume Le Moing, Tuan-Hung Vu, Himalaya Jain, Patrick Pérez, Matthieu Cord

Despite the recent progress of generative adversarial networks (GANs) at synthesizing photo-realistic images, producing complex urban scenes remains a challenging problem.

Data Augmentation Image Generation +1

Large-Scale Unsupervised Object Discovery

1 code implementation NeurIPS 2021 Huy V. Vo, Elena Sizikova, Cordelia Schmid, Patrick Pérez, Jean Ponce

Extensive experiments on COCO and OpenImages show that, in the single-object discovery setting where a single prominent object is sought in each image, the proposed LOD (Large-scale Object Discovery) approach is on par with, or better than the state of the art for medium-scale datasets (up to 120K images), and over 37% better than the only other algorithms capable of scaling up to 1. 7M images.

Multi-object discovery Object +2

Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation

1 code implementation ICCV 2021 Antoine Saporta, Tuan-Hung Vu, Matthieu Cord, Patrick Pérez

In this work, we address the task of unsupervised domain adaptation (UDA) for semantic segmentation in presence of multiple target domains: The objective is to train a single model that can handle all these domains at test time.

Segmentation Semantic Segmentation +2

LiDARTouch: Monocular metric depth estimation with a few-beam LiDAR

1 code implementation8 Sep 2021 Florent Bartoccioni, Éloi Zablocki, Patrick Pérez, Matthieu Cord, Karteek Alahari

In such a monocular setup, dense depth is obtained with either additional input from one or several expensive LiDARs, e. g., with 64 beams, or camera-only methods, which suffer from scale-ambiguity and infinite-depth problems.

Depth Completion Depth Estimation

Raising context awareness in motion forecasting

1 code implementation16 Sep 2021 Hédi Ben-Younes, Éloi Zablocki, Mickaël Chen, Patrick Pérez, Matthieu Cord

Learning-based trajectory prediction models have encountered great success, with the promise of leveraging contextual information in addition to motion history.

Motion Forecasting Trajectory Prediction

Localizing Objects with Self-Supervised Transformers and no Labels

2 code implementations29 Sep 2021 Oriane Siméoni, Gilles Puy, Huy V. Vo, Simon Roburin, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Renaud Marlet, Jean Ponce

We also show that training a class-agnostic detector on the discovered objects boosts results by another 7 points.

Ranked #4 on Weakly-Supervised Object Localization on CUB-200-2011 (Top-1 Localization Accuracy metric)

Object Object Discovery +2

STEEX: Steering Counterfactual Explanations with Semantics

1 code implementation17 Nov 2021 Paul Jacob, Éloi Zablocki, Hédi Ben-Younes, Mickaël Chen, Patrick Pérez, Matthieu Cord

In this work, we address the problem of producing counterfactual explanations for high-quality images and complex scenes.

counterfactual Counterfactual Explanation

CSG0: Continual Urban Scene Generation with Zero Forgetting

no code implementations6 Dec 2021 Himalaya Jain, Tuan-Hung Vu, Patrick Pérez, Matthieu Cord

With the rapid advances in generative adversarial networks (GANs), the visual quality of synthesised scenes keeps improving, including for complex urban scenes with applications to automated driving.

Continual Learning Scene Generation +1

Raw High-Definition Radar for Multi-Task Learning

1 code implementation CVPR 2022 Julien Rebut, Arthur Ouaknine, Waqas Malik, Patrick Pérez

With their robustness to adverse weather conditions and ability to measure speeds, radar sensors have been part of the automotive landscape for more than two decades.

Multi-Task Learning Vocal Bursts Intensity Prediction

Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-modal Distillation

1 code implementation21 Mar 2022 Antonin Vobecky, David Hurych, Oriane Siméoni, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Josef Sivic

This work investigates learning pixel-wise semantic image segmentation in urban scenes without any manual annotation, just from the raw non-curated data collected by cars which, equipped with cameras and LiDAR sensors, drive around a city.

Image Segmentation Segmentation +1

Multi-Head Distillation for Continual Unsupervised Domain Adaptation in Semantic Segmentation

1 code implementation25 Apr 2022 Antoine Saporta, Arthur Douillard, Tuan-Hung Vu, Patrick Pérez, Matthieu Cord

Unsupervised Domain Adaptation (UDA) is a transfer learning task which aims at training on an unlabeled target domain by leveraging a labeled source domain.

Continual Learning Semantic Segmentation +2

HULC: 3D Human Motion Capture with Pose Manifold Sampling and Dense Contact Guidance

no code implementations11 May 2022 Soshi Shimada, Vladislav Golyanik, Zhi Li, Patrick Pérez, Weipeng Xu, Christian Theobalt

Marker-less monocular 3D human motion capture (MoCap) with scene interactions is a challenging research topic relevant for extended reality, robotics and virtual avatar generation.

Active Learning Strategies for Weakly-supervised Object Detection

1 code implementation25 Jul 2022 Huy V. Vo, Oriane Siméoni, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Jean Ponce

On COCO, using on average 10 fully-annotated images per class, or equivalently 1% of the training set, BiB also reduces the performance gap (in AP) between the weakly-supervised detector and the fully-supervised Fast RCNN by over 70%, showing a good trade-off between performance and data efficiency.

Active Learning Object +1

Self-supervised learning with rotation-invariant kernels

1 code implementation28 Jul 2022 Léon Zheng, Gilles Puy, Elisa Riccietti, Patrick Pérez, Rémi Gribonval

We introduce a regularization loss based on kernel mean embeddings with rotation-invariant kernels on the hypersphere (also known as dot-product kernels) for self-supervised learning of image representations.

Self-Supervised Learning

Take One Gram of Neural Features, Get Enhanced Group Robustness

no code implementations26 Aug 2022 Simon Roburin, Charles Corbière, Gilles Puy, Nicolas Thome, Matthieu Aubry, Renaud Marlet, Patrick Pérez

Predictive performance of machine learning models trained with empirical risk minimization (ERM) can degrade considerably under distribution shifts.

OCTET: Object-aware Counterfactual Explanations

1 code implementation CVPR 2023 Mehdi Zemni, Mickaël Chen, Éloi Zablocki, Hédi Ben-Younes, Patrick Pérez, Matthieu Cord

We conduct a set of experiments on counterfactual explanation benchmarks for driving scenes, and we show that our method can be adapted beyond classification, e. g., to explain semantic segmentation models.

Autonomous Driving counterfactual +4

PØDA: Prompt-driven Zero-shot Domain Adaptation

1 code implementation6 Dec 2022 Mohammad Fahes, Tuan-Hung Vu, Andrei Bursuc, Patrick Pérez, Raoul de Charette

In this paper, we propose the task of `Prompt-driven Zero-shot Domain Adaptation', where we adapt a model trained on a source domain using only a general description in natural language of the target domain, i. e., a prompt.

Image Classification object-detection +5

PODA: Prompt-driven Zero-shot Domain Adaptation

1 code implementation ICCV 2023 Mohammad Fahes, Tuan-Hung Vu, Andrei Bursuc, Patrick Pérez, Raoul de Charette

In this paper, we propose the task of 'Prompt-driven Zero-shot Domain Adaptation', where we adapt a model trained on a source domain using only a general description in natural language of the target domain, i. e., a prompt.

Image Classification Language Modelling +7

Diverse Probabilistic Trajectory Forecasting with Admissibility Constraints

1 code implementation7 Feb 2023 Laura Calem, Hedi Ben-Younes, Patrick Pérez, Nicolas Thome

Predicting multiple trajectories for road users is important for automated driving systems: ego-vehicle motion planning indeed requires a clear view of the possible motions of the surrounding agents.

Motion Planning Structured Prediction +1

Towards Motion Forecasting with Real-World Perception Inputs: Are End-to-End Approaches Competitive?

1 code implementation15 Jun 2023 Yihong Xu, Loïck Chambon, Éloi Zablocki, Mickaël Chen, Alexandre Alahi, Matthieu Cord, Patrick Pérez

In fact, conventional forecasting methods are usually not trained nor tested in real-world pipelines (e. g., with upstream detection, tracking, and mapping modules).

Benchmarking Motion Forecasting

DiffHPE: Robust, Coherent 3D Human Pose Lifting with Diffusion

no code implementations4 Sep 2023 Cédric Rommel, Eduardo Valle, Mickaël Chen, Souhaiel Khalfaoui, Renaud Marlet, Matthieu Cord, Patrick Pérez

We present an innovative approach to 3D Human Pose Estimation (3D-HPE) by integrating cutting-edge diffusion models, which have revolutionized diverse fields, but are relatively unexplored in 3D-HPE.

3D Human Pose Estimation

T-UDA: Temporal Unsupervised Domain Adaptation in Sequential Point Clouds

1 code implementation15 Sep 2023 Awet Haileslassie Gebrehiwot, David Hurych, Karel Zimmermann, Patrick Pérez, Tomáš Svoboda

Deep perception models have to reliably cope with an open-world setting of domain shifts induced by different geographic regions, sensor properties, mounting positions, and several other reasons.

3D Semantic Segmentation Unsupervised Domain Adaptation

Decaf: Monocular Deformation Capture for Face and Hand Interactions

no code implementations28 Sep 2023 Soshi Shimada, Vladislav Golyanik, Patrick Pérez, Christian Theobalt

At the core of our neural approach are a variational auto-encoder supplying the hand-face depth prior and modules that guide the 3D tracking by estimating the contacts and the deformations.

Unsupervised Object Localization in the Era of Self-Supervised ViTs: A Survey

1 code implementation19 Oct 2023 Oriane Siméoni, Éloi Zablocki, Spyros Gidaris, Gilles Puy, Patrick Pérez

We propose here a survey of unsupervised object localization methods that discover objects in images without requiring any manual annotation in the era of self-supervised ViTs.

Object Unsupervised Object Localization

Three Pillars improving Vision Foundation Model Distillation for Lidar

1 code implementation26 Oct 2023 Gilles Puy, Spyros Gidaris, Alexandre Boulch, Oriane Siméoni, Corentin Sautier, Patrick Pérez, Andrei Bursuc, Renaud Marlet

In particular, thanks to our scalable distillation method named ScaLR, we show that scaling the 2D and 3D backbones and pretraining on diverse datasets leads to a substantial improvement of the feature quality.

Autonomous Driving Object Discovery +2

A Simple Recipe for Language-guided Domain Generalized Segmentation

1 code implementation29 Nov 2023 Mohammad Fahes, Tuan-Hung Vu, Andrei Bursuc, Patrick Pérez, Raoul de Charette

Generalization to new domains not seen during training is one of the long-standing challenges in deploying neural networks in real-world applications.

Data Augmentation Semantic Segmentation

ManiPose: Manifold-Constrained Multi-Hypothesis 3D Human Pose Estimation

no code implementations11 Dec 2023 Cédric Rommel, Victor Letzelter, Nermin Samet, Renaud Marlet, Matthieu Cord, Patrick Pérez, Eduardo Valle

Monocular 3D human pose estimation (3D-HPE) is an inherently ambiguous task, as a 2D pose in an image might originate from different possible 3D poses.

Monocular 3D Human Pose Estimation regression

Reliability in Semantic Segmentation: Can We Use Synthetic Data?

no code implementations14 Dec 2023 Thibaut Loiseau, Tuan-Hung Vu, Mickael Chen, Patrick Pérez, Matthieu Cord

Assessing the reliability of perception models to covariate shifts and out-of-distribution (OOD) detection is crucial for safety-critical applications such as autonomous vehicles.

Autonomous Vehicles Out of Distribution (OOD) Detection +1

CLIP-DINOiser: Teaching CLIP a few DINO tricks for open-vocabulary semantic segmentation

1 code implementation19 Dec 2023 Monika Wysoczańska, Oriane Siméoni, Michaël Ramamonjisoa, Andrei Bursuc, Tomasz Trzciński, Patrick Pérez

We propose to locally improve dense MaskCLIP features, which are computed with a simple modification of CLIP's last pooling layer, by integrating localization priors extracted from self-supervised features.

Open Vocabulary Semantic Segmentation Semantic Segmentation

Manipulating Trajectory Prediction with Backdoors

no code implementations21 Dec 2023 Kaouther Messaoud, Kathrin Grosse, Mickael Chen, Matthieu Cord, Patrick Pérez, Alexandre Alahi

In this paper, we focus on backdoors - a security threat acknowledged in other fields but so far overlooked for trajectory prediction.

Autonomous Vehicles Trajectory Prediction

POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images

no code implementations NeurIPS 2023 Antonin Vobecky, Oriane Siméoni, David Hurych, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Josef Sivic

We describe an approach to predict open-vocabulary 3D semantic voxel occupancy map from input 2D images with the objective of enabling 3D grounding, segmentation and retrieval of free-form language queries.

3D Semantic Occupancy Prediction 3D Semantic Segmentation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.