Search Results for author: Philippe Weinzaepfel

Found 31 papers, 14 papers with code

Improved Cross-view Completion Pre-training for Stereo Matching

no code implementations18 Nov 2022 Philippe Weinzaepfel, Vaibhav Arora, Yohann Cabon, Thomas Lucas, Romain Brégier, Vincent Leroy, Gabriela Csurka, Leonid Antsfeld, Boris Chidlovskii, Jérôme Revaud

However, the applicability of this concept has so far been limited in at least two ways: (a) by the difficulty of collecting real-world image pairs - in practice only synthetic data had been used - and (b) by the lack of generalization of vanilla transformers to dense downstream tasks for which relative position is more meaningful than absolute position.

Self-Supervised Learning Stereo Matching

PoseScript: 3D Human Poses from Natural Language

1 code implementation21 Oct 2022 Ginger Delmas, Philippe Weinzaepfel, Thomas Lucas, Francesc Moreno-Noguer, Grégory Rogez

This process extracts low-level pose information -- the posecodes -- using a set of simple but generic rules on the 3D keypoints.

Cross-Modal Retrieval Image Captioning +3

PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting

1 code implementation19 Oct 2022 Thomas Lucas, Fabien Baradel, Philippe Weinzaepfel, Grégory Rogez

The discrete and compressed nature of the latent space allows the GPT-like model to focus on long-range signal, as it removes low-level redundancy in the input signal.

Human-Object Interaction Detection Quantization

CroCo: Self-Supervised Pre-training for 3D Vision Tasks by Cross-View Completion

no code implementations19 Oct 2022 Philippe Weinzaepfel, Vincent Leroy, Thomas Lucas, Romain Brégier, Yohann Cabon, Vaibhav Arora, Leonid Antsfeld, Boris Chidlovskii, Gabriela Csurka, Jérôme Revaud

More precisely, we propose the pretext task of cross-view completion where the first input image is partially masked, and this masked content has to be reconstructed from the visible content and the second image.

Depth Estimation Depth Prediction +6

PoseBERT: A Generic Transformer Module for Temporal 3D Human Modeling

1 code implementation22 Aug 2022 Fabien Baradel, Romain Brégier, Thibault Groueix, Philippe Weinzaepfel, Yannis Kalantidis, Grégory Rogez

It is simple, generic and versatile, as it can be plugged on top of any image-based model to transform it in a video-based model leveraging temporal information.

Pose Estimation Pose Prediction

Investigating the Role of Image Retrieval for Visual Localization -- An exhaustive benchmark

1 code implementation31 May 2022 Martin Humenberger, Yohann Cabon, Noé Pion, Philippe Weinzaepfel, Donghwan Lee, Nicolas Guérin, Torsten Sattler, Gabriela Csurka

In order to investigate the consequences for visual localization, this paper focuses on understanding the role of image retrieval for multiple visual localization paradigms.

Autonomous Driving Image Retrieval +3

Learning Super-Features for Image Retrieval

1 code implementation ICLR 2022 Philippe Weinzaepfel, Thomas Lucas, Diane Larlus, Yannis Kalantidis

Second, they are typically trained with a global loss that only acts on top of an aggregation of local features; by contrast, testing is based on local feature matching, which creates a discrepancy between training and testing.

Image Retrieval Retrieval

PUMP: Pyramidal and Uniqueness Matching Priors for Unsupervised Learning of Local Descriptors

no code implementations CVPR 2022 Jérome Revaud, Vincent Leroy, Philippe Weinzaepfel, Boris Chidlovskii

In this paper, we propose to explicitly integrate two matching priors in a single loss in order to learn local descriptors without supervision.

Visual Localization

Barely-Supervised Learning: Semi-Supervised Learning with very few labeled images

no code implementations22 Dec 2021 Thomas Lucas, Philippe Weinzaepfel, Gregory Rogez

We propose a method to leverage self-supervised methods that provides training signal in the absence of confident pseudo-labels.

Pseudo Label

Leveraging MoCap Data for Human Mesh Recovery

1 code implementation18 Oct 2021 Fabien Baradel, Thibault Groueix, Philippe Weinzaepfel, Romain Brégier, Yannis Kalantidis, Grégory Rogez

In fact, we show that simply fine-tuning the batch normalization layers of the model is enough to achieve large gains.

Ranked #3 on 3D Human Pose Estimation on MPI-INF-3DHP (Acceleration Error metric)

3D Human Pose Estimation 3D Human Reconstruction +2

Multi-FinGAN: Generative Coarse-To-Fine Sampling of Multi-Finger Grasps

1 code implementation17 Dec 2020 Jens Lundell, Enric Corona, Tran Nguyen Le, Francesco Verdoja, Philippe Weinzaepfel, Gregory Rogez, Francesc Moreno-Noguer, Ville Kyrki

While there exists many methods for manipulating rigid objects with parallel-jaw grippers, grasping with multi-finger robotic hands remains a quite unexplored research topic.

SuperLoss: A Generic Loss for Robust Curriculum Learning

2 code implementations NeurIPS 2020 Thibault Castells, Philippe Weinzaepfel, Jerome Revaud

The key idea is to somehow estimate the importance (or weight) of each sample directly during training based on the observation that easy and hard samples behave differently and can therefore be separated.

Image Classification Image Retrieval +4

Hard Negative Mixing for Contrastive Learning

1 code implementation NeurIPS 2020 Yannis Kalantidis, Mert Bulent Sariyildiz, Noe Pion, Philippe Weinzaepfel, Diane Larlus

Based on these observations, and motivated by the success of data mixing, we propose hard negative mixing strategies at the feature level, that can be computed on-the-fly with a minimal computational overhead.

Contrastive Learning Data Augmentation +5

DOPE: Distillation Of Part Experts for whole-body 3D pose estimation in the wild

1 code implementation ECCV 2020 Philippe Weinzaepfel, Romain Brégier, Hadrien Combaluzier, Vincent Leroy, Grégory Rogez

We introduce DOPE, the first method to detect and estimate whole-body 3D human poses, including bodies, hands and faces, in the wild.

3D Pose Estimation

Mimetics: Towards Understanding Human Actions Out of Context

no code implementations16 Dec 2019 Philippe Weinzaepfel, Grégory Rogez

Our experiments show that (a) state-of-the-art 3D convolutional neural networks obtain disappointing results on such videos, highlighting the lack of true understanding of the human actions and (b) models leveraging body language via human pose are less prone to context biases.

3D Action Recognition Pose Estimation

R2D2: Repeatable and Reliable Detector and Descriptor

1 code implementation14 Jun 2019 Jerome Revaud, Philippe Weinzaepfel, César De Souza, Noe Pion, Gabriela Csurka, Yohann Cabon, Martin Humenberger

In this work, we argue that salient regions are not necessarily discriminative, and therefore can harm the performance of the description.

Interest Point Detection Keypoint Detection +1

Visual Localization by Learning Objects-Of-Interest Dense Match Regression

no code implementations CVPR 2019 Philippe Weinzaepfel, Gabriela Csurka, Yohann Cabon, Martin Humenberger

We introduce a novel CNN-based approach for visual localization from a single RGB image that relies on densely matching a set of Objects-of-Interest (OOIs).

regression Visual Localization

Action Tubelet Detector for Spatio-Temporal Action Localization

2 code implementations ICCV 2017 Vicky Kalogeiton, Philippe Weinzaepfel, Vittorio Ferrari, Cordelia Schmid

We propose the ACtion Tubelet detector (ACT-detector) that takes as input a sequence of frames and outputs tubelets, i. e., sequences of bounding boxes with associated scores.

Spatio-Temporal Action Localization Temporal Action Localization

Human Action Localization with Sparse Spatial Supervision

no code implementations17 May 2016 Philippe Weinzaepfel, Xavier Martin, Cordelia Schmid

We introduce an approach for spatio-temporal human action localization using sparse spatial supervision.

Action Localization

Learning to track for spatio-temporal action localization

no code implementations ICCV 2015 Philippe Weinzaepfel, Zaid Harchaoui, Cordelia Schmid

We present experimental results for spatio-temporal localization on the UCF-Sports, J-HMDB and UCF-101 action localization datasets, where our approach outperforms the state of the art with a margin of 15%, 7% and 12% respectively in mAP.

Spatio-Temporal Action Localization Temporal Action Localization +1

Learning to Detect Motion Boundaries

no code implementations CVPR 2015 Philippe Weinzaepfel, Jerome Revaud, Zaid Harchaoui, Cordelia Schmid

We compare the results obtained with several state-of-the-art optical flow approaches and study the impact of the different cues used in the random forest. Furthermore, we introduce a new dataset, the YouTube Motion Boundaries dataset (YMB), that comprises 60 sequences taken from real-world videos with manually annotated motion boundaries.

Boundary Detection Optical Flow Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.