no code implementations • 19 Sep 2023 • Anilkumar Swamy, Vincent Leroy, Philippe Weinzaepfel, Fabien Baradel, Salma Galaaoui, Romain Bregier, Matthieu Armando, Jean-Sebastien Franco, Gregory Rogez
Recent hand-object interaction datasets show limited real object variability and rely on fitting the MANO parametric model to obtain groundtruth hand shapes.
no code implementations • ICCV 2023 • Ginger Delmas, Philippe Weinzaepfel, Francesc Moreno-Noguer, Grégory Rogez
Automatically producing instructions to modify one's posture could open the door to endless applications, such as personalized coaching and in-home physical therapy.
no code implementations • 21 Jul 2023 • Jerome Revaud, Yohann Cabon, Romain Brégier, Jongmin Lee, Philippe Weinzaepfel
In this paper, we propose a new paradigm where a single generic SCR model is trained once to be then deployed to new test scenes, regardless of their scale and without further finetuning.
1 code implementation • ICCV 2023 • Philippe Weinzaepfel, Thomas Lucas, Vincent Leroy, Yohann Cabon, Vaibhav Arora, Romain Brégier, Gabriela Csurka, Leonid Antsfeld, Boris Chidlovskii, Jérôme Revaud
Despite impressive performance for high-level downstream tasks, self-supervised pre-training methods have not yet fully delivered on dense geometric vision tasks such as stereo matching or optical flow.
Ranked #1 on
Optical Flow Estimation
on KITTI 2012
1 code implementation • 21 Oct 2022 • Ginger Delmas, Philippe Weinzaepfel, Thomas Lucas, Francesc Moreno-Noguer, Grégory Rogez
This process extracts low-level pose information -- the posecodes -- using a set of simple but generic rules on the 3D keypoints.
1 code implementation • 19 Oct 2022 • Thomas Lucas, Fabien Baradel, Philippe Weinzaepfel, Grégory Rogez
The discrete and compressed nature of the latent space allows the GPT-like model to focus on long-range signal, as it removes low-level redundancy in the input signal.
1 code implementation • 19 Oct 2022 • Philippe Weinzaepfel, Vincent Leroy, Thomas Lucas, Romain Brégier, Yohann Cabon, Vaibhav Arora, Leonid Antsfeld, Boris Chidlovskii, Gabriela Csurka, Jérôme Revaud
More precisely, we propose the pretext task of cross-view completion where the first input image is partially masked, and this masked content has to be reconstructed from the visible content and the second image.
1 code implementation • 22 Aug 2022 • Fabien Baradel, Romain Brégier, Thibault Groueix, Philippe Weinzaepfel, Yannis Kalantidis, Grégory Rogez
It is simple, generic and versatile, as it can be plugged on top of any image-based model to transform it in a video-based model leveraging temporal information.
1 code implementation • 31 May 2022 • Martin Humenberger, Yohann Cabon, Noé Pion, Philippe Weinzaepfel, Donghwan Lee, Nicolas Guérin, Torsten Sattler, Gabriela Csurka
In order to investigate the consequences for visual localization, this paper focuses on understanding the role of image retrieval for multiple visual localization paradigms.
1 code implementation • ICLR 2022 • Philippe Weinzaepfel, Thomas Lucas, Diane Larlus, Yannis Kalantidis
Second, they are typically trained with a global loss that only acts on top of an aggregation of local features; by contrast, testing is based on local feature matching, which creates a discrepancy between training and testing.
Ranked #3 on
Image Retrieval
on ROxford (Medium)
1 code implementation • CVPR 2022 • Jérome Revaud, Vincent Leroy, Philippe Weinzaepfel, Boris Chidlovskii
In this paper, we propose to explicitly integrate two matching priors in a single loss in order to learn local descriptors without supervision.
no code implementations • 22 Dec 2021 • Thomas Lucas, Philippe Weinzaepfel, Gregory Rogez
We propose a method to leverage self-supervised methods that provides training signal in the absence of confident pseudo-labels.
1 code implementation • 18 Oct 2021 • Fabien Baradel, Thibault Groueix, Philippe Weinzaepfel, Romain Brégier, Yannis Kalantidis, Grégory Rogez
In fact, we show that simply fine-tuning the batch normalization layers of the model is enough to achieve large gains.
Ranked #6 on
3D Human Pose Estimation
on MPI-INF-3DHP
(Acceleration Error metric)
no code implementations • CVPR 2021 • Donghwan Lee, Soohyun Ryu, Suyong Yeon, Yonghan Lee, Deokhwa Kim, Cheolho Han, Yohann Cabon, Philippe Weinzaepfel, Nicolas Guérin, Gabriela Csurka, Martin Humenberger
In this paper, we introduce 5 new indoor datasets for visual localization in challenging real-world environments.
1 code implementation • 17 Dec 2020 • Jens Lundell, Enric Corona, Tran Nguyen Le, Francesco Verdoja, Philippe Weinzaepfel, Gregory Rogez, Francesc Moreno-Noguer, Ville Kyrki
While there exists many methods for manipulating rigid objects with parallel-jaw grippers, grasping with multi-finger robotic hands remains a quite unexplored research topic.
no code implementations • 4 Dec 2020 • Vincent Leroy, Philippe Weinzaepfel, Romain Brégier, Hadrien Combaluzier, Grégory Rogez
Predicting 3D human pose from images has seen great recent improvements.
2 code implementations • NeurIPS 2020 • Thibault Castells, Philippe Weinzaepfel, Jerome Revaud
The key idea is to somehow estimate the importance (or weight) of each sample directly during training based on the observation that easy and hard samples behave differently and can therefore be separated.
1 code implementation • NeurIPS 2020 • Yannis Kalantidis, Mert Bulent Sariyildiz, Noe Pion, Philippe Weinzaepfel, Diane Larlus
Based on these observations, and motivated by the success of data mixing, we propose hard negative mixing strategies at the feature level, that can be computed on-the-fly with a minimal computational overhead.
1 code implementation • ECCV 2020 • Philippe Weinzaepfel, Romain Brégier, Hadrien Combaluzier, Vincent Leroy, Grégory Rogez
We introduce DOPE, the first method to detect and estimate whole-body 3D human poses, including bodies, hands and faces, in the wild.
no code implementations • ECCV 2020 • Anil Armagan, Guillermo Garcia-Hernando, Seungryul Baek, Shreyas Hampali, Mahdi Rad, Zhaohui Zhang, Shipeng Xie, Mingxiu Chen, Boshen Zhang, Fu Xiong, Yang Xiao, Zhiguo Cao, Junsong Yuan, Pengfei Ren, Weiting Huang, Haifeng Sun, Marek Hrúz, Jakub Kanis, Zdeněk Krňoul, Qingfu Wan, Shile Li, Linlin Yang, Dongheui Lee, Angela Yao, Weiguo Zhou, Sijia Mei, Yun-hui Liu, Adrian Spurr, Umar Iqbal, Pavlo Molchanov, Philippe Weinzaepfel, Romain Brégier, Grégory Rogez, Vincent Lepetit, Tae-Kyun Kim
To address these issues, we designed a public challenge (HANDS'19) to evaluate the abilities of current 3D hand pose estimators (HPEs) to interpolate and extrapolate the poses of a training set.
no code implementations • 16 Dec 2019 • Philippe Weinzaepfel, Grégory Rogez
Our experiments show that (a) state-of-the-art 3D convolutional neural networks obtain disappointing results on such videos, highlighting the lack of true understanding of the human actions and (b) models leveraging body language via human pose are less prone to context biases.
2 code implementations • NeurIPS 2019 • Jerome Revaud, Cesar De Souza, Martin Humenberger, Philippe Weinzaepfel
We thus propose to jointly learn keypoint detection and description together with a predictor of the local descriptor discriminativeness.
Ranked #2 on
Camera Localization
on Aachen Day-Night benchmark
1 code implementation • 14 Jun 2019 • Jerome Revaud, Philippe Weinzaepfel, César De Souza, Noe Pion, Gabriela Csurka, Yohann Cabon, Martin Humenberger
In this work, we argue that salient regions are not necessarily discriminative, and therefore can harm the performance of the description.
no code implementations • CVPR 2019 • Philippe Weinzaepfel, Gabriela Csurka, Yohann Cabon, Martin Humenberger
We introduce a novel CNN-based approach for visual localization from a single RGB image that relies on densely matching a set of Objects-of-Interest (OOIs).
no code implementations • CVPR 2018 • Vasileios Choutas, Philippe Weinzaepfel, Jérôme Revaud, Cordelia Schmid
We use the human joints as these keypoints and term our Pose moTion representation PoTion.
Ranked #1 on
Skeleton Based Action Recognition
on J-HMDB
no code implementations • 1 Mar 2018 • Gregory Rogez, Philippe Weinzaepfel, Cordelia Schmid
We propose an end-to-end architecture for joint 2D and 3D human pose estimation in natural images.
3D Human Pose Estimation
3D Multi-Person Pose Estimation (absolute)
+1
no code implementations • ICCV 2017 • Vicky Kalogeiton, Philippe Weinzaepfel, Vittorio Ferrari, Cordelia Schmid
dog and cat jumping, enabling to detect actions of an object without training with these object-actions pairs.
no code implementations • CVPR 2017 • Gregory Rogez, Philippe Weinzaepfel, Cordelia Schmid
We propose an end-to-end architecture for joint 2D and 3D human pose estimation in natural images.
Ranked #4 on
3D Multi-Person Pose Estimation (root-relative)
on MuPoTS-3D
(MPJPE metric)
3D Human Pose Estimation
3D Multi-Person Pose Estimation (absolute)
+4
2 code implementations • ICCV 2017 • Vicky Kalogeiton, Philippe Weinzaepfel, Vittorio Ferrari, Cordelia Schmid
We propose the ACtion Tubelet detector (ACT-detector) that takes as input a sequence of frames and outputs tubelets, i. e., sequences of bounding boxes with associated scores.
Spatio-Temporal Action Localization
Temporal Action Localization
no code implementations • 17 May 2016 • Philippe Weinzaepfel, Xavier Martin, Cordelia Schmid
We introduce an approach for spatio-temporal human action localization using sparse spatial supervision.
1 code implementation • 25 Jun 2015 • Jerome Revaud, Philippe Weinzaepfel, Zaid Harchaoui, Cordelia Schmid
We introduce a novel matching algorithm, called DeepMatching, to compute dense correspondences between images.
Ranked #4 on
Dense Pixel Correspondence Estimation
on HPatches
Dense Pixel Correspondence Estimation
Optical Flow Estimation
no code implementations • ICCV 2015 • Philippe Weinzaepfel, Zaid Harchaoui, Cordelia Schmid
We present experimental results for spatio-temporal localization on the UCF-Sports, J-HMDB and UCF-101 action localization datasets, where our approach outperforms the state of the art with a margin of 15%, 7% and 12% respectively in mAP.
Spatio-Temporal Action Localization
Temporal Action Localization
+1
no code implementations • CVPR 2015 • Philippe Weinzaepfel, Jerome Revaud, Zaid Harchaoui, Cordelia Schmid
We compare the results obtained with several state-of-the-art optical flow approaches and study the impact of the different cues used in the random forest. Furthermore, we introduce a new dataset, the YouTube Motion Boundaries dataset (YMB), that comprises 60 sequences taken from real-world videos with manually annotated motion boundaries.
no code implementations • CVPR 2015 • Jerome Revaud, Philippe Weinzaepfel, Zaid Harchaoui, Cordelia Schmid
We propose a novel approach for optical flow estimation , targeted at large displacements with significant oc-clusions.