no code implementations • 27 Mar 2025 • Yan Xia, Xiaowei Zhou, Etienne Vouga, QiXing Huang, Georgios Pavlakos
In this paper, we introduce a method for reconstructing 3D humans from a single image using a biomechanically accurate skeleton model.
no code implementations • 18 Dec 2024 • Hanwen Jiang, Zexiang Xu, Desai Xie, Ziwen Chen, Haian Jin, Fujun Luan, Zhixin Shu, Kai Zhang, Sai Bi, Xin Sun, Jiuxiang Gu, QiXing Huang, Georgios Pavlakos, Hao Tan
We propose scaling up 3D scene reconstruction by training with synthesized data.
no code implementations • 1 Dec 2024 • Kumar Ashutosh, Georgios Pavlakos, Kristen Grauman
Anticipating how a person will interact with objects in an environment is essential for activity understanding, but existing methods are limited to the 2D space of video frames-capturing physically ungrounded predictions of 'what' and ignoring the 'where' and 'how'.
no code implementations • 15 Oct 2024 • Jinhan Li, Yifeng Zhu, Yuqi Xie, Zhenyu Jiang, Mingyo Seo, Georgios Pavlakos, Yuke Zhu
We study the problem of teaching humanoid robots manipulation skills by imitating from single video demonstrations.
no code implementations • 4 Oct 2024 • Brent Yi, Vickie Ye, Maya Zheng, Yunqi Li, Lea Müller, Georgios Pavlakos, Yi Ma, Jitendra Malik, Angjoo Kanazawa
We present EgoAllo, a system for human motion estimation from a head-mounted device.
no code implementations • 6 Sep 2024 • Vongani Maluleke, Lea Müller, Jathushan Rajasegaran, Georgios Pavlakos, Shiry Ginosar, Angjoo Kanazawa, Jitendra Malik
Our contributions are a demonstration of the advantages of socially conditioned future motion prediction and an in-the-wild, couple dance video dataset to enable future research in this direction.
no code implementations • 23 Aug 2024 • Haitao Yang, Yuan Dong, Hanwen Jiang, Dejia Xu, Georgios Pavlakos, QiXing Huang
Using the latent diffusion model has proven effective in developing novel 3D generation techniques.
no code implementations • 1 Aug 2024 • Kumar Ashutosh, Tushar Nagarajan, Georgios Pavlakos, Kris Kitani, Kristen Grauman
Our method takes a video demonstration and its accompanying 3D body pose and generates (1) free-form expert commentary describing what the person is doing well and what they could improve, and (2) a visual expert demonstration that incorporates the required corrections.
no code implementations • 3 Jul 2024 • Hezhen Hu, Zhiwen Fan, Tianhao Wu, Yihan Xi, Seoyoung Lee, Georgios Pavlakos, Zhangyang Wang
Nuanced expressiveness, particularly through fine-grained hand and facial expressions, is pivotal for enhancing the realism and vitality of digital human representations.
no code implementations • 12 Jun 2024 • Hanwen Jiang, QiXing Huang, Georgios Pavlakos
Real3D introduces a novel self-training framework that can benefit from both the existing synthetic data and diverse single-view real images.
no code implementations • 5 Jun 2024 • Hanwen Jiang, Haitao Yang, Georgios Pavlakos, QiXing Huang
When using the same amount of parameters with prior works, CoFie reduces the shape error by 48% and 56% on novel instances of both training and unseen shape categories.
no code implementations • 29 May 2024 • Simranjit Singh, Georgios Pavlakos, Dimitrios Stamoulis
As interest in "reformulating" the 3D Visual Question Answering (VQA) problem in the context of foundation models grows, it is imperative to assess how these new paradigms influence existing closed-vocabulary datasets.
no code implementations • CVPR 2024 • Nicolas Ugrinovic, Boxiao Pan, Georgios Pavlakos, Despoina Paschalidou, Bokui Shen, Jordi Sanchez-Riera, Francesc Moreno-Noguer, Leonidas Guibas
We introduce MultiPhys, a method designed for recovering multi-person motion from monocular videos.
no code implementations • 9 Apr 2024 • Jane Wu, Georgios Pavlakos, Georgia Gkioxari, Jitendra Malik
In order to obtain the best performing single frame model, we first present MCC-Hand-Object (MCC-HO), which jointly reconstructs hand and object geometry given a single RGB image and inferred 3D hand as inputs.
1 code implementation • 29 Mar 2024 • Zhiwen Fan, Kairun Wen, Wenyan Cong, Kevin Wang, Jian Zhang, Xinghao Ding, Danfei Xu, Boris Ivanovic, Marco Pavone, Georgios Pavlakos, Zhangyang Wang, Yue Wang
InstantSplat adopts a self-supervised framework that bridges the gap between 2D images and 3D representations using Gaussian Bundle Adjustment (GauBA) and can be optimized in an end-to-end manner.
1 code implementation • CVPR 2024 • Georgios Pavlakos, Dandan Shan, Ilija Radosavovic, Angjoo Kanazawa, David Fouhey, Jitendra Malik
The key to HaMeR's success lies in scaling up both the data used for training and the capacity of the deep network for hand reconstruction.
no code implementations • CVPR 2024 • Jiahui Lei, Yufu Wang, Georgios Pavlakos, Lingjie Liu, Kostas Daniilidis
We introduce Gaussian Articulated Template Model GART, an explicit, efficient, and expressive representation for non-rigid articulated subject capturing and rendering from monocular videos.
1 code implementation • CVPR 2024 • Lea Müller, Vickie Ye, Georgios Pavlakos, Michael Black, Angjoo Kanazawa
To address this, we present a novel approach that learns a prior over the 3D proxemics two people in close social interaction and demonstrate its use for single-view 3D reconstruction.
1 code implementation • ICCV 2023 • Shubham Goel, Georgios Pavlakos, Jathushan Rajasegaran, Angjoo Kanazawa, Jitendra Malik
To analyze video, we use 3D reconstructions from HMR 2. 0 as input to a tracking system that operates in 3D.
Ranked #3 on
Pose Tracking
on PoseTrack2018
no code implementations • CVPR 2023 • Anastasis Stathopoulos, Georgios Pavlakos, Ligong Han, Dimitris Metaxas
It is based on two key insights: (1) 2D keypoint estimation networks trained on as few as 50-150 images of a given object category generalize well and generate reliable pseudo-labels; (2) a data selection mechanism can automatically create a "curated" subset of the unlabeled web images that can be used for training -- we evaluate four data selection methods.
1 code implementation • CVPR 2023 • Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Christoph Feichtenhofer, Jitendra Malik
Subsequently, we propose a Lagrangian Action Recognition model by fusing 3D pose and contextualized appearance over tracklets.
Ranked #1 on
Action Recognition
on AVA v2.2
(using extra training data)
1 code implementation • CVPR 2023 • Vickie Ye, Georgios Pavlakos, Jitendra Malik, Angjoo Kanazawa
Our method robustly recovers the global 3D trajectories of people in challenging in-the-wild videos, such as PoseTrack.
no code implementations • 28 Jul 2022 • Georgios Pavlakos, Ethan Weber, Matthew Tancik, Angjoo Kanazawa
TV shows depict a wide variety of human behaviors and have been studied extensively for their potential to be a rich source of data for many applications.
1 code implementation • 12 Apr 2022 • Karl Schmeckpeper, Philip R. Osteen, Yufu Wang, Georgios Pavlakos, Kenneth Chaney, Wyatt Jordan, Xiaowei Zhou, Konstantinos G. Derpanis, Kostas Daniilidis
Empirically, we show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios even against a cluttered background.
no code implementations • CVPR 2022 • Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik
For a future frame, we compute the similarity between the predicted state of a tracklet and the single frame observations in a probabilistic manner.
no code implementations • 8 Dec 2021 • Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik
For a future frame, we compute the similarity between the predicted state of a tracklet and the single frame observations in a probabilistic manner.
1 code implementation • NeurIPS 2021 • Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik
We find that 3D representations are more effective than 2D representations for tracking in these settings, and we obtain state-of-the-art performance.
1 code implementation • ICCV 2021 • Nikos Kolotouros, Georgios Pavlakos, Dinesh Jayaraman, Kostas Daniilidis
This paper focuses on the problem of 3D human reconstruction from 2D evidence.
Ranked #2 on
Multi-Hypotheses 3D Human Pose Estimation
on AH36M
1 code implementation • CVPR 2022 • Georgios Pavlakos, Jitendra Malik, Angjoo Kanazawa
The tools we develop open the door to processing and analyzing in 3D content from a large library of edited media, which could be helpful for many downstream applications.
no code implementations • 24 Nov 2020 • Agelos Kratimenos, Georgios Pavlakos, Petros Maragos
Independent Sign Language Recognition is a complex visual recognition problem that combines several challenging tasks of Computer Vision due to the necessity to exploit and fuse information from hand gestures, body features and facial expressions.
1 code implementation • ECCV 2020 • Vasileios Choutas, Georgios Pavlakos, Timo Bolkart, Dimitrios Tzionas, Michael J. Black
To understand how people look, interact, or perform tasks, we need to quickly and accurately capture their 3D body, face, and hands together from an RGB image.
1 code implementation • CVPR 2020 • Wen Jiang, Nikos Kolotouros, Georgios Pavlakos, Xiaowei Zhou, Kostas Daniilidis
Our goal is to train a single network that learns to avoid these problems and generate a coherent 3D reconstruction of all the humans in the scene.
Ranked #2 on
3D Human Reconstruction
on AGORA
2 code implementations • 20 Feb 2020 • Vasileios Vasilopoulos, Georgios Pavlakos, Karl Schmeckpeper, Kostas Daniilidis, Daniel E. Koditschek
This paper solves the planar navigation problem by recourse to an online reactive scheme that exploits recent advances in SLAM and visual object recognition to recast prior geometric knowledge in terms of an offline catalogue of familiar objects.
Robotics
1 code implementation • ICCV 2019 • Georgios Pavlakos, Nikos Kolotouros, Kostas Daniilidis
Assuming that the texture of the person does not change dramatically between frames, we can apply a novel texture consistency loss, which enforces that each point in the texture map has the same texture value across all frames.
Ranked #27 on
Weakly-supervised 3D Human Pose Estimation
on Human3.6M
1 code implementation • ICCV 2019 • Nikos Kolotouros, Georgios Pavlakos, Michael J. Black, Kostas Daniilidis
Our approach is self-improving by nature, since better network estimates can lead the optimization to better solutions, while more accurate optimization fits provide better supervision for the network.
2 code implementations • CVPR 2019 • Nikos Kolotouros, Georgios Pavlakos, Kostas Daniilidis
Image-based features are attached to the mesh vertices and the Graph-CNN is responsible to process them on the mesh structure, while the regression target for each vertex is its 3D location.
Ranked #35 on
Monocular 3D Human Pose Estimation
on Human3.6M
1 code implementation • CVPR 2019 • Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed A. A. Osman, Dimitrios Tzionas, Michael J. Black
We use the new method, SMPLify-X, to fit SMPL-X to both controlled images and images in the wild.
Ranked #1 on
3D Human Reconstruction
on Expressive hands and faces dataset (EHF)
(TR V2V (mm), left hand metric)
no code implementations • CVPR 2018 • Georgios Pavlakos, Luyang Zhu, Xiaowei Zhou, Kostas Daniilidis
The proposed approach outperforms previous baselines on this task and offers an attractive solution for direct prediction of 3D shape from a single color image.
Ranked #127 on
3D Human Pose Estimation
on Human3.6M
(PA-MPJPE metric)
1 code implementation • CVPR 2018 • Georgios Pavlakos, Xiaowei Zhou, Kostas Daniilidis
This information can be acquired by human annotators for a wide range of images and poses.
Ranked #1 on
Monocular 3D Human Pose Estimation
on Human3.6M
(Use Video Sequence metric)
1 code implementation • 17 Apr 2018 • Xiaowei Zhou, Sikang Liu, Georgios Pavlakos, Vijay Kumar, Kostas Daniilidis
Current motion capture (MoCap) systems generally require markers and multiple calibrated cameras, which can be used only in constrained environments.
no code implementations • CVPR 2017 • Georgios Pavlakos, Xiaowei Zhou, Konstantinos G. Derpanis, Kostas Daniilidis
In this paper, we present a geometry-driven approach to automatically collect annotations for human pose prediction tasks.
Ranked #28 on
Weakly-supervised 3D Human Pose Estimation
on Human3.6M
1 code implementation • 14 Mar 2017 • Georgios Pavlakos, Xiaowei Zhou, Aaron Chan, Konstantinos G. Derpanis, Kostas Daniilidis
This paper presents a novel approach to estimating the continuous six degree of freedom (6-DoF) pose (3D translation and rotation) of an object from a single RGB image.
Ranked #1 on
Keypoint Detection
on Pascal3D+
1 code implementation • 9 Jan 2017 • Xiaowei Zhou, Menglong Zhu, Georgios Pavlakos, Spyridon Leonardos, Kostantinos G. Derpanis, Kostas Daniilidis
Recovering 3D full-body human pose is a challenging problem with many applications.
4 code implementations • CVPR 2017 • Georgios Pavlakos, Xiaowei Zhou, Konstantinos G. Derpanis, Kostas Daniilidis
This paper addresses the challenge of 3D human pose estimation from a single color image.
Ranked #18 on
3D Human Pose Estimation
on HumanEva-I