Search Results for author: Georgios Pavlakos

Found 33 papers, 20 papers with code

Evaluating Zero-Shot GPT-4V Performance on 3D Visual Question Answering Benchmarks

no code implementations • 29 May 2024 • Simranjit Singh, Georgios Pavlakos, Dimitrios Stamoulis

As interest in "reformulating" the 3D Visual Question Answering (VQA) problem in the context of foundation models grows, it is imperative to assess how these new paradigms influence existing closed-vocabulary datasets.

Paper
Add Code

MultiPhys: Multi-Person Physics-aware 3D Motion Estimation

no code implementations • 18 Apr 2024 • Nicolas Ugrinovic, Boxiao Pan, Georgios Pavlakos, Despoina Paschalidou, Bokui Shen, Jordi Sanchez-Riera, Francesc Moreno-Noguer, Leonidas Guibas

We introduce MultiPhys, a method designed for recovering multi-person motion from monocular videos.

Motion Estimation

Paper
Add Code

Reconstructing Hand-Held Objects in 3D

no code implementations • 9 Apr 2024 • Jane Wu, Georgios Pavlakos, Georgia Gkioxari, Jitendra Malik

At the same time, two strong anchors emerge in this setting: (1) estimated 3D hands help disambiguate the location and scale of the object, and (2) the set of manipulanda is small relative to all possible objects.

Object Object Reconstruction

Paper
Add Code

InstantSplat: Unbounded Sparse-view Pose-free Gaussian Splatting in 40 Seconds

no code implementations • 29 Mar 2024 • Zhiwen Fan, Wenyan Cong, Kairun Wen, Kevin Wang, Jian Zhang, Xinghao Ding, Danfei Xu, Boris Ivanovic, Marco Pavone, Georgios Pavlakos, Zhangyang Wang, Yue Wang

This pre-processing is usually conducted via a Structure-from-Motion (SfM) pipeline, a procedure that can be slow and unreliable, particularly in sparse-view scenarios with insufficient matched features for accurate reconstruction.

Novel View Synthesis SSIM

Paper
Add Code

Reconstructing Hands in 3D with Transformers

no code implementations • 8 Dec 2023 • Georgios Pavlakos, Dandan Shan, Ilija Radosavovic, Angjoo Kanazawa, David Fouhey, Jitendra Malik

The key to HaMeR's success lies in scaling up both the data used for training and the capacity of the deep network for hand reconstruction.

Paper
Add Code

GART: Gaussian Articulated Template Models

no code implementations • 27 Nov 2023 • Jiahui Lei, Yufu Wang, Georgios Pavlakos, Lingjie Liu, Kostas Daniilidis

We introduce Gaussian Articulated Template Model GART, an explicit, efficient, and expressive representation for non-rigid articulated subject capturing and rendering from monocular videos.

Paper
Add Code

Generative Proxemics: A Prior for 3D Social Interaction from Images

1 code implementation • 15 Jun 2023 • Lea Müller, Vickie Ye, Georgios Pavlakos, Michael Black, Angjoo Kanazawa

To address this, we present a novel approach that learns a prior over the 3D proxemics two people in close social interaction and demonstrate its use for single-view 3D reconstruction.

3D Reconstruction Denoising +1

127

Paper
Code

Humans in 4D: Reconstructing and Tracking Humans with Transformers

1 code implementation • ICCV 2023 • Shubham Goel, Georgios Pavlakos, Jathushan Rajasegaran, Angjoo Kanazawa, Jitendra Malik

To analyze video, we use 3D reconstructions from HMR 2. 0 as input to a tracking system that operates in 3D.

Ranked #3 on Pose Tracking on PoseTrack2018

3D Human Pose Estimation Action Recognition +2

1,070

Paper
Code

Learning Articulated Shape with Keypoint Pseudo-labels from Web Images

no code implementations • CVPR 2023 • Anastasis Stathopoulos, Georgios Pavlakos, Ligong Han, Dimitris Metaxas

It is based on two key insights: (1) 2D keypoint estimation networks trained on as few as 50-150 images of a given object category generalize well and generate reliable pseudo-labels; (2) a data selection mechanism can automatically create a "curated" subset of the unlabeled web images that can be used for training -- we evaluate four data selection methods.

3D Reconstruction Keypoint Estimation +1

Paper
Add Code

On the Benefits of 3D Pose and Tracking for Human Action Recognition

1 code implementation • CVPR 2023 • Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Christoph Feichtenhofer, Jitendra Malik

Subsequently, we propose a Lagrangian Action Recognition model by fusing 3D pose and contextualized appearance over tracklets.

Ranked #1 on Action Recognition on AVA v2.2 (using extra training data)

Action Recognition Temporal Action Localization

231

Paper
Code

Decoupling Human and Camera Motion from Videos in the Wild

1 code implementation • CVPR 2023 • Vickie Ye, Georgios Pavlakos, Jitendra Malik, Angjoo Kanazawa

Our method robustly recovers the global 3D trajectories of people in challenging in-the-wild videos, such as PoseTrack.

434

Paper
Code

The One Where They Reconstructed 3D Humans and Environments in TV Shows

no code implementations • 28 Jul 2022 • Georgios Pavlakos, Ethan Weber, Matthew Tancik, Angjoo Kanazawa

TV shows depict a wide variety of human behaviors and have been studied extensively for their potential to be a rich source of data for many applications.

3D Reconstruction Gaze Estimation

Paper
Add Code

Semantic keypoint-based pose estimation from single RGB frames

1 code implementation • 12 Apr 2022 • Karl Schmeckpeper, Philip R. Osteen, Yufu Wang, Georgios Pavlakos, Kenneth Chaney, Wyatt Jordan, Xiaowei Zhou, Konstantinos G. Derpanis, Kostas Daniilidis

Empirically, we show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios even against a cluttered background.

Object Pose Estimation

Paper
Code

Tracking People by Predicting 3D Appearance, Location and Pose

no code implementations • CVPR 2022 • Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik

For a future frame, we compute the similarity between the predicted state of a tracklet and the single frame observations in a probabilistic manner.

Paper
Add Code

Tracking People by Predicting 3D Appearance, Location & Pose

no code implementations • 8 Dec 2021 • Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik

For a future frame, we compute the similarity between the predicted state of a tracklet and the single frame observations in a probabilistic manner.

Paper
Add Code

Tracking People with 3D Representations

1 code implementation • NeurIPS 2021 • Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik

We find that 3D representations are more effective than 2D representations for tracking in these settings, and we obtain state-of-the-art performance.

Paper
Code

Probabilistic Modeling for Human Mesh Recovery

1 code implementation • ICCV 2021 • Nikos Kolotouros, Georgios Pavlakos, Dinesh Jayaraman, Kostas Daniilidis

This paper focuses on the problem of 3D human reconstruction from 2D evidence.

Ranked #70 on 3D Human Pose Estimation on Human3.6M (PA-MPJPE metric)

3D Human Pose Estimation 3D Human Reconstruction +1

254

Paper
Code

Human Mesh Recovery from Multiple Shots

1 code implementation • CVPR 2022 • Georgios Pavlakos, Jitendra Malik, Angjoo Kanazawa

The tools we develop open the door to processing and analyzing in 3D content from a large library of edited media, which could be helpful for many downstream applications.

3D Reconstruction Human Mesh Recovery

Paper
Code

Independent Sign Language Recognition with 3D Body, Hands, and Face Reconstruction

no code implementations • 24 Nov 2020 • Agelos Kratimenos, Georgios Pavlakos, Petros Maragos

Independent Sign Language Recognition is a complex visual recognition problem that combines several challenging tasks of Computer Vision due to the necessity to exploit and fuse information from hand gestures, body features and facial expressions.

3D Action Recognition 3D Reconstruction +3

Paper
Add Code

Monocular Expressive Body Regression through Body-Driven Attention

1 code implementation • ECCV 2020 • Vasileios Choutas, Georgios Pavlakos, Timo Bolkart, Dimitrios Tzionas, Michael J. Black

To understand how people look, interact, or perform tasks, we need to quickly and accurately capture their 3D body, face, and hands together from an RGB image.

Ranked #1 on 3D Human Reconstruction on Expressive hands and faces dataset (EHF).

3D Face Reconstruction 3D Hand Pose Estimation +4

596

Paper
Code

Coherent Reconstruction of Multiple Humans from a Single Image

1 code implementation • CVPR 2020 • Wen Jiang, Nikos Kolotouros, Georgios Pavlakos, Xiaowei Zhou, Kostas Daniilidis

Our goal is to train a single network that learns to avoid these problems and generate a coherent 3D reconstruction of all the humans in the scene.

Ranked #2 on 3D Human Reconstruction on AGORA

3D Depth Estimation 3D Human Reconstruction +4

364

Paper
Code

Reactive Navigation in Partially Familiar Planar Environments Using Semantic Perceptual Feedback

2 code implementations • 20 Feb 2020 • Vasileios Vasilopoulos, Georgios Pavlakos, Karl Schmeckpeper, Kostas Daniilidis, Daniel E. Koditschek

This paper solves the planar navigation problem by recourse to an online reactive scheme that exploits recent advances in SLAM and visual object recognition to recast prior geometric knowledge in terms of an offline catalogue of familiar objects.

Robotics

Paper
Code

TexturePose: Supervising Human Mesh Estimation with Texture Consistency

1 code implementation • ICCV 2019 • Georgios Pavlakos, Nikos Kolotouros, Kostas Daniilidis

Assuming that the texture of the person does not change dramatically between frames, we can apply a novel texture consistency loss, which enforces that each point in the texture map has the same texture value across all frames.

Ranked #27 on Weakly-supervised 3D Human Pose Estimation on Human3.6M

Pose Estimation Weakly-supervised 3D Human Pose Estimation

134

Paper
Code

Learning to Reconstruct 3D Human Pose and Shape via Model-fitting in the Loop

1 code implementation • ICCV 2019 • Nikos Kolotouros, Georgios Pavlakos, Michael J. Black, Kostas Daniilidis

Our approach is self-improving by nature, since better network estimates can lead the optimization to better solutions, while more accurate optimization fits provide better supervision for the network.

Ranked #3 on 3D Human Pose Estimation on 3D Poses in the Wild Challenge

3D Human Shape Estimation 3D Multi-Person Pose Estimation

788

Paper
Code

Convolutional Mesh Regression for Single-Image Human Shape Reconstruction

2 code implementations • CVPR 2019 • Nikos Kolotouros, Georgios Pavlakos, Kostas Daniilidis

Image-based features are attached to the mesh vertices and the Graph-CNN is responsible to process them on the mesh structure, while the regression target for each vertex is its 3D location.

Ranked #34 on Monocular 3D Human Pose Estimation on Human3.6M

3D Hand Pose Estimation 3D human pose and shape estimation +2

421

Paper
Code

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image

1 code implementation • CVPR 2019 • Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed A. A. Osman, Dimitrios Tzionas, Michael J. Black

We use the new method, SMPLify-X, to fit SMPL-X to both controlled images and images in the wild.

Ranked #1 on 3D Human Reconstruction on Expressive hands and faces dataset (EHF) (TR V2V (mm), left hand metric)

3D Human Pose Estimation 3D Human Reconstruction +2

1,649

Paper
Code

Ordinal Depth Supervision for 3D Human Pose Estimation

1 code implementation • CVPR 2018 • Georgios Pavlakos, Xiaowei Zhou, Kostas Daniilidis

This information can be acquired by human annotators for a wide range of images and poses.

Ranked #1 on Monocular 3D Human Pose Estimation on Human3.6M (Use Video Sequence metric)

Monocular 3D Human Pose Estimation

111

Paper
Code

Learning to Estimate 3D Human Pose and Shape from a Single Color Image

no code implementations • CVPR 2018 • Georgios Pavlakos, Luyang Zhu, Xiaowei Zhou, Kostas Daniilidis

The proposed approach outperforms previous baselines on this task and offers an attractive solution for direct prediction of 3D shape from a single color image.

Ranked #120 on 3D Human Pose Estimation on Human3.6M (PA-MPJPE metric)

3D Human Pose Estimation

Paper
Add Code

Human Motion Capture Using a Drone

1 code implementation • 17 Apr 2018 • Xiaowei Zhou, Sikang Liu, Georgios Pavlakos, Vijay Kumar, Kostas Daniilidis

Current motion capture (MoCap) systems generally require markers and multiple calibrated cameras, which can be used only in constrained environments.

Paper
Code

Harvesting Multiple Views for Marker-less 3D Human Pose Annotations

no code implementations • CVPR 2017 • Georgios Pavlakos, Xiaowei Zhou, Konstantinos G. Derpanis, Kostas Daniilidis

In this paper, we present a geometry-driven approach to automatically collect annotations for human pose prediction tasks.

Ranked #28 on Weakly-supervised 3D Human Pose Estimation on Human3.6M

Pose Prediction Weakly-supervised 3D Human Pose Estimation

Paper
Add Code

6-DoF Object Pose from Semantic Keypoints

1 code implementation • 14 Mar 2017 • Georgios Pavlakos, Xiaowei Zhou, Aaron Chan, Konstantinos G. Derpanis, Kostas Daniilidis

This paper presents a novel approach to estimating the continuous six degree of freedom (6-DoF) pose (3D translation and rotation) of an object from a single RGB image.

Ranked #1 on Keypoint Detection on Pascal3D+