Search Results for author: Georgios Pavlakos

Found 44 papers, 22 papers with code

Reconstructing Humans with a Biomechanically Accurate Skeleton

no code implementations27 Mar 2025 Yan Xia, Xiaowei Zhou, Etienne Vouga, QiXing Huang, Georgios Pavlakos

In this paper, we introduce a method for reconstructing 3D humans from a single image using a biomechanically accurate skeleton model.

Human Mesh Recovery Pose Estimation

FIction: 4D Future Interaction Prediction from Video

no code implementations1 Dec 2024 Kumar Ashutosh, Georgios Pavlakos, Kristen Grauman

Anticipating how a person will interact with objects in an environment is essential for activity understanding, but existing methods are limited to the 2D space of video frames-capturing physically ungrounded predictions of 'what' and ignoring the 'where' and 'how'.

Prediction

OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video Imitation

no code implementations15 Oct 2024 Jinhan Li, Yifeng Zhu, Yuqi Xie, Zhenyu Jiang, Mingyo Seo, Georgios Pavlakos, Yuke Zhu

We study the problem of teaching humanoid robots manipulation skills by imitating from single video demonstrations.

Synergy and Synchrony in Couple Dances

no code implementations6 Sep 2024 Vongani Maluleke, Lea Müller, Jathushan Rajasegaran, Georgios Pavlakos, Shiry Ginosar, Angjoo Kanazawa, Jitendra Malik

Our contributions are a demonstration of the advantages of socially conditioned future motion prediction and an in-the-wild, couple dance video dataset to enable future research in this direction.

motion prediction Prediction

Atlas Gaussians Diffusion for 3D Generation

no code implementations23 Aug 2024 Haitao Yang, Yuan Dong, Hanwen Jiang, Dejia Xu, Georgios Pavlakos, QiXing Huang

Using the latent diffusion model has proven effective in developing novel 3D generation techniques.

3D Generation

ExpertAF: Expert Actionable Feedback from Video

no code implementations1 Aug 2024 Kumar Ashutosh, Tushar Nagarajan, Georgios Pavlakos, Kris Kitani, Kristen Grauman

Our method takes a video demonstration and its accompanying 3D body pose and generates (1) free-form expert commentary describing what the person is doing well and what they could improve, and (2) a visual expert demonstration that incorporates the required corrections.

Language Modeling Language Modelling +1

Expressive Gaussian Human Avatars from Monocular RGB Video

no code implementations3 Jul 2024 Hezhen Hu, Zhiwen Fan, Tianhao Wu, Yihan Xi, Seoyoung Lee, Georgios Pavlakos, Zhangyang Wang

Nuanced expressiveness, particularly through fine-grained hand and facial expressions, is pivotal for enhancing the realism and vitality of digital human representations.

Real3D: Scaling Up Large Reconstruction Models with Real-World Images

no code implementations12 Jun 2024 Hanwen Jiang, QiXing Huang, Georgios Pavlakos

Real3D introduces a novel self-training framework that can benefit from both the existing synthetic data and diverse single-view real images.

CoFie: Learning Compact Neural Surface Representations with Coordinate Fields

no code implementations5 Jun 2024 Hanwen Jiang, Haitao Yang, Georgios Pavlakos, QiXing Huang

When using the same amount of parameters with prior works, CoFie reduces the shape error by 48% and 56% on novel instances of both training and unseen shape categories.

Evaluating Zero-Shot GPT-4V Performance on 3D Visual Question Answering Benchmarks

no code implementations29 May 2024 Simranjit Singh, Georgios Pavlakos, Dimitrios Stamoulis

As interest in "reformulating" the 3D Visual Question Answering (VQA) problem in the context of foundation models grows, it is imperative to assess how these new paradigms influence existing closed-vocabulary datasets.

Question Answering Visual Question Answering

Reconstructing Hand-Held Objects in 3D from Images and Videos

no code implementations9 Apr 2024 Jane Wu, Georgios Pavlakos, Georgia Gkioxari, Jitendra Malik

In order to obtain the best performing single frame model, we first present MCC-Hand-Object (MCC-HO), which jointly reconstructs hand and object geometry given a single RGB image and inferred 3D hand as inputs.

Object Object Reconstruction +1

InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds

1 code implementation29 Mar 2024 Zhiwen Fan, Kairun Wen, Wenyan Cong, Kevin Wang, Jian Zhang, Xinghao Ding, Danfei Xu, Boris Ivanovic, Marco Pavone, Georgios Pavlakos, Zhangyang Wang, Yue Wang

InstantSplat adopts a self-supervised framework that bridges the gap between 2D images and 3D representations using Gaussian Bundle Adjustment (GauBA) and can be optimized in an end-to-end manner.

3D Reconstruction Novel View Synthesis +1

GART: Gaussian Articulated Template Models

no code implementations CVPR 2024 Jiahui Lei, Yufu Wang, Georgios Pavlakos, Lingjie Liu, Kostas Daniilidis

We introduce Gaussian Articulated Template Model GART, an explicit, efficient, and expressive representation for non-rigid articulated subject capturing and rendering from monocular videos.

Generative Proxemics: A Prior for 3D Social Interaction from Images

1 code implementation CVPR 2024 Lea Müller, Vickie Ye, Georgios Pavlakos, Michael Black, Angjoo Kanazawa

To address this, we present a novel approach that learns a prior over the 3D proxemics two people in close social interaction and demonstrate its use for single-view 3D reconstruction.

3D Reconstruction Denoising +1

Learning Articulated Shape with Keypoint Pseudo-labels from Web Images

no code implementations CVPR 2023 Anastasis Stathopoulos, Georgios Pavlakos, Ligong Han, Dimitris Metaxas

It is based on two key insights: (1) 2D keypoint estimation networks trained on as few as 50-150 images of a given object category generalize well and generate reliable pseudo-labels; (2) a data selection mechanism can automatically create a "curated" subset of the unlabeled web images that can be used for training -- we evaluate four data selection methods.

3D Reconstruction Keypoint Estimation +1

Decoupling Human and Camera Motion from Videos in the Wild

1 code implementation CVPR 2023 Vickie Ye, Georgios Pavlakos, Jitendra Malik, Angjoo Kanazawa

Our method robustly recovers the global 3D trajectories of people in challenging in-the-wild videos, such as PoseTrack.

The One Where They Reconstructed 3D Humans and Environments in TV Shows

no code implementations28 Jul 2022 Georgios Pavlakos, Ethan Weber, Matthew Tancik, Angjoo Kanazawa

TV shows depict a wide variety of human behaviors and have been studied extensively for their potential to be a rich source of data for many applications.

3D Reconstruction Gaze Estimation

Semantic keypoint-based pose estimation from single RGB frames

1 code implementation12 Apr 2022 Karl Schmeckpeper, Philip R. Osteen, Yufu Wang, Georgios Pavlakos, Kenneth Chaney, Wyatt Jordan, Xiaowei Zhou, Konstantinos G. Derpanis, Kostas Daniilidis

Empirically, we show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios even against a cluttered background.

Object Pose Estimation

Tracking People by Predicting 3D Appearance, Location and Pose

no code implementations CVPR 2022 Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik

For a future frame, we compute the similarity between the predicted state of a tracklet and the single frame observations in a probabilistic manner.

Tracking People by Predicting 3D Appearance, Location & Pose

no code implementations8 Dec 2021 Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik

For a future frame, we compute the similarity between the predicted state of a tracklet and the single frame observations in a probabilistic manner.

Tracking People with 3D Representations

1 code implementation NeurIPS 2021 Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik

We find that 3D representations are more effective than 2D representations for tracking in these settings, and we obtain state-of-the-art performance.

3D geometry

Human Mesh Recovery from Multiple Shots

1 code implementation CVPR 2022 Georgios Pavlakos, Jitendra Malik, Angjoo Kanazawa

The tools we develop open the door to processing and analyzing in 3D content from a large library of edited media, which could be helpful for many downstream applications.

3D Reconstruction Human Mesh Recovery

Independent Sign Language Recognition with 3D Body, Hands, and Face Reconstruction

no code implementations24 Nov 2020 Agelos Kratimenos, Georgios Pavlakos, Petros Maragos

Independent Sign Language Recognition is a complex visual recognition problem that combines several challenging tasks of Computer Vision due to the necessity to exploit and fuse information from hand gestures, body features and facial expressions.

3D Action Recognition 3D Reconstruction +3

Coherent Reconstruction of Multiple Humans from a Single Image

1 code implementation CVPR 2020 Wen Jiang, Nikos Kolotouros, Georgios Pavlakos, Xiaowei Zhou, Kostas Daniilidis

Our goal is to train a single network that learns to avoid these problems and generate a coherent 3D reconstruction of all the humans in the scene.

3D Depth Estimation 3D Human Reconstruction +4

Reactive Navigation in Partially Familiar Planar Environments Using Semantic Perceptual Feedback

2 code implementations20 Feb 2020 Vasileios Vasilopoulos, Georgios Pavlakos, Karl Schmeckpeper, Kostas Daniilidis, Daniel E. Koditschek

This paper solves the planar navigation problem by recourse to an online reactive scheme that exploits recent advances in SLAM and visual object recognition to recast prior geometric knowledge in terms of an offline catalogue of familiar objects.

Robotics

TexturePose: Supervising Human Mesh Estimation with Texture Consistency

1 code implementation ICCV 2019 Georgios Pavlakos, Nikos Kolotouros, Kostas Daniilidis

Assuming that the texture of the person does not change dramatically between frames, we can apply a novel texture consistency loss, which enforces that each point in the texture map has the same texture value across all frames.

Pose Estimation Weakly-supervised 3D Human Pose Estimation

Learning to Reconstruct 3D Human Pose and Shape via Model-fitting in the Loop

1 code implementation ICCV 2019 Nikos Kolotouros, Georgios Pavlakos, Michael J. Black, Kostas Daniilidis

Our approach is self-improving by nature, since better network estimates can lead the optimization to better solutions, while more accurate optimization fits provide better supervision for the network.

3D Human Shape Estimation 3D Multi-Person Pose Estimation

Convolutional Mesh Regression for Single-Image Human Shape Reconstruction

2 code implementations CVPR 2019 Nikos Kolotouros, Georgios Pavlakos, Kostas Daniilidis

Image-based features are attached to the mesh vertices and the Graph-CNN is responsible to process them on the mesh structure, while the regression target for each vertex is its 3D location.

3D geometry 3D Hand Pose Estimation +3

Learning to Estimate 3D Human Pose and Shape from a Single Color Image

no code implementations CVPR 2018 Georgios Pavlakos, Luyang Zhu, Xiaowei Zhou, Kostas Daniilidis

The proposed approach outperforms previous baselines on this task and offers an attractive solution for direct prediction of 3D shape from a single color image.

Ranked #127 on 3D Human Pose Estimation on Human3.6M (PA-MPJPE metric)

3D Human Pose Estimation

Human Motion Capture Using a Drone

1 code implementation17 Apr 2018 Xiaowei Zhou, Sikang Liu, Georgios Pavlakos, Vijay Kumar, Kostas Daniilidis

Current motion capture (MoCap) systems generally require markers and multiple calibrated cameras, which can be used only in constrained environments.

6-DoF Object Pose from Semantic Keypoints

1 code implementation14 Mar 2017 Georgios Pavlakos, Xiaowei Zhou, Aaron Chan, Konstantinos G. Derpanis, Kostas Daniilidis

This paper presents a novel approach to estimating the continuous six degree of freedom (6-DoF) pose (3D translation and rotation) of an object from a single RGB image.

Keypoint Detection Object +1

Cannot find the paper you are looking for? You can Submit a new open access paper.