Search Results for author: Michael J. Black

Found 171 papers, 87 papers with code

HUMOS: Human Motion Model Conditioned on Body Shape

no code implementations5 Sep 2024 Shashank Tripathi, Omid Taheri, Christoph Lassner, Michael J. Black, Daniel Holden, Carsten Stoll

Generating realistic human motion is essential for many computer vision and graphics applications.

Diversity

Can Large Language Models Understand Symbolic Graphics Programs?

no code implementations15 Aug 2024 Zeju Qiu, Weiyang Liu, Haiwen Feng, Zhen Liu, Tim Z. Xiao, Katherine M. Collins, Joshua B. Tenenbaum, Adrian Weller, Michael J. Black, Bernhard Schölkopf

While LLMs exhibit impressive skills in general program synthesis and analysis, symbolic graphics programs offer a new layer of evaluation: they allow us to test an LLM's ability to answer different-grained semantic-level questions of the images or 3D geometries without a vision encoder.

Instruction Following Program Synthesis

MotionFix: Text-Driven 3D Human Motion Editing

no code implementations1 Aug 2024 Nikos Athanasiou, Alpár Ceske, Markos Diomataris, Michael J. Black, Gül Varol

Access to this data allows us to train a conditional diffusion model, TMED, that takes both the source motion and the edit text as input.

Motion Generation

RILe: Reinforced Imitation Learning

no code implementations12 Jun 2024 Mert Albaba, Sammy Christen, Thomas Langarek, Christoph Gebhardt, Otmar Hilliges, Michael J. Black

The trainer optimizes for long-term cumulative rewards from the discriminator, enabling it to provide nuanced feedback that accounts for the complexity of the task and the student's current capabilities.

Computational Efficiency Imitation Learning +2

PuzzleAvatar: Assembling 3D Avatars from Personal Albums

1 code implementation23 May 2024 Yuliang Xiu, Yufei Ye, Zhen Liu, Dimitrios Tzionas, Michael J. Black

We address this novel "Album2Human" task by developing PuzzleAvatar, a novel model that generates a faithful 3D avatar (in a canonical pose) from a personal OOTD album, while bypassing the challenging estimation of body and camera pose.

Language Modelling Text to 3D

ContourCraft: Learning to Resolve Intersections in Neural Multi-Garment Simulations

no code implementations15 May 2024 Artur Grigorev, Giorgio Becherini, Michael J. Black, Otmar Hilliges, Bernhard Thomaszewski

In this work, we present \moniker{}, a learning-based solution for handling intersections in neural cloth simulations.

ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool Reasoning

no code implementations7 May 2024 Jing Lin, Yao Feng, Weiyang Liu, Michael J. Black

The novel features of ChatHuman include leveraging academic publications to guide the application of 3D human-related tools, employing a retrieval-augmented generation model to generate in-context-learning examples for handling new tools, and discriminating and integrating tool results to enhance 3D human understanding.

Human-Object Interaction Detection In-Context Learning +3

Re-Thinking Inverse Graphics With Large Language Models

no code implementations23 Apr 2024 Peter Kulits, Haiwen Feng, Weiyang Liu, Victoria Abrevaya, Michael J. Black

Inverse graphics -- the task of inverting an image into physical variables that, when rendered, enable reproduction of the observed scene -- is a fundamental challenge in computer vision and graphics.

Language Modelling Large Language Model +1

WANDR: Intention-guided Human Motion Generation

no code implementations CVPR 2024 Markos Diomataris, Nikos Athanasiou, Omid Taheri, Xi Wang, Otmar Hilliges, Michael J. Black

To address this, we introduce WANDR, a data-driven model that takes an avatar's initial pose and a goal's 3D position and generates natural human motions that place the end effector (wrist) on the goal location.

Motion Generation

Generating Human Interaction Motions in Scenes with Text Control

no code implementations16 Apr 2024 Hongwei Yi, Justus Thies, Michael J. Black, Xue Bin Peng, Davis Rempe

Our approach begins with pre-training a scene-agnostic text-to-motion diffusion model, emphasizing goal-reaching constraints on large-scale motion-capture datasets.

Denoising Human-Object Interaction Detection +2

AWOL: Analysis WithOut synthesis using Language

no code implementations3 Apr 2024 Silvia Zuffi, Michael J. Black

This involves learning a mapping between the latent space of a vision-language model and the parameter space of the 3D model, which we do using a small set of shape and text pairs.

Language Modelling

Explorative Inbetweening of Time and Space

no code implementations21 Mar 2024 Haiwen Feng, Zheng Ding, Zhihao Xia, Simon Niklaus, Victoria Abrevaya, Michael J. Black, Xuaner Zhang

We introduce bounded generation as a generalized task to control video generation to synthesize arbitrary camera and subject motion based only on a given start and end frame.

Denoising Video Generation

HMP: Hand Motion Priors for Pose and Shape Estimation from Video

no code implementations27 Dec 2023 Enes Duran, Muhammed Kocabas, Vasileios Choutas, Zicong Fan, Michael J. Black

Therefore, we develop a generative motion prior specific for hands, trained on the AMASS dataset which features diverse and high-quality hand motions.

3D Hand Pose Estimation Motion Estimation

Synthesizing Environment-Specific People in Photographs

no code implementations22 Dec 2023 Mirela Ostrek, Carol O'Sullivan, Michael J. Black, Justus Thies

We present ESP, a novel method for context-aware full-body generation, that enables photo-realistic synthesis and inpainting of people wearing clothing that is semantically appropriate for the scene depicted in an input photograph.

Human Parsing Image Generation

WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion

1 code implementation CVPR 2024 Soyong Shin, Juyong Kim, Eni Halilaj, Michael J. Black

We address these limitations with WHAM (World-grounded Humans with Accurate Motion), which accurately and efficiently reconstructs 3D human motion in a global coordinate system from video.

3D Human Pose Estimation

Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion

1 code implementation CVPR 2024 Kiran Chhatre, Radek Daněček, Nikos Athanasiou, Giorgio Becherini, Christopher Peters, Michael J. Black, Timo Bolkart

Once trained, AMUSE synthesizes 3D human gestures directly from speech with control over the expressed emotions and style by combining the content from the driving speech with the emotion and style of another speech sequence.

ChatPose: Chatting about 3D Human Pose

no code implementations CVPR 2024 Yao Feng, Jing Lin, Sai Kumar Dwivedi, Yu Sun, Priyanka Patel, Michael J. Black

Additionally, ChatPose empowers LLMs to apply their extensive world knowledge in reasoning about human poses, leading to two advanced tasks: speculative pose generation and reasoning about pose estimation.

Pose Estimation Pose Prediction +1

HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video

1 code implementation CVPR 2024 Zicong Fan, Maria Parelli, Maria Eleni Kadoglou, Muhammed Kocabas, Xu Chen, Michael J. Black, Otmar Hilliges

Since humans interact with diverse objects every day, the holistic 3D capture of these interactions is important to understand and model human behaviour.

3D Reconstruction Object +1

Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

1 code implementation10 Nov 2023 Weiyang Liu, Zeju Qiu, Yao Feng, Yuliang Xiu, Yuxuan Xue, Longhui Yu, Haiwen Feng, Zhen Liu, Juyeon Heo, Songyou Peng, Yandong Wen, Michael J. Black, Adrian Weller, Bernhard Schölkopf

We apply this parameterization to OFT, creating a novel parameter-efficient finetuning method, called Orthogonal Butterfly (BOFT).

FLARE: Fast Learning of Animatable and Relightable Mesh Avatars

1 code implementation26 Oct 2023 Shrisha Bharadwaj, Yufeng Zheng, Otmar Hilliges, Michael J. Black, Victoria Fernandez-Abrevaya

Our goal is to efficiently learn personalized animatable 3D head avatars from videos that are geometrically accurate, realistic, relightable, and compatible with current rendering systems.

Ghost on the Shell: An Expressive Representation of General 3D Shapes

no code implementations23 Oct 2023 Zhen Liu, Yao Feng, Yuliang Xiu, Weiyang Liu, Liam Paull, Michael J. Black, Bernhard Schölkopf

Recent work has focused on the former, and methods for reconstructing open surfaces do not support fast reconstruction with material and lighting or unconditional generative modelling.

PACE: Human and Camera Motion Estimation from in-the-wild Videos

no code implementations20 Oct 2023 Muhammed Kocabas, Ye Yuan, Pavlo Molchanov, Yunrong Guo, Michael J. Black, Otmar Hilliges, Jan Kautz, Umar Iqbal

This design combines the strengths of SLAM and motion priors, which leads to significant improvements in human and camera motion estimation.

Motion Estimation

Text-Guided Generation and Editing of Compositional 3D Avatars

no code implementations13 Sep 2023 Hao Zhang, Yao Feng, Peter Kulits, Yandong Wen, Justus Thies, Michael J. Black

We argue that existing methods are limited because they employ a monolithic modeling approach, using a single representation for the head, face, hair, and accessories.

text-guided-generation Virtual Try-on

Learning Disentangled Avatars with Hybrid 3D Representations

no code implementations12 Sep 2023 Yao Feng, Weiyang Liu, Timo Bolkart, Jinlong Yang, Marc Pollefeys, Michael J. Black

Towards this end, both explicit and implicit 3D representations are heavily studied for a holistic modeling and capture of the whole human (e. g., body, clothing, face and hair), but neither representation is an optimal choice in terms of representation efficacy since different parts of the human avatar have different modeling desiderata.

Disentanglement

POCO: 3D Pose and Shape Estimation with Confidence

1 code implementation24 Aug 2023 Sai Kumar Dwivedi, Cordelia Schmid, Hongwei Yi, Michael J. Black, Dimitrios Tzionas

To address this, we develop POCO, a novel framework for training HPS regressors to estimate not only a 3D human body, but also their confidence, in a single feed-forward pass.

Action Recognition Pose Estimation +1

GRIP: Generating Interaction Poses Using Spatial Cues and Latent Consistency

no code implementations22 Aug 2023 Omid Taheri, Yi Zhou, Dimitrios Tzionas, Yang Zhou, Duygu Ceylan, Soren Pirk, Michael J. Black

In contrast, we introduce GRIP, a learning-based method that takes, as input, the 3D motion of the body and the object, and synthesizes realistic motion for both hands before, during, and after object interaction.

Mixed Reality Object

TADA! Text to Animatable Digital Avatars

no code implementations21 Aug 2023 Tingting Liao, Hongwei Yi, Yuliang Xiu, Jiaxaing Tang, Yangyi Huang, Justus Thies, Michael J. Black

We introduce TADA, a simple-yet-effective approach that takes textual descriptions and produces expressive 3D avatars with high-quality geometry and lifelike textures, that can be animated and rendered with traditional graphics pipelines.

Adversarial Likelihood Estimation With One-Way Flows

no code implementations19 Jul 2023 Omri Ben-Dov, Pravir Singh Gupta, Victoria Abrevaya, Michael J. Black, Partha Ghosh

Generative Adversarial Networks (GANs) can produce high-quality samples, but do not provide an estimate of the probability density around the samples.

BEDLAM: A Synthetic Dataset of Bodies Exhibiting Detailed Lifelike Animated Motion

2 code implementations CVPR 2023 Michael J. Black, Priyanka Patel, Joachim Tesch, Jinlong Yang

BEDLAM is useful for a variety of tasks and all images, ground truth bodies, 3D clothing, support code, and more are available for research purposes.

Synthetic Data Generation

Emotional Speech-Driven Animation with Content-Emotion Disentanglement

no code implementations15 Jun 2023 Radek Daněček, Kiran Chhatre, Shashank Tripathi, Yandong Wen, Michael J. Black, Timo Bolkart

While the best recent methods generate 3D animations that are synchronized with the input audio, they largely ignore the impact of emotions on facial expressions.

Disentanglement Lip Reading

Instant Multi-View Head Capture through Learnable Registration

1 code implementation CVPR 2023 Timo Bolkart, Tianye Li, Michael J. Black

We use raw MVS scans as supervision during training, but, once trained, TEMPEH directly predicts 3D heads in dense correspondence without requiring scans.

3D Face Alignment 3D Face Reconstruction +3

TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments

3 code implementations CVPR 2023 Yu Sun, Qian Bao, Wu Liu, Tao Mei, Michael J. Black

Although the estimation of 3D human pose and shape (HPS) is rapidly progressing, current methods still cannot reliably estimate moving humans in global coordinates, which is critical for many applications.

3D Human Pose Estimation regression

AG3D: Learning to Generate 3D Avatars from 2D Image Collections

no code implementations ICCV 2023 Zijian Dong, Xu Chen, Jinlong Yang, Michael J. Black, Otmar Hilliges, Andreas Geiger

The key to progress is hence to learn generative models of 3D avatars from abundant unstructured 2D image collections.

TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion Synthesis

1 code implementation ICCV 2023 Mathis Petrovich, Michael J. Black, Gül Varol

We show that maintaining the motion generation loss, along with the contrastive training, is crucial to obtain good performance.

Moment Retrieval Motion Generation +4

Generalizing Neural Human Fitting to Unseen Poses With Articulated SE(3) Equivariance

no code implementations ICCV 2023 Haiwen Feng, Peter Kulits, Shichen Liu, Michael J. Black, Victoria Abrevaya

Learning-based methods address this but do not generalize well when the input pose is far from those seen during training.

SINC: Spatial Composition of 3D Human Motions for Simultaneous Action Generation

no code implementations ICCV 2023 Nikos Athanasiou, Mathis Petrovich, Michael J. Black, Gül Varol

Motivated by the observation that the correspondence between actions and body parts is encoded in powerful language models, we extract this knowledge by prompting GPT-3 with text such as "what are the body parts involved in the action <action name>?

Action Generation Motion Generation

Reconstructing Signing Avatars From Video Using Linguistic Priors

no code implementations CVPR 2023 Maria-Paola Forte, Peter Kulits, Chun-Hao Huang, Vasileios Choutas, Dimitrios Tzionas, Katherine J. Kuchenbecker, Michael J. Black

A perceptual study shows that SGNify's 3D reconstructions are significantly more comprehensible and natural than those of previous methods and are on par with the source videos.

3D Human Pose Estimation via Intuitive Physics

no code implementations CVPR 2023 Shashank Tripathi, Lea Müller, Chun-Hao P. Huang, Omid Taheri, Michael J. Black, Dimitrios Tzionas

Inspired by biomechanics, we infer the pressure heatmap on the body, the Center of Pressure (CoP) from the heatmap, and the SMPL body's Center of Mass (CoM).

3D Human Pose Estimation

MeshDiffusion: Score-based Generative 3D Mesh Modeling

1 code implementation14 Mar 2023 Zhen Liu, Yao Feng, Michael J. Black, Derek Nowrouzezahrai, Liam Paull, Weiyang Liu

We consider the task of generating realistic 3D shapes, which is useful for a variety of applications such as automatic scene generation and physical simulation.

Scene Generation

Detecting Human-Object Contact in Images

1 code implementation CVPR 2023 Yixin Chen, Sai Kumar Dwivedi, Michael J. Black, Dimitrios Tzionas

To build HOT, we use two data sources: (1) We use the PROX dataset of 3D human meshes moving in 3D scenes, and automatically annotate 2D image areas for contact via 3D mesh proximity and projection.

Object

PointAvatar: Deformable Point-based Head Avatars from Videos

1 code implementation CVPR 2023 Yufeng Zheng, Wang Yifan, Gordon Wetzstein, Michael J. Black, Otmar Hilliges

The ability to create realistic, animatable and relightable head avatars from casual video sequences would open up wide ranging applications in communication and entertainment.

HOOD: Hierarchical Graphs for Generalized Modelling of Clothing Dynamics

2 code implementations CVPR 2023 Artur Grigorev, Bernhard Thomaszewski, Michael J. Black, Otmar Hilliges

We propose a method that leverages graph neural networks, multi-level message passing, and unsupervised training to enable real-time prediction of realistic clothing dynamics.

Physical Simulations

ECON: Explicit Clothed humans Optimized via Normal integration

1 code implementation CVPR 2023 Yuliang Xiu, Jinlong Yang, Xu Cao, Dimitrios Tzionas, Michael J. Black

To increase robustness for these cases, existing work uses an explicit parametric body model to constrain surface reconstruction, but this limits the recovery of free-form surfaces such as loose clothing that deviates from the body.

3D Human Reconstruction Surface Reconstruction

MIME: Human-Aware 3D Scene Generation

no code implementations CVPR 2023 Hongwei Yi, Chun-Hao P. Huang, Shashank Tripathi, Lea Hering, Justus Thies, Michael J. Black

We propose MIME (Mining Interaction and Movement to infer 3D Environments), which is a generative model of indoor scenes that produces furniture layouts that are consistent with the human movement.

2D Semantic Segmentation task 1 (8 classes) 3D Semantic Scene Completion +2

Fast-SNARF: A Fast Deformer for Articulated Neural Fields

1 code implementation28 Nov 2022 Xu Chen, Tianjian Jiang, Jie Song, Max Rietmann, Andreas Geiger, Michael J. Black, Otmar Hilliges

A key challenge in making such methods applicable to articulated objects, such as the human body, is to model the deformation of 3D locations between the rest pose (a canonical space) and the deformed space.

3D Reconstruction Computational Efficiency +1

SUPR: A Sparse Unified Part-Based Human Representation

1 code implementation25 Oct 2022 Ahmed A. A. Osman, Timo Bolkart, Dimitrios Tzionas, Michael J. Black

Using novel 4D scans of feet, we train a model with an extended kinematic tree that captures the range of motion of the toes.

Capturing and Animation of Body and Clothing from Monocular Video

1 code implementation4 Oct 2022 Yao Feng, Jinlong Yang, Marc Pollefeys, Michael J. Black, Timo Bolkart

Building on this insight, we propose SCARF (Segmented Clothed Avatar Radiance Field), a hybrid model combining a mesh-based body with a neural radiance field.

Virtual Try-on

SmartMocap: Joint Estimation of Human and Camera Motion using Uncalibrated RGB Cameras

1 code implementation28 Sep 2022 Nitin Saini, Chun-Hao P. Huang, Michael J. Black, Aamir Ahmad

Second, we learn a probability distribution of short human motion sequences ($\sim$1sec) relative to the ground plane and leverage it to disambiguate between the camera and human motion.

InterCap: Joint Markerless 3D Tracking of Humans and Objects in Interaction

no code implementations26 Sep 2022 Yinghao Huang, Omid Tehari, Michael J. Black, Dimitrios Tzionas

With this method we capture the InterCap dataset, which contains 10 subjects (5 males and 5 females) interacting with 10 objects of various sizes and affordances, including contact with the hands or feet.

Object Pose Estimation

Neural Point-based Shape Modeling of Humans in Challenging Clothing

no code implementations14 Sep 2022 Qianli Ma, Jinlong Yang, Michael J. Black, Siyu Tang

Specifically, we extend point-based methods with a coarse stage, that replaces canonicalization with a learned pose-independent "coarse shape" that can capture the rough surface geometry of clothing like skirts.

TEACH: Temporal Action Composition for 3D Humans

1 code implementation9 Sep 2022 Nikos Athanasiou, Mathis Petrovich, Michael J. Black, Gül Varol

In particular, our goal is to enable the synthesis of a series of actions, which we refer to as temporal action composition.

Motion Synthesis Sentence

LED: Latent Variable-based Estimation of Density

no code implementations23 Jun 2022 Omri Ben-Dov, Pravir Singh Gupta, Victoria Fernandez Abrevaya, Michael J. Black, Partha Ghosh

Modern generative models are roughly divided into two main categories: (1) models that can produce high-quality random samples, but cannot estimate the exact density of new data points and (2) those that provide exact density estimation, at the expense of sample quality and compactness of the latent space.

Density Estimation Diversity

Accurate 3D Body Shape Regression using Metric and Semantic Attributes

1 code implementation CVPR 2022 Vasileios Choutas, Lea Muller, Chun-Hao P. Huang, Siyu Tang, Dimitrios Tzionas, Michael J. Black

Since paired data with images and 3D body shape are rare, we exploit two sources of information: (1) we collect internet images of diverse "fashion" models together with a small set of anthropometric measurements; (2) we collect linguistic shape attributes for a wide range of 3D body meshes and the model images.

3D Human Reconstruction 3D Human Shape Estimation

Towards Racially Unbiased Skin Tone Estimation via Scene Disambiguation

no code implementations8 May 2022 Haiwen Feng, Timo Bolkart, Joachim Tesch, Michael J. Black, Victoria Abrevaya

Our experimental results show significant improvement compared to state-of-the-art methods on albedo estimation, both in terms of accuracy and fairness.

Fairness

ARCTIC: A Dataset for Dexterous Bimanual Hand-Object Manipulation

1 code implementation CVPR 2023 Zicong Fan, Omid Taheri, Dimitrios Tzionas, Muhammed Kocabas, Manuel Kaufmann, Michael J. Black, Otmar Hilliges

In part this is because there exist no datasets with ground-truth 3D annotations for the study of physically consistent and synchronised motion of hands and articulated objects.

3D Reconstruction Object

TEMOS: Generating diverse human motions from textual descriptions

1 code implementation25 Apr 2022 Mathis Petrovich, Michael J. Black, Gül Varol

In contrast to most previous work which focuses on generating a single, deterministic, motion from a textual description, we design a variational approach that can produce multiple diverse human motions.

Motion Synthesis

EMOCA: Emotion Driven Monocular Face Capture and Animation

1 code implementation CVPR 2022 Radek Danecek, Michael J. Black, Timo Bolkart

While EMOCA achieves 3D reconstruction errors that are on par with the current best methods, it significantly outperforms them in terms of the quality of the reconstructed expression and the perceived emotional content.

3D Face Reconstruction 3D geometry +3

OSSO: Obtaining Skeletal Shape from Outside

1 code implementation CVPR 2022 Marilyn Keller, Silvia Zuffi, Michael J. Black, Sergi Pujades

We address the problem of inferring the anatomic skeleton of a person, in an arbitrary pose, from the 3D surface of the body; i. e. we predict the inside (bones) from the outside (skin).

BARC: Learning to Regress 3D Dog Shape from Images by Exploiting Breed Information

no code implementations CVPR 2022 Nadine Rueegg, Silvia Zuffi, Konrad Schindler, Michael J. Black

But, even with a better shape model, the problem of regressing dog shape from an image is still challenging because we lack paired images with 3D ground truth.

LocATe: End-to-end Localization of Actions in 3D with Transformers

no code implementations21 Mar 2022 Jiankai Sun, Bolei Zhou, Michael J. Black, Arjun Chandrasekaran

An important component of this problem is 3D Temporal Action Localization (3D-TAL), which involves recognizing what actions a person is performing, and when.

Action Recognition object-detection +2

Human-Aware Object Placement for Visual Environment Reconstruction

1 code implementation CVPR 2022 Hongwei Yi, Chun-Hao P. Huang, Dimitrios Tzionas, Muhammed Kocabas, Mohamed Hassan, Siyu Tang, Justus Thies, Michael J. Black

In fact, we demonstrate that these human-scene interactions (HSIs) can be leveraged to improve the 3D reconstruction of a scene from a monocular RGB video.

3D Reconstruction Object

AirPose: Multi-View Fusion Network for Aerial 3D Human Pose and Shape Estimation

1 code implementation20 Jan 2022 Nitin Saini, Elia Bonetto, Eric Price, Aamir Ahmad, Michael J. Black

In this letter, we present a novel markerless 3D human motion capture (MoCap) system for unstructured, outdoor environments that uses a team of autonomous unmanned aerial vehicles (UAVs) with on-board RGB cameras and computation.

3D human pose and shape estimation

gDNA: Towards Generative Detailed Neural Avatars

no code implementations CVPR 2022 Xu Chen, Tianjian Jiang, Jie Song, Jinlong Yang, Michael J. Black, Andreas Geiger, Otmar Hilliges

Furthermore, we show that our method can be used on the task of fitting human models to raw scans, outperforming the previous state-of-the-art.

Diversity

Embodied Hands: Modeling and Capturing Hands and Bodies Together

no code implementations7 Jan 2022 Javier Romero, Dimitrios Tzionas, Michael J. Black

We attach MANO to a standard parameterized 3D body shape model (SMPL), resulting in a fully articulated body and hand model (SMPL+H).

GOAL: Generating 4D Whole-Body Motion for Hand-Object Grasping

1 code implementation CVPR 2022 Omid Taheri, Vasileios Choutas, Michael J. Black, Dimitrios Tzionas

This is challenging, as it requires the avatar to walk towards the object with foot-ground contact, orient the head towards it, reach out, and grasp it with a realistic hand pose and hand-object contact.

Object

I M Avatar: Implicit Morphable Head Avatars from Videos

1 code implementation CVPR 2022 Yufeng Zheng, Victoria Fernández Abrevaya, Marcel C. Bühler, Xu Chen, Michael J. Black, Otmar Hilliges

Traditional 3D morphable face models (3DMMs) provide fine-grained control over expression but cannot easily capture geometric and appearance details.

MORPH

InvGAN: Invertible GANs

no code implementations8 Dec 2021 Partha Ghosh, Dominik Zietlow, Michael J. Black, Larry S. Davis, Xiaochen Hu

Our \textbf{InvGAN}, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.

Data Augmentation Image Inpainting +1

SOMA: Solving Optical Marker-Based MoCap Automatically

2 code implementations ICCV 2021 Nima Ghorbani, Michael J. Black

Commercial auto-labeling tools require a specific calibration procedure at capture time, which is not possible for archival data.

Learning to Regress Bodies from Images using Differentiable Semantic Rendering

1 code implementation ICCV 2021 Sai Kumar Dwivedi, Nikos Athanasiou, Muhammed Kocabas, Michael J. Black

For Minimally-Clothed regions, we define the DSR-MC loss, which encourages a tight match between a rendered SMPL body and the minimally-clothed regions of the image.

Ranked #58 on 3D Human Pose Estimation on 3DPW (using extra training data)

3D human pose and shape estimation

The Power of Points for Modeling Humans in Clothing

no code implementations ICCV 2021 Qianli Ma, Jinlong Yang, Siyu Tang, Michael J. Black

The geometry feature can be optimized to fit a previously unseen scan of a person in clothing, enabling the scan to be reposed realistically.

PARE: Part Attention Regressor for 3D Human Body Estimation

1 code implementation ICCV 2021 Muhammed Kocabas, Chun-Hao P. Huang, Otmar Hilliges, Michael J. Black

Despite significant progress, we show that state of the art 3D human pose and shape estimation methods remain sensitive to partial occlusion and can produce dramatically wrong predictions although much of the body is observable.

3D human pose and shape estimation 3D Multi-Person Pose Estimation

LEAP: Learning Articulated Occupancy of People

1 code implementation CVPR 2021 Marko Mihajlovic, Yan Zhang, Michael J. Black, Siyu Tang

Substantial progress has been made on modeling rigid 3D objects using deep implicit representations.

Action-Conditioned 3D Human Motion Synthesis with Transformer VAE

2 code implementations ICCV 2021 Mathis Petrovich, Michael J. Black, Gül Varol

By sampling from this latent space and querying a certain duration through a series of positional encodings, we synthesize variable-length motion sequences conditioned on a categorical action.

Action Recognition Denoising +2

On Self-Contact and Human Pose

1 code implementation CVPR 2021 Lea Müller, Ahmed A. A. Osman, Siyu Tang, Chun-Hao P. Huang, Michael J. Black

Third, we develop a novel HPS optimization method, SMPLify-XMC, that includes contact constraints and uses the known 3DCP body pose during fitting to create near ground-truth poses for MTP images.

Ranked #79 on 3D Human Pose Estimation on 3DPW (MPJPE metric)

3D Human Pose Estimation

Populating 3D Scenes by Learning Human-Scene Interaction

1 code implementation CVPR 2021 Mohamed Hassan, Partha Ghosh, Joachim Tesch, Dimitrios Tzionas, Michael J. Black

Second, we show that POSA's learned representation of body-scene interaction supports monocular human pose estimation that is consistent with a 3D scene, improving on the state of the art.

Contact Detection Pose Estimation

We are More than Our Joints: Predicting how 3D Bodies Move

no code implementations CVPR 2021 Yan Zhang, Michael J. Black, Siyu Tang

We note that motion prediction methods accumulate errors over time, resulting in joints or markers that diverge from true human bodies.

Human motion prediction motion prediction +2

Monocular, One-stage, Regression of Multiple 3D People

2 code implementations ICCV 2021 Yu Sun, Qian Bao, Wu Liu, Yili Fu, Michael J. Black, Tao Mei

Through a body-center-guided sampling process, the body mesh parameters of all people in the image are easily extracted from the Mesh Parameter map.

 Ranked #1 on 3D Multi-Person Mesh Recovery on Relative Human (using extra training data)

3D Depth Estimation 3D Multi-Person Mesh Recovery +2

GRAB: A Dataset of Whole-Body Human Grasping of Objects

2 code implementations ECCV 2020 Omid Taheri, Nima Ghorbani, Michael J. Black, Dimitrios Tzionas

Training computers to understand, model, and synthesize human grasping requires a rich dataset containing complex 3D object shapes, detailed contact information, hand pose and shape, and the 3D body motion over time.

Grasp Contact Prediction Grasp Generation +2

STAR: Sparse Trained Articulated Human Body Regressor

1 code implementation ECCV 2020 Ahmed A. A. Osman, Timo Bolkart, Michael J. Black

The SMPL body model is widely used for the estimation, synthesis, and analysis of 3D human pose and shape.

SMPLpix: Neural Avatars from 3D Human Models

1 code implementation16 Aug 2020 Sergey Prokudin, Michael J. Black, Javier Romero

Recent advances in deep generative models have led to an unprecedented level of realism for synthetically generated images of humans.

3D geometry

PLACE: Proximity Learning of Articulation and Contact in 3D Environments

1 code implementation12 Aug 2020 Siwei Zhang, Yan Zhang, Qianli Ma, Michael J. Black, Siyu Tang

To synthesize realistic human-scene interactions, it is essential to effectively represent the physical contact and proximity between the body and the world.

Perpetual Motion: Generating Unbounded Human Motion

no code implementations27 Jul 2020 Yan Zhang, Michael J. Black, Siyu Tang

To address this problem, we propose a model to generate non-deterministic, \textit{ever-changing}, perpetual human motion, in which the global trajectory and the body pose are cross-conditioned.

Motion Estimation Time Series Analysis

AirCapRL: Autonomous Aerial Human Motion Capture using Deep Reinforcement Learning

no code implementations13 Jul 2020 Rahul Tallamraju, Nitin Saini, Elia Bonetto, Michael Pabst, Yu Tang Liu, Michael J. Black, Aamir Ahmad

We focus on vision-based MoCap, where the objective is to estimate the trajectory of body pose and shape of a single moving person using multiple micro aerial vehicles.

Decision Making reinforcement-learning +2

Generating 3D People in Scenes without People

3 code implementations CVPR 2020 Yan Zhang, Mohamed Hassan, Heiko Neumann, Michael J. Black, Siyu Tang

However, this is a challenging task for a computer as solving it requires that (1) the generated human bodies to be semantically plausible within the 3D environment (e. g. people sitting on the sofa or cooking near the stove), and (2) the generated human-scene interaction to be physically feasible such that the human body and scene do not interpenetrate while, at the same time, body-scene contact supports physical interactions.

Pose Estimation

Learning Multi-Human Optical Flow

2 code implementations24 Oct 2019 Anurag Ranjan, David T. Hoffmann, Dimitrios Tzionas, Siyu Tang, Javier Romero, Michael J. Black

Therefore, we develop a dataset of multi-human optical flow and train optical flow networks on this dataset.

Optical Flow Estimation

Attacking Optical Flow

1 code implementation ICCV 2019 Anurag Ranjan, Joel Janai, Andreas Geiger, Michael J. Black

In this paper, we extend adversarial patch attacks to optical flow networks and show that such attacks can compromise their performance.

Decoder Optical Flow Estimation +1

Learning to Reconstruct 3D Human Pose and Shape via Model-fitting in the Loop

1 code implementation ICCV 2019 Nikos Kolotouros, Georgios Pavlakos, Michael J. Black, Kostas Daniilidis

Our approach is self-improving by nature, since better network estimates can lead the optimization to better solutions, while more accurate optimization fits provide better supervision for the network.

3D Human Shape Estimation 3D Multi-Person Pose Estimation

Resolving 3D Human Pose Ambiguities with 3D Scene Constraints

1 code implementation ICCV 2019 Mohamed Hassan, Vasileios Choutas, Dimitrios Tzionas, Michael J. Black

To motivate this, we show that current 3D human pose estimation methods produce results that are not consistent with the 3D scene.

3D Human Pose Estimation

Three-D Safari: Learning to Estimate Zebra Pose, Shape, and Texture from Images "In the Wild"

1 code implementation ICCV 2019 Silvia Zuffi, Angjoo Kanazawa, Tanya Berger-Wolf, Michael J. Black

In contrast to research on human pose, shape and texture estimation, training data for endangered species is limited, the animals are in complex natural scenes with occlusion, they are naturally camouflaged, travel in herds, and look similar to each other.

Pose Estimation Texture Synthesis

Capture, Learning, and Synthesis of 3D Speaking Styles

1 code implementation CVPR 2019 Daniel Cudeiro, Timo Bolkart, Cassidy Laidlaw, Anurag Ranjan, Michael J. Black

To address this, we introduce a unique 4D face dataset with about 29 minutes of 4D scans captured at 60 fps and synchronized audio from 12 speakers.

3D Face Animation Talking Face Generation

AMASS: Archive of Motion Capture as Surface Shapes

4 code implementations ICCV 2019 Naureen Mahmood, Nima Ghorbani, Nikolaus F. Troje, Gerard Pons-Moll, Michael J. Black

We achieve this using a new method, MoSh++, that converts mocap data into realistic 3D human meshes represented by a rigged body model; here we use SMPL [doi:10. 1145/2816795. 2818013], which is widely used and provides a standard skeletal representation as well as a fully rigged surface mesh.

Learning and Tracking the 3D Body Shape of Freely Moving Infants from RGB-D sequences

no code implementations17 Oct 2018 Nikolas Hesse, Sergi Pujades, Michael J. Black, Michael Arens, Ulrich G. Hofmann, A. Sebastian Schroeder

To demonstrate the applicability of SMIL, we fit the model to RGB-D sequences of freely moving infants and show, with a case study, that our method captures enough motion detail for General Movements Assessment (GMA), a method used in clinical practice for early detection of neurodevelopmental disorders in infants.

Recovering Accurate 3D Human Pose in The Wild Using IMUs and a Moving Camera

no code implementations ECCV 2018 Timo von Marcard, Roberto Henschel, Michael J. Black, Bodo Rosenhahn, Gerard Pons-Moll

In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild.

3D Pose Estimation

Generating 3D faces using Convolutional Mesh Autoencoders

2 code implementations ECCV 2018 Anurag Ranjan, Timo Bolkart, Soubhik Sanyal, Michael J. Black

To address this, we introduce a versatile model that learns a non-linear representation of a face using spectral convolutions on a mesh surface.

3D Face Modelling Face Alignment +1

Learning Human Optical Flow

1 code implementation14 Jun 2018 Anurag Ranjan, Javier Romero, Michael J. Black

Given this, we devise an optical flow algorithm specifically for human motion and show that it is superior to generic flow methods.

Optical Flow Estimation

Lions and Tigers and Bears: Capturing Non-Rigid, 3D, Articulated Shape From Images

no code implementations CVPR 2018 Silvia Zuffi, Angjoo Kanazawa, Michael J. Black

Animals are widespread in nature and the analysis of their shape and motion is important in many fields and industries.

Resisting Adversarial Attacks using Gaussian Mixture Variational Autoencoders

no code implementations31 May 2018 Partha Ghosh, Arpan Losalka, Michael J. Black

Our model has the form of a variational autoencoder, with a Gaussian mixture prior on the latent vector.

Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation

1 code implementation CVPR 2019 Anurag Ranjan, Varun Jampani, Lukas Balles, Kihwan Kim, Deqing Sun, Jonas Wulff, Michael J. Black

We address the unsupervised learning of several interconnected problems in low-level vision: single view depth prediction, camera motion estimation, optical flow, and segmentation of a video into the static scene and moving regions.

Depth Prediction Monocular Depth Estimation +3

On the Integration of Optical Flow and Action Recognition

no code implementations22 Dec 2017 Laura Sevilla-Lara, Yiyi Liao, Fatma Guney, Varun Jampani, Andreas Geiger, Michael J. Black

Here we take a deeper look at the combination of flow and action recognition, and investigate why optical flow is helpful, what makes a flow method good for action recognition, and how we can make it better.

Action Recognition Optical Flow Estimation +1

End-to-end Recovery of Human Shape and Pose

9 code implementations CVPR 2018 Angjoo Kanazawa, Michael J. Black, David W. Jacobs, Jitendra Malik

The main objective is to minimize the reprojection loss of keypoints, which allow our model to be trained using images in-the-wild that only have ground truth 2D annotations.

3D Hand Pose Estimation 3D Human Shape Estimation +5

Towards Accurate Markerless Human Shape and Pose Estimation over Time

no code implementations24 Jul 2017 Yinghao Huang, Federica Bogo, Christoph Lassner, Angjoo Kanazawa, Peter V. Gehler, Ijaz Akhter, Michael J. Black

Existing marker-less motion capture methods often assume known backgrounds, static cameras, and sequence specific motion priors, which narrows its application scenarios.

Pose Estimation

Dynamic FAUST: Registering Human Bodies in Motion

no code implementations CVPR 2017 Federica Bogo, Javier Romero, Gerard Pons-Moll, Michael J. Black

We propose a new mesh registration method that uses both 3D geometry and texture information to register all scans in a sequence to a common reference topology.

3D geometry

Semantic Multi-View Stereo: Jointly Estimating Objects and Voxels

no code implementations CVPR 2017 Ali Osman Ulusoy, Michael J. Black, Andreas Geiger

Due to its probabilistic nature, the approach is able to cope with the approximate geometry of the 3D models as well as input shapes that are not present in the scene.

3D Reconstruction

On human motion prediction using recurrent neural networks

8 code implementations CVPR 2017 Julieta Martinez, Michael J. Black, Javier Romero

Human motion modelling is a classical problem at the intersection of graphics and computer vision, with applications spanning human-computer interaction, motion synthesis, and motion prediction for virtual and augmented reality.

Human motion prediction Human Pose Forecasting +3

Optical Flow in Mostly Rigid Scenes

no code implementations CVPR 2017 Jonas Wulff, Laura Sevilla-Lara, Michael J. Black

Existing algorithms typically focus on either recovering motion and structure under the assumption of a purely static world or optical flow for general unconstrained scenes.

Motion Estimation Optical Flow Estimation

Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs

no code implementations23 Mar 2017 Timo von Marcard, Bodo Rosenhahn, Michael J. Black, Gerard Pons-Moll

We address the problem of making human motion capture in the wild more practical by using a small set of inertial sensors attached to the body.

3D Human Pose Estimation

Unite the People: Closing the Loop Between 3D and 2D Human Representations

2 code implementations CVPR 2017 Christoph Lassner, Javier Romero, Martin Kiefel, Federica Bogo, Michael J. Black, Peter V. Gehler

With a comprehensive set of experiments, we show how this data can be used to train discriminative models that produce results with an unprecedented level of detail: our models predict 31 segments and 91 landmark locations on the body.

 Ranked #1 on Monocular 3D Human Pose Estimation on Human3.6M (Use Video Sequence metric)

3D human pose and shape estimation Monocular 3D Human Pose Estimation

Learning from Synthetic Humans

2 code implementations CVPR 2017 Gül Varol, Javier Romero, Xavier Martin, Naureen Mahmood, Michael J. Black, Ivan Laptev, Cordelia Schmid

In this work we present SURREAL (Synthetic hUmans foR REAL tasks): a new large-scale dataset with synthetically-generated but realistic images of people rendered from 3D sequences of human motion capture data.

2D Human Pose Estimation 3D Human Pose Estimation +2

3D Menagerie: Modeling the 3D shape and pose of animals

no code implementations CVPR 2017 Silvia Zuffi, Angjoo Kanazawa, David Jacobs, Michael J. Black

The best human body models are learned from thousands of 3D scans of people in specific poses, which is infeasible with live animals.

Video Segmentation via Object Flow

no code implementations CVPR 2016 Yi-Hsuan Tsai, Ming-Hsuan Yang, Michael J. Black

Video object segmentation is challenging due to fast moving objects, deforming shapes, and cluttered backgrounds.

Ranked #74 on Semi-Supervised Video Object Segmentation on DAVIS 2016 (using extra training data)

Object Optical Flow Estimation +5

Detailed Full-Body Reconstructions of Moving People From Monocular RGB-D Sequences

no code implementations ICCV 2015 Federica Bogo, Michael J. Black, Matthew Loper, Javier Romero

The method then uses geometry and image texture over time to obtain accurate shape, pose, and appearance information despite unconstrained motion, partial views, varying resolution, occlusion, and soft tissue deformation.

3D geometry

Intrinsic Depth: Improving Depth Transfer With Intrinsic Images

no code implementations ICCV 2015 Naejin Kong, Michael J. Black

In contrast to raw RGB values, albedo and shading provide a richer, more physical, foundation for depth transfer.

Depth Estimation Optical Flow Estimation

Pose-Conditioned Joint Angle Limits for 3D Human Pose Reconstruction

no code implementations CVPR 2015 Ijaz Akhter, Michael J. Black

Second, we define a general parametrization of body pose and a new, multi-stage, method to estimate 3D pose from 2D joint locations using an over-complete dictionary of poses.

Ranked #136 on 3D Human Pose Estimation on Human3.6M (PA-MPJPE metric)

3D Human Pose Estimation 3D Pose Estimation

Efficient Sparse-to-Dense Optical Flow Estimation Using a Learned Basis and Layers

no code implementations CVPR 2015 Jonas Wulff, Michael J. Black

Given a set of sparse matches, we regress to dense optical flow using a learned set of full-frame basis flow fields.

Optical Flow Estimation

Model Transport: Towards Scalable Transfer Learning on Manifolds

no code implementations CVPR 2014 Oren Freifeld, Soren Hauberg, Michael J. Black

We demonstrate the approach by transferring PCA and logistic-regression models of real-world data involving 3D shapes and image descriptors.

regression Transfer Learning

FAUST: Dataset and Evaluation for 3D Mesh Registration

no code implementations CVPR 2014 Federica Bogo, Javier Romero, Matthew Loper, Michael J. Black

We address this with a novel mesh registration technique that combines 3D shape and appearance information to produce high-quality alignments.

Retrieval

Grassmann Averages for Scalable Robust PCA

no code implementations CVPR 2014 Soren Hauberg, Aasa Feragen, Michael J. Black

We exploit that averages can be made robust to formulate the Robust Grassmann Average (RGA) as a form of robust PCA.