We propose a new representation of human body motion which encodes a full motion in a sequence of latent motion primitives.
We propose a framework to learn a structured latent space to represent 4D human body motion, where each latent vector encodes a full motion of the whole 3D human shape.
In this paper, we tackle the problem of 3D human shape estimation from single RGB images.
Our results demonstrate this ability, showing that a CNN, trained on a standard static dataset, can help recover surface details on dynamic scenes that are not perceived by traditional 2D feature based methods.
Recent capture technologies and methods allow not only to retrieve 3D model sequence of moving people in clothing, but also to separate and extract the underlying body geometry, motion component and the clothing as a geometric layer.
We consider 4D shape reconstructions in multi-view environments and investigate how to exploit temporal redundancy for precision refinement.
Instead of using the dominant surface-based geometric representation of the capture, which is less suitable for volumetric effects, our pipeline exploits Centroidal Voronoi tessellation decompositions as unified volumetric representation of the real captured actor, which we show can be used seamlessly as a building block for all processing stages, from capture and tracking to virtual physic simulation.
While numerically plausible, this paradigm ignores the fact that the observed surfaces often delimit volumetric shapes, for which deformations are constrained by the volume inside the shape.
To this goal we use 2D warps for all viewpoints and all temporal frames and a linear image formation model from texture to image space.