We explore 3D human pose estimation from a single RGB image. While many
approaches try to directly predict 3D pose from image measurements, we explore
a simple architecture that reasons through intermediate 2D pose predictions...
Our approach is based on two key observations (1) Deep neural nets have
revolutionized 2D pose estimation, producing accurate 2D predictions even for
poses with self occlusions. (2) Big-data sets of 3D mocap data are now readily
available, making it tempting to lift predicted 2D poses to 3D through simple
memorization (e.g., nearest neighbors). The resulting architecture is trivial
to implement with off-the-shelf 2D pose estimation systems and 3D mocap
libraries. Importantly, we demonstrate that such methods outperform almost all
state-of-the-art 3D pose estimation systems, most of which directly try to
regress 3D pose from 2D measurements.