Second, we propose to learn a metric that combines the Mahalanobis and feature distances when comparing a track and a new detection in data association.
Modeling and prediction of human motion dynamics has long been a challenging problem in computer vision, and most existing methods rely on the end-to-end supervised training of various architectures of recurrent neural networks.
Ranked #3 on Human Pose Forecasting on Human3.6M
While prior work attempts to predict future video pixels, anticipate activities or forecast future scene semantic segments from segmentation of the preceding frames, methods that predict future semantic segmentation solely from the previous frame RGB data in a single end-to-end trainable model do not exist.
In this paper, we propose a new action-agnostic method for short- and long-term human pose forecasting.
Ranked #4 on Human Pose Forecasting on Human3.6M