At the core of the proposed method lies our Volumetric Heatmap Autoencoder, a fully-convolutional network tasked with the compression of ground-truth heatmaps into a dense intermediate representation.
Ranked #6 on 3D Human Pose Estimation on Panoptic
Almost all neural architecture search methods are evaluated in terms of performance (i. e. test accuracy) of the model structures that it finds.
When you see a person in a crowd, occluded by other persons, you miss visual information that can be used to recognize, re-identify or simply classify him or her.
We address unsupervised optical flow estimation for ego-centric motion.
Despite the advent of autonomous cars, it's likely - at least in the near future - that human attention will still maintain a central role as a guarantee in terms of legal responsibility during the driving task.
Multi-object tracking has recently become an important area of computer vision, especially for Advanced Driver Assistance Systems (ADAS).
To effectively register an egocentric video sequence under these conditions, we propose to tackle the source of the problem: the matching process.