Learning strong representations for multi-modal retrieval is an important problem for many applications, such as recommendation and search.
Equivariance to random image transformations is an effective method to learn landmarks of object categories, such as the eyes and the nose in faces, without manual supervision.
Ranked #1 on Unsupervised Facial Landmark Detection on 300W
DensePose supersedes traditional landmark detectors by densely mapping image pixels to body surface coordinates.
We propose a new approach to model and learn, without manual supervision, the symmetries of natural objects, such as faces or flowers, given only images as input.
We propose a novel method for learning convolutional neural image representations without manual supervision.
One of the key challenges of visual perception is to extract abstract models of 3D objects and object categories from visual measurements, which are affected by complex nuisance factors such as viewpoint, occlusion, motion, and deformations.
Ranked #3 on Unsupervised Facial Landmark Detection on AFLW-MTFL
Learning automatically the structure of object categories remains an important open problem in computer vision.
Ranked #2 on Unsupervised Facial Landmark Detection on AFLW-MTFL