Via generative adversarial training to learn a target distribution, these layer-wise subspaces automatically discover a set of "eigen-dimensions" at each layer corresponding to a set of semantic attributes or interpretable variations.
Facial attribute editing aims to manipulate attributes on the human face, e. g., adding a mustache or changing the hair color.
Our method is based on the observation that equivariance is an implicit constraint in fully supervised semantic segmentation, whose pixel-level labels take the same spatial transformation as the input images during data augmentation.
This regularized CAM can be embedded in most recent advanced weakly supervised semantic segmentation framework.
Weakly supervised object detection aims at learning precise object detectors, given image category labels.
Benefitted from its great success on many tasks, deep learning is increasingly used on low-computational-cost devices, e. g. smartphone, embedded devices, etc.
Specifically, we introduce a kernel generator as meta-learner to learn to construct feature embedding for query images.
The generator contains an attribute manipulation network (AMN) to edit the face image, and a spatial attention network (SAN) to localize the attribute-specific region which restricts the alternation of AMN within this region.
In current face recognition approaches with convolutional neural network (CNN), a pair of faces to compare are independently fed into the CNN for feature extraction.
Following the similar idea of GAN, this work proposes a novel GAN architecture with duplex adversarial discriminators (referred to as DupGAN), which can achieve domain-invariant representation and domain transformation.
Rotation-invariant face detection, i. e. detecting faces with arbitrary rotation-in-plane (RIP) angles, is widely required in unconstrained applications but still remains as a challenging task, due to the large variations of face appearances.
Based on the encoder-decoder architecture, facial attribute editing is achieved by decoding the latent representation of the given face conditioned on the desired attributes.
The designed ReST has an intrinsic recursive structure and is capable of progressively aligning faces to a canonical one, even those with large variations.
On the other hand, by using a unified MLP cascade to examine proposals of all views in a centralized style, it provides a favorable solution for multi-view face detection with high accuracy and low time-cost.
Robust face representation is imperative to highly accurate face recognition.
Ranked #24 on Face Verification on Labeled Faces in the Wild
Face alignment or facial landmark detection plays an important role in many computer vision applications, e. g., face recognition, facial expression recognition, face animation, etc.
As a result, the representation from the topmost layers of the MvDN network is robust to view discrepancy, and also discriminative.
To alleviate the discrepancy between source and target domains, we propose a domain adaptation method, named as Bi-shifting Auto-Encoder network (BAE).
Facial landmark detection, as a vital topic in computer vision, has been studied for many decades and lots of datasets have been collected for evaluation.
Identifying subjects with variations caused by poses is one of the most challenging tasks in face recognition, since the difference in appearances caused by poses may be even larger than the difference due to identity.