We present a conditional estimation (CEST) framework to learn 3D facial parameters from 2D single-view images by self-supervised training from videos.
As one of the earliest works in hyperspherical face recognition, SphereFace explicitly proposed to learn face embeddings with large inter-class angular margin.
In this paper, we start by identifying the discrepancy between training and evaluation in the existing multi-class classification framework and then discuss the potential limitations caused by the "competitive" nature of softmax normalization.
To improve upon existing models, we propose a generic audio-driven facial animation approach that achieves highly realistic motion synthesis results for the entire face.
The network learns to generate faces from voices by matching the identities of generated faces to those of the speakers, on a training set.
In many retrieval problems, where we must retrieve one or more entries from a gallery in response to a probe, it is common practice to learn to do by directly comparing the probe and gallery entries to one another.
We propose a novel framework, called Disjoint Mapping Network (DIMNet), for cross-modal biometric matching, in particular of voices and faces.
Unlike these work, this paper investigated how long-tailed data impact the training of face CNNs and develop a novel loss function, called range loss, to effectively utilize the tailed data in training process.
This paper addresses deep face recognition (FR) problem under open-set protocol, where ideal face features are expected to have smaller maximal intra-class distance than minimal inter-class distance under a suitably chosen metric space.
Ranked #1 on Face Verification on CK+
Cross-entropy loss together with softmax is arguably one of the most common used supervision components in convolutional neural networks (CNNs).
Convolutional neural networks have achieved great improvement on face recognition in recent years because of its extraordinary ability in learning discriminative features of people with different identities.
In most of the available CNNs, the softmax loss function is used as the supervision signal to train the deep model.
In order to address this problem, we propose a novel deep face recognition framework to learn the age-invariant deep face features through a carefully designed CNN model.
Ranked #7 on Age-Invariant Face Recognition on CACDVS
Sparse coding with dictionary learning (DL) has shown excellent classification performance.
Practical face recognition has been studied in the past decades, but still remains an open challenge.
We propose the structured occlusion coding (SOC) to address occlusion problems.
The LCD similarity measure can be kernelized under KCRC, which theoretically links CRC and LCD under the kernel method.