NCL consists of two core components, namely Nested Individual Learning (NIL) and Nested Balanced Online Distillation (NBOD), which focus on the individual supervised learning for each single expert and the knowledge transferring among multiple experts, respectively.
Ranked #1 on Long-tail Learning on CIFAR-10-LT (ρ=100)
In this study, we developed a single-stage rotating object detector via two points with a solar corona heatmap (ROTP) to detect oriented objects.
In this paper, a novel approach via embedded tensor manifold regularization for 2D+3D facial expression recognition (FERETMR) is proposed.
We show that by integrating the BALI fields and SCPA model into a novel self-calibrated pose attention network, more facial prior knowledge can be learned and the detection accuracy and robustness of our method for faces with large poses and heavy occlusions have been improved.
Decoupling spatiotemporal representation refers to decomposing the spatial and temporal features into dimension-independent factors.
To improve the performance of facial age estimation, we first formulate a simple standard baseline and build a much strong one by collecting the tricks in pre-training, data augmentation, model architecture, and so on.
The threat of 3D masks to face recognition systems is increasingly serious and has been widely concerned by researchers.
Dealing with incomplete information is a well studied problem in the context of machine learning and computational intelligence.
no code implementations • 13 Apr 2021 • Ajian Liu, Chenxu Zhao, Zitong Yu, Jun Wan, Anyang Su, Xing Liu, Zichang Tan, Sergio Escalera, Junliang Xing, Yanyan Liang, Guodong Guo, Zhen Lei, Stan Z. Li, Du Zhang
To bridge the gap to real-world applications, we introduce a largescale High-Fidelity Mask dataset, namely CASIA-SURF HiFiMask (briefly HiFiMask).
Meanwhile, a more adaptive architecture-searched network structure can also perform better than the block-fixed ones like Resnet since it increases the diversity of features in different stages of the network better.
Attribute reduction is one of the most important research topics in the theory of rough sets, and many rough sets-based attribute reduction methods have thus been presented.
Although significant progress has been made in synthesizing high-quality and visually realistic face images by unconditional Generative Adversarial Networks (GANs), there still lacks of control over the generation process in order to achieve semantic face editing.
Recently, heatmap regression has been widely explored in facial landmark detection and obtained remarkable performance.
Moreover, a novel cross-order cross-semantic (COCS) regularizer is designed to drive the network to learn cross-order cross-semantic features from different activation for facial landmark detection.
Face anti-spoofing (FAS) plays a vital role in securing face recognition systems.
Heatmap regression (HR) has become one of the mainstream approaches for face alignment and has obtained promising results under constrained environments.
Ranked #4 on Face Alignment on AFLW-19
Although current face alignment algorithms have obtained pretty good performances at predicting the location of facial landmarks, huge challenges remain for faces with severe occlusion and large pose variations, etc.
Gesture recognition has attracted considerable attention owing to its great potential in applications.
no code implementations • 23 Apr 2020 • Ajian Liu, Xuan Li, Jun Wan, Sergio Escalera, Hugo Jair Escalante, Meysam Madadi, Yi Jin, Zhuoyuan Wu, Xiaogang Yu, Zichang Tan, Qi Yuan, Ruikun Yang, Benjia Zhou, Guodong Guo, Stan Z. Li
Although ethnic bias has been verified to severely affect the performance of face recognition systems, it still remains an open research problem in face anti-spoofing.
Ethnic bias has proven to negatively affect the performance of face recognition systems, and it remains an open research problem in face anti-spoofing.
Regardless of the usage of deep learning and handcrafted methods, the dynamic information from videos and the effect of cross-ethnicity are rarely considered in face anti-spoofing.
To narrow this gap and facilitate future pedestrian detection research, we introduce a large and diverse dataset named WiderPerson for dense pedestrian detection in the wild.
Ranked #3 on Object Detection on WiderPerson (mMR metric)
To facilitate face anti-spoofing research, we introduce a large-scale multi-modal dataset, namely CASIA-SURF, which is the largest publicly available dataset for face anti-spoofing in terms of both subjects and modalities.
The ChaLearn large-scale gesture recognition challenge has been run twice in two workshops in conjunction with the International Conference on Pattern Recognition (ICPR) 2016 and International Conference on Computer Vision (ICCV) 2017, attracting more than $200$ teams round the world.
To facilitate face anti-spoofing research, we introduce a large-scale multi-modal dataset, namely CASIA-SURF, which is the largest publicly available dataset for face anti-spoofing in terms of both subjects and visual modalities.
The method RAML aims to give the Meta learner the ability of leveraging the past learned knowledge to reduce the dimension of the original input data by expressing it into high representations, and help the Meta learner to perform well.
Face anti-spoofing is significant to the security of face recognition systems.
Differently from the conventional ConvNet that learns the deep separable features for homogeneous modality-based classification with only one softmax loss function, the c-ConvNet enhances the discriminative power of the deeply learned features and weakens the undesired modality discrepancy by jointly optimizing a ranking loss and a softmax loss for both homogeneous and heterogeneous modalities.
Specifically, deep learning methods based on the CNN and RNN architectures have been adopted for motion recognition using RGB-D data.
Recently, the popularity of depth-sensors such as Kinect has made depth videos easily available while its advantages have not been fully exploited.
In the considered scenario a single training-video is available for each gesture to be recognized, which limits the application of traditional techniques (e. g., HMMs).