NASTAR uses a feedback mechanism to simulate adaptive training data via a noise extractor and a retrieval model.
In visual search, the gallery set could be incrementally growing and added to the database in practice.
The STR mechanism treats the spatial transformation as the message passing process, and the relation between the view poses and the routing weights is modeled by an end-to-end trainable neural network.
This paper introduces an approach for multi-human 3D pose estimation and tracking based on calibrated multi-view.
Ranked #3 on 3D Multi-Person Pose Estimation on Campus
We take advantage of the recent self-supervised framework on jointly learning depth and camera ego-motion estimation on raw videos.
Besides, we discover the errors not only for the identity labels of tracklets but also for the evaluation protocol for the test data of MARS.
Although CondConv is effective for the performance enhancement of a deep model, it is currently applied to individual tasks only.
Ranked #1 on Continual Learning on Flowers (Fine-grained 6 Tasks)
First, it can avoid forgetting (i. e., learn new tasks while remembering all previous tasks).
Simultaneously running multiple modules is a key requirement for a smart multimedia system for facial applications including face recognition, facial expression understanding, and gender identification.
Ranked #1 on Gender Prediction on FotW Gender (using extra training data)
This paper aims at recognizing partially observed human actions in videos.
Many face recognition systems boost the performance using deep learning models, but only a few researches go into the mechanisms for dealing with online registration.
Ranked #1 on Face Recognition on Adience (Online Open Set) (using extra training data)
Automatic age and gender classification based on unconstrained images has become essential techniques on mobile devices.
Ranked #6 on Age And Gender Classification on Adience Gender
In this paper, we present an object detection method that tackles the stingray detection problem based on aerial images.
Although aesthetic quality assessment has generated a great deal of interest in the last decade, most studies focus on providing a quality rating of good or bad for an image.
In this paper, we propose a new unsupervised deep learning approach called DeepBit to learn compact binary descriptor for efficient visual object matching.
SSDH is simple and can be realized by a slight enhancement of an existing deep architecture for classification; yet it is effective and outperforms other hashing approaches on several benchmarks and large datasets.
Augmented reality (AR) displays become more and more popular recently, because of its high intuitiveness for humans and high-quality head-mounted display have rapidly developed.
We propose a Bayesian framework of Gaussian process in order to extend Fisher's discriminant to classify functional data such as spectra and images.