After investigating existing strategies, we observe that there is a lack of study on how to prevent the inter-phase confusion.
The category gap between training and evaluation has been characterised as one of the main obstacles to the success of Few-Shot Learning (FSL).
By simply pulling the different augmented views of each image together or other novel mechanisms, they can learn much unsupervised knowledge and significantly improve the transfer performance of pre-training models.
The past year has witnessed the rapid development of applying the Transformer module to vision problems.
Ranked #253 on Image Classification on ImageNet
Recently, the Transformer module has been transplanted from natural language processing to computer vision.
The main difficulty of person re-identification (ReID) lies in collecting annotated data and transferring the model across different domains.
Contrastive learning has achieved great success in self-supervised visual representation learning, but existing approaches mostly ignored spatial information which is often crucial for visual representation.
Recently, contrastive learning has largely advanced the progress of unsupervised visual representation learning.
Neural architecture search (NAS) has attracted increasing attentions in both academia and industry.
There has been a large literature of neural architecture search, but most existing work made use of heuristic rules that largely constrained the search flexibility.
We alleviate this issue by training a graph convolutional network to fit the performance of sampled sub-networks so that the impact of random errors becomes minimal.
Automatic designing computationally efficient neural networks has received much attention in recent years.
AutoAugment has been a powerful algorithm that improves the accuracy of many vision tasks, yet it is sensitive to the operator space as well as hyper-parameters, and an improper setting may degenerate network optimization.
Ranked #68 on Image Classification on ImageNet
The fundamental difficulty in person re-identification (ReID) lies in learning the correspondence among individual cameras.
Ranked #14 on Unsupervised Domain Adaptation on Duke to Market
Our approach bridges the gap from two aspects, namely, amending the estimation on the architectural gradients, and unifying the hyper-parameter settings in the search and re-training stages.
Differently, this paper investigates ReID in an unexplored single-camera-training (SCT) setting, where each person in the training set appears in only one camera.
Although the performance of person Re-Identification (ReID) has been significantly boosted, many challenging issues in real scenarios have not been fully investigated, e. g., the complex scenes and lighting variations, viewpoint and pose changes, and the large number of identities in a camera network.
Ranked #9 on Unsupervised Person Re-Identification on DukeMTMC-reID (Rank-10 metric)
Targeting to solve these problems, this work proposes a Global-Local-Alignment Descriptor (GLAD) and an efficient indexing and retrieval framework, respectively.
Ranked #71 on Person Re-Identification on Market-1501