Sim2Real domain adaptation (DA) research focuses on the constrained setting of adapting from a labeled synthetic source domain to an unlabeled or sparsely labeled real target domain.
Modern deep learning systems require huge data sets to achieve impressive performance, but there is little guidance on how much or what kind of data to collect.
Given a small training data set and a learning algorithm, how much more data is necessary to reach a target validation or test performance?
Oversampling instances of the tail classes attempts to solve this imbalance.
Ranked #1 on Long-tail Learning on mini-ImageNet-LT
Unsupervised domain adaptation is used in many machine learning applications where, during training, a model has access to unlabeled data in the target domain, and a related labeled dataset.
Active learning is the process of training a model with limited labeled data by selecting a core subset of an unlabeled data pool to label.
Synthetic data is emerging as a promising solution to the scalability issue of supervised deep learning, especially when real data are difficult to acquire or hard to annotate.
Most deep learning models rely on expressive high-dimensional representations to achieve good performance on tasks such as classification.
We derive a closed-form expression for the gradient that is efficient to compute: the complexity to compute the gradient is linear in the size of the training mini-batch and quadratic in the representation dimensionality.
Classic approaches alternate the optimization over the learned metric and the assignment of similar instances.
Clustering is the task of grouping a set of objects so that objects in the same cluster are more similar to each other than to those in other clusters.
This paper introduces a regularization method to explicitly control the rank of a learned symmetric positive semidefinite distance matrix in distance metric learning.