In our experiment, we demonstrate that the proposed frame-work is able to train deep learning models with millions of classes and achieve above 10×speedup compared to existing approaches.
Existing algorithms for MAML are based on the "episode" idea by sampling a number of tasks and a number of data points for each sampled task at each iteration for updating the meta-model.
Deep AUC (area under the ROC curve) Maximization (DAM) has attracted much attention recently due to its great potential for imbalanced data classification.
Our studies demonstrate that the proposed DAM method improves the performance of optimizing cross-entropy loss by a large margin, and also achieves better performance than optimizing the existing AUC square loss on these medical image classification tasks.
Ranked #1 on Multi-Label Classification on CheXpert
Here we develop a suite of comprehensive machine learning methods and tools spanning different computational models, molecular representations, and loss functions for molecular property prediction and drug discovery.
This paper focuses on stochastic methods for solving smooth non-convex strongly-concave min-max problems, which have received increasing attention due to their potential applications in deep learning (e. g., deep AUC maximization).
In this paper, we study distributed algorithms for large-scale AUC maximization with a deep neural network as a predictive model.
For convex loss functions and two classes of "nice-behaviored" non-convex objectives that are close to a convex function, we establish faster convergence of stagewise training than the vanilla SGD under the PL condition on both training error and testing error.
For example, there is still a lack of theories of convergence for SGD and its variants that use stagewise step size and return an averaged solution in practice.