A key challenge in this setting is to learn effectively across clients even though each client has unique data that is often limited in size.
Ranked #1 on Personalized Federated Learning on CIFAR-10 (ACC@1-100Clients metric)
In this approach, a central hypernetwork model is trained to generate a set of models, one model for each client.
Ranked #1 on Personalized Federated Learning on Omniglot
As a result, our method scales well with both the number of classes and data size.
In this paper, we identify an important type of data where generalization from small to large graphs is challenging: graph distributions for which the local structure depends on the graph size.
Here, we tackle the problem of learning the entire Pareto front, with the capability of selecting a desired operating point on the front after training.
Two main challenges arise in this multi-task learning setting: (i) designing useful auxiliary tasks; and (ii) combining auxiliary tasks into a single coherent loss.
We first characterize the space of linear layers that are equivariant both to element reordering and to the inherent symmetries of elements, like translation in the case of images.
Class-conditional generative models hold promise to overcome the shortcomings of their discriminative counterparts.
Predicting not only the target but also an accurate measure of uncertainty is important for many machine learning applications and in particular safety-critical ones.
This paper addresses this problem, incremental few-shot learning, where a regular classification network has already been trained to recognize a set of base classes, and several extra novel classes are being considered, each with only a few labeled examples.
Synthesizing programs using example input/outputs is a classic problem in artificial intelligence.
Message-passing algorithms, such as belief propagation, are a natural way to disseminate evidence amongst correlated variables while exploiting the graph structure, but these algorithms can struggle when the conditional dependency graphs contain loops.
We examine all RBP variants along with BPTT and TBPTT in three different application domains: associative memory with continuous Hopfield networks, document classification in citation networks using graph neural networks and hyperparameter optimization for fully connected networks.
We show how a simple modification to the local reparameterization trick, previously used to train Gaussian distributed weights, enables the training of discrete weights.
In this paper we consider the problem of human pose estimation from a single still image.
Ranked #29 on Pose Estimation on MPII Human Pose
In unsupervised ensemble learning, one obtains predictions from multiple sources or classifiers, yet without knowing the reliability and expertise of each source, and with no labeled data to assess it.
We consider the problem of learning from a similarity matrix (such as spectral clustering and lowd imensional embedding), when computing pairwise similarities are costly, and only a limited number of entries can be observed.