Graph Convolutional Networks (GCNs) are powerful for processing graph-structured data and have achieved state-of-the-art performance in several tasks such as node classification, link prediction, and graph classification.
Federated learning (FL) is a machine learning field in which researchers try to facilitate model learning process among multiparty without violating privacy protection regulations.
Inspired by the paradigm of multiple kernel learning, our solution to this issue is using a combination of multiple kernels to approximate the optimal kernel instead of a single one which may limit the performance and flexibility.
To address this issue, we propose Bayesian Pseudocoresets Exemplar VAE (ByPE-VAE), a new variant of VAE with a prior based on Bayesian pseudocoreset.
The goal of few-shot classification is to classify new categories with few labeled examples within each class.
We then force the pretraining model to focus on found foreground objects by a fusion sampling strategy; at the evaluation stage, among images in each training class of any few-shot task, we seek for shared contents and filter out background.
We further design discrete latent space for the variational attention and mathematically show that our model is free from posterior collapse.
Convolutional Neural Networks (CNNs) have achieved tremendous success in a number of learning tasks including image classification.
The first mechanism is a selective domain adaptation (SDA) method, which transfers knowledge from the closest source domain.
To tackle these problems, we use pairwise similarity to weigh the reconstruction loss to capture local structure information, while a similarity is learned by the self-expression layer.
In this work, we generalize the reaction-diffusion equation in statistical physics, Schr\"odinger equation in quantum mechanics, Helmholtz equation in paraxial optics into the neural partial differential equations (NPDE), which can be considered as the fundamental equations in the field of artificial intelligence research.
Disentanglement is defined as the problem of learninga representation that can separate the distinct, informativefactors of variations of data.
An important class of SSL methods is to naturally represent data as graphs such that the label information of unlabelled samples can be inferred from the graphs, which corresponds to graph-based semi-supervised learning (GSSL) methods.
Deep Convolutional Neural Networks (DCNNs) and their variants have been widely used in large scale face recognition(FR) recently.
The ResNet and its variants have achieved remarkable successes in various computer vision tasks.
Few-shot learning aims to recognize new classes with few annotated instances within each category.
Convolutional Neural Networks (CNNs) have achieved tremendous success in a number of learning tasks, e. g., image classification.
The word segmentation is a fundamental and inevitable prerequisite for many languages.
Deep neural networks (DNNs) have achieved outstanding performance in a wide range of applications, e. g., image classification, natural language processing, etc.
Interestingly, we discover that part of the rank elements is sensitive and usually aggregate in a narrow region, namely an interest region.
Furthermore, most existing graph-based methods conduct clustering and semi-supervised classification on the graph learned from the original data matrix, which doesn't have explicit cluster structure, thus they might not achieve the optimal performance.
In this paper, we propose a novel Contextualized code representation learning strategy for commit message Generation (CoreGen).
In this work, we propose a new representation learning method that explicitly models and leverages sample relations, which in turn is used as supervision to guide the representation learning.
Existing algorithms of MS-UDA either only exploit the shared features, i. e., the domain-invariant information, or based on some weak assumption in NLP, e. g., smoothness assumption.
Many complex network structures are proposed recently and many of them concentrate on multi-branch features to achieve high performance.
To this end, we propose the Mutual Information Gradient Estimator (MIGE) for representation learning based on the score estimation of implicit distributions.
Variational autoencoders have been widely applied for natural language generation, however, there are two long-standing problems: information under-representation and posterior collapse.
Deep discriminative models (e. g. deep regression forests, deep neural decision forests) have achieved remarkable success recently to solve problems such as facial age estimation and head pose estimation.
Leveraging on the underlying low-dimensional structure of data, low-rank and sparse modeling approaches have achieved great success in a wide range of applications.
Multi-view clustering is an important approach to analyze multi-view data in an unsupervised way.
Our intrinsic evaluation results demonstrate the high quality of our generated Sindhi word embeddings using SG, CBoW, and GloVe as compare to SdfastText word representations.
A plethora of multi-view subspace clustering (MVSC) methods have been proposed over the past few years.
In a worst-case scenario, MPM tries to minimize an upper bound of misclassification probabilities, considering the global information (i. e., mean and covariance information of each class).
We propose to exploit an energy function to describe the stability and prove that reducing such energy guarantees the robustness against adversarial examples.
Most existing methods don't pay attention to the quality of the graphs and perform graph learning and spectral clustering separately.
Recurrent neural networks (RNNs) have recently achieved remarkable successes in a number of applications.
Authentication is a task aiming to confirm the truth between data instances and personal identities.
By formulating graph construction and kernel learning in a unified framework, the graph and consensus kernel can be iteratively enhanced by each other.
In this paper, we propose a new unsupervised domain adaptation method named Domain-Adversarial Residual-Transfer (DART) learning of Deep Neural Networks to tackle cross-domain image classification tasks.
The proposed model is able to boost the performance of data clustering, semisupervised classification, and data recovery significantly, primarily due to two key factors: 1) enhanced low-rank recovery by exploiting the graph smoothness assumption, 2) improved graph construction by exploiting clean data recovered by robust PCA.
We study the problem of multimodal generative modelling of images based on generative adversarial networks (GANs).
Recently, deep clustering, which is able to perform feature learning that favors clustering tasks via deep neural networks, has achieved remarkable performance in image clustering applications.
Ranked #1 on Image Clustering on LetterA-J
Recurrent Neural Networks (RNNs) and their variants, such as Long-Short Term Memory (LSTM) networks, and Gated Recurrent Unit (GRU) networks, have achieved promising performance in sequential data modeling.
There are two possible reasons for the failure: (i) most existing MKL methods assume that the optimal kernel is a linear combination of base kernels, which may not hold true; and (ii) some kernel weights are inappropriately assigned due to noises and carelessly designed algorithms.
In this paper, we introduce a novel regularization method called Adversarial Noise Layer (ANL) and its efficient version called Class Adversarial Noise Layer (CANL), which are able to significantly improve CNN's generalization ability by adding carefully crafted noise into the intermediate layer activations.
Given the limited GPU DRAM, SuperNeurons not only provisions the necessary memory for the training, but also dynamically allocates the memory for convolution workspaces to achieve the high performance.
Recently, deep neural networks (DNNs) have been regarded as the state-of-the-art classification methods in a wide range of applications, especially in image classification.
On three challenging tasks, including Action Recognition in Videos, Image Captioning and Image Generation, BT-RNN outperforms TT-RNN and the standard RNN in terms of both prediction accuracy and convergence rate.
Generating high fidelity identity-preserving faces with different facial attributes has a wide range of applications.
Second, the discrete solution may deviate from the spectral solution since k-means method is well-known as sensitive to the initialization of cluster centers.
Inheriting these advantages of stochastic neural sequential models, we propose a structured and stochastic sequential neural network, which models both the long-term dependencies via recurrent neural networks and the uncertainty in the segmentation and labels via discrete random variables.
Probabilistic Temporal Tensor Factorization (PTTF) is an effective algorithm to model the temporal tensor data.
To unify these two tasks, we present a new sparse Bayesian approach for joint association study and disease diagnosis.
Most existing distance metric learning methods assume perfect side information that is usually given in pairwise or triplet constraints.
In this framework, SVM and TSVM can be regarded as a learning machine without regularization and one with full regularization from the unlabeled data, respectively.
Based on this finding, we present a parameterized subset of similarity functions for choosing the best tail-heaviness for HSSNE; (2) we present a fixed-point optimization algorithm that can be applied to all heavy-tailed functions and does not require the user to set any parameters; and (3) we present two empirical studies, one for unsupervised visualization showing that our optimization algorithm runs as fast and as good as the best known t-SNE implementation and the other for semi-supervised visualization showing quantitative superiority using the homogeneity measure as well as qualitative advantage in cluster separation over t-SNE.
We consider the problem of Support Vector Machine transduction, which involves a combinatorial problem with exponential computational complexity in the number of unlabeled examples.