Our numerical experiments on synthetic and real-world data verify that the proposed model can effectively manage data with multi-class nearby but disjoint manifolds of different classes, overlapping manifolds, and manifolds with non-trivial topology.
In this paper we study supervised learning tasks on the space of probability measures.
Applications are demonstrated for clustering of synthetic and real-life time series and image data, and the performance of kdiff is compared to competing distance measures for clustering.
With a novel sub-sampling scheme, StreaMRAK reduces memory and computational complexities by creating a sketch of the original data, where the sub-sampling density is adapted to the bandwidth of the kernel and the local dimensionality of the data.
We propose the use of low bit-depth Sigma-Delta and distributed noise-shaping methods for quantizing the Random Fourier features (RFFs) associated with shift-invariant kernels.
We present Low Distortion Local Eigenmaps (LDLE), a manifold learning technique which constructs a set of low distortion local views of a dataset in lower dimension and registers them to obtain a global embedding.
We introduce a set of novel multiscale basis transforms for signals on graphs that utilize their "dual" domains by incorporating the "natural" distances between graph Laplacian eigenvectors, rather than simply using the eigenvalue ordering.
The transform is defined by computing the optimal transport of each distribution to a fixed reference distribution, and has a number of benefits when it comes to speed of computation and to determining classification boundaries.
We study the approximation of two-layer compositions $f(x) = g(\phi(x))$ via deep networks with ReLU activation, where $\phi$ is a geometrically intuitive, dimensionality reducing feature map.
Stochastic-sampling-based Generative Neural Networks, such as Restricted Boltzmann Machines and Generative Adversarial Networks, are now used for applications such as denoising, image occlusion removal, pattern completion, and motion synthesis.
The recent success of generative adversarial networks and variational learning suggests training a classifier network may work well in addressing the classical two-sample problem.
In a number of situations, collecting a function value for every data point may be prohibitively expensive, and random sampling ignores any structure in the underlying data.
Variational autoencoders (VAEs) and generative adversarial networks (GANs) enjoy an intuitive connection to manifold learning: in training the decoder/generator is optimized to approximate a homeomorphism between the data distribution and the sampling space.
In this paper, we bound the error induced by using a weighted skeletonization of two data sets for computing a two sample test with kernel maximum mean discrepancy.
The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely-many multivariate samples.
We address the problem of defining a network graph on a large collection of classes.
We consider the problem of constructing diffusion operators high dimensional data $X$ to address counterfactual functions $F$, such as individualized treatment effectiveness.
Spectral embedding uses eigenfunctions of the discrete Laplacian on a weighted graph to obtain coordinates for an embedding of an abstract data set into Euclidean space.
We introduce DeepSurv, a Cox proportional hazards deep neural network and state-of-the-art survival method for modeling interactions between a patient's covariates and treatment effectiveness in order to provide personalized treatment recommendations.
In this paper, we build an organization of high-dimensional datasets that cannot be cleanly embedded into a low-dimensional representation due to missing entries and a subset of the features being irrelevant to modeling functions of interest.
In this paper, we propose a manifold learning algorithm based on deep learning to create an encoder, which maps a high-dimensional dataset and its low-dimensional embedding, and a decoder, which takes the embedded data back to the high-dimensional space.