Subsequently, this local information is aligned and propagated to the preserved nodes to alleviate information loss in graph coarsening.
This framework is general to endow arbitrary DNNs for solving linear inverse problems with convergence guarantees.
Recent 2D-to-3D human pose estimation works tend to utilize the graph structure formed by the topology of the human skeleton.
Graph convolutional networks have been a powerful tool in representation learning of networked data.
In this paper, we theoretically derive a bias-free and state/environment-dependent optimal baseline for DR, and analytically show its ability to achieve further variance reduction over the standard constant and state-dependent baselines for DR. We further propose a variance reduced domain randomization (VRDR) approach for policy gradient methods, to strike a tradeoff between the variance reduction and computational complexity in practice.
To address this issue, we introduce the information bottleneck principle and propose the Self-supervised Variational Information Bottleneck (SVIB) learning framework.
To ensure that the learned graph representations are invariant to node permutations, a layer is employed at the input of the networks to reorder the nodes according to their local topology information.
Here bag of instances indicates a set of similar samples constructed by the teacher and are grouped within a bag, and the goal of distillation is to aggregate compact representations over the student with respect to instances in a bag.
Furthermore, each filter in the spectral domain corresponds to a message passing scheme, and diverse schemes are implemented via the filter bank.
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets regardless of their taxonomy labels, and followed by fine-tuning the pretrained model over specific dataset as usual.
Existing differentiable neural architecture search approaches simply assume the architectural distribution on each edge is independent of each other, which conflicts with the intrinsic properties of architecture.
In the variational E-step, graph topology is optimized by approximating the posterior probability distribution of the latent adjacency matrix with a neural network learned from node embeddings.
To address this, in this paper, we propose a Bayesian linear regression with informative prior (IP-BLR) operator to leverage the data-dependent prior in the learning process of randomized value function, which can leverage the statistics of training results from previous iterations.
To mitigate the model discrepancy between training and target (testing) environments, domain randomization (DR) can generate plenty of environments with a sufficient diversity by randomly sampling environment parameters in simulator.
However, most existing works naively sum or average all the neighboring features to update node representations, which suffers from the following limitations: (1) lack of interpretability to identify crucial node features for GNN's prediction; (2) over-smoothing issue where repeated averaging aggregates excessive noise, making features of nodes in different classes over-mixed and thus indistinguishable.
We leverage the neural tangent kernel (NTK) theory to prove that our weight mean operation whitens activations and transits network into the chaotic regime like BN layer, and consequently, leads to an enhanced convergence.
In this paper, we propose a novel graph pooling strategy that leverages node proximity to improve the hierarchical representation learning of graph data with their multi-hop topology.
In this paper, we propose a novel paradigm of Spatial-Temporal Transformer Networks (STTNs) that leverages dynamical directed spatial dependencies and long-range temporal dependencies to improve the accuracy of long-term traffic forecasting.
To accelerate DNNs inference, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations.