TL;DR: CTLayer is a GNN Layer which is able to rewire a graph in an inductive an parameterfree way according to the commute times distance (or effective resistance). We address it learning a differentiable way to compute the CTembedding of the graph.
CTLayer is able to Learn the Commute Times distance between nodes (i.e. effective resistance distance) in a differentiable way, instead of the common spectral version, and in a parameter free manner, which is not the cased of the heat kernel. This approach allow to solve it as an optimization problem inside a GNN, leading to have a new layer which is able to learn how rewire a given graph in an optimal, and inductive way.
In addition, CTLayer also is able to learn Commute Times embeddings, and then calculate it for any graph in an inductive way. The Commute Times embedding is also related with the eigenvalues and eigenvectors of the Laplacian of the graph, because CT embedding is just the eigenvectors scaled. Therefore, CTLayer is also able to learn hot to calculate the spectrum of the Laplacian in a differentiable way. Therefore, this embedding must satisfy orthogonality and normality.
Finally, recent connections has been found between commute times distance and curvature (which is nondifferentiable), establishing equivalences between them. Therefore, CTLayer can also be seen as the differentiable version of the curvature rewiring.
We are going through a quick overview of the layer, but I suggest go to the paper for a detailed explanation.
CTembedding $\mathbf{Z}$ is computed spectrally in the literature (until the proposal of this method) or it is approximated using the heat kernel (very dependent on hyperparameter $t$). This fact does not allow us to propose differentiable methods using that measure: $$ \mathbf{Z}=\sqrt{vol(G)}\mathbf{\Lambda}^\frac{1}{2}\mathbf{F}^T \textrm{ given } \mathbf{L}=\mathbf{F}\mathbf{\Lambda}\mathbf{F}^T $$
Then, CTdistance is given by the Euclidean distances between the embeddings $CT_{uv} = \mathbf{z_u}\mathbf{z_v}^2$. The spectral form is:
$$ \frac{CT_{uv}}{vol(G)} = \sum_{i=2}^n \frac{1}{\lambda_i} (\mathbf{f}(u)\mathbf{f}(v))^2 $$ being $\mathbf{f}$ the eigenvectors of the graph Laplacian.
This embedding and distances gives us desirable properties of the graph, such an understanding of the structure, or an embedding based on the spectrum which minimizes Dirichlet energies. However, the spectral computation is not differentiable.
Giving that $\mathbf{Z}$ minimizes Dirichlet energies s.t. being orthogonal and normalized, we can formulate this problem as constraining neighboring nodes to have a similar embeddings s.t. $\mathbf{Z}\mathbf{Z}^T=\mathbf{I}$.
$$ \mathbf{Z} = \arg\min_{\mathbf{Z}^T\mathbf{Z}=\mathbf{I}} \frac{\sum_{u,v} \mathbf{z_u}\mathbf{z_v}^2\mathbf{A}_{uv}}{\sum_{u,v} \mathbf{Z}^2_{uv} d_u}=\frac{Tr[\mathbf{Z}^T\mathbf{L}\mathbf{Z}]}{Tr[\mathbf{Z}^T\mathbf{D}\mathbf{Z}]} $$
With the above elements we have a definition of CTLayer, our rewiring layer: Given the matrix $\mathbf{X}_{n\times F}$ encoding the features of the nodes after any message passing (MP) layer, $\mathbf{Z}_{n\times O(n)}=\tanh(\textrm{MLP}(\mathbf{X}))$ learns the association $\mathbf{X}\rightarrow \mathbf{Z}$ while $\mathbf{Z}$ is optimized according to the loss $$ L_{CT} = \frac{Tr[\mathbf{Z}^T\mathbf{L}\mathbf{Z}]}{Tr[\mathbf{Z}^T\mathbf{D}\mathbf{Z}]} + \left\frac{\mathbf{Z}^T\mathbf{Z}}{\mathbf{Z}^T\mathbf{Z}_F}  \mathbf{I}_n\right_F $$ This results in the following resistance diffusion $\mathbf{T}^{CT} = \mathbf{R}(\mathbf{S})\odot \mathbf{A}$ (Hadamard product between the resistance distance and the adjacency) which provides as input to the subsequent MP layer a learnt convolution matrix.
As explained before, $\mathbf{Z}$ is the commute times embedding matrix and the pairwise euclidian distance of that learned embeddings are the commute times distances or resistance distances. Therefore, once trained this layer, it will be able to calculate the commute times embedding for a new graph, and rewire that new and unseen graph in a principled way based on the commute times distance.
Does this rewiring preserve the original structure? Let $G' = \textrm{Sparsify}(G, q)$ be a sampling algorithm of graph $G = (V, E)$, where edges $e \in E$ are sampled with probability $q\propto R_e$ (proportional to the effective resistance, i.e. commute times). Then, for $n = V$ sufficiently large and $1/\sqrt{n}< \epsilon\le 1$, we need O(n\log n/\epsilon^2)$ samples to satisfy:
$$ \forall \mathbf{x}\in\mathbb{R}^n:\; (1\epsilon)\mathbf{x}^T\mathbf{L}_G\mathbf{x}\le\mathbf{x}^T\mathbf{L}_{G'}\mathbf{x}\le (1+\epsilon)\mathbf{x}^T\mathbf{L}_G\mathbf{x} $$
The intuitions behind is that Dirichlet energies in $G'$ are bounded in $(1\pm \epsilon)$ of the Dirichlet energies of the original graph $G$.
Source: DiffWire: Inductive Graph Rewiring via the Lovász BoundPaper  Code  Results  Date  Stars 

Task  Papers  Share 

Link Prediction  1  33.33% 
Graph Classification  1  33.33% 
Node Classification  1  33.33% 
Component  Type 


🤖 No Components Found  You can add them if they exist; e.g. Mask RCNN uses RoIAlign 