CT-Layer Explained | Papers With Code

Method Name:*

Method Full Name:*

Description with Markdown (optional):

**TL;DR: CT-Layer is a GNN Layer which is able to rewire a graph in an inductive an parameter-free way according to the commute times distance (or effective resistance). We address it learning a differentiable way to compute the CT-embedding of the graph.**

### Summary

**CT-Layer** is able to Learn the *Commute Times distance*  between nodes (i.e. *effective resistance distance*) in a **differentiable** way, instead of the common spectral version, and in a **parameter free** manner, which is not the cased of the heat kernel. This approach allow to solve it as an optimization problem inside a GNN, leading to have a new layer which is able to learn how rewire a given graph in an optimal, and **inductive** way.

In addition, **CT-Layer** also is able to learn *Commute Times embeddings*, and then calculate it for any graph in an inductive way. The Commute Times embedding is also related with the *eigenvalues* and *eigenvectors* of the Laplacian of the graph, because CT embedding is just the eigenvectors scaled. Therefore, CT-Layer is also able to learn hot to calculate the spectrum of the Laplacian in a differentiable way. Therefore, this embedding must satisfy orthogonality and normality.

Finally, recent connections has been found between commute times distance and **curvature** (which is non-differentiable), establishing equivalences between them. Therefore, **CT-Layer** can also be seen as the differentiable version of the curvature rewiring.

**We are going through a quick overview of the layer, but I suggest go to the paper for a detailed explanation. **

### Spectral CT- Embedding downsides
CT-embedding $\mathbf{Z}$ is computed spectrally  in the literature (until the proposal of this method) or it is approximated using the heat kernel (very dependent on hyperparameter $t$). This fact does not allow us to propose differentiable methods using that measure:
$$
\mathbf{Z}=\sqrt{vol(G)}\mathbf{\Lambda}^\frac{1}{2}\mathbf{F}^T \textrm{ given } \mathbf{L}=\mathbf{F}\mathbf{\Lambda}\mathbf{F}^T
$$

Then, CT-distance  is given by the Euclidean distances between the embeddings $CT_{uv} = ||\mathbf{z_u}-\mathbf{z_v}||^2$. The spectral form is:

$$
\frac{CT_{uv}}{vol(G)} = \sum_{i=2}^n \frac{1}{\lambda_i} (\mathbf{f}(u)-\mathbf{f}(v))^2 
$$
being $\mathbf{f}$ the eigenvectors of the graph Laplacian.

This embedding and distances gives us desirable properties of the graph, such an understanding of the structure, or an embedding based on the spectrum which minimizes Dirichlet energies. However, **the spectral computation is not differentiable**.

### CT-Layer as an optimization problem: Differentiable, learnable and inductive CT-Layer
Giving that $\mathbf{Z}$ minimizes Dirichlet energies s.t. being orthogonal and normalized, we can formulate this problem as constraining neighboring nodes to have a similar embeddings s.t. $\mathbf{Z}\mathbf{Z}^T=\mathbf{I}$.

$$
\mathbf{Z} = \arg\min_{\mathbf{Z}^T\mathbf{Z}=\mathbf{I}} \frac{\sum\_{u,v} ||\mathbf{z_u}-\mathbf{z_v}||^2\mathbf{A}\_{uv}}{\sum\_{u,v} \mathbf{Z}^2\_{uv} d_u}=\frac{Tr[\mathbf{Z}^T\mathbf{L}\mathbf{Z}]}{Tr[\mathbf{Z}^T\mathbf{D}\mathbf{Z}]}
$$

With the above elements we have a definition of **CT-Layer**, our rewiring layer: 
Given the matrix $\mathbf{X}\_{n\times F}$ encoding the features of the nodes after any message passing (MP) layer, $\mathbf{Z}\_{n\times O(n)}=\tanh(\textrm{MLP}(\mathbf{X}))$ learns the association $\mathbf{X}\rightarrow \mathbf{Z}$ while $\mathbf{Z}$ is optimized according to the loss 
$$
L\_{CT} = \frac{Tr[\mathbf{Z}^T\mathbf{L}\mathbf{Z}]}{Tr[\mathbf{Z}^T\mathbf{D}\mathbf{Z}]} + \left\|\frac{\mathbf{Z}^T\mathbf{Z}}{\|\mathbf{Z}^T\mathbf{Z}\|\_F} - \mathbf{I}\_n\right\|\_F
$$
 This results in the following **resistance diffusion** $\mathbf{T}^{CT} = \mathbf{R}(\mathbf{S})\odot \mathbf{A}$ (Hadamard product between the resistance distance and the adjacency) which provides as input to the subsequent MP layer a learnt convolution matrix.

As explained before, $\mathbf{Z}$ is the **commute times embedding matrix** and the pairwise euclidian distance of that learned embeddings are the **commute times distances** or resistance distances. **Therefore, once trained this layer, it will be able to calculate the commute times embedding for a new graph, and rewire that new and unseen graph in a principled way based on the commute times distance.**

## Preservation of Structure
Does this rewiring preserve the original structure? Let $G' = \textrm{Sparsify}(G, q)$ be a sampling algorithm of graph $G = (V, E)$, where edges $e \in E$ are sampled with probability $q\propto R_e$ (**proportional to the effective resistance, i.e. commute times**).
Then, for $n = |V|$ sufficiently large and $1/\sqrt{n}< \epsilon\le 1$, we need O(n\log n/\epsilon^2)$ samples to satisfy:

$$
\forall \mathbf{x}\in\mathbb{R}^n:\; (1-\epsilon)\mathbf{x}^T\mathbf{L}\_G\mathbf{x}\le\mathbf{x}^T\mathbf{L}\_{G'}\mathbf{x}\le (1+\epsilon)\mathbf{x}^T\mathbf{L}\_G\mathbf{x}
$$

The intuitions behind is that Dirichlet energies in $G'$ are bounded in $(1\pm \epsilon)$ of the Dirichlet energies of the original graph $G$.

Code Snippet URL (optional):

Image

Currently: methods/305a898a-e0a2-4d74-b8e8-c12839496577.png Clear
Change:

Attached collections:

GRAPH EMBEDDINGS

GRAPH MODELS

Add:

New collection name:

Top-level area:

Parent collection (if any):

Description (optional):

Task	Papers	Share
Link Prediction	1	33.33%
Graph Classification	1	33.33%
Node Classification	1	33.33%

Commute Times Layer

Summary

Spectral CT- Embedding downsides

CT-Layer as an optimization problem: Differentiable, learnable and inductive CT-Layer

Preservation of Structure

Papers

Tasks

Usage Over Time

Components

Categories

Add Remove