How to Turn Your Knowledge Graph Embeddings into Generative Models

Some of the most successful knowledge graph embedding (KGE) models for link prediction -- CP, RESCAL, TuckER, ComplEx -- can be interpreted as energy-based models. Under this perspective they are not amenable for exact maximum-likelihood estimation (MLE), sampling and struggle to integrate logical constraints. This work re-interprets the score functions of these KGEs as circuits -- constrained computational graphs allowing efficient marginalisation. Then, we design two recipes to obtain efficient generative circuit models by either restricting their activations to be non-negative or squaring their outputs. Our interpretation comes with little or no loss of performance for link prediction, while the circuits framework unlocks exact learning by MLE, efficient sampling of new triples, and guarantee that logical constraints are satisfied by design. Furthermore, our models scale more gracefully than the original KGEs on graphs with millions of entities.

PDF Abstract NeurIPS 2023 PDF NeurIPS 2023 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Link Property Prediction ogbl-biokg ComplEx^2 Test MRR 0.8583 ± 0.0005 # 3
Validation MRR 0.8592 ± 0.0004 # 3
Number of params 187648000 # 6
Ext. data No # 1

Methods