Self-Adversarial Negative Sampling is a negative sampling technique used for methods like word embeddings and knowledge graph embeddings. The traditional negative sampling loss from word2vec for optimizing distance-based models be written as:
$$ L = −\log\sigma\left(\gamma − d_{r}\left(\mathbf{h}, \mathbf{t}\right)\right) − \sum^{n}_{i=1}\frac{1}{k}\log\sigma\left(d_{r}\left(\mathbf{h}^{'}_{i}, \mathbf{t}^{'}_{i}\right) - \gamma\right) $$
where $\gamma$ is a fixed margin, $\sigma$ is the sigmoid function, and $\left(\mathbf{h}^{'}_{i}, r, \mathbf{t}^{'}_{i}\right)$ is the $i$-th negative triplet.
The negative sampling loss samples the negative triplets in a uniform way. Such a uniform negative sampling suffers the problem of inefficiency since many samples are obviously false as training goes on, which does not provide any meaningful information. Therefore, the authors propose an approach called self-adversarial negative sampling, which samples negative triples according to the current embedding model. Specifically, we sample negative triples from the following distribution:
$$ p\left(h^{'}_{j}, r, t^{'}_{j} | \text{set}\left(h_{i}, r_{i}, t_{i} \right) \right) = \frac{\exp\alpha{f}_{r}\left(\mathbf{h}^{'}_{j}, \mathbf{t}^{'}_{j}\right)}{\sum_{i=1}\exp\alpha{f}_{r}\left(\mathbf{h}^{'}_{i}, \mathbf{t}^{'}_{i}\right)} $$
where $\alpha$ is the temperature of sampling. Moreover, since the sampling procedure may be costly, the authors treat the above probability as the weight of the negative sample. Therefore, the final negative sampling loss with self-adversarial training takes the following form:
$$ L = −\log\sigma\left(\gamma − d_{r}\left(\mathbf{h}, \mathbf{t}\right)\right) − \sum^{n}_{i=1}p\left(h^{'}_{i}, r, t^{'}_{i}\right)\log\sigma\left(d_{r}\left(\mathbf{h}^{'}_{i}, \mathbf{t}^{'}_{i}\right) - \gamma\right) $$
Source: RotatE: Knowledge Graph Embedding by Relational Rotation in Complex SpacePaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Graph Embedding | 19 | 17.43% |
Knowledge Graph Embedding | 18 | 16.51% |
Link Prediction | 13 | 11.93% |
Knowledge Graphs | 12 | 11.01% |
Knowledge Graph Completion | 11 | 10.09% |
Entity Embeddings | 4 | 3.67% |
Knowledge Graph Embeddings | 3 | 2.75% |
Translation | 3 | 2.75% |
Reinforcement Learning (RL) | 2 | 1.83% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |