[Re] Improving Multi-hop Question Answering over Knowledge Graphs using Knowledge Base Embeddings

RC 2020 · Jishnu Jaykumar P, Ashish Sardana ·

Reproduction study for Improving Multi-hop Question Answering over Knowledge Graphs using Knowledge Base Embeddings

Scope of Reproducibility
Our work consists of four parts:
1. Reproducing the results from [1].
2. Exploring the effect of various knowledge graph embedding models in the Knowledge Graph Embedding module.
3. Exploring the effect of various transformer models in the Question Embedding
module.
4. Verifying the importance of the Relation Matching (RM) module.
Based on the code shared by the authors, we have reproduced the results for EmbedKGQA[1]. We have not performed relation matching deliberately to validate point-4.

Methodology
We have used the code provided by [1] with some customization for reproducibility. In addition to making the codebase more modular and easy to navigate, we have made changes to incorporate different transformers in the question embedding module. QuestionAnswering models were trained from scratch as no pre-trained models were available for our particular dataset. The code for this work is available on GitHub (See page footer for the link).

Results
We were able to reproduce the Hits@1 to be within ±2.4% of the reported value (in most cases). Anomalies were observed in 2 cases.
1. In MetaQA-KG-Full (3-hop) dataset.
2. WebQSP-KG-Full dataset.
From our experiments on the QA model, we have found that a recent transformer architecture, SBERT[2] produced better accuracy than the original paper. Replacing RoBERTa[3] with SBERT[2] increased the absolute accuracy by ≈3.4% and ≈0.6% in the half KG and the full KG case respectively. (KG: Knowledge Graph, ”≈”: Approximately)

What was easy
As the code was open-sourced, we didnʼt have to implement the paper giving us the liberty to customize the codebase to focus on the authorʼs claim validation, perform extended experiments and explore shared as well as new models. In addition to this, pretrained KG embedding models were shared which helped in the reproduction experiment.

What was difficult
The lack of comprehensive documentation along with missing comments defining functions/classes/attributes etc. made it laborious to review the code and modify it. In addition to large training times for question answering models, the knowledge graph embeddings also required a significant amount of computing resources.

Communication with original authors
We had a couple of virtual meetings with Apoorv Saxena, the primary author of EmbedKGQA[1].