Cross-Lingual GenQA: Open-Domain Question Answering with Answer Sentence Generation

14 Oct 2021  ·  Benjamin Muller, Luca Soldaini, Rik Koncel-Kedziorski, Eric Lind, Alessandro Moschitti ·

Recent approaches for question answering systems have achieved impressive performance on English by combining document-level retrieval with answer generation. These approaches, which we refer to as GenQA, are able to generate full sentences, effectively answering both factoid and non-factoid questions. In this paper, we extend GenQA beyond English and present the first Cross-Lingual answer sentence generation system (CrossGenQA). Our system produces natural, full-sentence answers to questions in several languages by exploiting passages written in multiple other languages. To foster further development on this topic, we introduce GenTyDiQA, an extension of the TyDiQA dataset with well-formed and complete answers for Arabic, Bengali, English, Japanese, and Russian questions. Using GenTyDiQA, we show that multi-language models outperform monolingual GenQA in the four non-English languages; for three of them, our CrossGenQA system achieves the best results.

PDF Abstract

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here