Exploring the Practicality of Generative Retrieval on Dynamic Corpora

27 May 2023  ·  Soyoung Yoon, Chaeeun Kim, Hyunji Lee, Joel Jang, Sohee Yang, Minjoon Seo ·

Benchmarking the performance of information retrieval (IR) methods are mostly conducted with a fixed set of documents (static corpora); in realistic scenarios, this is rarely the case and the document to be retrieved are constantly updated and added. In this paper, we focus on conducting a comprehensive comparison between two categories of contemporary retrieval systems, Dual Encoders (DE) and Generative Retrievals (GR), in a dynamic scenario where the corpora to be retrieved is updated. We also conduct an extensive evaluation of computational and memory efficiency, crucial factors for IR systems for real-world deployment. Our results demonstrate that GR is more adaptable to evolving knowledge (+13-18% on the StreamingQA Benchmark), robust in handling data with temporal information (x 10 times), and efficient in terms of memory (x 4 times), indexing time (x 6 times), and inference flops (x 10 times). Our paper highlights GR's potential for future use in practical IR systems.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods