no code implementations • 8 Nov 2023 • Lukas Gienapp, Harrisen Scells, Niklas Deckers, Janek Bevendorff, Shuai Wang, Johannes Kiesel, Shahbaz Syed, Maik Fröbe, Guide Zucoon, Benno Stein, Matthias Hagen, Martin Potthast
Recent advances in large language models have enabled the development of viable generative information retrieval systems.
2 code implementations • 2 Apr 2023 • Jan Heinrich Reimer, Sebastian Schmidt, Maik Fröbe, Lukas Gienapp, Harrisen Scells, Benno Stein, Matthias Hagen, Martin Potthast
The Archive Query Log (AQL) is a previously unused, comprehensive query log collected at the Internet Archive over the last 25 years.
no code implementations • 4 Nov 2022 • Janek Bevendorff, Philipp Sauer, Lukas Gienapp, Wolfgang Kircheis, Erik Körner, Benno Stein, Martin Potthast
The rapidly growing volume of scientific publications offers an interesting challenge for research on methods for analyzing the authorship of documents with one or more authors.
1 code implementation • 10 Jul 2022 • Lukas Gienapp, Maik Fröbe, Matthias Hagen, Martin Potthast
Pairwise re-ranking models predict which of two documents is more relevant to a query and then aggregate a final ranking from such preferences.
1 code implementation • 4 Feb 2022 • Christopher Akiki, Lukas Gienapp, Martin Potthast
This technical report documents our efforts in addressing the tasks set forth by the 2021 AMoC (Advanced Modelling of Cyber Criminal Careers) Hackathon.
1 code implementation • 22 Dec 2021 • Lukas Gienapp, Wolfgang Kircheis, Bjarne Sievers, Benno Stein, Martin Potthast
We present the Webis-STEREO-21 dataset, a massive collection of Scientific Text Reuse in Open-access publications.
no code implementations • 21 Nov 2021 • Maik Fröbe, Matthias Hagen, Janek Bevendorff, Michael Völske, Benno Stein, Christopher Schröder, Robby Wagner, Lukas Gienapp, Martin Potthast
Commercial web search engines employ near-duplicate detection to ensure that users see each relevant result only once, albeit the underlying web crawls typically include (near-)duplicates of many web pages.
no code implementations • ACL 2020 • Lukas Gienapp, Benno Stein, Matthias Hagen, Martin Potthast
We present an efficient annotation framework for argument quality, a feature difficult to be measured reliably as per previous work.