no code implementations • INLG (ACL) 2020 • Shahbaz Syed, Wei-Fan Chen, Matthias Hagen, Benno Stein, Henning Wachsmuth, Martin Potthast
We propose a shared task on abstractive snippet generation for web pages, a novel task of generating query-biased abstractive summaries for documents that are to be shown on a search results page.
no code implementations • EMNLP (ArgMining) 2021 • Johannes Kiesel, Nico Reichenbach, Benno Stein, Martin Potthast
Many forms of argumentation employ images as persuasive means, but research in argument mining has been focused on verbal argumentation so far.
1 code implementation • EMNLP 2021 • Erik Körner, Gregor Wiedemann, Ahmad Dawar Hakimi, Gerhard Heyer, Martin Potthast
To ease the difficulty of argument stance classification, the task of same side stance classification (S3C) has been proposed.
1 code implementation • In2Writing (ACL) 2022 • Matti Wiegmann, Michael Völske, Benno Stein, Martin Potthast
Context-sensitive word search engines are writing assistants that support word choice, phrasing, and idiomatic language use by indexing large-scale n-gram collections and implementing a wildcard search.
1 code implementation • COLING 2022 • Alexander Bondarenko, Magdalena Wolska, Stefan Heindorf, Lukas Blübaum, Axel-Cyrille Ngonga Ngomo, Benno Stein, Pavel Braslavski, Matthias Hagen, Martin Potthast
At least 5% of questions submitted to search engines ask about cause-effect relationships in some way.
1 code implementation • COLING 2022 • Ferdinand Schlatt, Dieter Bettin, Matthias Hagen, Benno Stein, Martin Potthast
An efficient assessment of the health relatedness of text passages is important to mine the web at scale to conduct health sociological analyses or to develop a health search engine.
1 code implementation • Findings (EMNLP) 2021 • Erik Körner, Ahmad Dawar Hakimi, Gerhard Heyer, Martin Potthast
We introduce and study a problem variant of sentiment analysis, namely the “same sentiment classification problem”, where, given a pair of texts, the task is to determine if they have the same sentiment, disregarding the actual sentiment polarity.
1 code implementation • 5 Apr 2025 • Xinyu Mao, Teerapong Leelanupab, Martin Potthast, Harrisen Scells, Guido Zuccon
However, no tool currently allows end users to directly leverage LLMs for screening or facilitates systematic and transparent usage of LLM-assisted screening methods.
1 code implementation • 19 Aug 2024 • Sebastian Heineking, Jonas Probst, Daniel Steinbach, Martin Potthast, Harrisen Scells
In a user study, our method correlates with the preferences of a human expert (Kendall's $\tau=0. 64$).
no code implementations • 31 Jul 2024 • Lukas Gienapp, Niklas Deckers, Martin Potthast, Harrisen Scells
Representation-based retrieval models, so-called biencoders, estimate the relevance of a document to a query by calculating the similarity of their respective embeddings.
1 code implementation • 10 Jul 2024 • Nandan Thakur, Luiz Bonifacio, Maik Fröbe, Alexander Bondarenko, Ehsan Kamalloo, Martin Potthast, Matthias Hagen, Jimmy Lin
Our black-box evaluation reveals an inherent bias of neural models towards retrieving short passages from the Touch\'e 2020 data, and we also find that quite a few of the neural models' results are unjudged in the Touch\'e 2020 data.
2 code implementations • 13 May 2024 • Ferdinand Schlatt, Maik Fröbe, Harrisen Scells, Shengyao Zhuang, Bevan Koopman, Guido Zuccon, Benno Stein, Martin Potthast, Matthias Hagen
Cross-encoders distilled from large language models (LLMs) are often more effective re-rankers than cross-encoders fine-tuned on manually labeled data.
1 code implementation • 15 Apr 2024 • Matti Wiegmann, Jennifer Rakete, Magdalena Wolska, Benno Stein, Martin Potthast
Trigger warnings are labels that preface documents with sensitive content if this content could be perceived as harmful by certain groups of readers.
2 code implementations • 10 Apr 2024 • Ferdinand Schlatt, Maik Fröbe, Harrisen Scells, Shengyao Zhuang, Bevan Koopman, Guido Zuccon, Benno Stein, Martin Potthast, Matthias Hagen
Existing cross-encoder models can be categorized as pointwise, pairwise, or listwise.
1 code implementation • 26 Mar 2024 • Marcel Gohsen, Matthias Hagen, Martin Potthast, Benno Stein
Since paraphrasing is an ill-defined task, the term "paraphrasing" covers text transformation tasks with different characteristics.
1 code implementation • 12 Mar 2024 • Andrew Parry, Maik Fröbe, Sean MacAvaney, Martin Potthast, Matthias Hagen
Modern sequence-to-sequence relevance models like monoT5 can effectively capture complex textual interactions between queries and documents through cross-encoding.
1 code implementation • 10 Feb 2024 • Shahbaz Syed, Khalid Al-Khatib, Martin Potthast
This paper presents TL;DR Progress, a new tool for exploring the literature on neural text summarization.
2 code implementations • 7 Feb 2024 • Sebastian Schmidt, Ines Zelch, Janek Bevendorff, Benno Stein, Matthias Hagen, Martin Potthast
In this paper, we thus take a first step to investigate whether LLMs can also be used as a countermeasure, i. e., to block generated native ads.
no code implementations • 12 Jan 2024 • Shuai Wang, Harrisen Scells, Shengyao Zhuang, Martin Potthast, Bevan Koopman, Guido Zuccon
Systematic reviews are crucial for evidence-based medicine as they comprehensively analyse published research findings on specific questions.
no code implementations • 8 Nov 2023 • Lukas Gienapp, Harrisen Scells, Niklas Deckers, Janek Bevendorff, Shuai Wang, Johannes Kiesel, Shahbaz Syed, Maik Fröbe, Guido Zuccon, Benno Stein, Matthias Hagen, Martin Potthast
To lay a foundation for developing new evaluation methods for generative retrieval systems, we survey the relevant literature from the fields of information retrieval and natural language processing, identify search tasks and system architectures in generative retrieval, develop a new user model, and study its operationalization.
1 code implementation • 4 Nov 2023 • Shahbaz Syed, Ahmad Dawar Hakimi, Khalid Al-Khatib, Martin Potthast
We propose a new contextualized summarization approach that can generate an informative summary conditioned on a given sentence containing the citation of a reference (a so-called "citance").
1 code implementation • 3 Nov 2023 • Shahbaz Syed, Dominik Schwabe, Khalid Al-Khatib, Martin Potthast
Online forums encourage the exchange and discussion of different stances on many topics.
no code implementations • 7 Oct 2023 • Ines Zelch, Matthias Hagen, Martin Potthast
How will generative AI pay for itself?
1 code implementation • 11 Sep 2023 • Shuai Wang, Harrisen Scells, Martin Potthast, Bevan Koopman, Guido Zuccon
Our best approach is not only viable based on the information available at the time of screening, but also has similar effectiveness to the final title.
2 code implementations • 23 Aug 2023 • Niklas Deckers, Julia Peters, Martin Potthast
Prompt engineering is still the primary way for users of generative text-to-image models to manipulate generated images in a targeted way.
1 code implementation • 8 Aug 2023 • Vahid Sadiri Javadi, Martin Potthast, Lucie Flek
This is also true in sales conversations, where a customer and a sales assistant exchange facts and opinions about products.
1 code implementation • 2 Jun 2023 • Aleksandra Piktus, Odunayo Ogundepo, Christopher Akiki, Akintunde Oladipo, Xinyu Zhang, Hailey Schoelkopf, Stella Biderman, Martin Potthast, Jimmy Lin
We discuss how Pyserini - a widely used toolkit for reproducible IR research can be integrated with the Hugging Face ecosystem of open-source AI libraries and artifacts.
1 code implementation • 30 May 2023 • Maik Fröbe, Jan Heinrich Reimer, Sean MacAvaney, Niklas Deckers, Simon Reich, Janek Bevendorff, Benno Stein, Matthias Hagen, Martin Potthast
Standardization is achieved when a retrieval approach implements PyTerrier's interfaces and the input and output of an experiment are compatible with ir_datasets and ir_measures.
1 code implementation • 24 May 2023 • Timon Ziegenbein, Shahbaz Syed, Felix Lange, Martin Potthast, Henning Wachsmuth
Online discussion moderators must make ad-hoc decisions about whether the contributions of discussion participants are appropriate or should be removed to maintain civility.
no code implementations • 3 May 2023 • Fabian Ziegner, Janos Borst, Andreas Niekler, Martin Potthast
This paper evaluates the viability of using fixed language models for training text classification networks on low-end hardware.
1 code implementation • 13 Apr 2023 • Guglielmo Faggioli, Laura Dietz, Charles Clarke, Gianluca Demartini, Matthias Hagen, Claudia Hauff, Noriko Kando, Evangelos Kanoulas, Martin Potthast, Benno Stein, Henning Wachsmuth
When asked, large language models (LLMs) like ChatGPT claim that they can assist with relevance judgments but it is not clear whether automated judgments can reliably be used in evaluations of retrieval systems.
2 code implementations • 2 Apr 2023 • Jan Heinrich Reimer, Sebastian Schmidt, Maik Fröbe, Lukas Gienapp, Harrisen Scells, Benno Stein, Matthias Hagen, Martin Potthast
The Archive Query Log (AQL) is a previously unused, comprehensive query log collected at the Internet Archive over the last 25 years.
1 code implementation • 28 Feb 2023 • Christopher Akiki, Odunayo Ogundepo, Aleksandra Piktus, Xinyu Zhang, Akintunde Oladipo, Jimmy Lin, Martin Potthast
We present Spacerini, a tool that integrates the Pyserini toolkit for reproducible information retrieval research with Hugging Face to enable the seamless construction and deployment of interactive search engines.
1 code implementation • 26 Jan 2023 • Marcel Gohsen, Matthias Hagen, Martin Potthast, Benno Stein
We propose to use image captions from the Web as a previously underutilized resource for paraphrases (i. e., texts with the same "message") and to create and analyze a corresponding dataset.
no code implementations • 23 Jan 2023 • Yamen Ajjour, Johannes Kiesel, Benno Stein, Martin Potthast
Many computational argumentation tasks, like stance classification, are topic-dependent: the effectiveness of approaches to these tasks significantly depends on whether the approaches were trained on arguments from the same topics as those they are tested on.
no code implementations • 14 Dec 2022 • Niklas Deckers, Maik Fröbe, Johannes Kiesel, Gianluca Pandolfo, Christopher Schröder, Benno Stein, Martin Potthast
Conditional generative models such as DALL-E and Stable Diffusion generate images based on a user-defined text, the prompt.
no code implementations • 4 Nov 2022 • Janek Bevendorff, Philipp Sauer, Lukas Gienapp, Wolfgang Kircheis, Erik Körner, Benno Stein, Martin Potthast
The rapidly growing volume of scientific publications offers an interesting challenge for research on methods for analyzing the authorship of documents with one or more authors.
1 code implementation • 18 Oct 2022 • Shahbaz Syed, Dominik Schwabe, Martin Potthast
This paper presents Summary Workbench, a new tool for developing and evaluating text summarization models.
no code implementations • 13 Oct 2022 • Alonso Palomino, Martin Potthast, Khalid Al-Khatib, Benno Stein
We see the problem not in the complexity of interpreting language phenomena but in the diversity of sociocultural backgrounds of the readers, which cannot be handled uniformly: To decide whether a text has crossed the proverbial line between non-biased and biased is subjective.
1 code implementation • 25 Sep 2022 • Niklas Deckers, Martin Potthast
Web archives have grown to petabytes.
no code implementations • 9 Sep 2022 • Magdalena Wolska, Christopher Schröder, Ole Borchardt, Benno Stein, Martin Potthast
We present the first dataset and evaluation results on a newly defined computational task of trigger warning assignment.
1 code implementation • 10 Jul 2022 • Lukas Gienapp, Maik Fröbe, Matthias Hagen, Martin Potthast
Pairwise re-ranking models predict which of two documents is more relevant to a query and then aggregate a final ranking from such preferences.
1 code implementation • 29 Jun 2022 • Maik Fröbe, Christopher Akiki, Martin Potthast, Matthias Hagen
We investigate the impact of this unintended train-test leakage by training neural retrieval models on combinations of a fixed number of MS MARCO / ORCAS queries that are highly similar to the actual test queries and an increasing number of other queries.
5 code implementations • 9 Jun 2022 • Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Bryan Orinion, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Dylan Schrader, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Batchelder, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mitch Walker, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan A. Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu
BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.
1 code implementation • ACL 2022 • Matthias Hagen, Maik Fröbe, Artur Jurk, Martin Potthast
We introduce and study the task of clickbait spoiling: generating a short text that satisfies the curiosity induced by a clickbait post.
1 code implementation • 4 Feb 2022 • Christopher Akiki, Lukas Gienapp, Martin Potthast
This technical report documents our efforts in addressing the tasks set forth by the 2021 AMoC (Advanced Modelling of Cyber Criminal Careers) Hackathon.
1 code implementation • 22 Dec 2021 • Lukas Gienapp, Wolfgang Kircheis, Bjarne Sievers, Benno Stein, Martin Potthast
We present the Webis-STEREO-21 dataset, a massive collection of Scientific Text Reuse in Open-access publications.
1 code implementation • 22 Nov 2021 • Janek Bevendorff, Martin Potthast, Benno Stein
Web search and other large-scale web data analytics rely on processing archives of web pages stored in a standardized and efficient format.
no code implementations • 21 Nov 2021 • Maik Fröbe, Matthias Hagen, Janek Bevendorff, Michael Völske, Benno Stein, Christopher Schröder, Robby Wagner, Lukas Gienapp, Martin Potthast
Commercial web search engines employ near-duplicate detection to ensure that users see each relevant result only once, albeit the underlying web crawls typically include (near-)duplicates of many web pages.
1 code implementation • 28 Oct 2021 • Christopher Akiki, Martin Potthast
Masked language models have recently been interpreted as energy-based sequence models that can be generated from using a Metropolis--Hastings sampler.
no code implementations • 15 Oct 2021 • Kim Breitwieser, Allison Lahnala, Charles Welch, Lucie Flek, Martin Potthast
We introduce the problem of proficiency modeling: Given a user's posts on a social media platform, the task is to identify the subset of posts or topics for which the user has some level of proficiency.
1 code implementation • EMNLP (ArgMining) 2021 • Milad Alshomary, Timon Gurcke, Shahbaz Syed, Philipp Heinrich, Maximilian Spliethöver, Philipp Cimiano, Martin Potthast, Henning Wachsmuth
Key point analysis is the task of extracting a set of concise and high-level statements from a given collection of arguments, representing the gist of these arguments.
1 code implementation • EMNLP (ACL) 2021 • Shahbaz Syed, Tariq Yousef, Khalid Al-Khatib, Stefan Jänicke, Martin Potthast
This paper introduces Summary Explorer, a new tool to support the manual inspection of text summarization systems by compiling the outputs of 55~state-of-the-art single document summarization approaches on three benchmark datasets, and visually exploring them during a qualitative assessment.
1 code implementation • European Chapter of the Association for Computational Linguistics 2023 • Christopher Schröder, Lydia Müller, Andreas Niekler, Martin Potthast
We introduce small-text, an easy-to-use active learning library, which offers pool-based active learning for single- and multi-label text classification in Python.
2 code implementations • Findings (ACL) 2022 • Christopher Schröder, Andreas Niekler, Martin Potthast
Active learning is the iterative construction of a classification model through targeted labeling, enabling significant labeling cost savings.
1 code implementation • Findings (ACL) 2021 • Shahbaz Syed, Khalid Al-Khatib, Milad Alshomary, Henning Wachsmuth, Martin Potthast
Third, insights are provided into the suitability of our corpus for the task, the differences between the two generation paradigms, the trade-off between informativeness and conciseness, and the impact of encoding argumentative knowledge.
1 code implementation • 25 May 2021 • Milad Alshomary, Shahbaz Syed, Arkajit Dhar, Martin Potthast, Henning Wachsmuth
We hypothesize that identifying the argument's weak premises is key to effective countering.
no code implementations • COLING 2020 • Shahbaz Syed, Roxanne El Baff, Johannes Kiesel, Khalid Al Khatib, Benno Stein, Martin Potthast
With Webis-EditorialSum-2020, we present a corpus of 1330 carefully curated summaries for 266 news editorials.
no code implementations • ACL 2020 • Janek Bevendorff, Khalid Al Khatib, Martin Potthast, Benno Stein
This paper introduces the Webis Gmane Email Corpus 2019, the largest publicly available and fully preprocessed email corpus to date.
no code implementations • ACL 2020 • Milad Alshomary, Shahbaz Syed, Martin Potthast, Henning Wachsmuth
In particular, we argue here that a decisive step is to infer a conclusion{'}s target, and we hypothesize that this target is related to the premises{'} targets.
no code implementations • ACL 2020 • Lukas Gienapp, Benno Stein, Matthias Hagen, Martin Potthast
We present an efficient annotation framework for argument quality, a feature difficult to be measured reliably as per previous work.
no code implementations • 29 May 2020 • Sebastian Bischoff, Niklas Deckers, Marcel Schliebs, Ben Thies, Matthias Hagen, Efstathios Stamatatos, Benno Stein, Martin Potthast
The prerequisite of many approaches to authorship analysis is a representation of writing style.
1 code implementation • 25 Feb 2020 • Wei-Fan Chen, Shahbaz Syed, Benno Stein, Matthias Hagen, Martin Potthast
An abstractive snippet is an originally created piece of text to summarize a web page on a search engine results page.
Ranked #1 on
Text Summarization
on Webis-Snippet-20 Corpus
no code implementations • 19 Jan 2020 • Krisztian Balog, Lucie Flekova, Matthias Hagen, Rosie Jones, Martin Potthast, Filip Radlinski, Mark Sanderson, Svitlana Vakulenko, Hamed Zamani
This paper discusses the potential for creating academic resources (tools, data, and evaluation approaches) to support research in conversational search, by focusing on realistic information needs and conversational interactions.
no code implementations • WS 2019 • Shahbaz Syed, Michael V{\"o}lske, Nedim Lipka, Benno Stein, Hinrich Sch{\"u}tze, Martin Potthast
In this paper, we report on the results of the TL;DR challenge, discussing an extensive manual evaluation of the expected properties of a good summary based on analyzing the comments provided by human annotators.
1 code implementation • ACL 2019 • Matti Wiegmann, Benno Stein, Martin Potthast
Celebrities are among the most prolific users of social media, promoting their personas and rallying followers.
1 code implementation • ACL 2019 • Janek Bevendorff, Matthias Hagen, Benno Stein, Martin Potthast
The PAN series of shared tasks is well known for its continuous and high quality research in the field of digital text forensics.
1 code implementation • ACL 2019 • Janek Bevendorff, Martin Potthast, Matthias Hagen, Benno Stein
Authorship verification is the task of determining whether two texts were written by the same author.
1 code implementation • NAACL 2019 • Janek Bevendorff, Benno Stein, Matthias Hagen, Martin Potthast
Authorship verification is the problem of inferring whether two texts were written by the same author.
no code implementations • SEMEVAL 2019 • Johannes Kiesel, Maria Mestre, Rishabh Shukla, Emmanuel Vincent, Payam Adineh, David Corney, Benno Stein, Martin Potthast
Hyperpartisan news is news that takes an extreme left-wing or right-wing standpoint.
no code implementations • 27 Dec 2018 • Martin Potthast, Tim Gollub, Matthias Hagen, Benno Stein
Clickbait has grown to become a nuisance to social media users and social media operators alike.
no code implementations • WS 2018 • Shahbaz Syed, Michael V{\"o}lske, Martin Potthast, Nedim Lipka, Benno Stein, Hinrich Sch{\"u}tze
The TL;DR challenge fosters research in abstractive summarization of informal text, the largest and fastest-growing source of textual data on the web, which has been overlooked by summarization research so far.
no code implementations • CONLL 2018 • Daniel Zeman, Jan Haji{\v{c}}, Martin Popel, Martin Potthast, Milan Straka, Filip Ginter, Joakim Nivre, Slav Petrov
Every year, the Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets.
no code implementations • COLING 2018 • Martin Potthast, Tim Gollub, Kristof Komlossy, Sebastian Schuster, Matti Wiegmann, Garces Fern, Erika Patricia ez, Matthias Hagen, Benno Stein
To address the urging task of clickbait detection, we constructed a new corpus of 38, 517 annotated Twitter tweets, the Webis Clickbait Corpus 2017.
no code implementations • 4 Feb 2018 • Matti Wiegmann, Michael Völske, Benno Stein, Matthias Hagen, Martin Potthast
We study feature selection as a means to optimize the baseline clickbait detector employed at the Clickbait Challenge 2017.
no code implementations • WS 2017 • Henning Wachsmuth, Martin Potthast, Khalid Al-Khatib, Yamen Ajjour, Jana Puschmann, Jiani Qu, Jonas Dorsch, Viorel Morari, Janek Bevendorff, Benno Stein
Computational argumentation is expected to play a critical role in the future of web search.
no code implementations • WS 2017 • Michael V{\"o}lske, Martin Potthast, Shahbaz Syed, Benno Stein
Recent advances in automatic text summarization have used deep neural networks to generate high-quality abstractive summaries, but the performance of these models strongly depends on large amounts of suitable training data.
no code implementations • CONLL 2017 • Daniel Zeman, Martin Popel, Milan Straka, Jan Haji{\v{c}}, Joakim Nivre, Filip Ginter, Juhani Luotolahti, Sampo Pyysalo, Slav Petrov, Martin Potthast, Francis Tyers, Elena Badmaeva, Memduh Gokirmak, Anna Nedoluzhko, Silvie Cinkov{\'a}, Jan Haji{\v{c}} jr., Jaroslava Hlav{\'a}{\v{c}}ov{\'a}, V{\'a}clava Kettnerov{\'a}, Zde{\v{n}}ka Ure{\v{s}}ov{\'a}, Jenna Kanerva, Stina Ojala, Anna Missil{\"a}, Christopher D. Manning, Sebastian Schuster, Siva Reddy, Dima Taji, Nizar Habash, Herman Leung, Marie-Catherine de Marneffe, Manuela Sanguinetti, Maria Simi, Hiroshi Kanayama, Valeria de Paiva, Kira Droganova, H{\'e}ctor Mart{\'\i}nez Alonso, {\c{C}}a{\u{g}}r{\i} {\c{C}}{\"o}ltekin, Umut Sulubacak, Hans Uszkoreit, Vivien Macketanz, Aljoscha Burchardt, Kim Harris, Katrin Marheinecke, Georg Rehm, Tolga Kayadelen, Mohammed Attia, Ali Elkahky, Zhuoran Yu, Emily Pitler, Saran Lertpradit, M, Michael l, Jesse Kirchner, Hector Fern Alcalde, ez, Jana Strnadov{\'a}, Esha Banerjee, Ruli Manurung, Antonio Stella, Atsuko Shimada, Sookyoung Kwak, Gustavo Mendon{\c{c}}a, L, Tatiana o, Rattima Nitisaroj, Josie Li
The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets.
1 code implementation • ACL 2018 • Martin Potthast, Johannes Kiesel, Kevin Reinartz, Janek Bevendorff, Benno Stein
The articles originated from 9 well-known political publishers, 3 each from the mainstream, the hyperpartisan left-wing, and the hyperpartisan right-wing.