no code implementations • 17 Aug 2024 • Sotiris Anagnostidis, Jannis Bulian
Our findings reveal that models are strongly influenced, and when explanations are provided they are swayed irrespective of the quality of the explanation.
no code implementations • 5 Jul 2024 • Zachary Kenton, Noah Y. Siegel, János Kramár, Jonah Brown-Cohen, Samuel Albanie, Jannis Bulian, Rishabh Agarwal, David Lindner, Yunhao Tang, Noah D. Goodman, Rohin Shah
We find that debate outperforms consultancy across all tasks when the consultant is randomly assigned to argue for the correct/incorrect answer.
no code implementations • 4 Oct 2023 • Jannis Bulian, Mike S. Schäfer, Afra Amini, Heidi Lam, Massimiliano Ciaramita, Ben Gaiarin, Michelle Chen Hübscher, Christian Buck, Niels G. Mede, Markus Leippold, Nadine Strauß
As Large Language Models (LLMs) rise in popularity, it is necessary to assess their capability in critically relevant domains.
3 code implementations • 31 Mar 2022 • Adam Roberts, Hyung Won Chung, Anselm Levskaya, Gaurav Mishra, James Bradbury, Daniel Andor, Sharan Narang, Brian Lester, Colin Gaffney, Afroz Mohiuddin, Curtis Hawthorne, Aitor Lewkowycz, Alex Salcianu, Marc van Zee, Jacob Austin, Sebastian Goodman, Livio Baldini Soares, Haitang Hu, Sasha Tsvyashchenko, Aakanksha Chowdhery, Jasmijn Bastings, Jannis Bulian, Xavier Garcia, Jianmo Ni, Andrew Chen, Kathleen Kenealy, Jonathan H. Clark, Stephan Lee, Dan Garrette, James Lee-Thorp, Colin Raffel, Noam Shazeer, Marvin Ritter, Maarten Bosma, Alexandre Passos, Jeremy Maitin-Shepard, Noah Fiedel, Mark Omernick, Brennan Saeta, Ryan Sepassi, Alexander Spiridonov, Joshua Newlan, Andrea Gesmundo
Recent neural network-based language models have benefited greatly from scaling up the size of training datasets and the number of parameters in the models themselves.
1 code implementation • 15 Feb 2022 • Jannis Bulian, Christian Buck, Wojciech Gajewski, Benjamin Boerschinger, Tal Schuster
The predictions of question answering (QA)systems are typically evaluated against manually annotated finite sets of one or more answers.
no code implementations • ACL 2021 • Cesar Ilharco, Afsaneh Shirazi, Arjun Gopalan, Arsha Nagrani, Blaz Bratanic, Chris Bregler, Christina Funk, Felipe Ferreira, Gabriel Barcik, Gabriel Ilharco, Georg Osang, Jannis Bulian, Jared Frank, Lucas Smaira, Qin Cao, Ricardo Marino, Roma Patel, Thomas Leung, Vaiva Imbrasaite
How information is created, shared and consumed has changed rapidly in recent decades, in part thanks to new social platforms and technologies on the web.
1 code implementation • NAACL 2021 • Julian Martin Eisenschlos, Bhuwan Dhingra, Jannis Bulian, Benjamin Börschinger, Jordan Boyd-Graber
We release FoolMeTwice (FM2 for short), a large dataset of challenging entailment pairs collected through a fun multi-player game.
no code implementations • 1 Dec 2020 • Thomas Diggelmann, Jordan Boyd-Graber, Jannis Bulian, Massimiliano Ciaramita, Markus Leippold
We introduce CLIMATE-FEVER, a new publicly available dataset for verification of climate change-related claims.
no code implementations • 11 Nov 2019 • Benjamin Borschinger, Jordan Boyd-Graber, Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Michelle Chen Huebscher, Wojciech Gajewski, Yannic Kilcher, Rodrigo Nogueira, Lierni Sestorain Saralegu
We investigate a framework for machine reading, inspired by real world information-seeking problems, where a meta question answering system interacts with a black box environment.
no code implementations • ICLR Workshop drlStructPred 2019 • Rodrigo Nogueira, Jannis Bulian, Massimiliano Ciaramita
We investigate methods to efficiently learn diverse strategies in reinforcement learning for a generative structured prediction problem: query reformulation.
no code implementations • ICLR 2019 • Rodrigo Nogueira, Jannis Bulian, Massimiliano Ciaramita
We propose a method to efficiently learn diverse strategies in reinforcement learning for query reformulation in the tasks of document retrieval and question answering.
no code implementations • 23 Jan 2018 • Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Wojciech Gajewski, Andrea Gesmundo, Neil Houlsby, Wei Wang
We analyze the language learned by an agent trained with reinforcement learning as a component of the ActiveQA system [Buck et al., 2017].
2 code implementations • ICLR 2018 • Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Wojciech Gajewski, Andrea Gesmundo, Neil Houlsby, Wei Wang
The agent probes the system with, potentially many, natural language reformulations of an initial question and aggregates the returned evidence to yield the best answer.