no code implementations • Findings (ACL) 2022 • Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi
Our experiments compare the zero-shot and few-shot performance of LMs prompted with reframed instructions on 12 NLP tasks across 6 categories.
no code implementations • FEVER (ACL) 2022 • James Ferguson, Hannaneh Hajishirzi, Pradeep Dasigi, Tushar Khot
Training retrieval models to fetch contexts for Question Answering (QA) over large corpora requires labeling relevant passages in those corpora.
no code implementations • 16 Mar 2023 • Bhargavi Paranjape, Scott Lundberg, Sameer Singh, Hannaneh Hajishirzi, Luke Zettlemoyer, Marco Tulio Ribeiro
We introduce Automatic Reasoning and Tool-use (ART), a framework that uses frozen LLMs to automatically generate intermediate reasoning steps as a program.
no code implementations • 28 Jan 2023 • Kolby Nottingham, Prithviraj Ammanabrolu, Alane Suhr, Yejin Choi, Hannaneh Hajishirzi, Sameer Singh, Roy Fox
Reinforcement learning (RL) agents typically learn tabula rasa, without prior knowledge of the world, which makes learning complex tasks with sparse rewards difficult.
2 code implementations • 20 Dec 2022 • Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi
Applying our method to vanilla GPT3, we demonstrate a 33% absolute improvement over the original model on Super-NaturalInstructions, on par with the performance of InstructGPT_001, which is trained with private user data and human annotations.
no code implementations • 20 Dec 2022 • Hamish Ivison, Akshita Bhagia, Yizhong Wang, Hannaneh Hajishirzi, Matthew Peters
Recent NLP models have the great ability to generalise `zero-shot' to new tasks using only an instruction as guidance.
1 code implementation • 20 Dec 2022 • Alex Mallen, Akari Asai, Victor Zhong, Rajarshi Das, Hannaneh Hajishirzi, Daniel Khashabi
Despite their impressive performance on diverse tasks, large language models (LMs) still struggle with tasks requiring rich world knowledge, implying the limitations of relying solely on their parameters to encode a wealth of world knowledge.
1 code implementation • 19 Dec 2022 • Xinxi Lyu, Sewon Min, Iz Beltagy, Luke Zettlemoyer, Hannaneh Hajishirzi
Although large language models can be prompted for both zero- and few-shot learning, performance drops significantly when no demonstrations are available.
1 code implementation • 8 Dec 2022 • Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Suchin Gururangan, Ludwig Schmidt, Hannaneh Hajishirzi, Ali Farhadi
Changing how pre-trained models behave -- e. g., improving their performance on a downstream task or mitigating biases learned during pre-training -- is a common practice when developing machine learning systems.
no code implementations • 2 Dec 2022 • Bhargavi Paranjape, Pradeep Dasigi, Vivek Srikumar, Luke Zettlemoyer, Hannaneh Hajishirzi
We propose AGRO -- Adversarial Group discovery for Distributionally Robust Optimization -- an end-to-end approach that jointly identifies error-prone groups and improves accuracy on them.
1 code implementation • 2 Dec 2022 • Sewon Min, Weijia Shi, Mike Lewis, Xilun Chen, Wen-tau Yih, Hannaneh Hajishirzi, Luke Zettlemoyer
Existing language models (LMs) predict tokens with a softmax over a finite vocabulary, which can make it difficult to predict rare tokens or phrases.
1 code implementation • 1 Dec 2022 • Hamish Ivison, Noah A. Smith, Hannaneh Hajishirzi, Pradeep Dasigi
We assume access to a small number (250--1000) of unlabeled target task instances, select their nearest neighbors from a pool of multitask data, and use the retrieved data to train target task-specific models.
1 code implementation • 30 Nov 2022 • Xinyan Velocity Yu, Sewon Min, Luke Zettlemoyer, Hannaneh Hajishirzi
We find that 25% of questions contain false presuppositions, and provide annotations for these presuppositions and their corrections.
1 code implementation • 16 Nov 2022 • Akari Asai, Timo Schick, Patrick Lewis, Xilun Chen, Gautier Izacard, Sebastian Riedel, Hannaneh Hajishirzi, Wen-tau Yih
We study the problem of retrieval with instructions, where users of a retrieval system explicitly describe their intent along with their queries.
1 code implementation • 25 Oct 2022 • David Wadden, Kyle Lo, Bailey Kuehl, Arman Cohan, Iz Beltagy, Lucy Lu Wang, Hannaneh Hajishirzi
While research on scientific claim verification has led to the development of powerful systems that appear to approach human performance, these approaches have yet to be tested in a realistic setting against large corpora of scientific literature.
no code implementations • 22 Oct 2022 • Anas Awadalla, Mitchell Wortsman, Gabriel Ilharco, Sewon Min, Ian Magnusson, Hannaneh Hajishirzi, Ludwig Schmidt
We conduct a large empirical evaluation to investigate the landscape of distributional robustness in question answering.
1 code implementation • 22 Oct 2022 • Vidhisha Balachandran, Hannaneh Hajishirzi, William W. Cohen, Yulia Tsvetkov
Abstractive summarization models often generate inconsistent summaries containing factual errors or hallucinated content.
1 code implementation • 10 Oct 2022 • Tanay Dixit, Bhargavi Paranjape, Hannaneh Hajishirzi, Luke Zettlemoyer
We present COunterfactual Generation via Retrieval and Editing (CORE), a retrieval-augmented generation framework for creating diverse counterfactual perturbations for CDA.
1 code implementation • 6 Oct 2022 • Jiacheng Liu, Skyler Hallinan, Ximing Lu, Pengfei He, Sean Welleck, Hannaneh Hajishirzi, Yejin Choi
Our work is the first to report that knowledge generated by models that are orders of magnitude smaller than GPT-3, even without direct supervision on the knowledge itself, can exceed the quality of commonsense knowledge elicited from GPT-3.
1 code implementation • 3 Oct 2022 • Rajkumar Ramamurthy, Prithviraj Ammanabrolu, Kianté Brantley, Jack Hessel, Rafet Sifa, Christian Bauckhage, Hannaneh Hajishirzi, Yejin Choi
To help answer this, we first introduce an open-source modular library, RL4LMs (Reinforcement Learning for Language Models), for optimizing language generators with RL.
1 code implementation • 10 Aug 2022 • Gabriel Ilharco, Mitchell Wortsman, Samir Yitzhak Gadre, Shuran Song, Hannaneh Hajishirzi, Simon Kornblith, Ali Farhadi, Ludwig Schmidt
We study model patching, where the goal is to improve accuracy on specific tasks without degrading accuracy on tasks where performance is already adequate.
1 code implementation • 2 Jul 2022 • Zeqiu Wu, Ryu Parish, Hao Cheng, Sewon Min, Prithviraj Ammanabrolu, Mari Ostendorf, Hannaneh Hajishirzi
In an information-seeking conversation, a user converses with an agent to ask a series of questions that can often be under- or over-specified.
1 code implementation • 9 Jun 2022 • Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramón Risco Delgado, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Timothy Telleen-Lawton, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu
BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.
1 code implementation • 25 May 2022 • Sean Welleck, Jiacheng Liu, Ximing Lu, Hannaneh Hajishirzi, Yejin Choi
Theorem proving in natural mathematical language - the mixture of symbolic and natural language used by humans - plays a central role in mathematical advances and education, and tests aspects of reasoning that are core to intelligence.
1 code implementation • 24 May 2022 • Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi
Our method, called ATTEMPT (ATTEntional Mixtures of Prompt Tuning), obtains source prompts as encodings of large-scale source tasks into a small number of parameters and trains an attention module to interpolate the source prompts and a newly initialized target prompt for every instance in the target task.
no code implementations • NAACL 2022 • Prithviraj Ammanabrolu, Liwei Jiang, Maarten Sap, Hannaneh Hajishirzi, Yejin Choi
We focus on creating agents that act in alignment with socially beneficial norms and values in interactive narratives or text-based games -- environments wherein an agent perceives and interacts with a world through natural language.
2 code implementations • 16 Apr 2022 • Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit, Ishani Mondal, Jacob Anderson, Kirby Kuznia, Krima Doshi, Maitreya Patel, Kuntal Kumar Pal, Mehrad Moradshahi, Mihir Parmar, Mirali Purohit, Neeraj Varshney, Phani Rohitha Kaza, Pulkit Verma, Ravsehaj Singh Puri, Rushang Karia, Shailaja Keyur Sampat, Savan Doshi, Siddhartha Mishra, Sujan Reddy, Sumanta Patro, Tanay Dixit, Xudong Shen, Chitta Baral, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi, Daniel Khashabi
This large and diverse collection of tasks enables rigorous benchmarking of cross-task generalization under instructions -- training models to follow instructions on a subset of tasks and evaluating them on the remaining unseen ones.
1 code implementation • 25 Feb 2022 • Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer
Large language models (LMs) are able to in-context learn -- perform a new task via inference alone by conditioning on a few input-label pairs (demonstrations) and making predictions for new inputs.
1 code implementation • 23 Feb 2022 • Daniel Khashabi, Yeganeh Kordi, Hannaneh Hajishirzi
We present UnifiedQA-v2, a QA model built with the same process as UnifiedQA, except that it utilizes more supervision -- roughly 3x the number of datasets used for UnifiedQA.
1 code implementation • 22 Feb 2022 • Rajarshi Das, Ameya Godbole, Ankita Naik, Elliot Tower, Robin Jia, Manzil Zaheer, Hannaneh Hajishirzi, Andrew McCallum
Question answering (QA) over knowledge bases (KBs) is challenging because of the diverse, essentially unbounded, types of reasoning patterns needed.
no code implementations • 16 Dec 2021 • Zeqiu Wu, Yi Luan, Hannah Rashkin, David Reitter, Hannaneh Hajishirzi, Mari Ostendorf, Gaurav Singh Tomar
Compared to standard retrieval tasks, passage retrieval for conversational question answering (CQA) poses new challenges in understanding the current user question, as each question needs to be interpreted within the dialogue context.
1 code implementation • NAACL 2022 • Akari Asai, Matt Gardner, Hannaneh Hajishirzi
We introduce a multi-task learning framework to jointly generate the final output and predict the evidentiality of each passage, leveraging a new task-agnostic method to obtain silver evidentiality labels for supervision.
1 code implementation • NAACL 2022 • Daniel Khashabi, Shane Lyu, Sewon Min, Lianhui Qin, Kyle Richardson, Sean Welleck, Hannaneh Hajishirzi, Tushar Khot, Ashish Sabharwal, Sameer Singh, Yejin Choi
Fine-tuning continuous prompts for target tasks has recently emerged as a compact alternative to full model fine-tuning.
2 code implementations • Findings (NAACL) 2022 • David Wadden, Kyle Lo, Lucy Lu Wang, Arman Cohan, Iz Beltagy, Hannaneh Hajishirzi
Our approach outperforms two competitive baselines on three scientific claim verification datasets, with particularly strong performance in zero / few-shot domain adaptation experiments.
1 code implementation • EMNLP 2021 • Christopher Clark, Jordi Salvador, Dustin Schwenk, Derrick Bonafilia, Mark Yatskar, Eric Kolve, Alvaro Herrasti, Jonghyun Choi, Sachin Mehta, Sam Skjonsberg, Carissa Schoenick, Aaron Sarnat, Hannaneh Hajishirzi, Aniruddha Kembhavi, Oren Etzioni, Ali Farhadi
We investigate these challenges in the context of Iconary, a collaborative game of drawing and guessing based on Pictionary, that poses a novel challenge for the research community.
1 code implementation • NAACL 2022 • Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi
We introduce MetaICL (Meta-training for In-Context Learning), a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learning on a large set of training tasks.
1 code implementation • ACL 2022 • Jiacheng Liu, Alisa Liu, Ximing Lu, Sean Welleck, Peter West, Ronan Le Bras, Yejin Choi, Hannaneh Hajishirzi
It remains an open question whether incorporating external knowledge benefits commonsense reasoning while maintaining the flexibility of pretrained sequence models.
no code implementations • 16 Sep 2021 • Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi
Our experiments compare the zero-shot and few-shot performance of LMs prompted with reframed instructions on 12 NLP tasks across 6 categories.
1 code implementation • EMNLP 2021 • Zeqiu Wu, Bo-Ru Lu, Hannaneh Hajishirzi, Mari Ostendorf
Identifying relevant knowledge to be used in conversational systems that are grounded in long documents is critical to effective response generation.
3 code implementations • CVPR 2022 • Mitchell Wortsman, Gabriel Ilharco, Jong Wook Kim, Mike Li, Simon Kornblith, Rebecca Roelofs, Raphael Gontijo-Lopes, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, Ludwig Schmidt
Compared to standard fine-tuning, WiSE-FT provides large accuracy improvements under distribution shift, while preserving high accuracy on the target distribution.
Ranked #8 on
Image Classification
on ObjectNet
(using extra training data)
1 code implementation • ACL 2022 • Sewon Min, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer
We introduce a noisy channel approach for language model prompting in few-shot text classification.
1 code implementation • NeurIPS 2021 • Akari Asai, Xinyan Yu, Jungo Kasai, Hannaneh Hajishirzi
We present Cross-lingual Open-Retrieval Answer Generation (CORA), the first unified many-to-many question answering (QA) model that can answer questions across many languages, even for ones without language-specific annotated data or knowledge sources.
1 code implementation • ACL 2022 • Jungsoo Park, Sewon Min, Jaewoo Kang, Luke Zettlemoyer, Hannaneh Hajishirzi
Claims in FAVIQ are verified to be natural, contain little lexical bias, and require a complete understanding of the evidence for verification.
1 code implementation • AKBC 2021 • Rahul Nadkarni, David Wadden, Iz Beltagy, Noah A. Smith, Hannaneh Hajishirzi, Tom Hope
Biomedical knowledge graphs (KGs) hold rich information on entities such as diseases, drugs, and genes.
no code implementations • Findings (ACL) 2021 • Bhargavi Paranjape, Julian Michael, Marjan Ghazvininejad, Luke Zettlemoyer, Hannaneh Hajishirzi
Many commonsense reasoning NLP tasks involve choosing between one or more possible answers to a question or prompt based on knowledge that is often implicit.
1 code implementation • ACL 2021 • Ikuya Yamada, Akari Asai, Hannaneh Hajishirzi
Most state-of-the-art open-domain question answering systems use a neural retrieval model to encode passages into continuous vectors and extract them from a knowledge source.
Ranked #2 on
Open-Domain Question Answering
on TQA
1 code implementation • NAACL 2021 • Iz Beltagy, Arman Cohan, Hannaneh Hajishirzi, Sewon Min, Matthew E. Peters
In this tutorial, we aim at bringing interested NLP researchers up to speed about the recent and ongoing techniques for document-level representation learning.
1 code implementation • Findings (EMNLP) 2021 • Daniel Khashabi, Amos Ng, Tushar Khot, Ashish Sabharwal, Hannaneh Hajishirzi, Chris Callison-Burch
GooAQ answers are mined from Google's responses to our collected questions, specifically from the answer boxes in the search results.
3 code implementations • ACL 2022 • Swaroop Mishra, Daniel Khashabi, Chitta Baral, Hannaneh Hajishirzi
Using this meta-dataset, we measure cross-task generalization by training models on seen tasks and measuring generalization to the remaining unseen ones.
no code implementations • EMNLP 2021 • Sewon Min, Kenton Lee, Ming-Wei Chang, Kristina Toutanova, Hannaneh Hajishirzi
We study multi-answer retrieval, an under-explored problem that requires retrieving passages to cover multiple distinct answers for a given question.
1 code implementation • Findings (EMNLP) 2021 • Leo Z. Liu, Yizhong Wang, Jungo Kasai, Hannaneh Hajishirzi, Noah A. Smith
Models of language trained on very large corpora have been demonstrated useful for NLP.
no code implementations • ICLR 2021 • Alon Talmor, Ori Yoran, Amnon Catav, Dan Lahav, Yizhong Wang, Akari Asai, Gabriel Ilharco, Hannaneh Hajishirzi, Jonathan Berant
When answering complex questions, people can seamlessly combine information from visual, textual and tabular sources.
1 code implementation • 24 Mar 2021 • Sean Welleck, Jiacheng Liu, Ronan Le Bras, Hannaneh Hajishirzi, Yejin Choi, Kyunghyun Cho
Understanding and creating mathematics using natural mathematical language - the mixture of symbolic and natural language used by humans - is a challenging and important problem for driving progress in machine learning.
no code implementations • 1 Jan 2021 • Sewon Min, Jordan Boyd-Graber, Chris Alberti, Danqi Chen, Eunsol Choi, Michael Collins, Kelvin Guu, Hannaneh Hajishirzi, Kenton Lee, Jennimaria Palomaki, Colin Raffel, Adam Roberts, Tom Kwiatkowski, Patrick Lewis, Yuxiang Wu, Heinrich Küttler, Linqing Liu, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel, Sohee Yang, Minjoon Seo, Gautier Izacard, Fabio Petroni, Lucas Hosseini, Nicola De Cao, Edouard Grave, Ikuya Yamada, Sonse Shimaoka, Masatoshi Suzuki, Shumpei Miyawaki, Shun Sato, Ryo Takahashi, Jun Suzuki, Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz, Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Schlichtkrull, Sonal Gupta, Yashar Mehdad, Wen-tau Yih
We review the EfficientQA competition from NeurIPS 2020.
no code implementations • EMNLP 2020 • James Ferguson, Matt Gardner, Hannaneh Hajishirzi, Tushar Khot, Pradeep Dasigi
However, most existing reading comprehension (RC) tasks only focus on questions for which the contexts provide all the information required to answer them, thus not evaluating a system's performance at identifying a potential lack of sufficient information and locating sources for that information.
3 code implementations • NAACL 2021 • Akari Asai, Jungo Kasai, Jonathan H. Clark, Kenton Lee, Eunsol Choi, Hannaneh Hajishirzi
Multilingual question answering tasks typically assume answers exist in the same language as the question.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Sanjay Subramanian, Lucy Lu Wang, Sachin Mehta, Ben Bogin, Madeleine van Zuylen, Sravanthi Parasa, Sameer Singh, Matt Gardner, Hannaneh Hajishirzi
To address challenges in figure retrieval and figure-to-text alignment, we introduce MedICaT, a dataset of medical images in context.
3 code implementations • NAACL 2021 • Tom Hope, Aida Amini, David Wadden, Madeleine van Zuylen, Sravanthi Parasa, Eric Horvitz, Daniel Weld, Roy Schwartz, Hannaneh Hajishirzi
The COVID-19 pandemic has spawned a diverse body of scientific literature that is challenging to navigate, stimulating interest in automated tools to help find useful knowledge.
1 code implementation • EMNLP 2020 • Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi
X-LXMERT's image generation capabilities rival state of the art generative models while its question answering and captioning abilities remains comparable to LXMERT.
6 code implementations • EMNLP 2020 • Swabha Swayamdipta, Roy Schwartz, Nicholas Lourie, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith, Yejin Choi
Experiments across four datasets show that these model-dependent measures reveal three distinct regions in the data map, each with pronounced characteristics.
1 code implementation • 19 Sep 2020 • Zeqiu Wu, Rik Koncel-Kedziorski, Mari Ostendorf, Hannaneh Hajishirzi
Knowledge graphs capture entities and relations from long documents and can facilitate reasoning in many downstream applications.
2 code implementations • ICLR 2021 • Sachin Mehta, Marjan Ghazvininejad, Srinivasan Iyer, Luke Zettlemoyer, Hannaneh Hajishirzi
We introduce a deep and light-weight transformer, DeLighT, that delivers similar or better performance than standard transformer-based models with significantly fewer parameters.
Ranked #1 on
Machine Translation
on WMT2016 English-Romanian
1 code implementation • 25 Jul 2020 • Sachin Mehta, Ximing Lu, Donald Weaver, Joann G. Elmore, Hannaneh Hajishirzi, Linda Shapiro
HATNet extends the bag-of-words approach and uses self-attention to encode global information, allowing it to learn representations from clinically relevant tissue structures without any explicit supervision.
no code implementations • ACL 2020 • Xin Luna Dong, Hannaneh Hajishirzi, Colin Lockard, Prashant Shiralkar
In this tutorial we take a holistic view toward information extraction, exploring the commonalities in the challenges and solutions developed to address these different forms of text.
no code implementations • 14 May 2020 • Colin Lockard, Prashant Shiralkar, Xin Luna Dong, Hannaneh Hajishirzi
In this work, we propose a solution for "zero-shot" open-domain relation extraction from webpages with a previously unseen template, including from websites with little overlap with existing sources of knowledge for distant supervision and websites in entirely new subject verticals.
2 code implementations • Findings of the Association for Computational Linguistics 2020 • Daniel Khashabi, Sewon Min, Tushar Khot, Ashish Sabharwal, Oyvind Tafjord, Peter Clark, Hannaneh Hajishirzi
As evidence, we use the latest advances in language modeling to build a single pre-trained QA model, UnifiedQA, that performs surprisingly well across 17 QA datasets spanning 4 diverse formats.
Ranked #1 on
Question Answering
on CommonsenseQA
1 code implementation • ACL 2020 • Sarthak Jain, Madeleine van Zuylen, Hannaneh Hajishirzi, Iz Beltagy
It is challenging to create a large-scale information extraction (IE) dataset at the document level since it requires an understanding of the whole document to annotate entities and their document-level relationships that usually span beyond sentences or even sections.
1 code implementation • 1 May 2020 • Zeqiu Wu, Michel Galley, Chris Brockett, Yizhe Zhang, Xiang Gao, Chris Quirk, Rik Koncel-Kedziorski, Jianfeng Gao, Hannaneh Hajishirzi, Mari Ostendorf, Bill Dolan
Current end-to-end neural conversation models inherently lack the flexibility to impose semantic control in the response generation process, often resulting in uninteresting responses.
no code implementations • NAACL 2021 • Gabriel Ilharco, Rowan Zellers, Ali Farhadi, Hannaneh Hajishirzi
The success of large-scale contextual language models has attracted great interest in probing what is encoded in their representations.
2 code implementations • EMNLP 2020 • Bhargavi Paranjape, Mandar Joshi, John Thickstun, Hannaneh Hajishirzi, Luke Zettlemoyer
Decisions of complex language understanding models can be rationalized by limiting their inputs to a relevant subsequence of the original text.
2 code implementations • EMNLP 2020 • David Wadden, Shanchuan Lin, Kyle Lo, Lucy Lu Wang, Madeleine van Zuylen, Arman Cohan, Hannaneh Hajishirzi
We introduce scientific claim verification, a new task to select abstracts from the research literature containing evidence that SUPPORTS or REFUTES a given scientific claim, and to identify rationales justifying each decision.
2 code implementations • EMNLP 2020 • Sewon Min, Julian Michael, Hannaneh Hajishirzi, Luke Zettlemoyer
Ambiguity is inherent to open-domain question answering; especially when exploring new topics, it can be difficult to ask questions that have a single, unambiguous answer.
1 code implementation • ACL 2020 • Akari Asai, Hannaneh Hajishirzi
Many natural language questions require qualitative, quantitative or logical comparisons between two entities or events.
no code implementations • AKBC 2020 • Aida Amini, Antoine Bosselut, Bhavana Dalvi Mishra, Yejin Choi, Hannaneh Hajishirzi
Procedural texts often describe processes (e. g., photosynthesis and cooking) that happen over entities (e. g., light, food).
3 code implementations • 15 Feb 2020 • Jesse Dodge, Gabriel Ilharco, Roy Schwartz, Ali Farhadi, Hannaneh Hajishirzi, Noah Smith
We publicly release all of our experimental data, including training and validation scores for 2, 100 trials, to encourage further analysis of training dynamics during fine-tuning.
1 code implementation • ICLR 2020 • Sachin Mehta, Rik Koncel-Kedziorski, Mohammad Rastegari, Hannaneh Hajishirzi
For sequence models with large vocabularies, a majority of network parameters lie in the input and output layers.
2 code implementations • ICLR 2020 • Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, Caiming Xiong
Answering questions that require multi-hop reasoning at web-scale necessitates retrieving multiple evidence documents, one of which often has little lexical or semantic relationship to the question.
Ranked #28 on
Question Answering
on HotpotQA
7 code implementations • 10 Nov 2019 • Sewon Min, Danqi Chen, Luke Zettlemoyer, Hannaneh Hajishirzi
We introduce an approach for open-domain question answering (QA) that retrieves and reads a passage graph, where vertices are passages of text and edges represent relationships that are derived from an external knowledge base or co-occurrence in the same article.
3 code implementations • ACL 2020 • Jinhyuk Lee, Minjoon Seo, Hannaneh Hajishirzi, Jaewoo Kang
Open-domain question answering can be formulated as a phrase retrieval problem, in which we can expect huge scalability and speed benefit but often suffer from low accuracy due to the limitation of existing phrase representation models.
no code implementations • WS 2019 • Matt Gardner, Jonathan Berant, Hannaneh Hajishirzi, Alon Talmor, Sewon Min
In this work, we justify a question answering approach to reading comprehension and describe the various kinds of questions one might use to more fully test a system{'}s comprehension of a passage, moving beyond questions that only probe local predicate-argument structures.
no code implementations • 25 Sep 2019 • Matt Gardner, Jonathan Berant, Hannaneh Hajishirzi, Alon Talmor, Sewon Min
In this opinion piece, we argue that question answering should be considered a format which is sometimes useful for studying particular phenomena, not a phenomenon or task in itself.
1 code implementation • IJCNLP 2019 • Sewon Min, Danqi Chen, Hannaneh Hajishirzi, Luke Zettlemoyer
Many question answering (QA) tasks only provide weak supervision for how the answer should be computed.
Ranked #2 on
Question Answering
on NarrativeQA
2 code implementations • IJCNLP 2019 • David Wadden, Ulme Wennberg, Yi Luan, Hannaneh Hajishirzi
We examine the capabilities of a unified, multi-task framework for three information extraction tasks: named entity recognition, relation extraction, and event extraction.
Ranked #5 on
Joint Entity and Relation Extraction
on SciERC
1 code implementation • IJCNLP 2019 • Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
The diversification stage uses a mixture of experts to sample different binary masks on the source sequence for diverse content selection.
Ranked #10 on
Question Generation
on SQuAD1.1
no code implementations • 20 Jul 2019 • Baicen Xiao, Bhaskar Ramasubramanian, Andrew Clark, Hannaneh Hajishirzi, Linda Bushnell, Radha Poovendran
This paper augments the reward received by a reinforcement learning agent with potential functions in order to help the agent learn (possibly stochastic) optimal policies.
1 code implementation • ACL 2019 • Minjoon Seo, Jinhyuk Lee, Tom Kwiatkowski, Ankur P. Parikh, Ali Farhadi, Hannaneh Hajishirzi
Existing open-domain question answering (QA) models are not suitable for real-time usage because they need to process several long documents on-demand for every input query.
2 code implementations • 8 Jun 2019 • Sachin Mehta, Hannaneh Hajishirzi, Mohammad Rastegari
When DiCE units are stacked to build the DiCENet model, we observe significant improvements over state-of-the-art models across various computer vision tasks including image classification, object detection, and semantic segmentation.
Ranked #20 on
Semantic Segmentation
on PASCAL VOC 2012 val
2 code implementations • ACL 2019 • Sewon Min, Victor Zhong, Luke Zettlemoyer, Hannaneh Hajishirzi
Multi-hop Reading Comprehension (RC) requires reasoning and aggregation across several paragraphs.
Ranked #68 on
Question Answering
on HotpotQA
1 code implementation • ACL 2019 • Sewon Min, Eric Wallace, Sameer Singh, Matt Gardner, Hannaneh Hajishirzi, Luke Zettlemoyer
Multi-hop reading comprehension (RC) questions are challenging because they require reading and reasoning over multiple paragraphs.
1 code implementation • SEMEVAL 2019 • Mark Hopkins, Ronan Le Bras, Cristian Petrescu-Prahova, Gabriel Stanovsky, Hannaneh Hajishirzi, Rik Koncel-Kedziorski
Systems were evaluated based on the percentage of correctly answered questions.
no code implementations • NAACL 2019 • Aida Amini, Saadia Gabriel, Peter Lin, Rik Koncel-Kedziorski, Yejin Choi, Hannaneh Hajishirzi
We introduce a new representation language to model precise operation programs corresponding to each math problem that aim to improve both the performance and the interpretability of the learned models.
3 code implementations • NAACL 2019 • Yi Luan, Dave Wadden, Luheng He, Amy Shah, Mari Ostendorf, Hannaneh Hajishirzi
We introduce a general framework for several information extraction tasks that share span representations using dynamically constructed span graphs.
Ranked #1 on
Relation Extraction
on ACE 2004
(Cross Sentence metric)
Joint Entity and Relation Extraction
Named Entity Recognition (NER)
3 code implementations • NAACL 2019 • Rik Koncel-Kedziorski, Dhanush Bekal, Yi Luan, Mirella Lapata, Hannaneh Hajishirzi
Generating texts which express complex ideas spanning multiple sentences requires a structured representation of their content (document plan), but these representations are prohibitively expensive to manually produce.
Ranked #6 on
KG-to-Text Generation
on AGENDA
8 code implementations • CVPR 2019 • Sachin Mehta, Mohammad Rastegari, Linda Shapiro, Hannaneh Hajishirzi
Compared to YOLOv2 on the MS-COCO object detection, ESPNetv2 delivers 4. 4% higher accuracy with 6x fewer FLOPs.
Ranked #41 on
Semantic Segmentation
on PASCAL VOC 2012 test
3 code implementations • EMNLP 2018 • Yi Luan, Luheng He, Mari Ostendorf, Hannaneh Hajishirzi
We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles.
Ranked #7 on
Joint Entity and Relation Extraction
on SciERC
Coreference Resolution
Joint Entity and Relation Extraction
+1
2 code implementations • EMNLP 2018 • Sachin Mehta, Rik Koncel-Kedziorski, Mohammad Rastegari, Hannaneh Hajishirzi
We introduce the Pyramidal Recurrent Unit (PRU), which enables learning representations in high dimensional space with more generalization power and fewer parameters.
no code implementations • NAACL 2018 • James Ferguson, Colin Lockard, Daniel S. Weld, Hannaneh Hajishirzi
Supervised event extraction systems are limited in their accuracy due to the lack of available training data.
no code implementations • 26 Aug 2018 • Yi Luan, Mari Ostendorf, Hannaneh Hajishirzi
This paper describes our submission for the SemEval 2018 Task 7 shared task on semantic relation extraction and classification in scientific papers.
no code implementations • SEMEVAL 2018 • Yi Luan, Mari Ostendorf, Hannaneh Hajishirzi
This paper describes our submission for SemEval 2018 Task 7 shared task on semantic relation extraction and classification in scientific papers.
no code implementations • 28 Apr 2018 • Benjamin Robaidek, Rik Koncel-Kedziorski, Hannaneh Hajishirzi
We explore contemporary, data-driven techniques for solving math word problems over recent large-scale datasets.
1 code implementation • EMNLP 2018 • Minjoon Seo, Tom Kwiatkowski, Ankur P. Parikh, Ali Farhadi, Hannaneh Hajishirzi
We formalize a new modular variant of current question answering tasks by enforcing complete independence of the document encoder from the question encoder.
7 code implementations • ECCV 2018 • Sachin Mehta, Mohammad Rastegari, Anat Caspi, Linda Shapiro, Hannaneh Hajishirzi
We introduce a fast and efficient convolutional neural network, ESPNet, for semantic segmentation of high resolution images under resource constraints.
Ranked #48 on
Semantic Segmentation
on PASCAL VOC 2012 test
no code implementations • 21 Nov 2017 • Sachin Mehta, Hannaneh Hajishirzi, Linda Shapiro
We present an approach for identifying the most walkable direction for navigation using a hand-held camera.
1 code implementation • ICLR 2018 • Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi
Inspired by the principles of speed reading, we introduce Skim-RNN, a recurrent neural network (RNN) that dynamically decides to update only a small fraction of the hidden state for relatively unimportant input tokens.
no code implementations • EMNLP 2017 • Yi Luan, Mari Ostendorf, Hannaneh Hajishirzi
This paper addresses the problem of extracting keyphrases from scientific articles and categorizing them as corresponding to a task, process, or material.
no code implementations • CVPR 2017 • Aniruddha Kembhavi, Minjoon Seo, Dustin Schwenk, Jonghyun Choi, Ali Farhadi, Hannaneh Hajishirzi
Our analysis shows that a significant portion of questions require complex parsing of the text and the diagrams and reasoning, indicating that our dataset is more complex compared to previous machine comprehension and visual question answering datasets.
1 code implementation • ACL 2017 • Sewon Min, Minjoon Seo, Hannaneh Hajishirzi
We show that the task of question answering (QA) can significantly benefit from the transfer learning of models trained on a different large, fine-grained QA dataset.
24 code implementations • 5 Nov 2016 • Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, Hannaneh Hajishirzi
Machine comprehension (MC), answering a query about a given context paragraph, requires modeling complex interactions between the context and the query.
Ranked #4 on
Question Answering
on CNN / Daily Mail
no code implementations • EMNLP 2016 • Rik Koncel-Kedziorski, Ioannis Konstas, Luke Zettlemoyer, Hannaneh Hajishirzi
Texts present coherent stories that have a particular theme or overall setting, for example science fiction or western.
2 code implementations • 14 Jun 2016 • Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi
In this paper, we study the problem of question answering when reasoning over multiple facts is required.
Ranked #2 on
Question Answering
on bAbi
no code implementations • CVPR 2016 • Roozbeh Mottaghi, Hannaneh Hajishirzi, Ali Farhadi
With the recent progress in visual recognition, we have already started to see a surge of vision related real-world applications.
no code implementations • 12 Apr 2016 • Vicky Zayats, Mari Ostendorf, Hannaneh Hajishirzi
We introduce a new approach for disfluency detection using a Bidirectional Long-Short Term Memory neural network (BLSTM).
1 code implementation • 24 Mar 2016 • Aniruddha Kembhavi, Mike Salvato, Eric Kolve, Minjoon Seo, Hannaneh Hajishirzi, Ali Farhadi
We define syntactic parsing of diagrams as learning to infer DPGs for diagrams and study semantic interpretation and reasoning of diagrams in the context of diagram question answering.
no code implementations • 2 Feb 2016 • Hessam Bagherinezhad, Hannaneh Hajishirzi, Yejin Choi, Ali Farhadi
In this paper, we introduce a method to automatically infer object sizes, leveraging visual and textual information from web.
no code implementations • EMNLP 2015 • Aaron Jaech, Victoria Zayats, Hao Fang, Mari Ostendorf, Hannaneh Hajishirzi
This paper addresses the question of how language use affects community reaction to comments in online discussion forums, and the relative importance of the message vs. the messenger.
no code implementations • CVPR 2015 • Mohammad Rastegari, Hannaneh Hajishirzi, Ali Farhadi
In this paper we present a bottom-up method to instance-level Multiple Instance Learning (MIL) that learns to discover positive instances with globally constrained reasoning about local pairwise similarities.
no code implementations • TACL 2015 • Rik Koncel-Kedziorski, Hannaneh Hajishirzi, Ashish Sabharwal, Oren Etzioni, Siena Dumas Ang
This paper formalizes the problem of solving multi-sentence algebraic word problems as that of generating and scoring equation trees.