1 code implementation • 11 Sep 2024 • Ben Bogin, Kejuan Yang, Shashank Gupta, Kyle Richardson, Erin Bransom, Peter Clark, Ashish Sabharwal, Tushar Khot
To advance towards this goal, we introduce SUPER, the first benchmark designed to evaluate the capability of LLMs in setting up and executing tasks from research repositories.
no code implementations • 22 Jul 2024 • Ori Yoran, Samuel Joseph Amouyal, Chaitanya Malaviya, Ben Bogin, Ofir Press, Jonathan Berant
Language agents, built on top of language models (LMs), are systems that can interact with complex environments, such as the open web.
no code implementations • 30 Mar 2024 • Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak, Aleksandr Drozd, Jordan Clive, Kshitij Gupta, Liangyu Chen, Qi Sun, Ken Tsui, Noah Persaud, Nour Fahmy, Tianlong Chen, Mohit Bansal, Nicolo Monti, Tai Dang, Ziyang Luo, Tien-Tung Bui, Roberto Navigli, Virendra Mehta, Matthew Blumberg, Victor May, Huu Nguyen, Sampo Pyysalo
Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility.
1 code implementation • 31 Jan 2024 • Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo
As a result, it is challenging to conduct and advance scientific research on language modeling, such as understanding how training data impacts model capabilities and limitations.
1 code implementation • 16 Nov 2023 • Ben Bogin, Shivanshu Gupta, Peter Clark, Ashish Sabharwal
In-context learning (ICL) is an appealing approach for semantic parsing due to its few-shot nature and improved generalization.
1 code implementation • 25 Apr 2023 • Ori Yoran, Tomer Wolfson, Ben Bogin, Uri Katz, Daniel Deutch, Jonathan Berant
Modern systems for multi-hop question answering (QA) typically break questions into a sequence of reasoning steps, termed chain-of-thought (CoT), before arriving at a final answer.
Ranked #2 on Question Answering on Bamboogle
1 code implementation • 13 Dec 2022 • Itay Levy, Ben Bogin, Jonathan Berant
In-context learning has shown great success in i. i. d semantic parsing splits, where the training and test sets are drawn from the same distribution.
1 code implementation • 1 Nov 2022 • Elad Segal, Ben Bogin, Jonathan Berant
We experiment with a high-performing vision-language model, and analyze the effect of bimodal supervision on three vision-language tasks.
1 code implementation • 15 Jan 2022 • Ben Bogin, Shivanshu Gupta, Jonathan Berant
While recent work has convincingly showed that sequence-to-sequence models struggle to generalize to new compositions (termed compositional generalization), little is known on what makes compositional generalization hard on a particular test instance.
1 code implementation • EMNLP 2021 • Ben Bogin, Shivanshu Gupta, Matt Gardner, Jonathan Berant
Due to the automatic generation process, COVR facilitates the creation of compositional splits, where models at test time need to generalize to new concepts and compositions in a zero- or few-shot setting.
1 code implementation • ACL (NLP4Prog) 2021 • Moshe Hazoom, Vibhor Malik, Ben Bogin
Most available semantic parsing datasets, comprising of pairs of natural utterances and logical forms, were collected solely for the purpose of training and evaluation of natural language understanding systems.
Ranked #1 on Text-To-SQL on SEDE
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Sanjay Subramanian, Lucy Lu Wang, Sachin Mehta, Ben Bogin, Madeleine van Zuylen, Sravanthi Parasa, Sameer Singh, Matt Gardner, Hannaneh Hajishirzi
To address challenges in figure retrieval and figure-to-text alignment, we introduce MedICaT, a dataset of medical images in context.
no code implementations • 1 Oct 2020 • Matt Gardner, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hanna Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson F. Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah A. Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, A. Zhang, Ben Zhou
Unfortunately, when a dataset has systematic gaps (e. g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture a dataset's intended capabilities.
1 code implementation • 1 Jul 2020 • Ben Bogin, Sanjay Subramanian, Matt Gardner, Jonathan Berant
However, state-of-the-art models in grounded question answering often do not explicitly perform decomposition, leading to difficulties in generalization to out-of-distribution examples.
1 code implementation • ACL 2020 • Sanjay Subramanian, Ben Bogin, Nitish Gupta, Tomer Wolfson, Sameer Singh, Jonathan Berant, Matt Gardner
Neural module networks (NMNs) are a popular approach for modeling compositionality: they achieve high accuracy when applied to problems in language and vision, while reflecting the compositional structure of the problem in the network architecture.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Matt Gardner, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hanna Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson F. Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah A. Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, Ally Zhang, Ben Zhou
Unfortunately, when a dataset has systematic gaps (e. g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture a dataset's intended capabilities.
1 code implementation • IJCNLP 2019 • Ben Bogin, Matt Gardner, Jonathan Berant
State-of-the-art semantic parsers rely on auto-regressive decoding, emitting one symbol at a time.
no code implementations • 30 May 2019 • Kevin Lin, Ben Bogin, Mark Neumann, Jonathan Berant, Matt Gardner
The sequence-to-sequence paradigm employed by neural text-to-SQL models typically performs token-level decoding and does not consider generating SQL hierarchically from a grammar.
1 code implementation • ACL 2019 • Ben Bogin, Matt Gardner, Jonathan Berant
Research on parsing language to SQL has largely ignored the structure of the database (DB) schema, either because the DB was very simple, or because it was observed at both training and test time.
1 code implementation • 3 Sep 2018 • Ben Bogin, Mor Geva, Jonathan Berant
Training agents to communicate with one another given task-based supervision only has attracted considerable attention recently, due to the growing interest in developing models for human-agent interaction.
no code implementations • COLING 2018 • Ran Levy, Ben Bogin, Shai Gretz, Ranit Aharonov, Noam Slonim
Our results clearly indicate that the system is able to successfully generalize from the weak signal, outperforming previously reported results in terms of both precision and coverage.
3 code implementations • 5 Jun 2017 • Ofir Press, Amir Bar, Ben Bogin, Jonathan Berant, Lior Wolf
Generative Adversarial Networks (GANs) have shown great promise recently in image generation.