1 code implementation • EMNLP 2021 • Aman Madaan, Niket Tandon, Dheeraj Rajagopal, Peter Clark, Yiming Yang, Eduard Hovy
Defeasible reasoning is the mode of reasoning where conclusions can be overturned by taking into account new evidence.
no code implementations • Findings (EMNLP) 2021 • Keisuke Sakaguchi, Chandra Bhagavatula, Ronan Le Bras, Niket Tandon, Peter Clark, Yejin Choi
Scripts – prototypical event sequences describing everyday activities – have been shown to help understand narratives by providing expectations, resolving ambiguity, and filling in unstated information.
no code implementations • 29 Jul 2024 • Tao Feng, Lizhen Qu, Niket Tandon, Zhuang Li, Xiaoxi Kang, Gholamreza Haffari
Recent advances in artificial intelligence have seen Large Language Models (LLMs) demonstrate notable proficiency in causal discovery tasks.
1 code implementation • 30 May 2024 • Li Zhang, Peter Jansen, Tianyi Zhang, Peter Clark, Chris Callison-Burch, Niket Tandon
A recent, promising line of work uses LLMs to generate a formal representation of the environment that can be solved by a symbolic planner.
1 code implementation • 25 Apr 2024 • Wenlong Zhao, Debanjan Mondal, Niket Tandon, Danica Dillion, Kurt Gray, Yuling Gu
The awareness of multi-cultural human values is critical to the ability of language models (LMs) to generate safe and personalized responses.
no code implementations • 29 Feb 2024 • Tianyi Zhang, Li Zhang, Zhaoyi Hou, Ziyu Wang, Yuling Gu, Peter Clark, Chris Callison-Burch, Niket Tandon
Planning in a text-based environment continues to be a major challenge for AI systems.
no code implementations • 21 Feb 2024 • Qing Lyu, Kumar Shridhar, Chaitanya Malaviya, Li Zhang, Yanai Elazar, Niket Tandon, Marianna Apidianaki, Mrinmaya Sachan, Chris Callison-Burch
Accurately gauging the confidence level of Large Language Models' (LLMs) predictions is pivotal for their reliable application.
1 code implementation • 8 Feb 2024 • Tianjun Zhang, Aman Madaan, Luyu Gao, Steven Zheng, Swaroop Mishra, Yiming Yang, Niket Tandon, Uri Alon
We evaluate LEAP on a wide range of benchmarks, including multi-hop question answering (Hotpot QA), textual QA (DROP), Big-Bench Hard reasoning, and math problems (GSM8K and MATH); in all these benchmarks, LEAP improves the strongest available LLMs such as GPT-3. 5-turbo, GPT-4, GPT-4 turbo and Claude-2. 1.
no code implementations • 16 Nov 2023 • Yash Kumar Lal, Li Zhang, Faeze Brahman, Bodhisattwa Prasad Majumder, Peter Clark, Niket Tandon
Our approach is to test several simple multi-LLM-agent architectures for customization, as well as an end-to-end LLM, using a new evaluation set, called CustomPlans, of over 200 WikiHow procedures each with a customization need.
no code implementations • 14 Nov 2023 • Kushal Jain, Moritz Miller, Niket Tandon, Kumar Shridhar
Language models can solve complex reasoning tasks better by learning to generate rationales for their predictions.
no code implementations • 24 Oct 2023 • Kavel Rao, Liwei Jiang, Valentina Pyatkin, Yuling Gu, Niket Tandon, Nouha Dziri, Faeze Brahman, Yejin Choi
From this model we distill a high-quality dataset, \delta-Rules-of-Thumb, of 1. 2M entries of contextualizations and rationales for 115K defeasible moral actions rated highly by human annotators 85. 9% to 99. 8% of the time.
no code implementations • 16 Oct 2023 • Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra, Peter Jansen, Oyvind Tafjord, Niket Tandon, Li Zhang, Chris Callison-Burch, Peter Clark
Language agents have shown some ability to interact with an external environment, e. g., a virtual world such as ScienceWorld, to perform complex tasks, e. g., growing a plant, without the startup costs of reinforcement learning.
no code implementations • 1 Jul 2023 • Beatriz Borges, Niket Tandon, Tanja Käser, Antoine Bosselut
Natural Language Feedback (NLF) is an increasingly popular mechanism for aligning Large Language Models (LLMs) to human preferences.
1 code implementation • 24 May 2023 • Li Zhang, Hainiu Xu, Abhinav Kommula, Chris Callison-Burch, Niket Tandon
An earlier dataset, OpenPI, provided crowdsourced annotations of entity state changes in text.
1 code implementation • 24 May 2023 • Anshita Gupta, Debanjan Mondal, Akshay Krishna Sheshadri, Wenlong Zhao, Xiang Lorraine Li, Sarah Wiegreffe, Niket Tandon
However, these editing methods have only been evaluated on statements about encyclopedic knowledge with a single correct answer.
no code implementations • 24 May 2023 • EunJeong Hwang, Bodhisattwa Prasad Majumder, Niket Tandon
An important aspect of developing LLMs that interact with humans is to align models' behavior to their users.
1 code implementation • 15 May 2023 • Afra Feyza Akyürek, Ekin Akyürek, Aman Madaan, Ashwin Kalyan, Peter Clark, Derry Wijaya, Niket Tandon
Despite their unprecedented success, even the largest language models make mistakes.
3 code implementations • NeurIPS 2023 • Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, Sean Welleck, Amir Yazdanbakhsh, Peter Clark
Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement.
no code implementations • 25 May 2022 • Aman Madaan, Dheeraj Rajagopal, Niket Tandon, Yiming Yang, Antoine Bosselut
Conditional set generation learns a mapping from an input sequence of tokens to a set.
1 code implementation • 16 Jan 2022 • Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang
Large LMs such as GPT-3 are powerful, but can commit mistakes that are obvious to humans.
1 code implementation • Findings (NAACL) 2022 • Niket Tandon, Aman Madaan, Peter Clark, Yiming Yang
Our goal is for an LM to continue to improve after deployment, without retraining, using feedback from the user.
1 code implementation • 15 Dec 2021 • Niket Tandon, Aman Madaan, Peter Clark, Keisuke Sakaguchi, Yiming Yang
We present a new dataset, Interscript, containing user feedback on a deployed model that generates complex everyday tasks.
1 code implementation • 24 Oct 2021 • Aman Madaan, Niket Tandon, Dheeraj Rajagopal, Peter Clark, Yiming Yang, Eduard Hovy
Defeasible reasoning is the mode of reasoning where conclusions can be overturned by taking into account new evidence.
1 code implementation • AKBC Workshop CSKB 2021 • Aman Madaan, Dheeraj Rajagopal, Niket Tandon, Yiming Yang, Eduard Hovy
Defeasible reasoning is the mode of reasoning where conclusions can be overturned by taking into account new evidence.
no code implementations • 18 Apr 2021 • Aman Madaan, Niket Tandon, Dheeraj Rajagopal, Yiming Yang, Peter Clark, Keisuke Sakaguchi, Ed Hovy
A class of explainable NLP models for reasoning tasks support their decisions by generating free-form or structured explanations, but what happens when these supporting structures contain errors?
no code implementations • 16 Apr 2021 • Keisuke Sakaguchi, Chandra Bhagavatula, Ronan Le Bras, Niket Tandon, Peter Clark, Yejin Choi
Scripts - standardized event sequences describing typical everyday activities - have been shown to help understand narratives by providing expectations, resolving ambiguity, and filling in unstated information.
1 code implementation • CSRR (ACL) 2022 • Dheeraj Rajagopal, Aman Madaan, Niket Tandon, Yiming Yang, Shrimai Prabhumoye, Abhilasha Ravichander, Peter Clark, Eduard Hovy
Recently, models have been shown to predict the effects of unexpected situations, e. g., would cloudy skies help or hinder plant growth?
no code implementations • 4 Mar 2021 • Simon Razniewski, Niket Tandon, Aparna S. Varde
Commonsense knowledge is a foundational cornerstone of artificial intelligence applications.
no code implementations • EMNLP 2020 • Niket Tandon, Keisuke Sakaguchi, Bhavana Dalvi Mishra, Dheeraj Rajagopal, Peter Clark, Michal Guerquin, Kyle Richardson, Eduard Hovy
Our solution is a new task formulation where given just a procedural text as input, the task is to generate a set of state change tuples(entity, at-tribute, before-state, after-state)for each step, where the entity, attribute, and state values must be predicted from an open vocabulary.
no code implementations • 12 Jun 2020 • Sumithra Bhakthavatsalam, Kyle Richardson, Niket Tandon, Peter Clark
We present a new knowledge-base of hasPart relationships, extracted from a large corpus of generic statements.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Dheeraj Rajagopal, Niket Tandon, Bhavana Dalvi, Peter Clark, Eduard Hovy
We address the task of explaining the effects of perturbations in procedural text, an important test of process comprehension.
1 code implementation • 10 Sep 2019 • Niket Tandon, Bhavana Dalvi Mishra, Keisuke Sakaguchi, Antoine Bosselut, Peter Clark
We introduce WIQA, the first large-scale dataset of "What if..." questions over procedural text.
no code implementations • IJCNLP 2019 • Bhavana Dalvi Mishra, Niket Tandon, Antoine Bosselut, Wen-tau Yih, Peter Clark
Our goal is to better comprehend procedural text, e. g., a paragraph about photosynthesis, by not only predicting what happens, but why some actions need to happen before others.
no code implementations • 4 Sep 2019 • Peter Clark, Oren Etzioni, Daniel Khashabi, Tushar Khot, Bhavana Dalvi Mishra, Kyle Richardson, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord, Niket Tandon, Sumithra Bhakthavatsalam, Dirk Groeneveld, Michal Guerquin, Michael Schmitz
This paper reports unprecedented success on the Grade 8 New York Regents Science Exam, where for the first time a system scores more than 90% on the exam's non-diagram, multiple choice (NDMC) questions.
no code implementations • 2 Sep 2019 • Sreyasi Nag Chowdhury, Niket Tandon, Hakan Ferhatosmanoglu, Gerhard Weikum
CBIR now gains semantic expressiveness by advances in deep-learning-based detection of visual labels.
no code implementations • WS 2016 • Sreyasi Nag Chowdhury, Niket Tandon, Gerhard Weikum
With the rise in popularity of social media, images accompanied by contextual text form a huge section of the web.
1 code implementation • NAACL 2019 • Xinya Du, Bhavana Dalvi Mishra, Niket Tandon, Antoine Bosselut, Wen-tau Yih, Peter Clark, Claire Cardie
Our goal is procedural text comprehension, namely tracking how the properties of entities (e. g., their location) change with time given a procedural text (e. g., a paragraph about photosynthesis, a recipe).
1 code implementation • EMNLP 2018 • Niket Tandon, Bhavana Dalvi Mishra, Joel Grus, Wen-tau Yih, Antoine Bosselut, Peter Clark
Comprehending procedural text, e. g., a paragraph describing photosynthesis, requires modeling actions and the state changes they produce, so that questions about entities at different timepoints can be answered.
no code implementations • NAACL 2018 • Bhavana Dalvi Mishra, Lifu Huang, Niket Tandon, Wen-tau Yih, Peter Clark
The new dataset, ProPara, is the first to contain natural (rather than machine-generated) text about a changing world along with a full annotation of entity states (location and existence) during those changes (81k datapoints).
Ranked #4 on
Procedural Text Understanding
on ProPara
no code implementations • 15 Apr 2018 • Peter Clark, Bhavana Dalvi, Niket Tandon
To supply this knowledge, we leverage VerbNet to build a rulebase (called the Semantic Lexicon) of the preconditions and effects of actions, and use it along with commonsense knowledge of persistence to answer questions about change.
no code implementations • 26 Sep 2016 • Atousa Torabi, Niket Tandon, Leonid Sigal
We evaluate our models on large scale LSMDC16 movie dataset for two tasks: 1) Standard Ranking for video annotation and retrieval 2) Our proposed movie multiple-choice test.
Ranked #40 on
Video Retrieval
on MSR-VTT
no code implementations • 12 May 2016 • Anna Rohrbach, Atousa Torabi, Marcus Rohrbach, Niket Tandon, Christopher Pal, Hugo Larochelle, Aaron Courville, Bernt Schiele
In addition we also collected and aligned movie scripts used in prior work and compare the two sources of descriptions.
no code implementations • CVPR 2015 • Anna Rohrbach, Marcus Rohrbach, Niket Tandon, Bernt Schiele
In this work we propose a novel dataset which contains transcribed DVS, which is temporally aligned to full length HD movies.