Search Results for author: Peter Clark

Found 112 papers, 47 papers with code

proScript: Partially Ordered Scripts Generation

no code implementations Findings (EMNLP) 2021 Keisuke Sakaguchi, Chandra Bhagavatula, Ronan Le Bras, Niket Tandon, Peter Clark, Yejin Choi

Scripts – prototypical event sequences describing everyday activities – have been shown to help understand narratives by providing expectations, resolving ambiguity, and filling in unstated information.

Script Generation Text Generation +1

Think about it! Improving defeasible reasoning by first modeling the question scenario.

1 code implementation EMNLP 2021 Aman Madaan, Niket Tandon, Dheeraj Rajagopal, Peter Clark, Yiming Yang, Eduard Hovy

Defeasible reasoning is the mode of reasoning where conclusions can be overturned by taking into account new evidence.

SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories

1 code implementation11 Sep 2024 Ben Bogin, Kejuan Yang, Shashank Gupta, Kyle Richardson, Erin Bransom, Peter Clark, Ashish Sabharwal, Tushar Khot

To advance towards this goal, we introduce SUPER, the first benchmark designed to evaluate the capability of LLMs in setting up and executing tasks from research repositories.

DiscoveryBench: Towards Data-Driven Discovery with Large Language Models

1 code implementation1 Jul 2024 Bodhisattwa Prasad Majumder, Harshit Surana, Dhruv Agarwal, Bhavana Dalvi Mishra, Abhijeetsingh Meena, Aryan Prakhar, Tirth Vora, Tushar Khot, Ashish Sabharwal, Peter Clark

Can the rapid advances in code generation, function calling, and data analysis using large language models (LLMs) help automate the search and verification of hypotheses purely from a set of provided datasets?

Code Generation Sociology

DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents

1 code implementation10 Jun 2024 Peter Jansen, Marc-Alexandre Côté, Tushar Khot, Erin Bransom, Bhavana Dalvi Mishra, Bodhisattwa Prasad Majumder, Oyvind Tafjord, Peter Clark

However, developing and evaluating an AI agent's capacity for end-to-end scientific reasoning is challenging as running real-world experiments is often prohibitively expensive or infeasible.

Benchmarking scientific discovery

Can Language Models Serve as Text-Based World Simulators?

no code implementations10 Jun 2024 Ruoyao Wang, Graham Todd, Ziang Xiao, Xingdi Yuan, Marc-Alexandre Côté, Peter Clark, Peter Jansen

Can current language models themselves serve as world simulators, correctly predicting how actions change different world states, thus bypassing the need for extensive manual coding?

Benchmarking Decision Making

PDDLEGO: Iterative Planning in Textual Environments

1 code implementation30 May 2024 Li Zhang, Peter Jansen, Tianyi Zhang, Peter Clark, Chris Callison-Burch, Niket Tandon

A recent, promising line of work uses LLMs to generate a formal representation of the environment that can be solved by a symbolic planner.

Learning to Reason via Program Generation, Emulation, and Search

1 code implementation25 May 2024 Nathaniel Weir, Muhammad Khalifa, Linlu Qiu, Orion Weller, Peter Clark

CoGEX works by (1) training LMs to generate their own pseudo-programs, (2) teaching them to emulate their generated program's execution, including those leaf functions, allowing the LM's knowledge to fill in the execution gaps; and (3) using them to search over many programs to find an optimal one.

Code Generation In-Context Learning +1

Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic

no code implementations22 Feb 2024 Nathaniel Weir, Kate Sanders, Orion Weller, Shreya Sharma, Dongwei Jiang, Zhengping Jiang, Bhavana Dalvi Mishra, Oyvind Tafjord, Peter Jansen, Peter Clark, Benjamin Van Durme

Recent language models enable new opportunities for structured reasoning with text, such as the construction of intuitive, proof-like textual entailment trees without relying on brittle formal logic.

Formal Logic Knowledge Distillation +2

Data-driven Discovery with Large Generative Models

no code implementations21 Feb 2024 Bodhisattwa Prasad Majumder, Harshit Surana, Dhruv Agarwal, Sanchaita Hazra, Ashish Sabharwal, Peter Clark

With the accumulation of data at an unprecedented rate, its potential to fuel scientific discovery is growing exponentially.

scientific discovery

Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills

1 code implementation5 Feb 2024 Kolby Nottingham, Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra, Sameer Singh, Peter Clark, Roy Fox

We evaluate our method in the classic videogame NetHack and the text environment ScienceWorld to demonstrate SSO's ability to optimize a set of skills and perform in-context policy improvement.

Decision Making Language Modelling +1

The Unreasonable Effectiveness of Easy Training Data for Hard Tasks

1 code implementation12 Jan 2024 Peter Hase, Mohit Bansal, Peter Clark, Sarah Wiegreffe

In this paper, we present the surprising conclusion that current pretrained language models often generalize relatively well from easy to hard data, even performing as well as oracle models finetuned on hard data.

General Knowledge In-Context Learning +1

BaRDa: A Belief and Reasoning Dataset that Separates Factual Accuracy and Reasoning Ability

no code implementations12 Dec 2023 Peter Clark, Bhavana Dalvi Mishra, Oyvind Tafjord

This shows the clear progression of models towards improved factual accuracy and entailment reasoning, and the dataset provides a new benchmark that more cleanly separates and quantifies these two notions.

counterfactual valid

Tailoring with Targeted Precision: Edit-Based Agents for Open-Domain Procedure Customization

no code implementations16 Nov 2023 Yash Kumar Lal, Li Zhang, Faeze Brahman, Bodhisattwa Prasad Majumder, Peter Clark, Niket Tandon

Our approach is to test several simple multi-LLM-agent architectures for customization, as well as an end-to-end LLM, using a new evaluation set, called CustomPlans, of over 200 WikiHow procedures each with a customization need.

Leveraging Code to Improve In-context Learning for Semantic Parsing

1 code implementation16 Nov 2023 Ben Bogin, Shivanshu Gupta, Peter Clark, Ashish Sabharwal

In-context learning (ICL) is an appealing approach for semantic parsing due to its few-shot nature and improved generalization.

In-Context Learning Semantic Parsing

Digital Socrates: Evaluating LLMs through Explanation Critiques

no code implementations16 Nov 2023 Yuling Gu, Oyvind Tafjord, Peter Clark

While LLMs can provide reasoned explanations along with their answers, the nature and quality of those explanations are still poorly understood.

ADaPT: As-Needed Decomposition and Planning with Language Models

1 code implementation8 Nov 2023 Archiki Prasad, Alexander Koller, Mareike Hartmann, Peter Clark, Ashish Sabharwal, Mohit Bansal, Tushar Khot

Large Language Models (LLMs) are increasingly being used for interactive decision-making tasks requiring planning and adapting to the environment.

Decision Making

Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs

1 code implementation8 Nov 2023 Shashank Gupta, Vaishnavi Shrivastava, Ameet Deshpande, Ashwin Kalyan, Peter Clark, Ashish Sabharwal, Tushar Khot

Our experiments with ChatGPT-3. 5 show that this bias is ubiquitous - 80% of our personas demonstrate bias; it is significant - some datasets show performance drops of 70%+; and can be especially harmful for certain groups - some personas suffer statistically significant drops on 80%+ of the datasets.

Fairness Math

QualEval: Qualitative Evaluation for Model Improvement

no code implementations6 Nov 2023 Vishvak Murahari, Ameet Deshpande, Peter Clark, Tanmay Rajpurohit, Ashish Sabharwal, Karthik Narasimhan, Ashwin Kalyan

In this work, we address the shortcomings of quantitative metrics by proposing QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.

CLIN: A Continually Learning Language Agent for Rapid Task Adaptation and Generalization

no code implementations16 Oct 2023 Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra, Peter Jansen, Oyvind Tafjord, Niket Tandon, Li Zhang, Chris Callison-Burch, Peter Clark

Language agents have shown some ability to interact with an external environment, e. g., a virtual world such as ScienceWorld, to perform complex tasks, e. g., growing a plant, without the startup costs of reinforcement learning.

Increasing Probability Mass on Answer Choices Does Not Always Improve Accuracy

1 code implementation24 May 2023 Sarah Wiegreffe, Matthew Finlayson, Oyvind Tafjord, Peter Clark, Ashish Sabharwal

For example, both normalization and prompting methods for reducing SFC can be ineffective or even detrimental to task performance for some LMs.

In-Context Learning Multiple-choice +1

Language Models with Rationality

no code implementations23 May 2023 Nora Kassner, Oyvind Tafjord, Ashish Sabharwal, Kyle Richardson, Hinrich Schuetze, Peter Clark

To address this, our goals are to make model beliefs and their inferential relationships explicit, and to resolve inconsistencies that may exist, so that answers are supported by interpretable chains of reasoning drawn from a consistent network of beliefs.

Question Answering

IfQA: A Dataset for Open-domain Question Answering under Counterfactual Presuppositions

no code implementations23 May 2023 Wenhao Yu, Meng Jiang, Peter Clark, Ashish Sabharwal

Although counterfactual reasoning is a fundamental aspect of intelligence, the lack of large-scale counterfactual open-domain question-answering (QA) benchmarks makes it difficult to evaluate and improve models on this ability.

counterfactual Counterfactual Reasoning +2

Let GPT be a Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation

no code implementations22 May 2023 Zhenwen Liang, Wenhao Yu, Tanmay Rajpurohit, Peter Clark, Xiangliang Zhang, Ashwin Kaylan

In this paper, we present a novel approach for distilling math word problem solving capabilities from large language models (LLMs) into smaller, more efficient student models.

Knowledge Tracing Math +1

Do language models have coherent mental models of everyday things?

1 code implementation20 Dec 2022 Yuling Gu, Bhavana Dalvi Mishra, Peter Clark

Using these questions as probes, we observe that state-of-the-art pre-trained language models (LMs) like GPT-3 and Macaw have fragments of knowledge about these everyday things, but do not have fully coherent "parts mental models" (54-59% accurate, 19-43% conditional constraint violation).

Just-DREAM-about-it: Figurative Language Understanding with DREAM-FLUTE

1 code implementation28 Oct 2022 Yuling Gu, Yao Fu, Valentina Pyatkin, Ian Magnusson, Bhavana Dalvi Mishra, Peter Clark

We hypothesize that to perform this task well, the reader needs to mentally elaborate the scene being described to identify a sensible meaning of the language.

Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning

no code implementations21 Oct 2022 Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark

Our goal is a question-answering (QA) system that can show how its answers are implied by its own internal beliefs via a systematic chain of reasoning.

Question Answering

Decomposed Prompting: A Modular Approach for Solving Complex Tasks

1 code implementation5 Oct 2022 Tushar Khot, Harsh Trivedi, Matthew Finlayson, Yao Fu, Kyle Richardson, Peter Clark, Ashish Sabharwal

On symbolic reasoning tasks, we can further decompose sub-tasks that are hard for LLMs into even simpler solvable sub-tasks.

Information Retrieval Retrieval

Complexity-Based Prompting for Multi-Step Reasoning

no code implementations3 Oct 2022 Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark, Tushar Khot

In this work, we propose complexity-based prompting, a simple and effective example selection scheme for multi-step reasoning.

Date Understanding GSM8K +2

Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning

2 code implementations29 Sep 2022 Pan Lu, Liang Qiu, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Tanmay Rajpurohit, Peter Clark, Ashwin Kalyan

However, it is unknown if the models can handle more complex problems that involve math reasoning over heterogeneous information, such as tabular data.

Logical Reasoning Math +1

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering

1 code implementation20 Sep 2022 Pan Lu, Swaroop Mishra, Tony Xia, Liang Qiu, Kai-Wei Chang, Song-Chun Zhu, Oyvind Tafjord, Peter Clark, Ashwin Kalyan

We further design language models to learn to generate lectures and explanations as the chain of thought (CoT) to mimic the multi-hop reasoning process when answering ScienceQA questions.

Multimodal Deep Learning Multimodal Reasoning +5

NELLIE: A Neuro-Symbolic Inference Engine for Grounded, Compositional, and Explainable Reasoning

no code implementations16 Sep 2022 Nathaniel Weir, Peter Clark, Benjamin Van Durme

Our goal is a modern approach to answering questions via systematic reasoning where answers are supported by human interpretable proof trees grounded in an NL corpus of authoritative facts.

Hallucination Language Modelling +1

Towards Teachable Reasoning Systems: Using a Dynamic Memory of User Feedback for Continual System Improvement

no code implementations27 Apr 2022 Bhavana Dalvi Mishra, Oyvind Tafjord, Peter Clark

Our goal is a teachable reasoning system for question-answering (QA), where a user can interact with faithful answer explanations, and correct its errors so that the system improves over time.

Question Answering

What Makes Instruction Learning Hard? An Investigation and a New Challenge in a Synthetic Environment

1 code implementation19 Apr 2022 Matthew Finlayson, Kyle Richardson, Ashish Sabharwal, Peter Clark

We propose Hard RegSet as a challenging instruction learning task, and a controlled environment for studying instruction learning.

Out-of-Distribution Generalization

Memory-assisted prompt editing to improve GPT-3 after deployment

1 code implementation16 Jan 2022 Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang

Large LMs such as GPT-3 are powerful, but can commit mistakes that are obvious to humans.

DREAM: Improving Situational QA by First Elaborating the Situation

1 code implementation NAACL 2022 Yuling Gu, Bhavana Dalvi Mishra, Peter Clark

To test this conjecture, we train a new model, DREAM, to answer questions that elaborate the scenes that situated questions are about, and then provide those elaborations as additional context to a question-answering (QA) model.

Question Answering

Interscript: A dataset for interactive learning of scripts through error feedback

1 code implementation15 Dec 2021 Niket Tandon, Aman Madaan, Peter Clark, Keisuke Sakaguchi, Yiming Yang

We present a new dataset, Interscript, containing user feedback on a deployed model that generates complex everyday tasks.

Structured Prediction

Think about it! Improving defeasible reasoning by first modeling the question scenario

1 code implementation24 Oct 2021 Aman Madaan, Niket Tandon, Dheeraj Rajagopal, Peter Clark, Yiming Yang, Eduard Hovy

Defeasible reasoning is the mode of reasoning where conclusions can be overturned by taking into account new evidence.

BeliefBank: Adding Memory to a Pre-Trained Language Model for a Systematic Notion of Belief

no code implementations EMNLP 2021 Nora Kassner, Oyvind Tafjord, Hinrich Schütze, Peter Clark

We show that, in a controlled experimental setting, these two mechanisms result in more consistent beliefs in the overall system, improving both the accuracy and consistency of its answers over time.

Language Modelling World Knowledge

General-Purpose Question-Answering with Macaw

2 code implementations6 Sep 2021 Oyvind Tafjord, Peter Clark

Despite the successes of pretrained language models, there are still few high-quality, general-purpose QA systems that are freely available.

Generative Question Answering Multiple-choice

Improving Neural Model Performance through Natural Language Feedback on Their Explanations

no code implementations18 Apr 2021 Aman Madaan, Niket Tandon, Dheeraj Rajagopal, Yiming Yang, Peter Clark, Keisuke Sakaguchi, Ed Hovy

A class of explainable NLP models for reasoning tasks support their decisions by generating free-form or structured explanations, but what happens when these supporting structures contain errors?

Explaining Answers with Entailment Trees

1 code implementation EMNLP 2021 Bhavana Dalvi, Peter Jansen, Oyvind Tafjord, Zhengnan Xie, Hannah Smith, Leighanna Pipatanangkura, Peter Clark

Our approach is to generate explanations in the form of entailment trees, namely a tree of multipremise entailment steps from facts that are known, through intermediate conclusions, to the hypothesis of interest (namely the question + answer).

Language Modelling Question Answering +1

proScript: Partially Ordered Scripts Generation via Pre-trained Language Models

no code implementations16 Apr 2021 Keisuke Sakaguchi, Chandra Bhagavatula, Ronan Le Bras, Niket Tandon, Peter Clark, Yejin Choi

Scripts - standardized event sequences describing typical everyday activities - have been shown to help understand narratives by providing expectations, resolving ambiguity, and filling in unstated information.

Script Generation Text Generation +1

ProofWriter: Generating Implications, Proofs, and Abductive Statements over Natural Language

no code implementations Findings (ACL) 2021 Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark

In this work we show that a generative model, called ProofWriter, can reliably generate both implications of a theory and the natural language proof(s) that support them.

A Dataset for Tracking Entities in Open Domain Procedural Text

no code implementations EMNLP 2020 Niket Tandon, Keisuke Sakaguchi, Bhavana Dalvi Mishra, Dheeraj Rajagopal, Peter Clark, Michal Guerquin, Kyle Richardson, Eduard Hovy

Our solution is a new task formulation where given just a procedural text as input, the task is to generate a set of state change tuples(entity, at-tribute, before-state, after-state)for each step, where the entity, attribute, and state values must be predicted from an open vocabulary.

Attribute

Text Modular Networks: Learning to Decompose Tasks in the Language of Existing Models

1 code implementation NAACL 2021 Tushar Khot, Daniel Khashabi, Kyle Richardson, Peter Clark, Ashish Sabharwal

We propose a general framework called Text Modular Networks(TMNs) for building interpretable systems that learn to solve complex tasks by decomposing them into simpler ones solvable by existing models.

Question Answering

Do Dogs have Whiskers? A New Knowledge Base of hasPart Relations

no code implementations12 Jun 2020 Sumithra Bhakthavatsalam, Kyle Richardson, Niket Tandon, Peter Clark

We present a new knowledge-base of hasPart relationships, extracted from a large corpus of generic statements.

Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge

1 code implementation NeurIPS 2020 Alon Talmor, Oyvind Tafjord, Peter Clark, Yoav Goldberg, Jonathan Berant

In this work, we provide a first demonstration that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements.

World Knowledge

Knowledge Patterns

no code implementations8 May 2020 Peter Clark, John Thompson, Bruce Porter

From a modeling perspective, knowledge patterns provide an important insight into the structure of a formal ontology: rather than viewing a formal ontology simply as a list of terms and axioms, knowledge patterns views it as a collection of abstract, modular theories (the "knowledge patterns") plus a collection of modeling decisions stating how different aspects of the world can be modeled using those theories.

GenericsKB: A Knowledge Base of Generic Statements

no code implementations2 May 2020 Sumithra Bhakthavatsalam, Chloe Anastasiades, Peter Clark

We present a new resource for the NLP community, namely a large (3. 5M+ sentence) knowledge base of *generic statements*, e. g., "Trees remove carbon dioxide from the atmosphere", collected from multiple corpora.

Sentence

Transformers as Soft Reasoners over Language

2 code implementations14 Feb 2020 Peter Clark, Oyvind Tafjord, Kyle Richardson

However, expressing the knowledge in a formal (logical or probabilistic) representation has been a major obstacle to this research.

counterfactual Counterfactual Reasoning +2

Everything Happens for a Reason: Discovering the Purpose of Actions in Procedural Text

no code implementations IJCNLP 2019 Bhavana Dalvi Mishra, Niket Tandon, Antoine Bosselut, Wen-tau Yih, Peter Clark

Our goal is to better comprehend procedural text, e. g., a paragraph about photosynthesis, by not only predicting what happens, but why some actions need to happen before others.

Reading Comprehension

WIQA: A dataset for "What if..." reasoning over procedural text

1 code implementation10 Sep 2019 Niket Tandon, Bhavana Dalvi Mishra, Keisuke Sakaguchi, Antoine Bosselut, Peter Clark

We introduce WIQA, the first large-scale dataset of "What if..." questions over procedural text.

Multiple-choice

QuaRTz: An Open-Domain Dataset of Qualitative Relationship Questions

no code implementations IJCNLP 2019 Oyvind Tafjord, Matt Gardner, Kevin Lin, Peter Clark

QuaRTz contains general qualitative statements, e. g., "A sunscreen with a higher SPF protects the skin longer.

General Knowledge

From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project

no code implementations4 Sep 2019 Peter Clark, Oren Etzioni, Daniel Khashabi, Tushar Khot, Bhavana Dalvi Mishra, Kyle Richardson, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord, Niket Tandon, Sumithra Bhakthavatsalam, Dirk Groeneveld, Michal Guerquin, Michael Schmitz

This paper reports unprecedented success on the Grade 8 New York Regents Science Exam, where for the first time a system scores more than 90% on the exam's non-diagram, multiple choice (NDMC) questions.

Multiple-choice Question Answering

Reasoning Over Paragraph Effects in Situations

no code implementations WS 2019 Kevin Lin, Oyvind Tafjord, Peter Clark, Matt Gardner

A system is presented a background passage containing at least one of these relations, a novel situation that uses this background, and questions that require reasoning about effects of the relationships in the background passage in the context of the situation.

Reading Comprehension

Be Consistent! Improving Procedural Text Comprehension using Label Consistency

1 code implementation NAACL 2019 Xinya Du, Bhavana Dalvi Mishra, Niket Tandon, Antoine Bosselut, Wen-tau Yih, Peter Clark, Claire Cardie

Our goal is procedural text comprehension, namely tracking how the properties of entities (e. g., their location) change with time given a procedural text (e. g., a paragraph about photosynthesis, a recipe).

Reading Comprehension

Declarative Question Answering over Knowledge Bases containing Natural Language Text with Answer Set Programming

1 code implementation1 May 2019 Arindam Mitra, Peter Clark, Oyvind Tafjord, Chitta Baral

While in recent years machine learning (ML) based approaches have been the popular approach in developing end-to-end question answering systems, such systems often struggle when additional knowledge is needed to correctly answer the questions.

Logical Reasoning Natural Language Inference +1

QuaRel: A Dataset and Models for Answering Questions about Qualitative Relationships

no code implementations20 Nov 2018 Oyvind Tafjord, Peter Clark, Matt Gardner, Wen-tau Yih, Ashish Sabharwal

Many natural language questions require recognizing and reasoning with qualitative relationships (e. g., in science, economics, and medicine), but are challenging to answer with corpus-based methods.

Friction Semantic Parsing

Exploiting Explicit Paths for Multi-hop Reading Comprehension

1 code implementation ACL 2019 Souvik Kundu, Tushar Khot, Ashish Sabharwal, Peter Clark

To capture additional context, PathNet also composes the passage representations along each path to compute a passage-based representation.

Implicit Relations Knowledge Graphs +1

Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering

1 code implementation EMNLP 2018 Todor Mihaylov, Peter Clark, Tushar Khot, Ashish Sabharwal

Our oracle experiments designed to circumvent the knowledge retrieval bottleneck demonstrate the value of both the open book and additional facts.

Question Answering Retrieval

Reasoning about Actions and State Changes by Injecting Commonsense Knowledge

1 code implementation EMNLP 2018 Niket Tandon, Bhavana Dalvi Mishra, Joel Grus, Wen-tau Yih, Antoine Bosselut, Peter Clark

Comprehending procedural text, e. g., a paragraph describing photosynthesis, requires modeling actions and the state changes they produce, so that questions about entities at different timepoints can be answered.

Reading Comprehension Structured Prediction

Bridging Knowledge Gaps in Neural Entailment via Symbolic Models

no code implementations EMNLP 2018 Dongyeop Kang, Tushar Khot, Ashish Sabharwal, Peter Clark

We focus on filling these knowledge gaps in the Science Entailment task, by leveraging an external structured knowledge base (KB) of science facts.

Natural Language Inference

What Knowledge is Needed to Solve the RTE5 Textual Entailment Challenge?

no code implementations10 Jun 2018 Peter Clark

The analysis ignores shallow statistical matching techniques between T and H, and rather asks: What would it take to reasonably infer that T implies H?

Natural Language Inference RTE +1

Tracking State Changes in Procedural Text: A Challenge Dataset and Models for Process Paragraph Comprehension

no code implementations NAACL 2018 Bhavana Dalvi Mishra, Lifu Huang, Niket Tandon, Wen-tau Yih, Peter Clark

The new dataset, ProPara, is the first to contain natural (rather than machine-generated) text about a changing world along with a full annotation of entity states (location and existence) during those changes (81k datapoints).

Procedural Text Understanding

What Happened? Leveraging VerbNet to Predict the Effects of Actions in Procedural Text

no code implementations15 Apr 2018 Peter Clark, Bhavana Dalvi, Niket Tandon

To supply this knowledge, we leverage VerbNet to build a rulebase (called the Semantic Lexicon) of the preconditions and effects of actions, and use it along with commonsense knowledge of persistence to answer questions about change.

Reading Comprehension

Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge

1 code implementation14 Mar 2018 Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord

We present a new question set, text corpus, and baselines assembled to encourage AI research in advanced question answering.

ARC Question Answering +1

Story Generation and Aviation Incident Representation

no code implementations13 Feb 2018 Peter Clark

This working note discusses the topic of story generation, with a view to identifying the knowledge required to understand aviation incident narratives (which have structural similarities to stories), following the premise that to understand aviation incidents, one should at least be able to generate examples of them.

Story Generation

Framing QA as Building and Ranking Intersentence Answer Justifications

no code implementations CL 2017 Peter Jansen, Rebecca Sharp, Mihai Surdeanu, Peter Clark

Our best configuration answers 44{\%} of the questions correctly, where the top justifications for 57{\%} of these correct answers contain a compelling human-readable justification that explains the inference required to arrive at the correct answer.

Multiple-choice Question Answering

Answering Complex Questions Using Open Information Extraction

1 code implementation ACL 2017 Tushar Khot, Ashish Sabharwal, Peter Clark

While there has been substantial progress in factoid question-answering (QA), answering complex questions remains challenging, typically requiring both a large body of knowledge and inference techniques.

Open Information Extraction Question Answering +1

Domain-Targeted, High Precision Knowledge Extraction

no code implementations TACL 2017 Bhavana Dalvi Mishra, T, Niket on, Peter Clark

Our goal is to construct a domain-targeted, high precision knowledge base (KB), containing general (subject, predicate, object) statements about the world, in support of a downstream question-answering (QA) application.

Open Information Extraction Question Answering +1

Creating Causal Embeddings for Question Answering with Minimal Supervision

no code implementations EMNLP 2016 Rebecca Sharp, Mihai Surdeanu, Peter Jansen, Peter Clark, Michael Hammond

We argue that a better approach is to look for answers that are related to the question in a relevant way, according to the information need of the question, which may be determined through task-specific embeddings.

Question Answering Word Embeddings

Question Answering via Integer Programming over Semi-Structured Knowledge

no code implementations20 Apr 2016 Daniel Khashabi, Tushar Khot, Ashish Sabharwal, Peter Clark, Oren Etzioni, Dan Roth

We propose a structured inference system for this task, formulated as an Integer Linear Program (ILP), that answers natural language questions using a semi-structured knowledge base derived from text, including questions requiring multi-step inference and a combination of multiple facts.

Information Retrieval Question Answering +1

Moving Beyond the Turing Test with the Allen AI Science Challenge

3 code implementations14 Apr 2016 Carissa Schoenick, Peter Clark, Oyvind Tafjord, Peter Turney, Oren Etzioni

Given recent successes in AI (e. g., AlphaGo's victory against Lee Sedol in the game of GO), it's become increasingly important to assess: how close are AI systems to human-level intelligence?

Question Answering

Higher-order Lexical Semantic Models for Non-factoid Answer Reranking

no code implementations TACL 2015 Daniel Fried, Peter Jansen, Gustave Hahn-Powell, Mihai Surdeanu, Peter Clark

We introduce a higher-order formalism that allows all these lexical semantic models to chain direct evidence to construct indirect associations between question and answer texts, by casting the task as the traversal of graphs that encode direct term associations.

Open-Domain Question Answering Semantic Similarity +1

Cannot find the paper you are looking for? You can Submit a new open access paper.