Search Results for author: Oyvind Tafjord

Found 38 papers, 18 papers with code

“Let Your Characters Tell Their Story”: A Dataset for Character-Centric Narrative Understanding

no code implementations Findings (EMNLP) 2021 Faeze Brahman, Meng Huang, Oyvind Tafjord, Chao Zhao, Mrinmaya Sachan, Snigdha Chaturvedi

When reading a literary piece, readers often make inferences about various characters’ roles, personalities, relationships, intents, actions, etc.

``You are grounded!'': Latent Name Artifacts in Pre-trained Language Models

no code implementations EMNLP 2020 Vered Shwartz, Rachel Rudinger, Oyvind Tafjord

Pre-trained language models (LMs) may perpetuate biases originating in their training corpus to downstream models.

Reading Comprehension

Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic

no code implementations22 Feb 2024 Nathaniel Weir, Kate Sanders, Orion Weller, Shreya Sharma, Dongwei Jiang, Zhengping Jiang, Bhavana Dalvi Mishra, Oyvind Tafjord, Peter Jansen, Peter Clark, Benjamin Van Durme

Contemporary language models enable new opportunities for structured reasoning with text, such as the construction and evaluation of intuitive, proof-like textual entailment trees without relying on brittle formal logic.

Formal Logic Knowledge Distillation +2

Paloma: A Benchmark for Evaluating Language Model Fit

no code implementations16 Dec 2023 Ian Magnusson, Akshita Bhagia, Valentin Hofmann, Luca Soldaini, Ananya Harsh Jha, Oyvind Tafjord, Dustin Schwenk, Evan Pete Walsh, Yanai Elazar, Kyle Lo, Dirk Groeneveld, Iz Beltagy, Hannaneh Hajishirzi, Noah A. Smith, Kyle Richardson, Jesse Dodge

We invite submissions to our benchmark and organize results by comparability based on compliance with guidelines such as removal of benchmark contamination from pretraining.

Language Modelling

BaRDa: A Belief and Reasoning Dataset that Separates Factual Accuracy and Reasoning Ability

no code implementations12 Dec 2023 Peter Clark, Bhavana Dalvi Mishra, Oyvind Tafjord

This shows the clear progression of models towards improved factual accuracy and entailment reasoning, and the dataset provides a new benchmark that more cleanly separates and quantifies these two notions.

counterfactual valid

Digital Socrates: Evaluating LLMs through Explanation Critiques

no code implementations16 Nov 2023 Yuling Gu, Oyvind Tafjord, Peter Clark

While LLMs can provide reasoned explanations along with their answers, the nature and quality of those explanations are still poorly understood.

CLIN: A Continually Learning Language Agent for Rapid Task Adaptation and Generalization

no code implementations16 Oct 2023 Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra, Peter Jansen, Oyvind Tafjord, Niket Tandon, Li Zhang, Chris Callison-Burch, Peter Clark

Language agents have shown some ability to interact with an external environment, e. g., a virtual world such as ScienceWorld, to perform complex tasks, e. g., growing a plant, without the startup costs of reinforcement learning.

Increasing Probability Mass on Answer Choices Does Not Always Improve Accuracy

1 code implementation24 May 2023 Sarah Wiegreffe, Matthew Finlayson, Oyvind Tafjord, Peter Clark, Ashish Sabharwal

For example, both normalization and prompting methods for reducing SFC can be ineffective or even detrimental to task performance for some LMs.

In-Context Learning Multiple-choice +1

Language Models with Rationality

no code implementations23 May 2023 Nora Kassner, Oyvind Tafjord, Ashish Sabharwal, Kyle Richardson, Hinrich Schuetze, Peter Clark

To address this, our goals are to make model beliefs and their inferential relationships explicit, and to resolve inconsistencies that may exist, so that answers are supported by interpretable chains of reasoning drawn from a consistent network of beliefs.

Question Answering

Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning

no code implementations21 Oct 2022 Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark

Our goal is a question-answering (QA) system that can show how its answers are implied by its own internal beliefs via a systematic chain of reasoning.

Question Answering

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering

1 code implementation20 Sep 2022 Pan Lu, Swaroop Mishra, Tony Xia, Liang Qiu, Kai-Wei Chang, Song-Chun Zhu, Oyvind Tafjord, Peter Clark, Ashwin Kalyan

We further design language models to learn to generate lectures and explanations as the chain of thought (CoT) to mimic the multi-hop reasoning process when answering ScienceQA questions.

Multimodal Deep Learning Multimodal Reasoning +5

Towards Teachable Reasoning Systems: Using a Dynamic Memory of User Feedback for Continual System Improvement

no code implementations27 Apr 2022 Bhavana Dalvi Mishra, Oyvind Tafjord, Peter Clark

Our goal is a teachable reasoning system for question-answering (QA), where a user can interact with faithful answer explanations, and correct its errors so that the system improves over time.

Question Answering

BeliefBank: Adding Memory to a Pre-Trained Language Model for a Systematic Notion of Belief

no code implementations EMNLP 2021 Nora Kassner, Oyvind Tafjord, Hinrich Schütze, Peter Clark

We show that, in a controlled experimental setting, these two mechanisms result in more consistent beliefs in the overall system, improving both the accuracy and consistency of its answers over time.

Language Modelling World Knowledge

"Let Your Characters Tell Their Story": A Dataset for Character-Centric Narrative Understanding

no code implementations12 Sep 2021 Faeze Brahman, Meng Huang, Oyvind Tafjord, Chao Zhao, Mrinmaya Sachan, Snigdha Chaturvedi

When reading a literary piece, readers often make inferences about various characters' roles, personalities, relationships, intents, actions, etc.

General-Purpose Question-Answering with Macaw

2 code implementations6 Sep 2021 Oyvind Tafjord, Peter Clark

Despite the successes of pretrained language models, there are still few high-quality, general-purpose QA systems that are freely available.

Generative Question Answering Multiple-choice

Explaining Answers with Entailment Trees

1 code implementation EMNLP 2021 Bhavana Dalvi, Peter Jansen, Oyvind Tafjord, Zhengnan Xie, Hannah Smith, Leighanna Pipatanangkura, Peter Clark

Our approach is to generate explanations in the form of entailment trees, namely a tree of multipremise entailment steps from facts that are known, through intermediate conclusions, to the hypothesis of interest (namely the question + answer).

Language Modelling Question Answering +1

ProofWriter: Generating Implications, Proofs, and Abductive Statements over Natural Language

no code implementations Findings (ACL) 2021 Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark

In this work we show that a generative model, called ProofWriter, can reliably generate both implications of a theory and the natural language proof(s) that support them.

Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge

1 code implementation NeurIPS 2020 Alon Talmor, Oyvind Tafjord, Peter Clark, Yoav Goldberg, Jonathan Berant

In this work, we provide a first demonstration that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements.

World Knowledge

UnifiedQA: Crossing Format Boundaries With a Single QA System

2 code implementations Findings of the Association for Computational Linguistics 2020 Daniel Khashabi, Sewon Min, Tushar Khot, Ashish Sabharwal, Oyvind Tafjord, Peter Clark, Hannaneh Hajishirzi

As evidence, we use the latest advances in language modeling to build a single pre-trained QA model, UnifiedQA, that performs surprisingly well across 17 QA datasets spanning 4 diverse formats.

Common Sense Reasoning Language Modelling +3

"You are grounded!": Latent Name Artifacts in Pre-trained Language Models

1 code implementation6 Apr 2020 Vered Shwartz, Rachel Rudinger, Oyvind Tafjord

Pre-trained language models (LMs) may perpetuate biases originating in their training corpus to downstream models.

Reading Comprehension

Transformers as Soft Reasoners over Language

2 code implementations14 Feb 2020 Peter Clark, Oyvind Tafjord, Kyle Richardson

However, expressing the knowledge in a formal (logical or probabilistic) representation has been a major obstacle to this research.

counterfactual Counterfactual Reasoning +2

SUPP.AI: Finding Evidence for Supplement-Drug Interactions

1 code implementation ACL 2020 Lucy Lu Wang, Oyvind Tafjord, Arman Cohan, Sarthak Jain, Sam Skjonsberg, Carissa Schoenick, Nick Botner, Waleed Ammar

We fine-tune the contextualized word representations of the RoBERTa language model using labeled DDI data, and apply the fine-tuned model to identify supplement interactions.

General Classification Language Modelling

QuaRTz: An Open-Domain Dataset of Qualitative Relationship Questions

no code implementations IJCNLP 2019 Oyvind Tafjord, Matt Gardner, Kevin Lin, Peter Clark

QuaRTz contains general qualitative statements, e. g., "A sunscreen with a higher SPF protects the skin longer.

General Knowledge

From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project

no code implementations4 Sep 2019 Peter Clark, Oren Etzioni, Daniel Khashabi, Tushar Khot, Bhavana Dalvi Mishra, Kyle Richardson, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord, Niket Tandon, Sumithra Bhakthavatsalam, Dirk Groeneveld, Michal Guerquin, Michael Schmitz

This paper reports unprecedented success on the Grade 8 New York Regents Science Exam, where for the first time a system scores more than 90% on the exam's non-diagram, multiple choice (NDMC) questions.

Multiple-choice Question Answering

Reasoning Over Paragraph Effects in Situations

no code implementations WS 2019 Kevin Lin, Oyvind Tafjord, Peter Clark, Matt Gardner

A system is presented a background passage containing at least one of these relations, a novel situation that uses this background, and questions that require reasoning about effects of the relationships in the background passage in the context of the situation.

Reading Comprehension

Declarative Question Answering over Knowledge Bases containing Natural Language Text with Answer Set Programming

1 code implementation1 May 2019 Arindam Mitra, Peter Clark, Oyvind Tafjord, Chitta Baral

While in recent years machine learning (ML) based approaches have been the popular approach in developing end-to-end question answering systems, such systems often struggle when additional knowledge is needed to correctly answer the questions.

Logical Reasoning Natural Language Inference +1

QuaRel: A Dataset and Models for Answering Questions about Qualitative Relationships

no code implementations20 Nov 2018 Oyvind Tafjord, Peter Clark, Matt Gardner, Wen-tau Yih, Ashish Sabharwal

Many natural language questions require recognizing and reasoning with qualitative relationships (e. g., in science, economics, and medicine), but are challenging to answer with corpus-based methods.

Friction Semantic Parsing

Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge

1 code implementation14 Mar 2018 Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord

We present a new question set, text corpus, and baselines assembled to encourage AI research in advanced question answering.

Question Answering Retrieval

Moving Beyond the Turing Test with the Allen AI Science Challenge

3 code implementations14 Apr 2016 Carissa Schoenick, Peter Clark, Oyvind Tafjord, Peter Turney, Oren Etzioni

Given recent successes in AI (e. g., AlphaGo's victory against Lee Sedol in the game of GO), it's become increasingly important to assess: how close are AI systems to human-level intelligence?

Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.