Search Results for author: Benjamin Van Durme

Found 205 papers, 72 papers with code

Craft an Iron Sword: Dynamically Generating Interactive Game Characters by Prompting Large Language Models Tuned on Code

1 code implementation • NAACL (Wordplay) 2022 • Ryan Volum, Sudha Rao, Michael Xu, Gabriel DesGarennes, Chris Brockett, Benjamin Van Durme, Olivia Deng, Akanksha Malhotra, Bill Dolan

In this work, we demonstrate that use of a few example conversational prompts can power a conversational agent to generate both natural language and novel code.

Paper
Code

Improved Induction of Narrative Chains via Cross-Document Relations

1 code implementation • *SEM (NAACL) 2022 • Andrew Blair-Stanek, Benjamin Van Durme

The standard approach for inducing narrative chains considers statistics gathered per individual document.

Paper
Code

Guided K-best Selection for Semantic Parsing Annotation

no code implementations • ACL 2022 • Anton Belyy, Chieh-Yang Huang, Jacob Andreas, Emmanouil Antonios Platanios, Sam Thomson, Richard Shin, Subhro Roy, Aleksandr Nisnevich, Charles Chen, Benjamin Van Durme

Collecting data for conversational semantic parsing is a time-consuming and demanding process.

Semantic Parsing valid

Paper
Add Code

Human-Model Divergence in the Handling of Vagueness

no code implementations • ACL (unimplicit) 2021 • Elias Stengel-Eskin, Jimena Guallar-Blasco, Benjamin Van Durme

Paper
Add Code

Online Neural Coreference Resolution with Rollback

no code implementations • COLING (CRAC) 2022 • Patrick Xia, Benjamin Van Durme

Humans process natural language online, whether reading a document or participating in multiparty dialogue.

coreference-resolution Dialogue Understanding

Paper
Add Code

AdapterSwap: Continuous Training of LLMs with Data Removal and Access-Control Guarantees

no code implementations • 12 Apr 2024 • William Fleshman, Aleem Khan, Marc Marone, Benjamin Van Durme

Large language models (LLMs) are increasingly capable of completing knowledge intensive tasks by recalling information from a static pretraining corpus.

Continual Learning

Paper
Add Code

Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data

no code implementations • 5 Apr 2024 • Jingyu Zhang, Marc Marone, Tianjian Li, Benjamin Van Durme, Daniel Khashabi

To address these limitations, we tackle the verifiability goal with a different philosophy: we trivialize the verification process by developing models that quote verbatim statements from trusted sources in pre-training data.

Philosophy

Paper
Add Code

SELF-[IN]CORRECT: LLMs Struggle with Refining Self-Generated Responses

no code implementations • 4 Apr 2024 • Dongwei Jiang, Jingyu Zhang, Orion Weller, Nathaniel Weir, Benjamin Van Durme, Daniel Khashabi

Can LLMs continually improve their previous outputs for better results?

Paper
Add Code

FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions

1 code implementation • 22 Mar 2024 • Orion Weller, Benjamin Chang, Sean MacAvaney, Kyle Lo, Arman Cohan, Benjamin Van Durme, Dawn Lawrie, Luca Soldaini

First, we introduce our dataset FollowIR, which contains a rigorous instruction evaluation benchmark as well as a training set for helping IR models learn to better follow real-world instructions.

Information Retrieval Retrieval +1

Paper
Code

Dated Data: Tracing Knowledge Cutoffs in Large Language Models

no code implementations • 19 Mar 2024 • Jeffrey Cheng, Marc Marone, Orion Weller, Dawn Lawrie, Daniel Khashabi, Benjamin Van Durme

Using this analysis, we find that effective cutoffs often differ from reported cutoffs.

Paper
Add Code

A Closer Look at Claim Decomposition

no code implementations • 18 Mar 2024 • Miriam Wanner, Seth Ebner, Zhengping Jiang, Mark Dredze, Benjamin Van Durme

We investigate how various methods of claim decomposition -- especially LLM-based methods -- affect the result of an evaluation approach such as the recently proposed FActScore, finding that it is sensitive to the decomposition method used.

Attribute

Paper
Add Code

Tur[k]ingBench: A Challenge Benchmark for Web Agents

no code implementations • 18 Mar 2024 • Kevin Xu, Yeganeh Kordi, Kate Sanders, Yizhong Wang, Adam Byerly, Jack Zhang, Benjamin Van Durme, Daniel Khashabi

We evaluate the performance of state-of-the-art models, including language-only, vision-only, and layout-only models, and their combinations, on this benchmark.

Paper
Add Code

LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error

1 code implementation • 7 Mar 2024 • Boshi Wang, Hao Fang, Jason Eisner, Benjamin Van Durme, Yu Su

We find that existing LLMs, including GPT-4 and open-source LLMs specifically fine-tuned for tool use, only reach a correctness rate in the range of 30% to 60%, far from reliable use in practice.

Continual Learning In-Context Learning

Paper
Code

TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning

no code implementations • 29 Feb 2024 • Kate Sanders, Nathaniel Weir, Benjamin Van Durme

It is challenging to perform question-answering over complex, multimodal content such as television clips.

Question Answering Video Understanding

Paper
Add Code

RORA: Robust Free-Text Rationale Evaluation

no code implementations • 28 Feb 2024 • Zhengping Jiang, Yining Lu, Hanjie Chen, Daniel Khashabi, Benjamin Van Durme, Anqi Liu

This is achieved by assessing the conditional V-information \citep{hewitt-etal-2021-conditional} with a predictive family robust against leaky features that can be exploited by a small model.

Decision Making

Paper
Add Code

Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic

no code implementations • 22 Feb 2024 • Nathaniel Weir, Kate Sanders, Orion Weller, Shreya Sharma, Dongwei Jiang, Zhengping Jiang, Bhavana Dalvi Mishra, Oyvind Tafjord, Peter Jansen, Peter Clark, Benjamin Van Durme

Contemporary language models enable new opportunities for structured reasoning with text, such as the construction and evaluation of intuitive, proof-like textual entailment trees without relying on brittle formal logic.

Formal Logic Knowledge Distillation +2

Paper
Add Code

Streaming Sequence Transduction through Dynamic Compression

1 code implementation • 2 Feb 2024 • Weiting Tan, Yunmo Chen, Tongfei Chen, Guanghui Qin, Haoran Xu, Heidi C. Zhang, Benjamin Van Durme, Philipp Koehn

We introduce STAR (Stream Transduction with Anchor Representations), a novel Transformer-based model designed for efficient sequence-to-sequence transduction over streams.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

MultiMUC: Multilingual Template Filling on MUC-4

1 code implementation • 29 Jan 2024 • William Gantt, Shabnam Behzad, Hannah Youngeun An, Yunmo Chen, Aaron Steven White, Benjamin Van Durme, Mahsa Yarmohammadi

We introduce MultiMUC, the first multilingual parallel corpus for template filling, comprising translations of the classic MUC-4 template filling benchmark into five languages: Arabic, Chinese, Farsi, Korean, and Russian.

Machine Translation Translation

Paper
Code

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

1 code implementation • 16 Jan 2024 • Haoran Xu, Amr Sharaf, Yunmo Chen, Weiting Tan, Lingfeng Shen, Benjamin Van Durme, Kenton Murray, Young Jin Kim

However, even the top-performing 13B LLM-based translation models, like ALMA, does not match the performance of state-of-the-art conventional encoder-decoder translation models or larger-scale LLMs such as GPT-4.

Decoder Machine Translation +1

304

Paper
Code

Reframing Tax Law Entailment as Analogical Reasoning

no code implementations • 12 Jan 2024 • Xinrui Zou, Ming Zhang, Nathaniel Weir, Benjamin Van Durme, Nils Holzenberger

We re-frame statutory reasoning as an analogy task, where each instance of the analogy task involves a combination of two instances of statutory reasoning.

Retrieval

Paper
Add Code

Do Androids Know They're Only Dreaming of Electric Sheep?

no code implementations • 28 Dec 2023 • Sky CH-Wang, Benjamin Van Durme, Jason Eisner, Chris Kedzie

We design probes trained on the internal representations of a transformer language model that are predictive of its hallucinatory behavior on in-context generation tasks.

Hallucination Hallucination Evaluation +1

Paper
Add Code

Interpreting User Requests in the Context of Natural Language Standing Instructions

1 code implementation • 16 Nov 2023 • Nikita Moghe, Patrick Xia, Jacob Andreas, Jason Eisner, Benjamin Van Durme, Harsh Jhamtani

Users of natural language interfaces, generally powered by Large Language Models (LLMs), often must repeat their preferences each time they make a similar request.

Paper
Code

BLT: Can Large Language Models Handle Basic Legal Text?

1 code implementation • 16 Nov 2023 • Andrew Blair-Stanek, Nils Holzenberger, Benjamin Van Durme

We find that the best publicly available LLMs like GPT-4, Claude, and {PaLM 2} currently perform poorly at basic legal text handling.

Paper
Code

Toucan: Token-Aware Character Level Language Modeling

no code implementations • 15 Nov 2023 • William Fleshman, Benjamin Van Durme

Character-level language models obviate the need for separately trained tokenizers, but efficiency suffers from longer sequence lengths.

Language Modelling

Paper
Add Code

FAMuS: Frames Across Multiple Sources

2 code implementations • 9 Nov 2023 • Siddharth Vashishtha, Alexander Martin, William Gantt, Benjamin Van Durme, Aaron Steven White

Understanding event descriptions is a central aspect of language processing, but current approaches focus overwhelmingly on single sentences or documents.

Sentence valid

Paper
Code

Narrowing the Gap between Zero- and Few-shot Machine Translation by Matching Styles

no code implementations • 4 Nov 2023 • Weiting Tan, Haoran Xu, Lingfeng Shen, Shuyue Stella Li, Kenton Murray, Philipp Koehn, Benjamin Van Durme, Yunmo Chen

Large language models trained primarily in a monolingual setting have demonstrated their ability to generalize to machine translation using zero- and few-shot examples with in-context learning.

In-Context Learning Machine Translation +1

Paper
Add Code

InstructExcel: A Benchmark for Natural Language Instruction in Excel

no code implementations • 23 Oct 2023 • Justin Payan, Swaroop Mishra, Mukul Singh, Carina Negreanu, Christian Poelitz, Chitta Baral, Subhro Roy, Rasika Chakravarthy, Benjamin Van Durme, Elnaz Nouri

With the evolution of Large Language Models (LLMs) we can solve increasingly more complex NLP tasks across various domains, including spreadsheets.

Paper
Add Code

A Unified View of Evaluation Metrics for Structured Prediction

1 code implementation • 20 Oct 2023 • Yunmo Chen, William Gantt, Tongfei Chen, Aaron Steven White, Benjamin Van Durme

We present a conceptual framework that unifies a variety of evaluation metrics for different structured prediction tasks (e. g. event and relation extraction, syntactic and semantic parsing).

Relation Extraction Semantic Parsing +1

Paper
Code

SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation

1 code implementation • 6 Oct 2023 • Abe Bohan Hou, Jingyu Zhang, Tianxing He, Yichen Wang, Yung-Sung Chuang, Hongwei Wang, Lingfeng Shen, Benjamin Van Durme, Daniel Khashabi, Yulia Tsvetkov

Existing watermarking algorithms are vulnerable to paraphrase attacks because of their token-level design.

Sentence Text Generation

Paper
Code

Nugget 2D: Dynamic Contextual Compression for Scaling Decoder-only Language Models

no code implementations • 3 Oct 2023 • Guanghui Qin, Corby Rosset, Ethan C. Chau, Nikhil Rao, Benjamin Van Durme

Standard Transformer-based language models (LMs) scale poorly to long contexts.

Decoder Language Modelling +1

Paper
Add Code

Nugget: Neural Agglomerative Embeddings of Text

no code implementations • 3 Oct 2023 • Guanghui Qin, Benjamin Van Durme

This is problematic, as the amount of information contained in text often varies with the length of the input.

Language Modelling Machine Translation +1

Paper
Add Code

SCREWS: A Modular Framework for Reasoning with Revisions

1 code implementation • 20 Sep 2023 • Kumar Shridhar, Harsh Jhamtani, Hao Fang, Benjamin Van Durme, Jason Eisner, Patrick Xia

To enable exploration in this space, we present SCREWS, a modular framework for reasoning with revisions.

Multi-hop Question Answering Question Answering

Paper
Code

OpenAI Cribbed Our Tax Example, But Can GPT-4 Really Do Tax?

no code implementations • 15 Sep 2023 • Andrew Blair-Stanek, Nils Holzenberger, Benjamin Van Durme

The authors explain where OpenAI got the tax law example in its livestream demonstration of GPT-4, why GPT-4 got the wrong answer, and how it fails to reliably calculate taxes.

Paper
Add Code

When do Generative Query and Document Expansions Fail? A Comprehensive Study Across Methods, Retrievers, and Datasets

no code implementations • 15 Sep 2023 • Orion Weller, Kyle Lo, David Wadden, Dawn Lawrie, Benjamin Van Durme, Arman Cohan, Luca Soldaini

Using large language models (LMs) for query or document expansion can improve generalization in information retrieval.

Information Retrieval Retrieval

Paper
Add Code

MegaWika: Millions of reports and their sources across 50 diverse languages

no code implementations • 13 Jul 2023 • Samuel Barham, Orion Weller, Michelle Yuan, Kenton Murray, Mahsa Yarmohammadi, Zhengping Jiang, Siddharth Vashishtha, Alexander Martin, Anqi Liu, Aaron Steven White, Jordan Boyd-Graber, Benjamin Van Durme

To foster the development of new models for collaborative AI-assisted report generation, we introduce MegaWika, consisting of 13 million Wikipedia articles in 50 diverse languages, along with their 71 million referenced source materials.

Cross-Lingual Question Answering Retrieval +1

Paper
Add Code

MultiVENT: Multilingual Videos of Events with Aligned Natural Text

no code implementations • 6 Jul 2023 • Kate Sanders, David Etter, Reno Kriz, Benjamin Van Durme

Everyday news coverage has shifted from traditional broadcasts towards a wide range of presentation formats such as first-hand, unedited video footage.

Information Retrieval Retrieval +1

Paper
Add Code

Evaluating Paraphrastic Robustness in Textual Entailment Models

no code implementations • 29 Jun 2023 • Dhruv Verma, Yash Kumar Lal, Shreyashee Sinha, Benjamin Van Durme, Adam Poliak

We present PaRTE, a collection of 1, 126 pairs of Recognizing Textual Entailment (RTE) examples to evaluate whether models are robust to paraphrasing.

Natural Language Inference RTE

Paper
Add Code

Zero and Few-shot Semantic Parsing with Ambiguous Inputs

1 code implementation • 1 Jun 2023 • Elias Stengel-Eskin, Kyle Rawlins, Benjamin Van Durme

We attempt to address this shortcoming by introducing AmP, a framework, dataset, and challenge for translating ambiguous natural language to formal representations like logic and code.

Semantic Parsing

Paper
Code

InteractiveIE: Towards Assessing the Strength of Human-AI Collaboration in Improving the Performance of Information Extraction

no code implementations • 24 May 2023 • Ishani Mondal, Michelle Yuan, Anandhavelu N, Aparna Garimella, Francis Ferraro, Andrew Blair-Stanek, Benjamin Van Durme, Jordan Boyd-Graber

Learning template based information extraction from documents is a crucial yet difficult task.

Question Answering Question Generation +1

Paper
Add Code

Condensing Multilingual Knowledge with Lightweight Language-Specific Modules

1 code implementation • 23 May 2023 • Haoran Xu, Weiting Tan, Shuyue Stella Li, Yunmo Chen, Benjamin Van Durme, Philipp Koehn, Kenton Murray

Incorporating language-specific (LS) modules is a proven method to boost performance in multilingual machine translation.

Machine Translation Translation

Paper
Code

"According to ...": Prompting Language Models Improves Quoting from Pre-Training Data

no code implementations • 22 May 2023 • Orion Weller, Marc Marone, Nathaniel Weir, Dawn Lawrie, Daniel Khashabi, Benjamin Van Durme

Large Language Models (LLMs) may hallucinate and generate fake information, despite pre-training on factual data.

Paper
Add Code

NevIR: Negation in Neural Information Retrieval

1 code implementation • 12 May 2023 • Orion Weller, Dawn Lawrie, Benjamin Van Durme

Although the Information Retrieval (IR) community has adopted LMs as the backbone of modern IR architectures, there has been little to no research in understanding how negation impacts neural IR.

Information Retrieval Negation +1

Paper
Code

Did You Mean...? Confidence-based Trade-offs in Semantic Parsing

no code implementations • 29 Mar 2023 • Elias Stengel-Eskin, Benjamin Van Durme

We then examine how confidence scores can help optimize the trade-off between usability and safety.

Semantic Parsing

Paper
Add Code

Data Portraits: Recording Foundation Model Training Data

no code implementations • NeurIPS 2023 • Marc Marone, Benjamin Van Durme

Foundation models are trained on increasingly immense and opaque datasets.

Language Modelling

Paper
Add Code

Can GPT-3 Perform Statutory Reasoning?

1 code implementation • 13 Feb 2023 • Andrew Blair-Stanek, Nils Holzenberger, Benjamin Van Durme

Statutory reasoning is the task of reasoning with facts and statutes, which are rules written in natural language by a legislature.

Paper
Code

Defending Against Disinformation Attacks in Open-Domain Question Answering

no code implementations • 20 Dec 2022 • Orion Weller, Aleem Khan, Nathaniel Weir, Dawn Lawrie, Benjamin Van Durme

Recent work in open-domain question answering (ODQA) has shown that adversarial poisoning of the search collection can cause large drops in accuracy for production systems.

Data Poisoning Misinformation +1

Paper
Add Code

When Do Decompositions Help for Machine Reading?

no code implementations • 20 Dec 2022 • Kangda Wei, Dawn Lawrie, Benjamin Van Durme, Yunmo Chen, Orion Weller

Answering complex questions often requires multi-step reasoning in order to obtain the final answer.

Reading Comprehension Retrieval

Paper
Add Code

Ontologically Faithful Generation of Non-Player Character Dialogues

no code implementations • 20 Dec 2022 • Nathaniel Weir, Ryan Thomas, Randolph D'Amore, Kellie Hill, Benjamin Van Durme, Harsh Jhamtani

We introduce a language generation task grounded in a popular video game environment.

Dialogue Generation In-Context Learning

Paper
Add Code

Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning

2 code implementations • CVPR 2023 • Zhuowan Li, Xingrui Wang, Elias Stengel-Eskin, Adam Kortylewski, Wufei Ma, Benjamin Van Durme, Alan Yuille

Visual Question Answering (VQA) models often perform poorly on out-of-distribution data and struggle on domain generalization.

Domain Generalization Question Answering +2

Paper
Code

Localization vs. Semantics: Visual Representations in Unimodal and Multimodal Models

no code implementations • 1 Dec 2022 • Zhuowan Li, Cihang Xie, Benjamin Van Durme, Alan Yuille

Despite the impressive advancements achieved through vision-and-language pretraining, it remains unclear whether this joint learning paradigm can help understand each individual modality.

Attribute Representation Learning

Paper
Add Code

Calibrated Interpretation: Confidence Estimation in Semantic Parsing

2 code implementations • 14 Nov 2022 • Elias Stengel-Eskin, Benjamin Van Durme

Sequence generation models are increasingly being used to translate natural language into programs, i. e. to perform executable semantic parsing.

Semantic Parsing

Paper
Code

Why Did the Chicken Cross the Road? Rephrasing and Analyzing Ambiguous Questions in VQA

1 code implementation • 14 Nov 2022 • Elias Stengel-Eskin, Jimena Guallar-Blasco, Yi Zhou, Benjamin Van Durme

Natural language is ambiguous.

Question Generation Question-Generation +1

Paper
Code

Automatic Document Selection for Efficient Encoder Pretraining

no code implementations • 20 Oct 2022 • Yukun Feng, Patrick Xia, Benjamin Van Durme, João Sedoc

Building pretrained language models is considered expensive and data-intensive, but must we increase dataset size to achieve better performance?

Sentence

Paper
Add Code

An Empirical Study on Finding Spans

no code implementations • 13 Oct 2022 • Weiwei Gu, Boyuan Zheng, Yunmo Chen, Tongfei Chen, Benjamin Van Durme

We present an empirical study on methods for span finding, the selection of consecutive tokens in text for some downstream tasks.

Paper
Add Code

Iterative Document-level Information Extraction via Imitation Learning

2 code implementations • 12 Oct 2022 • Yunmo Chen, William Gantt, Weiwei Gu, Tongfei Chen, Aaron Steven White, Benjamin Van Durme

We present a novel iterative extraction model, IterX, for extracting complex relations, or templates (i. e., N-tuples representing a mapping from named slots to spans of text) within a document.

4-ary Relation Extraction Imitation Learning

Paper
Code

Ambiguous Images With Human Judgments for Robust Visual Event Classification

no code implementations • 6 Oct 2022 • Kate Sanders, Reno Kriz, Anqi Liu, Benjamin Van Durme

However, humans are frequently presented with visual data that they cannot classify with 100% certainty, and models trained on standard vision benchmarks achieve low performance when evaluated on this data.

Paper
Add Code

NELLIE: A Neuro-Symbolic Inference Engine for Grounded, Compositional, and Explainable Reasoning

no code implementations • 16 Sep 2022 • Nathaniel Weir, Peter Clark, Benjamin Van Durme

Our goal is a modern approach to answering questions via systematic reasoning where answers are supported by human interpretable proof trees grounded in an NL corpus of authoritative facts.

Hallucination Language Modelling +1

Paper
Add Code

Multilingual Coreference Resolution in Multiparty Dialogue

1 code implementation • 2 Aug 2022 • Boyuan Zheng, Patrick Xia, Mahsa Yarmohammadi, Benjamin Van Durme

Existing multiparty dialogue datasets for entity coreference resolution are nascent, and many challenges are still unaddressed.

coreference-resolution Data Augmentation

Paper
Code

Zero-shot Cross-lingual Transfer is Under-specified Optimization

1 code implementation • RepL4NLP (ACL) 2022 • Shijie Wu, Benjamin Van Durme, Mark Dredze

Pretrained multilingual encoders enable zero-shot cross-lingual transfer, but often produce unreliable models that exhibit high performance variance on the target language.

Zero-Shot Cross-Lingual Transfer

Paper
Code

BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and Semantic Parsing

1 code implementation • NeurIPS 2023 • Subhro Roy, Sam Thomson, Tongfei Chen, Richard Shin, Adam Pauls, Jason Eisner, Benjamin Van Durme

We introduce BenchCLAMP, a Benchmark to evaluate Constrained LAnguage Model Parsing, that includes context-free grammars for seven semantic parsing datasets and two syntactic parsing datasets with varied output representations, as well as a constrained decoding interface to generate only valid outputs covered by these grammars.

Decoder Language Modelling +3

Paper
Code

Pretrained Models for Multilingual Federated Learning

1 code implementation • NAACL 2022 • Orion Weller, Marc Marone, Vladimir Braverman, Dawn Lawrie, Benjamin Van Durme

Since the advent of Federated Learning (FL), research has applied these methods to natural language processing (NLP) tasks.

Federated Learning Language Modelling +3

Paper
Code

Asking the Right Questions in Low Resource Template Extraction

no code implementations • 25 May 2022 • Nils Holzenberger, Yunmo Chen, Benjamin Van Durme

Information Extraction (IE) researchers are mapping tasks to Question Answering (QA) in order to leverage existing large QA resources, and thereby improve data efficiency.

Question Answering

Paper
Add Code

The Curious Case of Control

1 code implementation • 24 May 2022 • Elias Stengel-Eskin, Benjamin Van Durme

Given the advanced fluency of large generative language models, we ask whether model outputs are consistent with these heuristics, and to what degree different models are consistent with each other.

Paper
Code

When More Data Hurts: A Troubling Quirk in Developing Broad-Coverage Natural Language Understanding Systems

1 code implementation • 24 May 2022 • Elias Stengel-Eskin, Emmanouil Antonios Platanios, Adam Pauls, Sam Thomson, Hao Fang, Benjamin Van Durme, Jason Eisner, Yu Su

Rejecting class imbalance as the sole culprit, we reveal that the trend is closely associated with an effect we call source signal dilution, where strong lexical cues for the new symbol become diluted as the training dataset grows.

Intent Recognition Natural Language Understanding +1

Paper
Code

Addressing Resource and Privacy Constraints in Semantic Parsing Through Data Augmentation

no code implementations • Findings (ACL) 2022 • Kevin Yang, Olivia Deng, Charles Chen, Richard Shin, Subhro Roy, Benjamin Van Durme

We introduce a novel setup for low-resource task-oriented semantic parsing which incorporates several constraints that may arise in real-world scenarios: (1) lack of similar datasets/models from a related domain, (2) inability to sample useful logical forms directly from a grammar, and (3) privacy requirements for unlabeled natural utterances.

Data Augmentation Semantic Parsing

Paper
Add Code

Visual Commonsense in Pretrained Unimodal and Multimodal Models

1 code implementation • NAACL 2022 • Chenyu Zhang, Benjamin Van Durme, Zhuowan Li, Elias Stengel-Eskin

Our commonsense knowledge about objects includes their typical visual attributes; we know that bananas are typically yellow or green, and not purple.

Ranked #1 on Visual Commonsense Tests on ViComTe-color

Attribute Visual Commonsense Tests +1

Paper
Code

One-Shot Learning from a Demonstration with Hierarchical Latent Language

no code implementations • 9 Mar 2022 • Nathaniel Weir, Xingdi Yuan, Marc-Alexandre Côté, Matthew Hausknecht, Romain Laroche, Ida Momennejad, Harm van Seijen, Benjamin Van Durme

Humans have the capability, aided by the expressive compositionality of their language, to learn quickly by demonstration.

One-Shot Learning

Paper
Add Code

The NLP Task Effectiveness of Long-Range Transformers

no code implementations • 16 Feb 2022 • Guanghui Qin, Yukun Feng, Benjamin Van Durme

Transformer models cannot easily scale to long sequences due to their O(N^2) time and space complexity.

Paper
Add Code

Few-Shot Semantic Parsing with Language Models Trained On Code

no code implementations • NAACL 2022 • Richard Shin, Benjamin Van Durme

Intuitively, such models can more easily output canonical utterances as they are closer to the natural language used for pre-training.

Semantic Parsing

Paper
Add Code

Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images

1 code implementation • ICCV 2021 • Zhuowan Li, Elias Stengel-Eskin, Yixiao Zhang, Cihang Xie, Quan Tran, Benjamin Van Durme, Alan Yuille

Our experiments show CCO substantially boosts the performance of neural symbolic methods on real images.

Question Answering Visual Question Answering

Paper
Code

Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction

2 code implementations • EMNLP 2021 • Mahsa Yarmohammadi, Shijie Wu, Marc Marone, Haoran Xu, Seth Ebner, Guanghui Qin, Yunmo Chen, Jialiang Guo, Craig Harman, Kenton Murray, Aaron Steven White, Mark Dredze, Benjamin Van Durme

Zero-shot cross-lingual information extraction (IE) describes the construction of an IE model for some target language, given existing annotations exclusively in some other language, typically English.

Dependency Parsing Event Extraction +4

Paper
Code

BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation

2 code implementations • EMNLP 2021 • Haoran Xu, Benjamin Van Durme, Kenton Murray

The success of bidirectional encoders using masked language models, such as BERT, on numerous natural language processing tasks has prompted researchers to attempt to incorporate these pre-trained models into neural machine translation (NMT) systems.

Ranked #2 on Machine Translation on IWSLT2014 German-English

Language Modelling Machine Translation +2

Paper
Code

Guided Generation of Cause and Effect

no code implementations • 21 Jul 2021 • Zhongyang Li, Xiao Ding, Ting Liu, J. Edward Hu, Benjamin Van Durme

We present a conditional text generation framework that posits sentential expressions of possible causes and effects.

Conditional Text Generation Knowledge Graphs

Paper
Add Code

Factoring Statutory Reasoning as Language Understanding Challenges

1 code implementation • ACL 2021 • Nils Holzenberger, Benjamin Van Durme

Statutory reasoning is the task of determining whether a legal statute, stated in natural language, applies to the text description of a case.

Natural Language Inference

Paper
Code

Human Schema Curation via Causal Association Rule Mining

1 code implementation • LREC (LAW) 2022 • Noah Weber, Anton Belyy, Nils Holzenberger, Rachel Rudinger, Benjamin Van Durme

Event schemas are structured knowledge sources defining typical real-world scenarios (e. g., going to an airport).

Paper
Code

Constrained Language Models Yield Few-Shot Semantic Parsers

1 code implementation • EMNLP 2021 • Richard Shin, Christopher H. Lin, Sam Thomson, Charles Chen, Subhro Roy, Emmanouil Antonios Platanios, Adam Pauls, Dan Klein, Jason Eisner, Benjamin Van Durme

We explore the use of large pretrained language models as few-shot semantic parsers.

Semantic Parsing

Paper
Code

Moving on from OntoNotes: Coreference Resolution Model Transfer

2 code implementations • EMNLP 2021 • Patrick Xia, Benjamin Van Durme

Academic neural models for coreference resolution (coref) are typically trained on a single dataset, OntoNotes, and model improvements are benchmarked on that same dataset.

coreference-resolution

Paper
Code

Adapting Coreference Resolution Models through Active Learning

1 code implementation • ACL 2022 • Michelle Yuan, Patrick Xia, Chandler May, Benjamin Van Durme, Jordan Boyd-Graber

Active learning mitigates this problem by sampling a small subset of data for annotators to label.

Active Learning Clustering +1

Paper
Code

Joint Universal Syntactic and Semantic Parsing

1 code implementation • 12 Apr 2021 • Elias Stengel-Eskin, Kenton Murray, Sheng Zhang, Aaron Steven White, Benjamin Van Durme

While numerous attempts have been made to jointly parse syntax and semantics, high performance in one domain typically comes at the price of performance in the other.

Semantic Parsing

Paper
Code

InFillmore: Frame-Guided Language Generation with Bidirectional Context

no code implementations • Joint Conference on Lexical and Computational Semantics 2021 • Jiefu Ou, Nathaniel Weir, Anton Belyy, Felix Yu, Benjamin Van Durme

We propose a structured extension to bidirectional-context conditional language generation, or "infilling," inspired by Frame Semantic theory (Fillmore, 1976).

Text Infilling

Paper
Add Code

Gradual Fine-Tuning for Low-Resource Domain Adaptation

2 code implementations • EACL (AdaptNLP) 2021 • Haoran Xu, Seth Ebner, Mahsa Yarmohammadi, Aaron Steven White, Benjamin Van Durme, Kenton Murray

Fine-tuning is known to improve NLP models by adapting an initial model trained on more plentiful but less domain-salient examples to data in a target domain.

Domain Adaptation

Paper
Code

LOME: Large Ontology Multilingual Extraction

no code implementations • EACL 2021 • Patrick Xia, Guanghui Qin, Siddharth Vashishtha, Yunmo Chen, Tongfei Chen, Chandler May, Craig Harman, Kyle Rawlins, Aaron Steven White, Benjamin Van Durme

We present LOME, a system for performing multilingual information extraction.

coreference-resolution Entity Typing +3

Paper
Add Code

Joint Modeling of Arguments for Event Understanding

1 code implementation • 20 Nov 2020 • Yunmo Chen, Tongfei Chen, Benjamin Van Durme

We recognize the task of event argument linking in documents as similar to that of intent slot resolution in dialogue, providing a Transformer-based model that extends from a recently proposed solution to resolve references to slots.

Sentence

Paper
Code

Temporal Reasoning in Natural Language Inference

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Siddharth Vashishtha, Adam Poliak, Yash Kumar Lal, Benjamin Van Durme, Aaron Steven White

We introduce five new natural language inference (NLI) datasets focused on temporal reasoning.

Natural Language Inference

Paper
Code

CopyNext: Explicit Span Copying and Alignment in Sequence to Sequence Models

no code implementations • EMNLP (spnlp) 2020 • Abhinav Singh, Patrick Xia, Guanghui Qin, Mahsa Yarmohammadi, Benjamin Van Durme

Copy mechanisms are employed in sequence to sequence models (seq2seq) to generate reproductions of words from the input to the output.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

COD3S: Diverse Generation with Discrete Semantic Signatures

1 code implementation • EMNLP 2020 • Nathaniel Weir, João Sedoc, Benjamin Van Durme

We present COD3S, a novel method for generating semantically diverse sentences using neural sequence-to-sequence (seq2seq) models.

Semantic Textual Similarity Sentence

Paper
Code

Which *BERT? A Survey Organizing Contextualized Encoders

no code implementations • EMNLP 2020 • Patrick Xia, Shijie Wu, Benjamin Van Durme

Pretrained contextualized text encoders are now a staple of the NLP community.

Representation Learning

Paper
Add Code

Iterative Paraphrastic Augmentation with Discriminative Span Alignment

no code implementations • 1 Jul 2020 • Ryan Culkin, J. Edward Hu, Elias Stengel-Eskin, Guanghui Qin, Benjamin Van Durme

We introduce a novel paraphrastic augmentation strategy based on sentence-level lexically constrained paraphrasing and discriminative span alignment.

Sentence

Paper
Add Code

Script Induction as Association Rule Mining

no code implementations • WS 2020 • Anton Belyy, Benjamin Van Durme

We show that the count-based Script Induction models of Chambers and Jurafsky (2008) and Jans et al. (2012) can be unified in a general framework of narrative chain likelihood maximization.

Cloze Test

Paper
Add Code

A Dataset for Statutory Reasoning in Tax Law Entailment and Question Answering

1 code implementation • 11 May 2020 • Nils Holzenberger, Andrew Blair-Stanek, Benjamin Van Durme

Legislation can be viewed as a body of prescriptive rules expressed in natural language.

Natural Language Understanding Question Answering +1

Paper
Code

Incremental Neural Coreference Resolution in Constant Memory

1 code implementation • EMNLP 2020 • Patrick Xia, João Sedoc, Benjamin Van Durme

We investigate modeling coreference resolution under a fixed memory constraint by extending an incremental clustering algorithm to utilize contextualized encoders and neural components.

Clustering coreference-resolution +1

Paper
Code

Complementing Lexical Retrieval with Semantic Residual Embedding

no code implementations • 29 Apr 2020 • Luyu Gao, Zhuyun Dai, Tongfei Chen, Zhen Fan, Benjamin Van Durme, Jamie Callan

This paper presents CLEAR, a retrieval model that seeks to complement classical lexical exact-match models such as BM25 with semantic matching signals from a neural embedding matching model.

Information Retrieval Retrieval

Paper
Add Code

Probing Neural Language Models for Human Tacit Assumptions

no code implementations • 10 Apr 2020 • Nathaniel Weir, Adam Poliak, Benjamin Van Durme

Our prompts are based on human responses in a psychological study of conceptual associations.

Paper
Add Code

Hierarchical Entity Typing via Multi-level Learning to Rank

1 code implementation • ACL 2020 • Tongfei Chen, Yunmo Chen, Benjamin Van Durme

We propose a novel method for hierarchical entity classification that embraces ontological structure at both training and during prediction.

Decoder Entity Typing +1

Paper
Code

Causal Inference of Script Knowledge

no code implementations • EMNLP 2020 • Noah Weber, Rachel Rudinger, Benjamin Van Durme

When does a sequence of events define an everyday scenario and how can this knowledge be induced from text?

Causal Inference

Paper
Add Code

Reading the Manual: Event Extraction as Definition Comprehension

no code implementations • EMNLP (spnlp) 2020 • Yunmo Chen, Tongfei Chen, Seth Ebner, Aaron Steven White, Benjamin Van Durme

We ask whether text understanding has progressed to where we may extract event information through incremental refinement of bleached statements derived from annotation manuals.

Event Extraction

Paper
Add Code

Multi-Sentence Argument Linking

no code implementations • ACL 2020 • Seth Ebner, Patrick Xia, Ryan Culkin, Kyle Rawlins, Benjamin Van Durme

We present a novel document-level model for finding argument spans that fill an event's roles, connecting related ideas in sentence-level semantic role labeling and coreference resolution.

coreference-resolution Semantic Role Labeling +2

Paper
Add Code

Interactive Refinement of Cross-Lingual Word Embeddings

1 code implementation • EMNLP 2020 • Michelle Yuan, Mozhi Zhang, Benjamin Van Durme, Leah Findlater, Jordan Boyd-Graber

Cross-lingual word embeddings transfer knowledge between languages: models trained on high-resource languages can predict in low-resource languages.

Active Learning Cross-Lingual Word Embeddings +3

Paper
Code

Large-Scale, Diverse, Paraphrastic Bitexts via Sampling and Clustering

no code implementations • CONLL 2019 • J. Edward Hu, Abhinav Singh, Nils Holzenberger, Matt Post, Benjamin Van Durme

Producing diverse paraphrases of a sentence is a challenging task.

Clustering Sentence +1

Paper
Add Code

Bag-of-Words Transfer: Non-Contextual Techniques for Multi-Task Learning

no code implementations • WS 2019 • Seth Ebner, Felicity Wang, Benjamin Van Durme

Many architectures for multi-task learning (MTL) have been proposed to take advantage of transfer among tasks, often involving complex models and training procedures.

Multi-Task Learning Sentence +1

Paper
Add Code

Universal Decompositional Semantic Parsing

no code implementations • ACL 2020 • Elias Stengel-Eskin, Aaron Steven White, Sheng Zhang, Benjamin Van Durme

We introduce a transductive model for parsing into Universal Decompositional Semantics (UDS) representations, which jointly learns to map natural language utterances into UDS graph structures and annotate the graph with decompositional semantic attribute scores.

Attribute Semantic Parsing

Paper
Add Code

Exact and/or Fast Nearest Neighbors

1 code implementation • 6 Oct 2019 • Matthew Francis-Landau, Benjamin Van Durme

Prior methods for retrieval of nearest neighbors in high dimensions are fast and approximate--providing probabilistic guarantees of returning the correct answer--or slow and exact performing an exhaustive search.

Data Structures and Algorithms

Paper
Code

The Universal Decompositional Semantics Dataset and Decomp Toolkit

1 code implementation • LREC 2020 • Aaron Steven White, Elias Stengel-Eskin, Siddharth Vashishtha, Venkata Govindarajan, Dee Ann Reisinger, Tim Vieira, Keisuke Sakaguchi, Sheng Zhang, Francis Ferraro, Rachel Rudinger, Kyle Rawlins, Benjamin Van Durme

We present the Universal Decompositional Semantics (UDS) dataset (v1. 0), which is bundled with the Decomp toolkit (v0. 1).

Paper
Code

Uncertain Natural Language Inference

no code implementations • ACL 2020 • Tongfei Chen, Zhengping Jiang, Adam Poliak, Keisuke Sakaguchi, Benjamin Van Durme

We introduce Uncertain Natural Language Inference (UNLI), a refinement of Natural Language Inference (NLI) that shifts away from categorical labels, targeting instead the direct prediction of subjective probability assessments.

Learning-To-Rank Natural Language Inference +1

Paper
Add Code

Broad-Coverage Semantic Parsing as Transduction

no code implementations • IJCNLP 2019 • Sheng Zhang, Xutai Ma, Kevin Duh, Benjamin Van Durme

We unify different broad-coverage semantic parsing tasks under a transduction paradigm, and propose an attention-based neural framework that incrementally builds a meaning representation via a sequence of semantic relations.

Ranked #2 on UCCA Parsing on SemEval 2019 Task 1

AMR Parsing UCCA Parsing

Paper
Add Code

A Discriminative Neural Model for Cross-Lingual Word Alignment

no code implementations • IJCNLP 2019 • Elias Stengel-Eskin, Tzu-Ray Su, Matt Post, Benjamin Van Durme

We introduce a novel discriminative word alignment model, which we integrate into a Transformer-based machine translation model.

Machine Translation NER +2

Paper
Add Code

Don't Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference

1 code implementation • ACL 2019 • Yonatan Belinkov, Adam Poliak, Stuart M. Shieber, Benjamin Van Durme, Alexander M. Rush

In contrast to standard approaches to NLI, our methods predict the probability of a premise given a hypothesis and NLI label, discouraging models from ignoring the premise.

Natural Language Inference

Paper
Code

On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference

1 code implementation • SEMEVAL 2019 • Yonatan Belinkov, Adam Poliak, Stuart M. Shieber, Benjamin Van Durme, Alexander M. Rush

Popular Natural Language Inference (NLI) datasets have been shown to be tainted by hypothesis-only biases.

Natural Language Inference

Paper
Code

Learning to Rank for Plausible Plausibility

no code implementations • ACL 2019 • Zhongyang Li, Tongfei Chen, Benjamin Van Durme

Researchers illustrate improvements in contextual encoding strategies via resultant performance on a battery of shared Natural Language Understanding (NLU) tasks.

Learning-To-Rank Natural Language Understanding

Paper
Add Code

Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting

1 code implementation • NAACL 2019 • J. Edward Hu, Huda Khayrallah, Ryan Culkin, Patrick Xia, Tongfei Chen, Matt Post, Benjamin Van Durme

Lexically-constrained sequence decoding allows for explicit positive or negative phrase-based constraints to be placed on target output strings in generation tasks such as machine translation or monolingual text rewriting.

Data Augmentation Machine Translation +3

1,207

Paper
Code

AMR Parsing as Sequence-to-Graph Transduction

1 code implementation • ACL 2019 • Sheng Zhang, Xutai Ma, Kevin Duh, Benjamin Van Durme

Our experimental results outperform all previously reported SMATCH scores, on both AMR 2. 0 (76. 3% F1 on LDC2017T10) and AMR 1. 0 (70. 2% F1 on LDC2014T12).

Ranked #1 on AMR Parsing on LDC2014T12:

AMR Parsing

154

Paper
Code

What do you learn from context? Probing for sentence structure in contextualized word representations

2 code implementations • ICLR 2019 • Ian Tenney, Patrick Xia, Berlin Chen, Alex Wang, Adam Poliak, R. Thomas McCoy, Najoung Kim, Benjamin Van Durme, Samuel R. Bowman, Dipanjan Das, Ellie Pavlick

The jiant toolkit for general-purpose text understanding models

Language Modelling Sentence

Paper
Code

Looking for ELMo's friends: Sentence-Level Pretraining Beyond Language Modeling

no code implementations • ICLR 2019 • Samuel R. Bowman, Ellie Pavlick, Edouard Grave, Benjamin Van Durme, Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen

Work on the problem of contextualized word representation—the development of reusable neural network components for sentence understanding—has recently seen a surge of progress centered on the unsupervised pretraining task of language modeling with methods like ELMo (Peters et al., 2018).

Language Modelling Sentence

Paper
Add Code

Probing What Different NLP Tasks Teach Machines about Function Word Comprehension

no code implementations • SEMEVAL 2019 • Najoung Kim, Roma Patel, Adam Poliak, Alex Wang, Patrick Xia, R. Thomas McCoy, Ian Tenney, Alexis Ross, Tal Linzen, Benjamin Van Durme, Samuel R. Bowman, Ellie Pavlick

Our results show that pretraining on language modeling performs the best on average across our probing tasks, supporting its widespread use for pretraining state-of-the-art NLP models, and CCG supertagging and NLI pretraining perform comparably.

CCG Supertagging Language Modelling +3

Paper
Add Code

Fine-Grained Temporal Relation Extraction

no code implementations • ACL 2019 • Siddharth Vashishtha, Benjamin Van Durme, Aaron Steven White

We present a novel semantic framework for modeling temporal relations and event durations that maps pairs of events to real-valued scales.

Relation Temporal Relation Extraction +1

Paper
Add Code

Decomposing Generalization: Models of Generic, Habitual, and Episodic Statements

no code implementations • TACL 2019 • Venkata Subrahmanyan Govindarajan, Benjamin Van Durme, Aaron Steven White

We present a novel semantic framework for modeling linguistic expressions of generalization---generic, habitual, and episodic statements---as combinations of simple, real-valued referential properties of predicates and their arguments.

Word Embeddings

Paper
Add Code

ParaBank: Monolingual Bitext Generation and Sentential Paraphrasing via Lexically-constrained Neural Machine Translation

no code implementations • 11 Jan 2019 • J. Edward Hu, Rachel Rudinger, Matt Post, Benjamin Van Durme

We present ParaBank, a large-scale English paraphrase dataset that surpasses prior work in both quantity and quality.

Machine Translation NMT +5

Paper
Add Code

Can You Tell Me How to Get Past Sesame Street? Sentence-Level Pretraining Beyond Language Modeling

no code implementations • ACL 2019 • Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen, Benjamin Van Durme, Edouard Grave, Ellie Pavlick, Samuel R. Bowman

Natural language understanding has recently seen a surge of progress with the use of sentence encoders like ELMo (Peters et al., 2018a) and BERT (Devlin et al., 2019) which are pretrained on variants of language modeling.

Language Modelling Natural Language Understanding +2

Paper
Add Code

Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages

1 code implementation • NeurIPS 2018 • Michelle Yuan, Benjamin Van Durme, Jordan L. Ying

Multilingual topic models can reveal patterns in cross-lingual document collections.

General Classification Topic Models

Paper
Code

ReCoRD: Bridging the Gap between Human and Machine Commonsense Reading Comprehension

no code implementations • 30 Oct 2018 • Sheng Zhang, Xiaodong Liu, Jingjing Liu, Jianfeng Gao, Kevin Duh, Benjamin Van Durme

We present a large-scale dataset, ReCoRD, for machine reading comprehension requiring commonsense reasoning.

Ranked #34 on Common Sense Reasoning on ReCoRD

Common Sense Reasoning Machine Reading Comprehension

Paper
Add Code

Cross-lingual Decompositional Semantic Parsing

no code implementations • EMNLP 2018 • Sheng Zhang, Xutai Ma, Rachel Rudinger, Kevin Duh, Benjamin Van Durme

We introduce the task of cross-lingual decompositional semantic parsing: mapping content provided in a source language into a decompositional semantic analysis based on a target language.

Semantic Parsing

Paper
Add Code

Predicting the Argumenthood of English Prepositional Phrases

no code implementations • 20 Sep 2018 • Najoung Kim, Kyle Rawlins, Benjamin Van Durme, Paul Smolensky

Distinguishing between arguments and adjuncts of a verb is a longstanding, nontrivial problem.

Semantic Role Labeling Sentence +1

Paper
Add Code

Lexicosyntactic Inference in Neural Models

no code implementations • EMNLP 2018 • Aaron Steven White, Rachel Rudinger, Kyle Rawlins, Benjamin Van Durme

We use this dataset, which we make publicly available, to probe the behavior of current state-of-the-art neural systems, showing that these systems make certain systematic errors that are clearly visible through the lens of factuality prediction.

Paper
Add Code

Efficient Online Scalar Annotation with Bounded Support

no code implementations • ACL 2018 • Keisuke Sakaguchi, Benjamin Van Durme

We describe a novel method for efficiently eliciting scalar annotations for dataset construction and system quality estimation by human judgments.

Paper
Add Code

Halo: Learning Semantics-Aware Representations for Cross-Lingual Information Extraction

no code implementations • SEMEVAL 2018 • Hongyuan Mei, Sheng Zhang, Kevin Duh, Benjamin Van Durme

Cross-lingual information extraction (CLIE) is an important and challenging task, especially in low resource scenarios.

TAG

Paper
Add Code

Hypothesis Only Baselines in Natural Language Inference

1 code implementation • SEMEVAL 2018 • Adam Poliak, Jason Naradowsky, Aparajita Haldar, Rachel Rudinger, Benjamin Van Durme

We propose a hypothesis only baseline for diagnosing Natural Language Inference (NLI).

Natural Language Inference

Paper
Code

On the Evaluation of Semantic Phenomena in Neural Machine Translation Using Natural Language Inference

1 code implementation • NAACL 2018 • Adam Poliak, Yonatan Belinkov, James Glass, Benjamin Van Durme

We propose a process for investigating the extent to which sentence representations arising from neural machine translation (NMT) systems encode distinct semantic phenomena.

Machine Translation Natural Language Inference +4

Paper
Code

Gender Bias in Coreference Resolution

3 code implementations • NAACL 2018 • Rachel Rudinger, Jason Naradowsky, Brian Leonard, Benjamin Van Durme

We present an empirical study of gender bias in coreference resolution systems.

coreference-resolution

212

Paper
Code

Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation

no code implementations • EMNLP (ACL) 2018 • Adam Poliak, Aparajita Haldar, Rachel Rudinger, J. Edward Hu, Ellie Pavlick, Aaron Steven White, Benjamin Van Durme

We present a large-scale collection of diverse natural language inference (NLI) datasets that help provide insight into how well a sentence representation captures distinct types of reasoning.

Natural Language Inference Sentence

Paper
Add Code

Cross-lingual Semantic Parsing

no code implementations • 21 Apr 2018 • Sheng Zhang, Kevin Duh, Benjamin Van Durme

We introduce the task of cross-lingual semantic parsing: mapping content provided in a source language into a meaning representation based on a target language.

Semantic Parsing

Paper
Add Code

Neural-Davidsonian Semantic Proto-role Labeling

1 code implementation • EMNLP 2018 • Rachel Rudinger, Adam Teichert, Ryan Culkin, Sheng Zhang, Benjamin Van Durme

We present a model for semantic proto-role labeling (SPRL) using an adapted bidirectional LSTM encoding strategy that we call "Neural-Davidsonian": predicate-argument structure is represented as pairs of hidden states corresponding to predicate and argument head tokens of the input sequence.

Attribute

Paper
Code

Fine-grained Entity Typing through Increased Discourse Context and Adaptive Classification Thresholds

1 code implementation • SEMEVAL 2018 • Sheng Zhang, Kevin Duh, Benjamin Van Durme

Fine-grained entity typing is the task of assigning fine-grained semantic types to entity mentions.

Entity Typing General Classification +1

Paper
Code

Neural models of factuality

1 code implementation • NAACL 2018 • Rachel Rudinger, Aaron Steven White, Benjamin Van Durme

We present two neural models for event factuality prediction, which yield significant performance gains over previous models on three event factuality datasets: FactBank, UW, and MEANTIME.

Paper
Code

Selective Decoding for Cross-lingual Open Information Extraction

no code implementations • IJCNLP 2017 • Sheng Zhang, Kevin Duh, Benjamin Van Durme

Cross-lingual open information extraction is the task of distilling facts from the source language into representations in the target language.

Decoder Machine Translation +1

Paper
Add Code

Inference is Everything: Recasting Semantic Resources into a Unified Evaluation Framework

no code implementations • IJCNLP 2017 • Aaron Steven White, Pushpendre Rastogi, Kevin Duh, Benjamin Van Durme

We propose to unify a variety of existing semantic classification tasks, such as semantic role labeling, anaphora resolution, and paraphrase detection, under the heading of Recognizing Textual Entailment (RTE).

General Classification Image Captioning +4

Paper
Add Code

CADET: Computer Assisted Discovery Extraction and Translation

no code implementations • IJCNLP 2017 • Benjamin Van Durme, Tom Lippincott, Kevin Duh, Deana Burchfield, Adam Poliak, Cash Costello, Tim Finin, Scott Miller, James Mayfield, Philipp Koehn, Craig Harman, Dawn Lawrie, Ch May, ler, Max Thomas, Annabelle Carrell, Julianne Chaloux, Tongfei Chen, Alex Comerford, Mark Dredze, Benjamin Glass, Shudong Hao, Patrick Martin, Pushpendre Rastogi, Rashmi Sankepally, Travis Wolfe, Ying-Ying Tran, Ted Zhang

It combines a multitude of analytics together with a flexible environment for customizing the workflow for different users.

Active Learning Machine Translation +1

Paper
Add Code

Grammatical Error Correction with Neural Reinforcement Learning

no code implementations • IJCNLP 2017 • Keisuke Sakaguchi, Matt Post, Benjamin Van Durme

We propose a neural encoder-decoder model with reinforcement learning (NRL) for grammatical error correction (GEC).

Decoder Grammatical Error Correction +3

Paper
Add Code

Pocket Knowledge Base Population

no code implementations • ACL 2017 • Travis Wolfe, Mark Dredze, Benjamin Van Durme

Existing Knowledge Base Population methods extract relations from a closed relational schema with limited coverage leading to sparse KBs.

Knowledge Base Population Open Information Extraction +1

Paper
Add Code

Error-repair Dependency Parsing for Ungrammatical Texts

1 code implementation • ACL 2017 • Keisuke Sakaguchi, Matt Post, Benjamin Van Durme

We propose a new dependency parsing scheme which jointly parses a sentence and repairs grammatical errors by extending the non-directional transition-based formalism of Goldberg and Elhadad (2010) with three additional actions: SUBSTITUTE, DELETE, INSERT.

Dependency Parsing Sentence

Paper
Code

Bayesian Modeling of Lexical Resources for Low-Resource Settings

no code implementations • ACL 2017 • Nicholas Andrews, Mark Dredze, Benjamin Van Durme, Jason Eisner

Practically, this means that we may treat the lexical resources as observations under the proposed generative model.

Low Resource Named Entity Recognition named-entity-recognition +2

Paper
Add Code

Frame-Based Continuous Lexical Semantics through Exponential Family Tensor Factorization and Semantic Proto-Roles

1 code implementation • SEMEVAL 2017 • Francis Ferraro, Adam Poliak, Ryan Cotterell, Benjamin Van Durme

We study how different frame annotations complement one another when learning continuous lexical semantics.

Paper
Code

Streaming Word Embeddings with the Space-Saving Algorithm

2 code implementations • 24 Apr 2017 • Chandler May, Kevin Duh, Benjamin Van Durme, Ashwin Lall

We develop a streaming (one-pass, bounded-memory) word embedding algorithm based on the canonical skip-gram with negative sampling algorithm implemented in word2vec.

Word Embeddings

Paper
Code

The Semantic Proto-Role Linking Model

no code implementations • EACL 2017 • Aaron Steven White, Kyle Rawlins, Benjamin Van Durme

We propose the semantic proto-role linking model, which jointly induces both predicate-specific semantic roles and predicate-general semantic proto-roles based on semantic proto-role property likelihood judgments.

Semantic Role Labeling

Paper
Add Code

Efficient, Compositional, Order-sensitive n-gram Embeddings

1 code implementation • EACL 2017 • Adam Poliak, Pushpendre Rastogi, M. Patrick Martin, Benjamin Van Durme

We propose ECO: a new way to generate embeddings for phrases that is Efficient, Compositional, and Order-sensitive.

Word Embeddings

Paper
Code

MT/IE: Cross-lingual Open Information Extraction with Neural Sequence-to-Sequence Models

no code implementations • EACL 2017 • Sheng Zhang, Kevin Duh, Benjamin Van Durme

Conventional pipeline solutions decompose the task as machine translation followed by information extraction (or vice versa).

Machine Translation Open Information Extraction +3

Paper
Add Code

Social Bias in Elicited Natural Language Inferences

1 code implementation • WS 2017 • Rachel Rudinger, Ch May, ler, Benjamin Van Durme

We analyze the Stanford Natural Language Inference (SNLI) corpus in an investigation of bias and stereotyping in NLP data.

Language Modelling Natural Language Inference +1

Paper
Code

Explaining and Generalizing Skip-Gram through Exponential Family Principal Component Analysis

no code implementations • EACL 2017 • Ryan Cotterell, Adam Poliak, Benjamin Van Durme, Jason Eisner

The popular skip-gram model induces word embeddings by exploiting the signal from word-context coocurrence.

Word Embeddings

Paper
Add Code

Discriminative Information Retrieval for Question Answering Sentence Selection

1 code implementation • EACL 2017 • Tongfei Chen, Benjamin Van Durme

We propose a framework for discriminative IR atop linguistic features, trained to improve the recall of answer candidate passage retrieval, the initial step in text-based question answering.

Passage Retrieval Question Answering +2

Paper
Code

Feature Generation for Robust Semantic Role Labeling

no code implementations • 22 Feb 2017 • Travis Wolfe, Mark Dredze, Benjamin Van Durme

Hand-engineered feature sets are a well understood method for creating robust NLP models, but they require a lot of expertise and effort to create.

Semantic Role Labeling

Paper
Add Code

Skip-Prop: Representing Sentences with One Vector Per Proposition

no code implementations • WS 2017 • Rachel Rudinger, Kevin Duh, Benjamin Van Durme

Machine Translation Question Answering +2

Paper
Add Code

An Evaluation of PredPatt and Open IE via Stage 1 Semantic Role Labeling

1 code implementation • WS 2017 • Sheng Zhang, Rachel Rudinger, Benjamin Van Durme

Common Sense Reasoning Open Information Extraction +1

110

Paper
Code

Ordinal Common-sense Inference

no code implementations • TACL 2017 • Sheng Zhang, Rachel Rudinger, Kevin Duh, Benjamin Van Durme

Humans have the capacity to draw common-sense inferences from natural language: various things that are likely but not certain to hold based on established discourse, and are rarely stated explicitly.

Common Sense Reasoning Natural Language Inference

Paper
Add Code

Universal Decompositional Semantics on Universal Dependencies

1 code implementation • EMNLP 2016 • Aaron Steven White, Drew Reisinger, Keisuke Sakaguchi, Tim Vieira, Sheng Zhang, Rachel Rudinger, Kyle Rawlins, Benjamin Van Durme

110

Paper
Code

Fluency detection on communication networks

no code implementations • EMNLP 2016 • Tom Lippincott, Benjamin Van Durme

Language Identification Part-Of-Speech Tagging

Paper
Add Code

A Study of Imitation Learning Methods for Semantic Role Labeling

no code implementations • WS 2016 • Travis Wolfe, Mark Dredze, Benjamin Van Durme

Imitation Learning Semantic Role Labeling +1

Paper
Add Code

Computational linking theory

no code implementations • 8 Oct 2016 • Aaron Steven White, Drew Reisinger, Rachel Rudinger, Kyle Rawlins, Benjamin Van Durme

A linking theory explains how verbs' semantic arguments are mapped to their syntactic arguments---the inverse of the Semantic Role Labeling task from the shallow semantic parsing literature.

Semantic Parsing Semantic Role Labeling

Paper
Add Code

An Analysis of Lemmatization on Topic Models of Morphologically Rich Language

no code implementations • 13 Aug 2016 • Chandler May, Ryan Cotterell, Benjamin Van Durme

Topic models are typically represented by top-$m$ word lists for human interpretation.

Lemmatization Topic Models

Paper
Add Code

Robsut Wrod Reocginiton via semi-Character Recurrent Neural Network

1 code implementation • 7 Aug 2016 • Keisuke Sakaguchi, Kevin Duh, Matt Post, Benjamin Van Durme

Inspired by the findings from the Cmabrigde Uinervtisy effect, we propose a word recognition model based on a semi-character level recurrent neural network (scRNN).

Spelling Correction

Paper
Code

A Critical Examination of RESCAL for Completion of Knowledge Bases with Transitive Relations

no code implementations • 16 May 2016 • Pushpendre Rastogi, Benjamin Van Durme

Link prediction in large knowledge graphs has received a lot of attention recently because of its importance for inferring missing relations and for completing and improving noisily extracted knowledge graphs.

Knowledge Graphs Link Prediction

Paper
Add Code

Script Induction as Language Modeling

no code implementations • EMNLP 2015 • Rachel Rudinger, Pushpendre Rastogi, Francis Ferraro, Benjamin Van Durme

Language Modelling

Paper
Add Code

Topic Identification and Discovery on Text and Speech

no code implementations • EMNLP 2015 • Ch May, ler, Francis Ferraro, Alan McCree, Jonathan Wintrode, Daniel Garcia-Romero, Benjamin Van Durme

Dimensionality Reduction Speech Recognition +1

Paper
Add Code

Sublinear Partition Estimation

2 code implementations • 7 Aug 2015 • Pushpendre Rastogi, Benjamin Van Durme

The output scores of a neural network classifier are converted to probabilities via normalizing over the scores of all competing categories.

Language Modelling Object Recognition

Paper
Code

PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification

no code implementations • IJCNLP 2015 • Ellie Pavlick, Pushpendre Rastogi, Juri Ganitkevitch, Benjamin Van Durme, Chris Callison-Burch

General Classification Natural Language Inference +3

Paper
Add Code

Adding Semantics to Data-Driven Paraphrasing

no code implementations • IJCNLP 2015 • Ellie Pavlick, Johan Bos, Malvina Nissim, Charley Beller, Benjamin Van Durme, Chris Callison-Burch

Natural Language Inference

Paper
Add Code

Domain-Specific Paraphrase Extraction

no code implementations • IJCNLP 2015 • Ellie Pavlick, Juri Ganitkevitch, Tsz Ping Chan, Xuchen Yao, Benjamin Van Durme, Chris Callison-Burch

Machine Translation

Paper
Add Code

FrameNet+: Fast Paraphrastic Tripling of FrameNet

no code implementations • IJCNLP 2015 • Ellie Pavlick, Travis Wolfe, Pushpendre Rastogi, Chris Callison-Burch, Mark Dredze, Benjamin Van Durme

Knowledge Base Population Natural Language Inference +1

Paper
Add Code

Learning to predict script events from domain-specific text

no code implementations • SEMEVAL 2015 • Rachel Rudinger, Vera Demberg, Ashutosh Modi, Benjamin Van Durme, Manfred Pinkal

Information Retrieval

Paper
Add Code

A Concrete Chinese NLP Pipeline

no code implementations • NAACL 2015 • Nanyun Peng, Francis Ferraro, Mo Yu, Nicholas Andrews, Jay DeYoung, Max Thomas, Matthew R. Gormley, Travis Wolfe, Craig Harman, Benjamin Van Durme, Mark Dredze

Coreference Resolution Entity Linking +6

Paper
Add Code

Interactive Knowledge Base Population

no code implementations • 31 May 2015 • Travis Wolfe, Mark Dredze, James Mayfield, Paul McNamee, Craig Harman, Tim Finin, Benjamin Van Durme

Most work on building knowledge bases has focused on collecting entities and facts from as large a collection of documents as possible.

Knowledge Base Population

Paper
Add Code

Multiview LSA: Representation Learning via Generalized CCA

no code implementations • HLT 2015 • Benjamin Van Durme, Pushpendre Rastogi, Raman Arora

Information Retrieval Representation Learning +1

Paper
Add Code

Social Media Predictive Analytics

no code implementations • HLT 2015 • Svitlana Volkova, Benjamin Van Durme, David Yarowsky, Yoram Bachrach

Active Learning Attribute +7

Paper
Add Code

Predicate Argument Alignment using a Global Coherence Model

1 code implementation • HLT 2015 • Mark Dredze, Benjamin Van Durme, Travis Wolfe

coreference-resolution Cross Document Coreference Resolution +3

Paper
Code

Statistical modality tagging from rule-based annotations and crowdsourcing

no code implementations • WS 2012 • Vinodkumar Prabhakaran, Michael Bloodgood, Mona Diab, Bonnie Dorr, Lori Levin, Christine D. Piatko, Owen Rambow, Benjamin Van Durme

We explore training an automatic modality tagger.

Paper
Add Code

Semantic Proto-Roles

no code implementations • TACL 2015 • Drew Reisinger, Rachel Rudinger, Francis Ferraro, Craig Harman, Kyle Rawlins, Benjamin Van Durme

We present the first large-scale, corpus based verification of Dowty{'}s seminal theory of proto-roles.

Semantic Role Labeling

Paper
Add Code

Efficient Elicitation of Annotations for Human Evaluation of Machine Translation

1 code implementation • WS 2014 • Keisuke Sakaguchi, Matt Post, Benjamin Van Durme

Machine Translation Translation

Paper
Code

Freebase QA: Information Extraction or Semantic Parsing?

no code implementations • WS 2014 • Xuchen Yao, Jonathan Berant, Benjamin Van Durme

Open Information Extraction Question Answering +1

Paper
Add Code

Predicting Fine-grained Social Roles with Selectional Preferences

no code implementations • WS 2014 • Charley Beller, Craig Harman, Benjamin Van Durme

Relation Extraction Semantic Role Labeling +1

Paper
Add Code

A Comparison of the Events and Relations Across ACE, ERE, TAC-KBP, and FrameNet Annotation Standards

no code implementations • WS 2014 • Jacqueline Aguilar, Charley Beller, Paul McNamee, Benjamin Van Durme, Stephanie Strassel, Zhiyi Song, Joe Ellis

Relation Extraction Semantic Parsing +1

Paper
Add Code

Exponential Reservoir Sampling for Streaming Language Models

no code implementations • ACL 2014 • Miles Osborne, Ashwin Lall, Benjamin Van Durme

Language Modelling Machine Translation

Paper
Add Code

Biases in Predicting the Human Language Model

no code implementations • ACL 2014 • Alex B. Fine, Austin F. Frank, T. Florian Jaeger, Benjamin Van Durme

Language Modelling

Paper
Add Code

I'm a Belieber: Social Roles via Self-identification and Conceptual Attributes

no code implementations • ACL 2014 • Charley Beller, Rebecca Knowles, Craig Harman, Shane Bergsma, Margaret Mitchell, Benjamin Van Durme

Paper
Add Code

Inferring User Political Preferences from Streaming Communications

no code implementations • ACL 2014 • Svitlana Volkova, Glen Coppersmith, Benjamin Van Durme

Paper
Add Code

Low-Resource Semantic Role Labeling

no code implementations • ACL 2014 • Matthew R. Gormley, Margaret Mitchell, Benjamin Van Durme, Mark Dredze

Information Retrieval Machine Translation +2

Paper
Add Code

Is the Stanford Dependency Representation Semantic?

no code implementations • WS 2014 • Rachel Rudinger, Benjamin Van Durme

Natural Language Inference

Paper
Add Code

Augmenting FrameNet Via PPDB

no code implementations • WS 2014 • Pushpendre Rastogi, Benjamin Van Durme

Lexical Analysis Semantic Parsing

Paper
Add Code

Particle Filter Rejuvenation and Latent Dirichlet Allocation

no code implementations • ACL 2014 • Ch May, ler, Alex Clemmer, Benjamin Van Durme

Paper
Add Code

Information Extraction over Structured Data: Question Answering with Freebase

no code implementations • ACL 2014 • Xuchen Yao, Benjamin Van Durme

Information Retrieval Question Answering +1

Paper
Add Code

A Wikipedia-based Corpus for Contextualized Machine Translation

no code implementations • LREC 2014 • Jennifer Drexler, Pushpendre Rastogi, Jacqueline Aguilar, Benjamin Van Durme, Matt Post

We describe a corpus for target-contextualized machine translation (MT), where the task is to improve the translation of source documents using language models built over presumably related documents in the target language.

Domain Adaptation Language Modelling +2