Search Results for author: Chris Callison-Burch

Found 96 papers, 21 papers with code

Goal-Oriented Script Construction

no code implementations28 Jul 2021 Qing Lyu, Li Zhang, Chris Callison-Burch

The knowledge of scripts, common chains of events in stereotypical scenarios, is a valuable asset for task-oriented natural language understanding systems.

Language Modelling Natural Language Understanding

Deduplicating Training Data Makes Language Models Better

1 code implementation14 Jul 2021 Katherine Lee, Daphne Ippolito, Andrew Nystrom, Chiyuan Zhang, Douglas Eck, Chris Callison-Burch, Nicholas Carlini

As a result, over 1% of the unprompted output of language models trained on these datasets is copied verbatim from the training data.

Language Modelling

Cultural and Geographical Influences on Image Translatability of Words across Languages

1 code implementation NAACL 2021 Nikzad Khani, Isidora Tourni, Mohammad Sadegh Rasooli, Chris Callison-Burch, Derry Tanti Wijaya

We find that images of words are not always invariant across languages, and that language pairs with shared culture, meaning having either a common language family, ethnicity or religion, have improved image translatability (i. e., have more similar images for similar words) compared to its converse, regardless of their geographic proximity.

Machine Translation

GooAQ: Open Question Answering with Diverse Answer Types

1 code implementation18 Apr 2021 Daniel Khashabi, Amos Ng, Tushar Khot, Ashish Sabharwal, Hannaneh Hajishirzi, Chris Callison-Burch

GooAQ answers are mined from Google's responses to our collected questions, specifically from the answer boxes in the search results.

Question Answering

"Wikily" Neural Machine Translation Tailored to Cross-Lingual Tasks

1 code implementation16 Apr 2021 Mohammad Sadegh Rasooli, Chris Callison-Burch, Derry Tanti Wijaya

In image captioning, we train a multi-tasking machine translation and image captioning pipeline for Arabic and English from which the Arabic training data is a wikily translation of the English captioning data.

Cross-Lingual Transfer Dependency Parsing +3

Visual Goal-Step Inference using wikiHow

no code implementations12 Apr 2021 Yue Yang, Artemis Panagopoulou, Qing Lyu, Li Zhang, Mark Yatskar, Chris Callison-Burch

We introduce a novel dataset harvested from wikiHow that consists of 772, 294 images representing human actions.

Simple-QE: Better Automatic Quality Estimation for Text Simplification

no code implementations22 Dec 2020 Reno Kriz, Marianna Apidianaki, Chris Callison-Burch

Text simplification systems generate versions of texts that are easier to understand for a broader audience.

Text Simplification

Automatic Standardization of Colloquial Persian

1 code implementation10 Dec 2020 Mohammad Sadegh Rasooli, Farzane Bakhtyari, Fatemeh Shafiei, Mahsa Ravanbakhsh, Chris Callison-Burch

We also show that our model improves English-to-Persian machine translation in scenarios for which the training data is from colloquial Persian with 1. 4 absolute BLEU score difference in the development data, and 0. 8 in the test data.

Machine Translation

RoFT: A Tool for Evaluating Human Detection of Machine-Generated Text

2 code implementations EMNLP 2020 Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Chris Callison-Burch

In recent years, large neural networks for natural language generation (NLG) have made leaps and bounds in their ability to generate fluent text.

Human Detection Text Generation

Reasoning about Goals, Steps, and Temporal Ordering with WikiHow

1 code implementation EMNLP 2020 Li Zhang, Qing Lyu, Chris Callison-Burch

We propose a suite of reasoning tasks on two types of relations between procedural events: goal-step relations ("learn poses" is a step in the larger goal of "doing yoga") and step-step temporal relations ("buy a yoga mat" typically precedes "learn poses").

Toward Better Storylines with Sentence-Level Language Models

1 code implementation ACL 2020 Daphne Ippolito, David Grangier, Douglas Eck, Chris Callison-Burch

We propose a sentence-level language model which selects the next sentence in a story from a finite set of fluent alternatives.

Language Modelling Sentence Embeddings +1

Bilingual is At Least Monolingual (BALM): A Novel Translation Algorithm that Encodes Monolingual Priors

1 code implementation30 Aug 2019 Jeffrey Cheng, Chris Callison-Burch

State-of-the-art machine translation (MT) models do not use knowledge of any single language's structure; this is the equivalent of asking someone to translate from English to German while knowing neither language.

Machine Translation

Winter is here: Summarizing Twitter Streams related to Pre-Scheduled Events

no code implementations WS 2019 Anietie Andy, Derry Tanti Wijaya, Chris Callison-Burch

Pre-scheduled events, such as TV shows and sports games, usually garner considerable attention from the public.

Comparison of Diverse Decoding Methods from Conditional Language Models

1 code implementation ACL 2019 Daphne Ippolito, Reno Kriz, Maria Kustikova, João Sedoc, Chris Callison-Burch

While conditional language models have greatly improved in their ability to output high-quality natural language, many NLP applications benefit from being able to generate a diverse set of candidate sequences.

PerspectroScope: A Window to the World of Diverse Perspectives

1 code implementation ACL 2019 Sihao Chen, Daniel Khashabi, Chris Callison-Burch, Dan Roth

This work presents PerspectroScope, a web-based system which lets users query a discussion-worthy natural language claim, and extract and visualize various perspectives in support or against the claim, along with evidence supporting each perspective.

Natural Language Inference Natural Language Understanding +1

ChatEval: A Tool for Chatbot Evaluation

no code implementations NAACL 2019 Jo{\~a}o Sedoc, Daphne Ippolito, Arun Kirubarajan, Jai Thirani, Lyle Ungar, Chris Callison-Burch

We introduce a unified framework for human evaluation of chatbots that augments existing tools and provides a web-based hub for researchers to share and compare their dialog systems.

Chatbot Open-Domain Dialog

A Comparison of Context-sensitive Models for Lexical Substitution

no code implementations WS 2019 Aina Gar{\'\i} Soler, Anne Cocos, Marianna Apidianaki, Chris Callison-Burch

Word embedding representations provide good estimates of word meaning and give state-of-the art performance in semantic tasks.

Word Embeddings

Paraphrase-Sense-Tagged Sentences

no code implementations TACL 2019 Anne Cocos, Chris Callison-Burch

Many natural language processing tasks require discriminating the particular meaning of a word in context, but building corpora for developing sense-aware models can be a challenge.

Magnitude: A Fast, Efficient Universal Vector Embedding Utility Package

1 code implementation EMNLP 2018 Ajay Patel, Alexander Sands, Chris Callison-Burch, Marianna Apidianaki

Vector space embedding models like word2vec, GloVe, fastText, and ELMo are extremely popular representations in natural language processing (NLP) applications.

Word Embeddings

Automated Paraphrase Lattice Creation for HyTER Machine Translation Evaluation

no code implementations NAACL 2018 Marianna Apidianaki, Guillaume Wisniewski, Anne Cocos, Chris Callison-Burch

We propose a variant of a well-known machine translation (MT) evaluation metric, HyTER (Dreyer and Marcu, 2012), which exploits reference translations enriched with meaning equivalent expressions.

Machine Translation

Simplification Using Paraphrases and Context-Based Lexical Substitution

no code implementations NAACL 2018 Reno Kriz, Eleni Miltsakaki, Marianna Apidianaki, Chris Callison-Burch

Lexical simplification involves identifying complex words or phrases that need to be simplified, and recommending simpler meaning-preserving substitutes that can be more easily understood.

Complex Word Identification Lexical Simplification +1

Comparing Constraints for Taxonomic Organization

no code implementations NAACL 2018 Anne Cocos, Marianna Apidianaki, Chris Callison-Burch

In this paper, we present a head-to-head comparison of six taxonomic organization algorithms that vary with respect to their structural and transitivity constraints, and treatment of synonymy.

Entity Extraction using GAN

Constructing an Alias List for Named Entities during an Event

no code implementations WS 2017 Anietie Andy, Mark Dredze, Mugizi Rwebangira, Chris Callison-Burch

EntitySpike uses a temporal heuristic to identify named entities with similar context that occur in the same time period (within minutes) during an event.

Community Question Answering

Mapping the Paraphrase Database to WordNet

no code implementations SEMEVAL 2017 Anne Cocos, Marianna Apidianaki, Chris Callison-Burch

WordNet has facilitated important research in natural language processing but its usefulness is somewhat limited by its relatively small lexical coverage.

Word Sense Filtering Improves Embedding-Based Lexical Substitution

no code implementations WS 2017 Anne Cocos, Marianna Apidianaki, Chris Callison-Burch

The role of word sense disambiguation in lexical substitution has been questioned due to the high performance of vector space models which propose good substitutes without explicitly accounting for sense.

Entity Extraction using GAN Part-Of-Speech Tagging +4

Optimizing Statistical Machine Translation for Text Simplification

1 code implementation TACL 2016 Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch

Most recent sentence simplification systems use basic machine translation models to learn lexical and syntactic paraphrases from a manually simplified parallel corpus.

Machine Translation Text Simplification

Use of Modality and Negation in Semantically-Informed Syntactic MT

no code implementations5 Feb 2015 Kathryn Baker, Michael Bloodgood, Bonnie J. Dorr, Chris Callison-Burch, Nathaniel W. Filardo, Christine Piatko, Lori Levin, Scott Miller

We apply our MN annotation scheme to statistical machine translation using a syntactic framework that supports the inclusion of semantic annotations.

Machine Translation

Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation

no code implementations21 Oct 2014 Michael Bloodgood, Chris Callison-Burch

We explore how to improve machine translation systems by adding more translation data in situations where we already have substantial resources.

Active Learning Machine Translation

Semantically-Informed Syntactic Machine Translation: A Tree-Grafting Approach

no code implementations24 Sep 2014 Kathryn Baker, Michael Bloodgood, Chris Callison-Burch, Bonnie J. Dorr, Nathaniel W. Filardo, Lori Levin, Scott Miller, Christine Piatko

We describe a unified and coherent syntactic framework for supporting a semantically-informed syntactic approach to statistical machine translation.

Machine Translation

The Multilingual Paraphrase Database

no code implementations LREC 2014 Juri Ganitkevitch, Chris Callison-Burch

We release a massive expansion of the paraphrase database (PPDB) that now includes a collection of paraphrases in 23 different languages.

Document Summarization Information Retrieval +6

A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic

no code implementations LREC 2014 Ryan Cotterell, Chris Callison-Burch

To the best of the authors’ knowledge, this work is the most diverse corpus of dialectal Arabic in both the source of the content and the number of dialects.

Dialect Identification

The American Local News Corpus

no code implementations LREC 2014 Ann Irvine, Joshua Langfus, Chris Callison-Burch

We present the American Local News Corpus (ALNC), containing over 4 billion words of text from 2, 652 online newspapers in the United States.

Extracting Lexically Divergent Paraphrases from Twitter

1 code implementation TACL 2014 Wei Xu, Alan Ritter, Chris Callison-Burch, William B. Dolan, Yangfeng Ji

We present MultiP (Multi-instance Learning Paraphrase Model), a new model suited to identify paraphrases within the short messages on Twitter.

Cannot find the paper you are looking for? You can Submit a new open access paper.