Search Results for author: Chris Callison-Burch

Found 154 papers, 56 papers with code

The NIEUW Project: Developing Language Resources through Novel Incentives

no code implementations NIDCP (LREC) 2022 James Fiumara, Christopher Cieri, Mark Liberman, Chris Callison-Burch, Jonathan Wright, Robert Parker

NIEUW leverages the power of novel incentives to elicit linguistic data and annotations from a wide variety of contributors including citizen scientists, game players, and language students and professionals.

Is “My Favorite New Movie” My Favorite Movie? Probing the Understanding of Recursive Noun Phrases

no code implementations NAACL 2022 Qing Lyu, Zheng Hua, Daoxin Li, Li Zhang, Marianna Apidianaki, Chris Callison-Burch

We introduce the Recursive Noun Phrase Challenge (RNPC), a dataset of three textual inference tasks involving textual entailment and event plausibility comparison, precisely targeting the understanding of recursive NPs.

Common Sense Reasoning Natural Language Inference

Resolving Pronouns in Twitter Streams: Context can Help!

no code implementations COLING (CRAC) 2020 Anietie Andy, Chris Callison-Burch, Derry Tanti Wijaya

Many people live-tweet televised events like Presidential debates and popular TV-shows and discuss people or characters in the event.

The Case for a Single Model that can Both Generate Continuations and Fill-in-the-Blank

no code implementations Findings (NAACL) 2022 Daphne Ippolito, Liam Dugan, Emily Reif, Ann Yuan, Andy Coenen, Chris Callison-Burch

While previous work has tackled this problem with models trained specifically to do fill in the blank, a more useful model is one that can effectively perform _both_ FitB and continuation tasks.

Position Text Generation

CoMo: Controllable Motion Generation through Language Guided Pose Code Editing

no code implementations20 Mar 2024 Yiming Huang, Weilin Wan, Yue Yang, Chris Callison-Burch, Mark Yatskar, Lingjie Liu

Text-to-motion models excel at efficient human motion generation, but existing approaches lack fine-grained controllability over the generation process.

FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language Models

1 code implementation21 Feb 2024 Andrew Zhu, Alyssa Hwang, Liam Dugan, Chris Callison-Burch

One type of question that is commonly found in day-to-day scenarios is ``fan-out'' questions, complex multi-hop, multi-document reasoning questions that require finding information about a large number of entities.

Question Answering

Calibrating Large Language Models with Sample Consistency

no code implementations21 Feb 2024 Qing Lyu, Kumar Shridhar, Chaitanya Malaviya, Li Zhang, Yanai Elazar, Niket Tandon, Marianna Apidianaki, Mrinmaya Sachan, Chris Callison-Burch

Accurately gauging the confidence level of Large Language Models' (LLMs) predictions is pivotal for their reliable application.

DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

1 code implementation16 Feb 2024 Ajay Patel, Colin Raffel, Chris Callison-Burch

The rapid rise to prominence of these models and these unique challenges has had immediate adverse impacts on open science and on the reproducibility of work that uses them.

Synthetic Data Generation

Grounded Intuition of GPT-Vision's Abilities with Scientific Images

1 code implementation3 Nov 2023 Alyssa Hwang, Andrew Head, Chris Callison-Burch

GPT-Vision has impressed us on a range of vision-language tasks, but it comes with the familiar new challenge: we have little idea of its capabilities and limitations.

Benchmarking counterfactual +1

Interpretable-by-Design Text Understanding with Iteratively Generated Concept Bottleneck

1 code implementation30 Oct 2023 Josh Magnus Ludan, Qing Lyu, Yue Yang, Liam Dugan, Mark Yatskar, Chris Callison-Burch

Black-box deep neural networks excel in text classification, yet their application in high-stakes domains is hindered by their lack of interpretability.

Language Modelling Large Language Model +2

CLIN: A Continually Learning Language Agent for Rapid Task Adaptation and Generalization

no code implementations16 Oct 2023 Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra, Peter Jansen, Oyvind Tafjord, Niket Tandon, Li Zhang, Chris Callison-Burch, Peter Clark

Language agents have shown some ability to interact with an external environment, e. g., a virtual world such as ScienceWorld, to perform complex tasks, e. g., growing a plant, without the startup costs of reinforcement learning.

Choice-75: A Dataset on Decision Branching in Script Learning

no code implementations21 Sep 2023 Zhaoyi Joey Hou, Li Zhang, Chris Callison-Burch

Script learning studies how stereotypical events unfold, enabling machines to reason about narratives with implicit information.


Kani: A Lightweight and Highly Hackable Framework for Building Language Model Applications

1 code implementation11 Sep 2023 Andrew Zhu, Liam Dugan, Alyssa Hwang, Chris Callison-Burch

Language model applications are becoming increasingly popular and complex, often including features like tool usage and retrieval augmentation.

Language Modelling Management +1

ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style Transfer

1 code implementation29 Aug 2023 Zachary Horvitz, Ajay Patel, Chris Callison-Burch, Zhou Yu, Kathleen McKeown

Our parameter-efficient approach, ParaGuide, leverages paraphrase-conditioned diffusion models alongside gradient-based guidance from both off-the-shelf classifiers and strong existing style embedders to transform the style of text while preserving semantic information.

Style Transfer

CALYPSO: LLMs as Dungeon Masters' Assistants

no code implementations15 Aug 2023 Andrew Zhu, Lara J. Martin, Andrew Head, Chris Callison-Burch

The role of a Dungeon Master, or DM, in the game Dungeons & Dragons is to perform multiple tasks simultaneously.

Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models

1 code implementation1 Jun 2023 Liam Dugan, Anshul Wadhawan, Kyle Spence, Chris Callison-Burch, Morgan McGuire, Victor Zordan

Recent work in speech-to-speech translation (S2ST) has focused primarily on offline settings, where the full input utterance is available before any output is given.

Speech-to-Speech Translation Translation

Representation Of Lexical Stylistic Features In Language Models' Embedding Space

no code implementations29 May 2023 Qing Lyu, Marianna Apidianaki, Chris Callison-Burch

The representation space of pretrained Language Models (LMs) encodes rich information about words and their relationships (e. g., similarity, hypernymy, polysemy) as well as abstract semantic notions (e. g., intensity).

This Land is {Your, My} Land: Evaluating Geopolitical Biases in Language Models

1 code implementation24 May 2023 Bryan Li, Samar Haider, Chris Callison-Burch

We then evaluate various multilingual LLMs on our dataset and metrics to probe their internal knowledge and use the proposed metrics to discover numerous inconsistencies in how these models respond in different languages.

Language Modelling Large Language Model +1

OpenPI2.0: An Improved Dataset for Entity Tracking in Texts

1 code implementation24 May 2023 Li Zhang, Hainiu Xu, Abhinav Kommula, Chris Callison-Burch, Niket Tandon

An earlier dataset, OpenPI, provided crowdsourced annotations of entity state changes in text.

Question Answering

Explanation-based Finetuning Makes Models More Robust to Spurious Cues

1 code implementation8 May 2023 Josh Magnus Ludan, Yixuan Meng, Tai Nguyen, Saurabh Shah, Qing Lyu, Marianna Apidianaki, Chris Callison-Burch

Large Language Models (LLMs) are so powerful that they sometimes learn correlations between labels and features that are irrelevant to the task, leading to poor generalization on out-of-distribution data.

FIREBALL: A Dataset of Dungeons and Dragons Actual-Play with Structured Game State Information

1 code implementation2 May 2023 Andrew Zhu, Karmanya Aggarwal, Alexander Feng, Lara J. Martin, Chris Callison-Burch

Dungeons & Dragons (D&D) is a tabletop roleplaying game with complex natural language interactions between players and hidden state information.

Text Generation

Exploring the Curious Case of Code Prompts

1 code implementation26 Apr 2023 Li Zhang, Liam Dugan, Hainiu Xu, Chris Callison-Burch

Furthermore, we show that the style of code prompt has a large effect on performance for some but not all tasks and that fine-tuning on text instructions leads to better relative performance of code prompts.

PAXQA: Generating Cross-lingual Question Answering Examples at Training Scale

1 code implementation24 Apr 2023 Bryan Li, Chris Callison-Burch

This work proposes a synthetic data generation method for cross-lingual QA which leverages indirect supervision from existing parallel corpora.

Cross-Lingual Question Answering Machine Translation +3

Faithful Chain-of-Thought Reasoning

1 code implementation31 Jan 2023 Qing Lyu, Shreya Havaldar, Adam Stein, Li Zhang, Delip Rao, Eric Wong, Marianna Apidianaki, Chris Callison-Burch

While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka.

Math Multi-hop Question Answering +1

Causal Reasoning of Entities and Events in Procedural Texts

1 code implementation26 Jan 2023 Li Zhang, Hainiu Xu, Yue Yang, Shuyan Zhou, Weiqiu You, Manni Arora, Chris Callison-Burch

By injecting the causal relations between entities and events as intermediate reasoning steps in our representation, we further boost the performance to . 67 F1.

Language Models are Drummers: Drum Composition with Natural Language Pre-Training

1 code implementation3 Jan 2023 Li Zhang, Chris Callison-Burch

Automatic music generation with artificial intelligence typically requires a large amount of data which is hard to obtain for many less common genres and musical instruments.

Music Generation Transfer Learning

Real or Fake Text?: Investigating Human Ability to Detect Boundaries Between Human-Written and Machine-Generated Text

1 code implementation24 Dec 2022 Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Sherry Shi, Chris Callison-Burch

As text generated by large language models proliferates, it becomes vital to understand how humans engage with such text, and whether or not they are able to detect when the text they are reading did not originate with a human writer.

Human Detection Sentence

CoRRPUS: Code-based Structured Prompting for Neurosymbolic Story Understanding

1 code implementation21 Dec 2022 Yijiang River Dong, Lara J. Martin, Chris Callison-Burch

In this work, we capitalize on state-of-the-art Code-LLMs, such as Codex, to bootstrap the use of symbolic methods for tracking the state of stories and aiding in story understanding.

Story Generation Task 2

Low-Resource Authorship Style Transfer: Can Non-Famous Authors Be Imitated?

no code implementations18 Dec 2022 Ajay Patel, Nicholas Andrews, Chris Callison-Burch

Existing unsupervised approaches like STRAP have largely focused on style transfer to target authors with many examples of their writing style in books, speeches, or other published works.

In-Context Learning Style Transfer

Visualizing the Obvious: A Concreteness-based Ensemble Model for Noun Property Prediction

1 code implementation24 Oct 2022 Yue Yang, Artemis Panagopoulou, Marianna Apidianaki, Mark Yatskar, Chris Callison-Burch

We propose to extract these properties from images and use them in an ensemble model, in order to complement the information that is extracted from language models.

Property Prediction

Dungeons and Dragons as a Dialog Challenge for Artificial Intelligence

no code implementations13 Oct 2022 Chris Callison-Burch, Gaurav Singh Tomar, Lara J. Martin, Daphne Ippolito, Suma Bailis, David Reitter

In this paper, we frame D&D specifically as a dialogue system challenge, where the tasks are to both generate the next conversational turn in the game and predict the state of the game given the dialogue history.

Language Modelling Large Language Model

Bidirectional Language Models Are Also Few-shot Learners

no code implementations29 Sep 2022 Ajay Patel, Bryan Li, Mohammad Sadegh Rasooli, Noah Constant, Colin Raffel, Chris Callison-Burch

An arbitrary task can be reformulated as a natural language prompt, and a language model can be asked to generate the completion, indirectly performing the task in a paradigm known as prompt-based learning.

Denoising Language Modelling +4

Towards Faithful Model Explanation in NLP: A Survey

no code implementations22 Sep 2022 Qing Lyu, Marianna Apidianaki, Chris Callison-Burch

In this survey, we review over 110 model explanation methods in NLP through the lens of faithfulness.


The Case for a Single Model that can Both Generate Continuations and Fill in the Blank

no code implementations9 Jun 2022 Daphne Ippolito, Liam Dugan, Emily Reif, Ann Yuan, Andy Coenen, Chris Callison-Burch

The task of inserting text into a specified position in a passage, known as fill in the blank (FitB), is useful for a variety of applications where writers interact with a natural language generation (NLG) system to craft text.

Position Text Generation

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

3 code implementations9 Jun 2022 Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Bryan Orinion, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Dylan Schrader, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Batchelder, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mitch Walker, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan A. Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu

BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.

Common Sense Reasoning Math +1

Empathic Conversations: A Multi-level Dataset of Contextualized Conversations

no code implementations25 May 2022 Damilola Omitaomu, Shabnam Tafreshi, Tingting Liu, Sven Buechel, Chris Callison-Burch, Johannes Eichstaedt, Lyle Ungar, João Sedoc

Hence, we collected detailed characterization of the participants' traits, their self-reported empathetic response to news articles, their conversational partner other-report, and turn-by-turn third-party assessments of the level of self-disclosure, emotion, and empathy expressed.

Creating Multimedia Summaries Using Tweets and Videos

no code implementations16 Mar 2022 Anietie Andy, Siyi Liu, Daphne Ippolito, Reno Kriz, Chris Callison-Burch, Derry Wijaya

While popular televised events such as presidential debates or TV shows are airing, people provide commentary on them in real-time.

Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data

1 code implementation ACL 2022 Shuyan Zhou, Li Zhang, Yue Yang, Qing Lyu, Pengcheng Yin, Chris Callison-Burch, Graham Neubig

To this end, we develop a simple and efficient method that links steps (e. g., "purchase a camera") in an article to other articles with similar goals (e. g., "how to choose a camera"), recursively constructing the KB.

Retrieval Video Retrieval

$\rm{C {\small IS}}^2$: A Simplified Commonsense Inference Evaluation for Story Prose

2 code implementations16 Feb 2022 Bryan Li, Lara J. Martin, Chris Callison-Burch

Transformers have been showing near-human performance on a variety of tasks, but they are not without their limitations.

Sentence Text Generation

Is "My Favorite New Movie" My Favorite Movie? Probing the Understanding of Recursive Noun Phrases

1 code implementation15 Dec 2021 Qing Lyu, Hua Zheng, Daoxin Li, Li Zhang, Marianna Apidianaki, Chris Callison-Burch

We introduce the Recursive Noun Phrase Challenge (RNPC), a dataset of three textual inference tasks involving textual entailment and event plausibility comparison, precisely targeting the understanding of recursive NPs.

Common Sense Reasoning Natural Language Inference

Induce, Edit, Retrieve:Language Grounded Multimodal Schema for Instructional Video Retrieval

no code implementations17 Nov 2021 Yue Yang, Joongwon Kim, Artemis Panagopoulou, Mark Yatskar, Chris Callison-Burch

Schemata are structured representations of complex tasks that can aid artificial intelligence by allowing models to break down complex tasks into intermediate steps.

Retrieval Video Retrieval

SynthBio: A Case Study in Human-AI Collaborative Curation of Text Datasets

no code implementations11 Nov 2021 Ann Yuan, Daphne Ippolito, Vitaly Nikolaev, Chris Callison-Burch, Andy Coenen, Sebastian Gehrmann

We use our method to curate SynthBio - a new evaluation set for WikiBio - composed of structured attribute lists describing fictional individuals, mapped to natural language biographies.

Attribute Language Modelling +2

BiSECT: Learning to Split and Rephrase Sentences with Bitexts

1 code implementation EMNLP 2021 Joongwon Kim, Mounica Maddela, Reno Kriz, Wei Xu, Chris Callison-Burch

We categorize examples in our corpus, and use these categories in a novel model that allows us to target specific regions of the input sentence to be split and edited.

Machine Translation Sentence +2

Goal-Oriented Script Construction

1 code implementation INLG (ACL) 2021 Qing Lyu, Li Zhang, Chris Callison-Burch

The knowledge of scripts, common chains of events in stereotypical scenarios, is a valuable asset for task-oriented natural language understanding systems.

Language Modelling Natural Language Understanding +1

Deduplicating Training Data Makes Language Models Better

1 code implementation ACL 2022 Katherine Lee, Daphne Ippolito, Andrew Nystrom, Chiyuan Zhang, Douglas Eck, Chris Callison-Burch, Nicholas Carlini

As a result, over 1% of the unprompted output of language models trained on these datasets is copied verbatim from the training data.

Language Modelling Sentence

Cultural and Geographical Influences on Image Translatability of Words across Languages

1 code implementation NAACL 2021 Nikzad Khani, Isidora Tourni, Mohammad Sadegh Rasooli, Chris Callison-Burch, Derry Tanti Wijaya

We find that images of words are not always invariant across languages, and that language pairs with shared culture, meaning having either a common language family, ethnicity or religion, have improved image translatability (i. e., have more similar images for similar words) compared to its converse, regardless of their geographic proximity.

Cultural Vocal Bursts Intensity Prediction Multilingual NLP +3

Visual Goal-Step Inference using wikiHow

1 code implementation EMNLP 2021 Yue Yang, Artemis Panagopoulou, Qing Lyu, Li Zhang, Mark Yatskar, Chris Callison-Burch

Understanding what sequence of steps are needed to complete a goal can help artificial intelligence systems reason about human activities.

Multimodal Reasoning VGSI

Simple-QE: Better Automatic Quality Estimation for Text Simplification

no code implementations22 Dec 2020 Reno Kriz, Marianna Apidianaki, Chris Callison-Burch

Text simplification systems generate versions of texts that are easier to understand for a broader audience.

Text Simplification

Automatic Standardization of Colloquial Persian

1 code implementation10 Dec 2020 Mohammad Sadegh Rasooli, Farzane Bakhtyari, Fatemeh Shafiei, Mahsa Ravanbakhsh, Chris Callison-Burch

We also show that our model improves English-to-Persian machine translation in scenarios for which the training data is from colloquial Persian with 1. 4 absolute BLEU score difference in the development data, and 0. 8 in the test data.

Machine Translation Translation

RoFT: A Tool for Evaluating Human Detection of Machine-Generated Text

2 code implementations EMNLP 2020 Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Chris Callison-Burch

In recent years, large neural networks for natural language generation (NLG) have made leaps and bounds in their ability to generate fluent text.

Human Detection Text Generation

Reasoning about Goals, Steps, and Temporal Ordering with WikiHow

1 code implementation EMNLP 2020 Li Zhang, Qing Lyu, Chris Callison-Burch

We propose a suite of reasoning tasks on two types of relations between procedural events: goal-step relations ("learn poses" is a step in the larger goal of "doing yoga") and step-step temporal relations ("buy a yoga mat" typically precedes "learn poses").

Cloze Test

Toward Better Storylines with Sentence-Level Language Models

1 code implementation ACL 2020 Daphne Ippolito, David Grangier, Douglas Eck, Chris Callison-Burch

We propose a sentence-level language model which selects the next sentence in a story from a finite set of fluent alternatives.

Language Modelling Sentence +2

Bilingual is At Least Monolingual (BALM): A Novel Translation Algorithm that Encodes Monolingual Priors

1 code implementation30 Aug 2019 Jeffrey Cheng, Chris Callison-Burch

State-of-the-art machine translation (MT) models do not use knowledge of any single language's structure; this is the equivalent of asking someone to translate from English to German while knowing neither language.

Machine Translation Translation

Winter is here: Summarizing Twitter Streams related to Pre-Scheduled Events

no code implementations WS 2019 Anietie Andy, Derry Tanti Wijaya, Chris Callison-Burch

Pre-scheduled events, such as TV shows and sports games, usually garner considerable attention from the public.

Comparison of Diverse Decoding Methods from Conditional Language Models

1 code implementation ACL 2019 Daphne Ippolito, Reno Kriz, Maria Kustikova, João Sedoc, Chris Callison-Burch

While conditional language models have greatly improved in their ability to output high-quality natural language, many NLP applications benefit from being able to generate a diverse set of candidate sequences.

PerspectroScope: A Window to the World of Diverse Perspectives

1 code implementation ACL 2019 Sihao Chen, Daniel Khashabi, Chris Callison-Burch, Dan Roth

This work presents PerspectroScope, a web-based system which lets users query a discussion-worthy natural language claim, and extract and visualize various perspectives in support or against the claim, along with evidence supporting each perspective.

Natural Language Inference Natural Language Understanding +1

ChatEval: A Tool for Chatbot Evaluation

no code implementations NAACL 2019 Jo{\~a}o Sedoc, Daphne Ippolito, Arun Kirubarajan, Jai Thirani, Lyle Ungar, Chris Callison-Burch

We introduce a unified framework for human evaluation of chatbots that augments existing tools and provides a web-based hub for researchers to share and compare their dialog systems.

Chatbot Open-Domain Dialog

A Comparison of Context-sensitive Models for Lexical Substitution

no code implementations WS 2019 Aina Gar{\'\i} Soler, Anne Cocos, Marianna Apidianaki, Chris Callison-Burch

Word embedding representations provide good estimates of word meaning and give state-of-the art performance in semantic tasks.

Word Embeddings

Paraphrase-Sense-Tagged Sentences

no code implementations TACL 2019 Anne Cocos, Chris Callison-Burch

Many natural language processing tasks require discriminating the particular meaning of a word in context, but building corpora for developing sense-aware models can be a challenge.

Magnitude: A Fast, Efficient Universal Vector Embedding Utility Package

1 code implementation EMNLP 2018 Ajay Patel, Alexander Sands, Chris Callison-Burch, Marianna Apidianaki

Vector space embedding models like word2vec, GloVe, fastText, and ELMo are extremely popular representations in natural language processing (NLP) applications.

Word Embeddings

Comparing Constraints for Taxonomic Organization

no code implementations NAACL 2018 Anne Cocos, Marianna Apidianaki, Chris Callison-Burch

In this paper, we present a head-to-head comparison of six taxonomic organization algorithms that vary with respect to their structural and transitivity constraints, and treatment of synonymy.

Entity Extraction using GAN

Automated Paraphrase Lattice Creation for HyTER Machine Translation Evaluation

no code implementations NAACL 2018 Marianna Apidianaki, Guillaume Wisniewski, Anne Cocos, Chris Callison-Burch

We propose a variant of a well-known machine translation (MT) evaluation metric, HyTER (Dreyer and Marcu, 2012), which exploits reference translations enriched with meaning equivalent expressions.

Machine Translation Translation

Simplification Using Paraphrases and Context-Based Lexical Substitution

no code implementations NAACL 2018 Reno Kriz, Eleni Miltsakaki, Marianna Apidianaki, Chris Callison-Burch

Lexical simplification involves identifying complex words or phrases that need to be simplified, and recommending simpler meaning-preserving substitutes that can be more easily understood.

Complex Word Identification Lexical Simplification +1

Constructing an Alias List for Named Entities during an Event

no code implementations WS 2017 Anietie Andy, Mark Dredze, Mugizi Rwebangira, Chris Callison-Burch

EntitySpike uses a temporal heuristic to identify named entities with similar context that occur in the same time period (within minutes) during an event.

Community Question Answering

Mapping the Paraphrase Database to WordNet

no code implementations SEMEVAL 2017 Anne Cocos, Marianna Apidianaki, Chris Callison-Burch

WordNet has facilitated important research in natural language processing but its usefulness is somewhat limited by its relatively small lexical coverage.


Word Sense Filtering Improves Embedding-Based Lexical Substitution

no code implementations WS 2017 Anne Cocos, Marianna Apidianaki, Chris Callison-Burch

The role of word sense disambiguation in lexical substitution has been questioned due to the high performance of vector space models which propose good substitutes without explicitly accounting for sense.

Clustering Entity Extraction using GAN +5

Optimizing Statistical Machine Translation for Text Simplification

1 code implementation TACL 2016 Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, Chris Callison-Burch

Most recent sentence simplification systems use basic machine translation models to learn lexical and syntactic paraphrases from a manually simplified parallel corpus.

Machine Translation Sentence +2

Use of Modality and Negation in Semantically-Informed Syntactic MT

no code implementations5 Feb 2015 Kathryn Baker, Michael Bloodgood, Bonnie J. Dorr, Chris Callison-Burch, Nathaniel W. Filardo, Christine Piatko, Lori Levin, Scott Miller

We apply our MN annotation scheme to statistical machine translation using a syntactic framework that supports the inclusion of semantic annotations.

Machine Translation Negation +1

Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation

no code implementations21 Oct 2014 Michael Bloodgood, Chris Callison-Burch

We explore how to improve machine translation systems by adding more translation data in situations where we already have substantial resources.

Active Learning Machine Translation +1

Semantically-Informed Syntactic Machine Translation: A Tree-Grafting Approach

no code implementations24 Sep 2014 Kathryn Baker, Michael Bloodgood, Chris Callison-Burch, Bonnie J. Dorr, Nathaniel W. Filardo, Lori Levin, Scott Miller, Christine Piatko

We describe a unified and coherent syntactic framework for supporting a semantically-informed syntactic approach to statistical machine translation.

Machine Translation Translation

The Multilingual Paraphrase Database

no code implementations LREC 2014 Juri Ganitkevitch, Chris Callison-Burch

We release a massive expansion of the paraphrase database (PPDB) that now includes a collection of paraphrases in 23 different languages.

Document Summarization Information Retrieval +6

A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic

no code implementations LREC 2014 Ryan Cotterell, Chris Callison-Burch

To the best of the authors’ knowledge, this work is the most diverse corpus of dialectal Arabic in both the source of the content and the number of dialects.

Dialect Identification

The American Local News Corpus

no code implementations LREC 2014 Ann Irvine, Joshua Langfus, Chris Callison-Burch

We present the American Local News Corpus (ALNC), containing over 4 billion words of text from 2, 652 online newspapers in the United States.

Extracting Lexically Divergent Paraphrases from Twitter

1 code implementation TACL 2014 Wei Xu, Alan Ritter, Chris Callison-Burch, William B. Dolan, Yangfeng Ji

We present MultiP (Multi-instance Learning Paraphrase Model), a new model suited to identify paraphrases within the short messages on Twitter.