Search Results for author: Kevin Gimpel

Found 95 papers, 40 papers with code

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

48 code implementations • ICLR 2020 • Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut

Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks.

Ranked #1 on Natural Language Inference on QNLI

Common Sense Reasoning Linguistic Acceptability +7

124,984

Paper
Code

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

3 code implementations • 9 Jun 2022 • Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Bryan Orinion, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Dylan Schrader, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Batchelder, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mitch Walker, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan A. Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu

BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.

Common Sense Reasoning Math +1

2,650

Paper
Code

Gaussian Error Linear Units (GELUs)

8 code implementations • 27 Jun 2016 • Dan Hendrycks, Kevin Gimpel

We propose the Gaussian Error Linear Unit (GELU), a high-performing neural network activation function.

577

Paper
Code

A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks

14 code implementations • 7 Oct 2016 • Dan Hendrycks, Kevin Gimpel

We consider the two related problems of detecting if an example is misclassified or out-of-distribution.

Ranked #12 on Out-of-Distribution Detection on CIFAR-10 vs CIFAR-100

Anomaly Detection Automatic Speech Recognition +2

218

Paper
Code

Adversarial Example Generation with Syntactically Controlled Paraphrase Networks

2 code implementations • NAACL 2018 • Mohit Iyyer, John Wieting, Kevin Gimpel, Luke Zettlemoyer

We propose syntactically controlled paraphrase networks (SCPNs) and use them to generate adversarial examples.

Sentence

167

Paper
Code

Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise

1 code implementation • NeurIPS 2018 • Dan Hendrycks, Mantas Mazeika, Duncan Wilson, Kevin Gimpel

We utilize trusted data by proposing a loss correction technique that utilizes trusted examples in a data-efficient manner to mitigate the effects of label noise on deep neural network classifiers.

Data Poisoning

Paper
Code

Simple and Effective Paraphrastic Similarity from Parallel Translations

4 code implementations • ACL 2019 • John Wieting, Kevin Gimpel, Graham Neubig, Taylor Berg-Kirkpatrick

We present a model and methodology for learning paraphrastic sentence embeddings directly from bitext, removing the time-consuming intermediate step of creating paraphrase corpora.

Sentence Sentence Embeddings

Paper
Code

Paraphrastic Representations at Scale

1 code implementation • 30 Apr 2021 • John Wieting, Kevin Gimpel, Graham Neubig, Taylor Berg-Kirkpatrick

We train these models on large amounts of data, achieving significantly improved performance from the original papers proposing the methods on a suite of monolingual semantic similarity, cross-lingual semantic similarity, and bitext mining tasks.

Paper
Code

A Multi-Task Approach for Disentangling Syntax and Semantics in Sentence Representations

1 code implementation • NAACL 2019 • Mingda Chen, Qingming Tang, Sam Wiseman, Kevin Gimpel

We propose a generative model for a sentence that uses two latent variables, with one intended to represent the syntax of the sentence and the other to represent its semantics.

Disentanglement Semantic Similarity +2

Paper
Code

Beyond BLEU: Training Neural Machine Translation with Semantic Similarity

1 code implementation • 14 Sep 2019 • John Wieting, Taylor Berg-Kirkpatrick, Kevin Gimpel, Graham Neubig

While most neural machine translation (NMT) systems are still trained using maximum likelihood estimation, recent work has demonstrated that optimizing systems to directly improve evaluation metrics such as BLEU can substantially improve final translation accuracy.

Machine Translation NMT +3

Paper
Code

Evaluation Benchmarks and Learning Criteria for Discourse-Aware Sentence Representations

2 code implementations • IJCNLP 2019 • Mingda Chen, Zewei Chu, Kevin Gimpel

Prior work on pretrained sentence embeddings and benchmarks focus on the capabilities of stand-alone sentences.

Sentence Sentence Embeddings

Paper
Code

Variational Sequential Labelers for Semi-Supervised Learning

1 code implementation • EMNLP 2018 • Mingda Chen, Qingming Tang, Karen Livescu, Kevin Gimpel

Our model family consists of a latent-variable generative model and a discriminative labeler.

Ranked #72 on Named Entity Recognition (NER) on CoNLL 2003 (English)

Learning Word Embeddings

Paper
Code

Chess as a Testbed for Language Model State Tracking

2 code implementations • 26 Feb 2021 • Shubham Toshniwal, Sam Wiseman, Karen Livescu, Kevin Gimpel

Motivated by this issue, we consider the task of language modeling for the game of chess.

Game of Chess Language Modelling +1

Paper
Code

Learning to Ignore: Long Document Coreference with Bounded Memory Neural Networks

3 code implementations • EMNLP 2020 • Shubham Toshniwal, Sam Wiseman, Allyson Ettinger, Karen Livescu, Kevin Gimpel

Long document coreference resolution remains a challenging task due to the large memory and runtime requirements of current models.

Ranked #9 on Coreference Resolution on CoNLL 2012

Management

Paper
Code

SummScreen: A Dataset for Abstractive Screenplay Summarization

1 code implementation • ACL 2022 • Mingda Chen, Zewei Chu, Sam Wiseman, Kevin Gimpel

Since characters are fundamental to TV series, we also propose two entity-centric evaluation metrics.

Abstractive Text Summarization

Paper
Code

On Generalization in Coreference Resolution

2 code implementations • CRAC (ACL) 2021 • Shubham Toshniwal, Patrick Xia, Sam Wiseman, Karen Livescu, Kevin Gimpel

While coreference resolution is defined independently of dataset domain, most models for performing coreference resolution do not transfer well to unseen domains.

Ranked #1 on Coreference Resolution on WikiCoref

coreference-resolution Data Augmentation

Paper
Code

Smaller Text Classifiers with Discriminative Cluster Embeddings

1 code implementation • NAACL 2018 • Mingda Chen, Kevin Gimpel

Word embedding parameters often dominate overall model sizes in neural methods for natural language processing.

Clustering

Paper
Code

Learning Approximate Inference Networks for Structured Prediction

3 code implementations • ICLR 2018 • Lifu Tu, Kevin Gimpel

Prior work used gradient descent for inference, relaxing the structured output to a set of continuous variables and then optimizing the energy with respect to them.

Language Modelling Multi-Label Classification +2

Paper
Code

ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation

1 code implementation • ACL 2020 • Lifu Tu, Richard Yuanzhe Pang, Sam Wiseman, Kevin Gimpel

We propose to train a non-autoregressive machine translation model to minimize the energy defined by a pretrained autoregressive model.

Machine Translation Translation

Paper
Code

EntEval: A Holistic Evaluation Benchmark for Entity Representations

2 code implementations • IJCNLP 2019 • Mingda Chen, Zewei Chu, Yang Chen, Karl Stratos, Kevin Gimpel

Rich entity representations are useful for a wide class of problems involving entities.

Entity Disambiguation Entity Typing

Paper
Code

How to Ask Better Questions? A Large-Scale Multi-Domain Dataset for Rewriting Ill-Formed Questions

1 code implementation • 21 Nov 2019 • Zewei Chu, Mingda Chen, Jing Chen, Miaosen Wang, Kevin Gimpel, Manaal Faruqui, Xiance Si

We present a large-scale dataset for the task of rewriting an ill-formed natural language question to a well-formed one.

Question Rewriting

Paper
Code

A Cross-Task Analysis of Text Span Representations

1 code implementation • WS 2020 • Shubham Toshniwal, Haoyue Shi, Bowen Shi, Lingyu Gao, Karen Livescu, Kevin Gimpel

Many natural language processing (NLP) tasks involve reasoning with textual spans, including question answering, entity recognition, and coreference resolution.

coreference-resolution Question Answering

Paper
Code

TVStoryGen: A Dataset for Generating Stories with Character Descriptions

1 code implementation • 18 Sep 2021 • Mingda Chen, Kevin Gimpel

We introduce TVStoryGen, a story generation dataset that requires generating detailed TV show episode recaps from a brief summary and a set of documents describing the characters involved.

Ranked #1 on Story Generation on Fandom dev

Abstractive Text Summarization Story Generation

Paper
Code

WikiTableT: A Large-Scale Data-to-Text Dataset for Generating Wikipedia Article Sections

1 code implementation • Findings (ACL) 2021 • Mingda Chen, Sam Wiseman, Kevin Gimpel

Datasets for data-to-text generation typically focus either on multi-domain, single-sentence generation or on single-domain, long-form generation.

Data-to-Text Generation Sentence

Paper
Code

Mining Knowledge for Natural Language Inference from Wikipedia Categories

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Mingda Chen, Zewei Chu, Karl Stratos, Kevin Gimpel

Accurate lexical entailment (LE) and natural language inference (NLI) often require large quantities of costly annotations.

Lexical Entailment Natural Language Inference

Paper
Code

Parsing Speech: A Neural Approach to Integrating Lexical and Acoustic-Prosodic Information

1 code implementation • NAACL 2018 • Trang Tran, Shubham Toshniwal, Mohit Bansal, Kevin Gimpel, Karen Livescu, Mari Ostendorf

In conversational speech, the acoustic signal provides cues that help listeners disambiguate difficult parses.

Sentence

Paper
Code

Early Methods for Detecting Adversarial Images

1 code implementation • 1 Aug 2016 • Dan Hendrycks, Kevin Gimpel

Many machine learning classifiers are vulnerable to adversarial perturbations.

BIG-bench Machine Learning

Paper
Code

NatCat: Weakly Supervised Text Classification with Naturally Annotated Resources

1 code implementation • AKBC 2021 • Zewei Chu, Karl Stratos, Kevin Gimpel

We describe NatCat, a large-scale resource for text classification constructed from three data sources: Wikipedia, Stack Exchange, and Reddit.

General Classification Text Categorization +1

Paper
Code

Unsupervised Label Refinement Improves Dataless Text Classification

1 code implementation • Findings (ACL) 2021 • Zewei Chu, Karl Stratos, Kevin Gimpel

This reliance causes dataless classifiers to be highly sensitive to the choice of label descriptions and hinders the broader application of dataless classification in practice.

Ranked #3 on Zero-Shot Text Classification on AG News

Clustering General Classification +2

Paper
Code

Exemplar-Controllable Paraphrasing and Translation using Bitext

1 code implementation • 12 Oct 2020 • Mingda Chen, Sam Wiseman, Kevin Gimpel

Our experimental results show that our models achieve competitive results on controlled paraphrase generation and strong performance on controlled machine translation.

Machine Translation Paraphrase Generation +1

Paper
Code

Adjusting for Dropout Variance in Batch Normalization and Weight Initialization

1 code implementation • 8 Jul 2016 • Dan Hendrycks, Kevin Gimpel

We show how to adjust for the variance introduced by dropout with corrections to weight initialization and Batch Normalization, yielding higher accuracy.

Data Augmentation

Paper
Code

PeTra: A Sparsely Supervised Memory Model for People Tracking

1 code implementation • ACL 2020 • Shubham Toshniwal, Allyson Ettinger, Kevin Gimpel, Karen Livescu

We propose PeTra, a memory-augmented neural network designed to track entities in its memory slots.

Ranked #1 on Coreference Resolution on GAP (F1 metric)

Coreference Resolution

Paper
Code

Benchmarking Approximate Inference Methods for Neural Structured Prediction

1 code implementation • NAACL 2019 • Lifu Tu, Kevin Gimpel

One approach is to perform gradient descent with respect to the output structure directly (Belanger and McCallum, 2016).

Benchmarking Structured Prediction

Paper
Code

The Benefits of Label-Description Training for Zero-Shot Text Classification

1 code implementation • 3 May 2023 • Lingyu Gao, Debanjan Ghosh, Kevin Gimpel

Pretrained language models have improved zero-shot text classification by allowing the transfer of semantic knowledge from the training data in order to classify among specific label sets in downstream tasks.

domain classification text-classification +3

Paper
Code

An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks

1 code implementation • EMNLP 2020 • Lifu Tu, Tianyu Liu, Kevin Gimpel

Many tasks in natural language processing involve predicting structured outputs, e. g., sequence labeling, semantic role labeling, parsing, and machine translation.

Machine Translation Representation Learning +2

Paper
Code

Structured Tree Alignment for Evaluation of (Speech) Constituency Parsing

1 code implementation • 21 Feb 2024 • Freda Shi, Kevin Gimpel, Karen Livescu

We present the structured average intersection-over-union ratio (STRUCT-IOU), a similarity metric between constituency parse trees motivated by the problem of evaluating speech parsers.

Constituency Parsing

Paper
Code

ParaNMT-50M: Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations

no code implementations • ACL 2018 • John Wieting, Kevin Gimpel

We describe PARANMT-50M, a dataset of more than 50 million English-English sentential paraphrase pairs.

Machine Translation Natural Language Understanding +5

Paper
Add Code

A Study of All-Convolutional Encoders for Connectionist Temporal Classification

no code implementations • 28 Oct 2017 • Kalpesh Krishna, Liang Lu, Kevin Gimpel, Karen Livescu

We explore whether deep convolutional neural networks (CNNs) can be used effectively instead of RNNs as the "encoder" in CTC.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

End-to-End Neural Segmental Models for Speech Recognition

no code implementations • 1 Aug 2017 • Hao Tang, Liang Lu, Lingpeng Kong, Kevin Gimpel, Karen Livescu, Chris Dyer, Noah A. Smith, Steve Renals

Segmental models are an alternative to frame-based models for sequence prediction, where hypothesized path weights are based on entire segment scores rather than a single frame at a time.

speech-recognition Speech Recognition

Paper
Add Code

Learning to Embed Words in Context for Syntactic Tasks

no code implementations • WS 2017 • Lifu Tu, Kevin Gimpel, Karen Livescu

We present models for embedding words in the context of surrounding words.

Paper
Add Code

Learning Paraphrastic Sentence Embeddings from Back-Translated Bitext

no code implementations • EMNLP 2017 • John Wieting, Jonathan Mallinson, Kevin Gimpel

We consider the problem of learning general-purpose, paraphrastic sentence embeddings in the setting of Wieting et al. (2016b).

Machine Translation Sentence +2

Paper
Add Code

Emergent Predication Structure in Hidden State Vectors of Neural Readers

no code implementations • WS 2017 • Hai Wang, Takeshi Onishi, Kevin Gimpel, David Mcallester

A significant number of neural architectures for reading comprehension have recently been developed and evaluated on large cloze-style datasets.

Reading Comprehension

Paper
Add Code

Revisiting Recurrent Networks for Paraphrastic Sentence Embeddings

no code implementations • ACL 2017 • John Wieting, Kevin Gimpel

We consider the problem of learning general-purpose, paraphrastic sentence embeddings, revisiting the setting of Wieting et al. (2016b).

Sentence Sentence Embeddings +1

Paper
Add Code

Broad Context Language Modeling as Reading Comprehension

no code implementations • EACL 2017 • Zewei Chu, Hai Wang, Kevin Gimpel, David Mcallester

Progress in text understanding has been driven by large datasets that test particular capabilities, like recent datasets for reading comprehension (Hermann et al., 2015).

Ranked #32 on Language Modelling on LAMBADA

coreference-resolution LAMBADA +2

Paper
Add Code

End-to-End Training Approaches for Discriminative Segmental Models

no code implementations • 21 Oct 2016 • Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu

Similarly to hybrid HMM-neural network models, segmental models of this class can be trained in two stages (frame classifier training followed by linear segmental model weight training), end to end (joint training of both frame classifier and linear weights), or with end-to-end fine-tuning after two-stage training.

speech-recognition Speech Recognition

Paper
Add Code

Who did What: A Large-Scale Person-Centered Cloze Dataset

no code implementations • EMNLP 2016 • Takeshi Onishi, Hai Wang, Mohit Bansal, Kevin Gimpel, David Mcallester

We have constructed a new "Who-did-What" dataset of over 200, 000 fill-in-the-gap (cloze) multiple choice reading comprehension problems constructed from the LDC English Gigaword newswire corpus.

Multiple-choice Reading Comprehension

Paper
Add Code

Discriminative Segmental Cascades for Feature-Rich Phone Recognition

no code implementations • 22 Jul 2015 • Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu

A typical solution is to use approximate decoding, either by beam pruning in a single pass or by beam pruning to generate a lattice followed by a second pass.

Language Modelling speech-recognition +2

Paper
Add Code

Efficient Segmental Cascades for Speech Recognition

no code implementations • 2 Aug 2016 • Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu

Discriminative segmental models offer a way to incorporate flexible feature functions into speech recognition.

speech-recognition Speech Recognition

Paper
Add Code

Charagram: Embedding Words and Sentences via Character n-grams

no code implementations • EMNLP 2016 • John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu

We present Charagram embeddings, a simple approach for learning character-based compositional models to embed textual sequences.

Part-Of-Speech Tagging Sentence +2

Paper
Add Code

Mapping Unseen Words to Task-Trained Embedding Spaces

no code implementations • WS 2016 • Pranava Swaroop Madhyastha, Mohit Bansal, Kevin Gimpel, Karen Livescu

We consider the supervised training setting in which we learn task-specific word embeddings.

Dependency Parsing Sentiment Analysis +1

Paper
Add Code

Towards Universal Paraphrastic Sentence Embeddings

no code implementations • 25 Nov 2015 • John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu

We again find that the word averaging models perform well for sentence similarity and entailment, outperforming LSTMs.

General Classification Sentence +4

Paper
Add Code

From Paraphrase Database to Compositional Paraphrase Model and Back

1 code implementation • TACL 2015 • John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu, Dan Roth

The Paraphrase Database (PPDB; Ganitkevitch et al., 2013) is an extensive semantic resource, consisting of a list of phrase pairs with (heuristic) confidence estimates.

Word Embeddings

Paper
Code

Predicting the NFL using Twitter

no code implementations • 25 Oct 2013 • Shiladitya Sinha, Chris Dyer, Kevin Gimpel, Noah A. Smith

We study the relationship between social media output and National Football League (NFL) games, using a dataset containing messages from Twitter and NFL game statistics.

Paper
Add Code

Unsupervised Evaluation Metrics and Learning Criteria for Non-Parallel Textual Transfer

no code implementations • WS 2019 • Richard Yuanzhe Pang, Kevin Gimpel

We show that the metric of post-transfer classification accuracy is insufficient on its own, and propose additional metrics based on semantic preservation and fluency as well as a way to combine them into a single overall score.

Sentence

Paper
Add Code

Pay Attention to the Ending:Strong Neural Baselines for the ROC Story Cloze Task

no code implementations • ACL 2017 • Zheng Cai, Lifu Tu, Kevin Gimpel

We consider the ROC story cloze task (Mostafazadeh et al., 2016) and present several findings.

Negation Outlier Detection +1

Paper
Add Code

Commonsense Knowledge Base Completion

no code implementations • ACL 2016 • Xiang Li, Aynaz Taheri, Lifu Tu, Kevin Gimpel

Knowledge Base Completion

Paper
Add Code

Quality Signals in Generated Stories

no code implementations • SEMEVAL 2018 • Manasvi Sagarkar, John Wieting, Lifu Tu, Kevin Gimpel

We study the problem of measuring the quality of automatically-generated stories.

reinforcement-learning Reinforcement Learning (RL) +3

Paper
Add Code

UMD-TTIC-UW at SemEval-2016 Task 1: Attention-Based Multi-Perspective Convolutional Neural Networks for Textual Similarity Measurement

no code implementations • SEMEVAL 2016 • Hua He, John Wieting, Kevin Gimpel, Jinfeng Rao, Jimmy Lin

Feature Engineering Question Answering +2

Paper
Add Code

Constraints Based Convex Belief Propagation

no code implementations • NeurIPS 2016 • YAniv Tenzer, Alex Schwing, Kevin Gimpel, Tamir Hazan

Inference in Markov random fields subject to consistency structure is a fundamental problem that arises in many real-life applications.

Paper
Add Code

Phrase Dependency Machine Translation with Quasi-Synchronous Tree-to-Tree Features

no code implementations • CL 2014 • Kevin Gimpel, Noah A. Smith

Dependency Parsing Machine Translation +1

Paper
Add Code

A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment

no code implementations • TACL 2015 • Jing Wang, Mohit Bansal, Kevin Gimpel, Brian D. Ziebart, Clement T. Yu

Word sense induction (WSI) seeks to automatically discover the senses of a word in a corpus via unsupervised methods.

Topic Models Word Embeddings +1

Paper
Add Code

Machine Comprehension with Syntax, Frames, and Semantics

no code implementations • IJCNLP 2015 • Hai Wang, Mohit Bansal, Kevin Gimpel, David Mcallester

Question Answering Reading Comprehension +1

Paper
Add Code

Tailoring Continuous Word Representations for Dependency Parsing

no code implementations • ACL 2014 • Mohit Bansal, Kevin Gimpel, Karen Livescu

Chunking Dependency Parsing +1

Paper
Add Code

Deep Multilingual Correlation for Improved Word Embeddings

no code implementations • HLT 2015 • Mohit Bansal, Kevin Gimpel, Karen Livescu, Ang Lu, Weiran Wang

Clustering named-entity-recognition +6

Paper
Add Code

Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters

no code implementations • NAACL 2013 • Olutobi Owoputi, Brendan O{'}Connor, Chris Dyer, Kevin Gimpel, Nathan Schneider, Noah A. Smith

Ranked #2 on Part-Of-Speech Tagging on Social media

Named Entity Recognition (NER) Part-Of-Speech Tagging

Paper
Add Code

Structured Ramp Loss Minimization for Machine Translation

no code implementations • NAACL 2012 • Kevin Gimpel, Noah A. Smith

Machine Translation Translation

Paper
Add Code

Concavity and Initialization for Unsupervised Dependency Parsing

no code implementations • NAACL 2012 • Kevin Gimpel, Noah A. Smith

Dependency Grammar Induction Part-Of-Speech Tagging +2

Paper
Add Code

Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks

no code implementations • EMNLP 2015 • Hua He, Kevin Gimpel, Jimmy Lin

Feature Engineering Machine Translation +5

Paper
Add Code

Weakly-Supervised Learning with Cost-Augmented Contrastive Estimation

no code implementations • EMNLP 2014 • Kevin Gimpel, Mohit Bansal

Dependency Parsing Weakly-supervised Learning +1

Paper
Add Code

A Systematic Exploration of Diversity in Machine Translation

no code implementations • EMNLP 2013 • Kevin Gimpel, Dhruv Batra, Chris Dyer, Gregory Shakhnarovich

Machine Translation Translation

Paper
Add Code

Word Salad: Relating Food Prices and Descriptions

no code implementations • EMNLP 2012 • Victor Chahuneau, Kevin Gimpel, Bryan R. Routledge, Lily Scherlis, Noah A. Smith

Paper
Add Code

PoMo: Generating Entity-Specific Post-Modifiers in Context

no code implementations • NAACL 2019 • Jun Seok Kang, Robert L. Logan IV, Zewei Chu, Yang Chen, Dheeru Dua, Kevin Gimpel, Sameer Singh, Niranjan Balasubramanian

Given a sentence about a target entity, the task is to automatically generate a post-modifier phrase that provides contextually relevant information about the entity.

Sentence

Paper
Add Code

Controllable Paraphrase Generation with a Syntactic Exemplar

no code implementations • ACL 2019 • Mingda Chen, Qingming Tang, Sam Wiseman, Kevin Gimpel

Prior work on controllable text generation usually assumes that the controlled attribute can take on one of a small set of values known a priori.

Attribute Paraphrase Generation +2

Paper
Add Code

Visually Grounded Neural Syntax Acquisition

no code implementations • ACL 2019 • Haoyue Shi, Jiayuan Mao, Kevin Gimpel, Karen Livescu

We define concreteness of constituents by their matching scores with images, and use it to guide the parsing of text.

Visual Grounding

Paper
Add Code

Beyond BLEU:Training Neural Machine Translation with Semantic Similarity

no code implementations • ACL 2019 • John Wieting, Taylor Berg-Kirkpatrick, Kevin Gimpel, Graham Neubig

While most neural machine translation (NMT)systems are still trained using maximum likelihood estimation, recent work has demonstrated that optimizing systems to directly improve evaluation metrics such as BLEU can significantly improve final translation accuracy.

Machine Translation NMT +3

Paper
Add Code

Generating Diverse Story Continuations with Controllable Semantics

no code implementations • WS 2019 • Lifu Tu, Xiaoan Ding, Dong Yu, Kevin Gimpel

We propose a simple and effective modeling framework for controlled generation of multiple, diverse outputs.

Sentence

Paper
Add Code

Latent-Variable Generative Models for Data-Efficient Text Classification

no code implementations • IJCNLP 2019 • Xiaoan Ding, Kevin Gimpel

This model consistently outperforms both the generative and discriminative classifiers in small-data settings.

General Classification Language Modelling +4

Paper
Add Code

Improving Joint Training of Inference Networks and Structured Prediction Energy Networks

no code implementations • EMNLP (spnlp) 2020 • Lifu Tu, Richard Yuanzhe Pang, Kevin Gimpel

Deep energy-based models are powerful, but pose challenges for learning and inference (Belanger and McCallum, 2016).

Structured Prediction

Paper
Add Code

Learning Probabilistic Sentence Representations from Paraphrases

no code implementations • WS 2020 • Mingda Chen, Kevin Gimpel

Probabilistic word embeddings have shown effectiveness in capturing notions of generality and entailment, but there is very little work on doing the analogous type of investigation for sentences.

Sentence Specificity +1

Paper
Add Code

Distractor Analysis and Selection for Multiple-Choice Cloze Questions for Second-Language Learners

no code implementations • WS 2020 • Lingyu Gao, Kevin Gimpel, Arnar Jensson

Simple features of the distractor and correct answer correlate with the annotations, though we find substantial benefit to additionally using large-scale pretrained models to measure the fit of the distractor in the context.

Multiple-choice

Paper
Add Code

Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size

no code implementations • 16 Aug 2020 • Davis Yoshida, Allyson Ettinger, Kevin Gimpel

Fine-tuning a pretrained transformer for a downstream task has become a standard method in NLP in the last few years.

Language Modelling

Paper
Add Code

Learning Chess Blindfolded

no code implementations • 1 Jan 2021 • Shubham Toshniwal, Sam Wiseman, Karen Livescu, Kevin Gimpel

Motivated by this issue, we consider the task of language modeling for the game of chess.

Domain Probing Game of Chess +2

Paper
Add Code

Adding Recurrence to Pretrained Transformers

no code implementations • 1 Jan 2021 • Davis Yoshida, Allyson Ettinger, Kevin Gimpel

Fine-tuning a pretrained transformer for a downstream task has become a standard method in NLP in the last few years.

Language Modelling

Paper
Add Code

On the Role of Supervision in Unsupervised Constituency Parsing

no code implementations • EMNLP 2020 • Haoyue Shi, Karen Livescu, Kevin Gimpel

We analyze several recent unsupervised constituency parsing models, which are tuned with respect to the parsing $F_1$ score on the Wall Street Journal (WSJ) development set (1, 700 sentences).

Constituency Parsing Data Augmentation +1

Paper
Add Code

Discriminatively-Tuned Generative Classifiers for Robust Natural Language Inference

1 code implementation • EMNLP 2020 • Xiaoan Ding, Tianyu Liu, Baobao Chang, Zhifang Sui, Kevin Gimpel

We explore training objectives for discriminative fine-tuning of our generative classifiers, showing improvements over log loss fine-tuning from prior work .

Natural Language Inference

Paper
Code

Deep Clustering of Text Representations for Supervision-free Probing of Syntax

no code implementations • 24 Oct 2020 • Vikram Gupta, Haoyue Shi, Kevin Gimpel, Mrinmaya Sachan

We explore deep clustering of text representations for unsupervised model interpretation and induction of syntax.

Clustering Deep Clustering +1

Paper
Add Code

Substructure Substitution: Structured Data Augmentation for NLP

no code implementations • Findings (ACL) 2021 • Haoyue Shi, Karen Livescu, Kevin Gimpel

We study a family of data augmentation methods, substructure substitution (SUB2), for natural language processing (NLP) tasks.

Data Augmentation Part-Of-Speech Tagging +2

Paper
Add Code

FlowPrior: Learning Expressive Priors for Latent Variable Sentence Models

no code implementations • NAACL 2021 • Xiaoan Ding, Kevin Gimpel

Variational autoencoders (VAEs) are widely used for latent variable modeling of text.

Language Modelling Sentence

Paper
Add Code

Substructure Distribution Projection for Zero-Shot Cross-Lingual Dependency Parsing

no code implementations • ACL 2022 • Haoyue Shi, Kevin Gimpel, Karen Livescu

We present substructure distribution projection (SubDP), a technique that projects a distribution over structures in one domain to another, by projecting substructure distributions separately.

Dependency Parsing

Paper
Add Code

Reconsidering the Past: Optimizing Hidden States in Language Models

no code implementations • Findings (EMNLP) 2021 • Davis Yoshida, Kevin Gimpel

We present Hidden-State Optimization (HSO), a gradient-based method for improving the performance of transformer language models at inference time.

Language Modelling

Paper
Add Code

"What makes a question inquisitive?" A Study on Type-Controlled Inquisitive Question Generation

no code implementations • 17 May 2022 • Lingyu Gao, Debanjan Ghosh, Kevin Gimpel

We propose a type-controlled framework for inquisitive question generation.

Question Generation Question-Generation +2

Paper
Add Code

“What makes a question inquisitive?” A Study on Type-Controlled Inquisitive Question Generation

1 code implementation • *SEM (NAACL) 2022 • Lingyu Gao, Debanjan Ghosh, Kevin Gimpel

We propose a type-controlled framework for inquisitive question generation.

Question Generation Question-Generation +2

Paper
Code

Audio-Visual Neural Syntax Acquisition

no code implementations • 11 Oct 2023 • Cheng-I Jeff Lai, Freda Shi, Puyuan Peng, Yoon Kim, Kevin Gimpel, Shiyu Chang, Yung-Sung Chuang, Saurabhchand Bhati, David Cox, David Harwath, Yang Zhang, Karen Livescu, James Glass

We study phrase structure induction from visually-grounded speech.

Language Acquisition

Paper
Add Code

MAP's not dead yet: Uncovering true language model modes by conditioning away degeneracy

no code implementations • 15 Nov 2023 • Davis Yoshida, Kartik Goyal, Kevin Gimpel

It has been widely observed that exact or approximate MAP (mode-seeking) decoding from natural language generation (NLG) models consistently leads to degenerate outputs (Stahlberg and Byrne, 2019, Holtzman et al., 2019).

Instruction Following Language Modelling +2

Paper
Add Code

GEE! Grammar Error Explanation with Large Language Models

1 code implementation • 16 Nov 2023 • Yixiao Song, Kalpesh Krishna, Rajesh Bhatt, Kevin Gimpel, Mohit Iyyer

To address this gap, we propose the task of grammar error explanation, where a system needs to provide one-sentence explanations for each grammatical error in a pair of erroneous and corrected sentences.

Grammatical Error Correction Sentence

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.