Search Results for author: Mike Lewis

Found 51 papers, 24 papers with code

Sparse Distillation: Speeding Up Text Classification by Using Bigger Models

no code implementations16 Oct 2021 Qinyuan Ye, Madian Khabsa, Mike Lewis, Sinong Wang, Xiang Ren, Aaron Jaech

In this paper, we aim to further push the limit of inference speed by exploring a new area in the design space of the student model.

Classification Domain Generalization +1

Tricks for Training Sparse Translation Models

no code implementations15 Oct 2021 Dheeru Dua, Shruti Bhosale, Vedanuj Goswami, James Cross, Mike Lewis, Angela Fan

Multi-task learning with an unbalanced data distribution skews model learning towards high resource tasks, especially when model capacity is fixed and fully shared across all tasks.

Machine Translation Multi-Task Learning +1

8-bit Optimizers via Block-wise Quantization

1 code implementation6 Oct 2021 Tim Dettmers, Mike Lewis, Sam Shleifer, Luke Zettlemoyer

To maintain stability and performance, we combine block-wise quantization with two additional changes: (1) dynamic quantization, a form of non-linear optimization that is precise for both large and small magnitude values, and (2) a stable embedding layer to reduce gradient variance that comes from the highly non-uniform distribution of input tokens in language models.

Language Modelling Machine Translation +1

Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation

1 code implementation27 Aug 2021 Ofir Press, Noah A. Smith, Mike Lewis

We introduce a simple and efficient method, Attention with Linear Biases (ALiBi), that allows for extrapolation.

Word Embeddings

DEMix Layers: Disentangling Domains for Modular Language Modeling

2 code implementations11 Aug 2021 Suchin Gururangan, Mike Lewis, Ari Holtzman, Noah A. Smith, Luke Zettlemoyer

We introduce a new domain expert mixture (DEMix) layer that enables conditioning a language model (LM) on the domain of the input text.

Language Modelling

Question Answering Infused Pre-training of General-Purpose Contextualized Representations

1 code implementation15 Jun 2021 Robin Jia, Mike Lewis, Luke Zettlemoyer

This paper proposes a pre-training objective based on question answering (QA) for learning general-purpose contextual representations, motivated by the intuition that the representation of a phrase in a passage should encode all questions that the phrase can answer in context.

Named Entity Recognition Question Answering +1

Multitasking Inhibits Semantic Drift

no code implementations NAACL 2021 Athul Paul Jacob, Mike Lewis, Jacob Andreas

When intelligent agents communicate to accomplish shared goals, how do these goals shape the agents' language?

BASE Layers: Simplifying Training of Large, Sparse Models

1 code implementation30 Mar 2021 Mike Lewis, Shruti Bhosale, Tim Dettmers, Naman Goyal, Luke Zettlemoyer

Sparse layers can dramatically improve the efficiency of training and inference by routing each token to specialized expert modules that contain only a small fraction of the model parameters.

Nearest Neighbor Machine Translation

1 code implementation ICLR 2021 Urvashi Khandelwal, Angela Fan, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis

We introduce $k$-nearest-neighbor machine translation ($k$NN-MT), which predicts tokens with a nearest neighbor classifier over a large datastore of cached examples, using representations from a neural translation model for similarity search.

Machine Translation Translation

Conversational Semantic Parsing

no code implementations EMNLP 2020 Armen Aghajanyan, Jean Maillard, Akshat Shrivastava, Keith Diedrick, Mike Haeger, Haoran Li, Yashar Mehdad, Ves Stoyanov, Anuj Kumar, Mike Lewis, Sonal Gupta

In this paper, we propose a semantic representation for such task-oriented conversational systems that can represent concepts such as co-reference and context carryover, enabling comprehensive understanding of queries in a session.

Semantic Parsing

Grounded Adaptation for Zero-shot Executable Semantic Parsing

no code implementations EMNLP 2020 Victor Zhong, Mike Lewis, Sida I. Wang, Luke Zettlemoyer

We propose Grounded Adaptation for Zero-shot Executable Semantic Parsing (GAZP) to adapt an existing semantic parser to new environments (e. g. new database schemas).

Data Augmentation Semantic Parsing

Pre-training via Paraphrasing

1 code implementation NeurIPS 2020 Mike Lewis, Marjan Ghazvininejad, Gargi Ghosh, Armen Aghajanyan, Sida Wang, Luke Zettlemoyer

The objective noisily captures aspects of paraphrase, translation, multi-document summarization, and information retrieval, allowing for strong zero-shot performance on several tasks.

Document Summarization Document Translation +4

Asking and Answering Questions to Evaluate the Factual Consistency of Summaries

1 code implementation ACL 2020 Alex Wang, Kyunghyun Cho, Mike Lewis

QAGS is based on the intuition that if we ask questions about a summary and its source, we will receive similar answers if the summary is factually consistent with the source.

Abstractive Text Summarization

Multilingual Denoising Pre-training for Neural Machine Translation

4 code implementations22 Jan 2020 Yinhan Liu, Jiatao Gu, Naman Goyal, Xi-An Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer

This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks.

Denoising Document-level +2

Enforcing Encoder-Decoder Modularity in Sequence-to-Sequence Models

no code implementations9 Nov 2019 Siddharth Dalmia, Abdel-rahman Mohamed, Mike Lewis, Florian Metze, Luke Zettlemoyer

Inspired by modular software design principles of independence, interchangeability, and clarity of interface, we introduce a method for enforcing encoder-decoder modularity in seq2seq models without sacrificing the overall model quality or its full differentiability.

Generalization through Memorization: Nearest Neighbor Language Models

2 code implementations ICLR 2020 Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis

Applying this augmentation to a strong Wikitext-103 LM, with neighbors drawn from the original training set, our $k$NN-LM achieves a new state-of-the-art perplexity of 15. 79 - a 2. 9 point improvement with no additional training.

Domain Adaptation Language Modelling

Span-based Hierarchical Semantic Parsing for Task-Oriented Dialog

no code implementations IJCNLP 2019 Panupong Pasupat, Sonal Gupta, M, Karishma yam, Rushin Shah, Mike Lewis, Luke Zettlemoyer

We propose a semantic parser for parsing compositional utterances into Task Oriented Parse (TOP), a tree representation that has intents and slots as labels of nesting tree nodes.

Semantic Parsing

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

24 code implementations ACL 2020 Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdel-rahman Mohamed, Omer Levy, Ves Stoyanov, Luke Zettlemoyer

We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.

Abstractive Text Summarization Denoising +5

RoBERTa: A Robustly Optimized BERT Pretraining Approach

40 code implementations26 Jul 2019 Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

Common Sense Reasoning Language Modelling +6

MelNet: A Generative Model for Audio in the Frequency Domain

4 code implementations4 Jun 2019 Sean Vasquez, Mike Lewis

Capturing high-level structure in audio waveforms is challenging because a single second of audio spans tens of thousands of timesteps.

Audio Generation Music Generation +2

Hierarchical Decision Making by Generating and Following Natural Language Instructions

1 code implementation NeurIPS 2019 Hengyuan Hu, Denis Yarats, Qucheng Gong, Yuandong Tian, Mike Lewis

We explore using latent natural language instructions as an expressive and compositional representation of complex actions for hierarchical decision making.

Decision Making

Generative Question Answering: Learning to Answer the Whole Question

no code implementations ICLR 2019 Mike Lewis, Angela Fan

Discriminative question answering models can overfit to superficial biases in datasets, because their loss function saturates when any clue makes the answer likely.

Generative Question Answering Language Modelling

Improving Semantic Parsing for Task Oriented Dialog

no code implementations15 Feb 2019 Arash Einolghozati, Panupong Pasupat, Sonal Gupta, Rushin Shah, Mrinal Mohit, Mike Lewis, Luke Zettlemoyer

Semantic parsing using hierarchical representations has recently been proposed for task oriented dialog with promising results [Gupta et al 2018].

Language Modelling Re-Ranking +1

Strategies for Structuring Story Generation

no code implementations ACL 2019 Angela Fan, Mike Lewis, Yann Dauphin

Writers generally rely on plans or sketches to write long stories, but most current language models generate word by word from left to right.

Story Generation

Cross-Lingual Transfer Learning for Multilingual Task Oriented Dialog

no code implementations NAACL 2019 Sebastian Schuster, Sonal Gupta, Rushin Shah, Mike Lewis

We use this data set to evaluate three different cross-lingual transfer methods: (1) translating the training data, (2) using cross-lingual pre-trained embeddings, and (3) a novel method of using a multilingual machine translation encoder as contextual word representations.

Cross-Lingual Transfer Machine Translation +1

A Dataset for Telling the Stories of Social Media Videos

no code implementations EMNLP 2018 Sp Gella, ana, Mike Lewis, Marcus Rohrbach

Video content on social media platforms constitutes a major part of the communication between people, as it allows everyone to share their stories.

Video Captioning Video Description

Community Regularization of Visually-Grounded Dialog

1 code implementation10 Aug 2018 Akshat Agarwal, Swaminathan Gurumurthy, Vasu Sharma, Mike Lewis, Katia Sycara

The task of conducting visually grounded dialog involves learning goal-oriented cooperative dialog between autonomous agents who exchange information about a scene through several rounds of questions and answers in natural language.

Hierarchical Neural Story Generation

6 code implementations ACL 2018 Angela Fan, Mike Lewis, Yann Dauphin

We explore story generation: creative systems that can build coherent and fluent passages of text about a topic.

Story Generation

Hierarchical Text Generation and Planning for Strategic Dialogue

1 code implementation ICML 2018 Denis Yarats, Mike Lewis

End-to-end models for goal-orientated dialogue are challenging to train, because linguistic and strategic aspects are entangled in latent state vectors.

Decision Making Text Generation

Deal or No Deal? End-to-End Learning of Negotiation Dialogues

no code implementations EMNLP 2017 Mike Lewis, Denis Yarats, Yann Dauphin, Devi Parikh, Dhruv Batra

Much of human dialogue occurs in semi-cooperative settings, where agents with different goals attempt to agree on common decisions.

End-to-end Neural Coreference Resolution

4 code implementations EMNLP 2017 Kenton Lee, Luheng He, Mike Lewis, Luke Zettlemoyer

We introduce the first end-to-end coreference resolution model and show that it significantly outperforms all previous work without using a syntactic parser or hand-engineered mention detector.

Coreference Resolution

A Corpus of Natural Language for Visual Reasoning

no code implementations ACL 2017 Alane Suhr, Mike Lewis, James Yeh, Yoav Artzi

We present a new visual reasoning language dataset, containing 92, 244 pairs of examples of natural statements grounded in synthetic images with 3, 962 unique sentences.

Question Answering Visual Question Answering +1

Deep Semantic Role Labeling: What Works and What's Next

1 code implementation ACL 2017 Luheng He, Kenton Lee, Mike Lewis, Luke Zettlemoyer

We introduce a new deep learning model for semantic role labeling (SRL) that significantly improves the state of the art, along with detailed analyses to reveal its strengths and limitations.

Predicate Detection

Deal or No Deal? End-to-End Learning for Negotiation Dialogues

1 code implementation16 Jun 2017 Mike Lewis, Denis Yarats, Yann N. Dauphin, Devi Parikh, Dhruv Batra

Much of human dialogue occurs in semi-cooperative settings, where agents with different goals attempt to agree on common decisions.

Global Neural CCG Parsing with Optimality Guarantees

1 code implementation EMNLP 2016 Kenton Lee, Mike Lewis, Luke Zettlemoyer

We introduce the first global recursive neural parsing model with optimality guarantees during decoding.

Improved CCG Parsing with Semi-supervised Supertagging

no code implementations TACL 2014 Mike Lewis, Mark Steedman

Current supervised parsers are limited by the size of their labelled training data, making improving them with unlabelled data an important goal.

CCG Supertagging Dependency Parsing +5

Combined Distributional and Logical Semantics

no code implementations TACL 2013 Mike Lewis, Mark Steedman

We introduce a new approach to semantics which combines the benefits of distributional and formal logical semantics.

Question Answering Relation Extraction

Cannot find the paper you are looking for? You can Submit a new open access paper.