Search Results for author: Mike Lewis

Found 85 papers, 48 papers with code

Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training

no code implementations6 May 2024 Zexuan Zhong, Mengzhou Xia, Danqi Chen, Mike Lewis

Mixture-of-experts (MoE) models facilitate efficient scaling; however, training the router network introduces the challenge of optimizing a non-differentiable, discrete objective.

Language Modelling

In-Context Pretraining: Language Modeling Beyond Document Boundaries

no code implementations16 Oct 2023 Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Gergely Szilvasy, Rich James, Xi Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Scott Yih, Mike Lewis

Large language models (LMs) are currently trained to predict tokens given document prefixes, enabling them to directly perform long-form generation and prompting-style tasks which can be reduced to document completion.

In-Context Learning Language Modelling +1

RA-DIT: Retrieval-Augmented Dual Instruction Tuning

no code implementations2 Oct 2023 Xi Victoria Lin, Xilun Chen, Mingda Chen, Weijia Shi, Maria Lomeli, Rich James, Pedro Rodriguez, Jacob Kahn, Gergely Szilvasy, Mike Lewis, Luke Zettlemoyer, Scott Yih

Retrieval-augmented language models (RALMs) improve performance by accessing long-tail and up-to-date knowledge from external data stores, but are challenging to build.

Few-Shot Learning Open-Domain Question Answering +1

Efficient Streaming Language Models with Attention Sinks

5 code implementations29 Sep 2023 Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis

In this paper, we first demonstrate that the emergence of attention sink is due to the strong attention scores towards initial tokens as a "sink" even if they are not semantically important.

Language Modelling

Effective Long-Context Scaling of Foundation Models

1 code implementation27 Sep 2023 Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma

We also examine the impact of various design choices in the pretraining process, including the data mix and the training curriculum of sequence lengths -- our ablation experiments suggest that having abundant long texts in the pretrain dataset is not the key to achieving strong performance, and we empirically verify that long context continual pretraining is more efficient and similarly effective compared to pretraining from scratch with long sequences.

Continual Pretraining Language Modelling

Contrastive Decoding Improves Reasoning in Large Language Models

no code implementations17 Sep 2023 Sean O'Brien, Mike Lewis

We demonstrate that Contrastive Decoding -- a simple, computationally light, and training-free text generation method proposed by Li et al 2022 -- achieves large out-of-the-box improvements over greedy decoding on a variety of reasoning tasks.

GSM8K Math +1

Self-Alignment with Instruction Backtranslation

2 code implementations11 Aug 2023 Xian Li, Ping Yu, Chunting Zhou, Timo Schick, Omer Levy, Luke Zettlemoyer, Jason Weston, Mike Lewis

We present a scalable method to build a high quality instruction following language model by automatically labelling human-written text with corresponding instructions.

Instruction Following Language Modelling

Trusting Your Evidence: Hallucinate Less with Context-aware Decoding

no code implementations24 May 2023 Weijia Shi, Xiaochuang Han, Mike Lewis, Yulia Tsvetkov, Luke Zettlemoyer, Scott Wen-tau Yih

Language models (LMs) often struggle to pay enough attention to the input context, and generate texts that are unfaithful or contain hallucinations.

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

5 code implementations23 May 2023 Sewon Min, Kalpesh Krishna, Xinxi Lyu, Mike Lewis, Wen-tau Yih, Pang Wei Koh, Mohit Iyyer, Luke Zettlemoyer, Hannaneh Hajishirzi

Evaluating the factuality of long-form text generated by large language models (LMs) is non-trivial because (1) generations often contain a mixture of supported and unsupported pieces of information, making binary judgments of quality inadequate, and (2) human evaluation is time-consuming and costly.

Language Modelling Retrieval +1

LIMA: Less Is More for Alignment

5 code implementations NeurIPS 2023 Chunting Zhou, PengFei Liu, Puxin Xu, Srini Iyer, Jiao Sun, Yuning Mao, Xuezhe Ma, Avia Efrat, Ping Yu, Lili Yu, Susan Zhang, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer, Omer Levy

Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences.

Language Modelling reinforcement-learning

MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers

no code implementations NeurIPS 2023 Lili Yu, Dániel Simig, Colin Flaherty, Armen Aghajanyan, Luke Zettlemoyer, Mike Lewis

Autoregressive transformers are spectacular models for short sequences but scale poorly to long sequences such as high-resolution images, podcasts, code, or books.

Decoder Density Estimation +1

Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization

1 code implementation6 May 2023 Anastasia Razdaibiedina, Yuning Mao, Rui Hou, Madian Khabsa, Mike Lewis, Jimmy Ba, Amjad Almahairi

In this work, we introduce Residual Prompt Tuning - a simple and efficient method that significantly improves the performance and stability of prompt tuning.

Scaling Expert Language Models with Unsupervised Domain Discovery

1 code implementation24 Mar 2023 Suchin Gururangan, Margaret Li, Mike Lewis, Weijia Shi, Tim Althoff, Noah A. Smith, Luke Zettlemoyer

Large language models are typically trained densely: all parameters are updated with respect to all inputs.

Language Modelling

REPLUG: Retrieval-Augmented Black-Box Language Models

1 code implementation30 Jan 2023 Weijia Shi, Sewon Min, Michihiro Yasunaga, Minjoon Seo, Rich James, Mike Lewis, Luke Zettlemoyer, Wen-tau Yih

We introduce REPLUG, a retrieval-augmented language modeling framework that treats the language model (LM) as a black box and augments it with a tuneable retrieval model.

Language Modelling Multi-task Language Understanding +2

Progressive Prompts: Continual Learning for Language Models

2 code implementations29 Jan 2023 Anastasia Razdaibiedina, Yuning Mao, Rui Hou, Madian Khabsa, Mike Lewis, Amjad Almahairi

We introduce Progressive Prompts - a simple and efficient approach for continual learning in language models.

Continual Learning

Improving Chess Commentaries by Combining Language Models with Symbolic Reasoning Engines

no code implementations15 Dec 2022 Andrew Lee, David Wu, Emily Dinan, Mike Lewis

Despite many recent advancements in language modeling, state-of-the-art language models lack grounding in the real world and struggle with tasks involving complex reasoning.

Language Modelling

In-context Examples Selection for Machine Translation

1 code implementation5 Dec 2022 Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, Marjan Ghazvininejad

Large-scale generative models show an impressive ability to perform a wide range of Natural Language Processing (NLP) tasks using in-context learning, where a few examples are used to describe a task to the model.

In-Context Learning Language Modelling +2

Nonparametric Masked Language Modeling

1 code implementation2 Dec 2022 Sewon Min, Weijia Shi, Mike Lewis, Xilun Chen, Wen-tau Yih, Hannaneh Hajishirzi, Luke Zettlemoyer

Existing language models (LMs) predict tokens with a softmax over a finite vocabulary, which can make it difficult to predict rare tokens or phrases.

Language Modelling Masked Language Modeling +2

Coder Reviewer Reranking for Code Generation

1 code implementation29 Nov 2022 Tianyi Zhang, Tao Yu, Tatsunori B. Hashimoto, Mike Lewis, Wen-tau Yih, Daniel Fried, Sida I. Wang

Sampling diverse programs from a code language model and reranking with model likelihood is a popular method for code generation but it is prone to preferring degenerate solutions.

Code Generation Language Modelling

Retrieval-Augmented Multimodal Language Modeling

no code implementations22 Nov 2022 Michihiro Yasunaga, Armen Aghajanyan, Weijia Shi, Rich James, Jure Leskovec, Percy Liang, Mike Lewis, Luke Zettlemoyer, Wen-tau Yih

To integrate knowledge in a more scalable and modular way, we propose a retrieval-augmented multimodal model, which enables a base multimodal model (generator) to refer to relevant text and images fetched by a retriever from external memory (e. g., documents on the web).

Caption Generation Image Captioning +5

Contrastive Decoding: Open-ended Text Generation as Optimization

2 code implementations27 Oct 2022 Xiang Lisa Li, Ari Holtzman, Daniel Fried, Percy Liang, Jason Eisner, Tatsunori Hashimoto, Luke Zettlemoyer, Mike Lewis

We propose contrastive decoding (CD), a reliable decoding approach that optimizes a contrastive objective subject to a plausibility constraint.

Language Modelling Text Generation

Measuring and Narrowing the Compositionality Gap in Language Models

1 code implementation7 Oct 2022 Ofir Press, Muru Zhang, Sewon Min, Ludwig Schmidt, Noah A. Smith, Mike Lewis

We investigate the ability of language models to perform compositional reasoning tasks where the overall solution depends on correctly composing the answers to sub-problems.

Question Answering

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

3 code implementations15 Aug 2022 Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer

We develop a procedure for Int8 matrix multiplication for feed-forward and attention projection layers in transformers, which cut the memory needed for inference by half while retaining full precision performance.

Language Modelling Linguistic Acceptability +4

Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models

2 code implementations5 Aug 2022 Margaret Li, Suchin Gururangan, Tim Dettmers, Mike Lewis, Tim Althoff, Noah A. Smith, Luke Zettlemoyer

New ELMs are learned by branching from (mixtures of) ELMs in the current set, further training the parameters on data for the new domain, and then merging the resulting model back into the set for future use.

Questions Are All You Need to Train a Dense Passage Retriever

1 code implementation21 Jun 2022 Devendra Singh Sachan, Mike Lewis, Dani Yogatama, Luke Zettlemoyer, Joelle Pineau, Manzil Zaheer

We introduce ART, a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data.

Denoising Language Modelling +1

LegoNN: Building Modular Encoder-Decoder Models

no code implementations7 Jun 2022 Siddharth Dalmia, Dmytro Okhonko, Mike Lewis, Sergey Edunov, Shinji Watanabe, Florian Metze, Luke Zettlemoyer, Abdelrahman Mohamed

We describe LegoNN, a procedure for building encoder-decoder architectures in a way so that its parts can be applied to other tasks without the need for any fine-tuning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

InCoder: A Generative Model for Code Infilling and Synthesis

3 code implementations12 Apr 2022 Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen-tau Yih, Luke Zettlemoyer, Mike Lewis

Our model is the first generative model that is able to directly perform zero-shot code infilling, which we evaluate on challenging tasks such as type inference, comment generation, and variable re-naming.

Code Generation Comment Generation +1

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

1 code implementation25 Feb 2022 Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer

Large language models (LMs) are able to in-context learn -- perform a new task via inference alone by conditioning on a few input-label pairs (demonstrations) and making predictions for new inputs.

In-Context Learning

CM3: A Causal Masked Multimodal Model of the Internet

no code implementations19 Jan 2022 Armen Aghajanyan, Bernie Huang, Candace Ross, Vladimir Karpukhin, Hu Xu, Naman Goyal, Dmytro Okhonko, Mandar Joshi, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer

We introduce CM3, a family of causally masked generative models trained over a large corpus of structured multi-modal documents that can contain both text and image tokens.

Entity Disambiguation Entity Linking

MetaICL: Learning to Learn In Context

2 code implementations NAACL 2022 Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi

We introduce MetaICL (Meta-training for In-Context Learning), a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learning on a large set of training tasks.

Few-Shot Learning In-Context Learning +4

Sparse Distillation: Speeding Up Text Classification by Using Bigger Student Models

1 code implementation NAACL 2022 Qinyuan Ye, Madian Khabsa, Mike Lewis, Sinong Wang, Xiang Ren, Aaron Jaech

Distilling state-of-the-art transformer models into lightweight student models is an effective way to reduce computation cost at inference time.

Domain Generalization Privacy Preserving +4

Tricks for Training Sparse Translation Models

no code implementations NAACL 2022 Dheeru Dua, Shruti Bhosale, Vedanuj Goswami, James Cross, Mike Lewis, Angela Fan

Multi-task learning with an unbalanced data distribution skews model learning towards high resource tasks, especially when model capacity is fixed and fully shared across all tasks.

Machine Translation Multi-Task Learning +1

8-bit Optimizers via Block-wise Quantization

2 code implementations ICLR 2022 Tim Dettmers, Mike Lewis, Sam Shleifer, Luke Zettlemoyer

To maintain stability and performance, we combine block-wise quantization with two additional changes: (1) dynamic quantization, a form of non-linear optimization that is precise for both large and small magnitude values, and (2) a stable embedding layer to reduce gradient variance that comes from the highly non-uniform distribution of input tokens in language models.

Language Modelling Machine Translation +1

Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation

7 code implementations ICLR 2022 Ofir Press, Noah A. Smith, Mike Lewis

Since the introduction of the transformer model by Vaswani et al. (2017), a fundamental question has yet to be answered: how does a model achieve extrapolation at inference time for sequences that are longer than it saw during training?

Inductive Bias Playing the Game of 2048 +2

DEMix Layers: Disentangling Domains for Modular Language Modeling

2 code implementations NAACL 2022 Suchin Gururangan, Mike Lewis, Ari Holtzman, Noah A. Smith, Luke Zettlemoyer

We introduce a new domain expert mixture (DEMix) layer that enables conditioning a language model (LM) on the domain of the input text.

Language Modelling

Question Answering Infused Pre-training of General-Purpose Contextualized Representations

1 code implementation Findings (ACL) 2022 Robin Jia, Mike Lewis, Luke Zettlemoyer

We propose a pre-training objective based on question answering (QA) for learning general-purpose contextual representations, motivated by the intuition that the representation of a phrase in a passage should encode all questions that the phrase can answer in context.

named-entity-recognition Named Entity Recognition +3

Multitasking Inhibits Semantic Drift

no code implementations NAACL 2021 Athul Paul Jacob, Mike Lewis, Jacob Andreas

When intelligent agents communicate to accomplish shared goals, how do these goals shape the agents' language?

BASE Layers: Simplifying Training of Large, Sparse Models

1 code implementation30 Mar 2021 Mike Lewis, Shruti Bhosale, Tim Dettmers, Naman Goyal, Luke Zettlemoyer

Sparse layers can dramatically improve the efficiency of training and inference by routing each token to specialized expert modules that contain only a small fraction of the model parameters.

Nearest Neighbor Machine Translation

5 code implementations ICLR 2021 Urvashi Khandelwal, Angela Fan, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis

We introduce $k$-nearest-neighbor machine translation ($k$NN-MT), which predicts tokens with a nearest neighbor classifier over a large datastore of cached examples, using representations from a neural translation model for similarity search.

Decoder Machine Translation +1

Conversational Semantic Parsing

no code implementations EMNLP 2020 Armen Aghajanyan, Jean Maillard, Akshat Shrivastava, Keith Diedrick, Mike Haeger, Haoran Li, Yashar Mehdad, Ves Stoyanov, Anuj Kumar, Mike Lewis, Sonal Gupta

In this paper, we propose a semantic representation for such task-oriented conversational systems that can represent concepts such as co-reference and context carryover, enabling comprehensive understanding of queries in a session.

dialog state tracking Semantic Parsing

Grounded Adaptation for Zero-shot Executable Semantic Parsing

1 code implementation EMNLP 2020 Victor Zhong, Mike Lewis, Sida I. Wang, Luke Zettlemoyer

We propose Grounded Adaptation for Zero-shot Executable Semantic Parsing (GAZP) to adapt an existing semantic parser to new environments (e. g. new database schemas).

Data Augmentation Dialogue State Tracking +2

Pre-training via Paraphrasing

2 code implementations NeurIPS 2020 Mike Lewis, Marjan Ghazvininejad, Gargi Ghosh, Armen Aghajanyan, Sida Wang, Luke Zettlemoyer

The objective noisily captures aspects of paraphrase, translation, multi-document summarization, and information retrieval, allowing for strong zero-shot performance on several tasks.

Document Summarization Document Translation +6

Asking and Answering Questions to Evaluate the Factual Consistency of Summaries

2 code implementations ACL 2020 Alex Wang, Kyunghyun Cho, Mike Lewis

QAGS is based on the intuition that if we ask questions about a summary and its source, we will receive similar answers if the summary is factually consistent with the source.

Abstractive Text Summarization

Multilingual Denoising Pre-training for Neural Machine Translation

5 code implementations22 Jan 2020 Yinhan Liu, Jiatao Gu, Naman Goyal, Xi-An Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer

This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks.

Decoder Denoising +3

Enforcing Encoder-Decoder Modularity in Sequence-to-Sequence Models

no code implementations9 Nov 2019 Siddharth Dalmia, Abdel-rahman Mohamed, Mike Lewis, Florian Metze, Luke Zettlemoyer

Inspired by modular software design principles of independence, interchangeability, and clarity of interface, we introduce a method for enforcing encoder-decoder modularity in seq2seq models without sacrificing the overall model quality or its full differentiability.


Span-based Hierarchical Semantic Parsing for Task-Oriented Dialog

no code implementations IJCNLP 2019 Panupong Pasupat, Sonal Gupta, M, Karishma yam, Rushin Shah, Mike Lewis, Luke Zettlemoyer

We propose a semantic parser for parsing compositional utterances into Task Oriented Parse (TOP), a tree representation that has intents and slots as labels of nesting tree nodes.

Decoder Semantic Parsing +1

Generalization through Memorization: Nearest Neighbor Language Models

5 code implementations ICLR 2020 Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis

Applying this augmentation to a strong Wikitext-103 LM, with neighbors drawn from the original training set, our $k$NN-LM achieves a new state-of-the-art perplexity of 15. 79 - a 2. 9 point improvement with no additional training.

Domain Adaptation Language Modelling +1

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

43 code implementations ACL 2020 Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdel-rahman Mohamed, Omer Levy, Ves Stoyanov, Luke Zettlemoyer

We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.

Abstractive Text Summarization Decoder +6

RoBERTa: A Robustly Optimized BERT Pretraining Approach

59 code implementations26 Jul 2019 Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

 Ranked #1 on Only Connect Walls Dataset Task 1 (Grouping) on OCW (Wasserstein Distance (WD) metric, using extra training data)

Document Image Classification Language Modelling +13

MelNet: A Generative Model for Audio in the Frequency Domain

5 code implementations4 Jun 2019 Sean Vasquez, Mike Lewis

Capturing high-level structure in audio waveforms is challenging because a single second of audio spans tens of thousands of timesteps.

Audio Generation Music Generation +2

Hierarchical Decision Making by Generating and Following Natural Language Instructions

1 code implementation NeurIPS 2019 Hengyuan Hu, Denis Yarats, Qucheng Gong, Yuandong Tian, Mike Lewis

We explore using latent natural language instructions as an expressive and compositional representation of complex actions for hierarchical decision making.

Decision Making

Generative Question Answering: Learning to Answer the Whole Question

no code implementations ICLR 2019 Mike Lewis, Angela Fan

Discriminative question answering models can overfit to superficial biases in datasets, because their loss function saturates when any clue makes the answer likely.

Generative Question Answering Language Modelling

Improving Semantic Parsing for Task Oriented Dialog

no code implementations15 Feb 2019 Arash Einolghozati, Panupong Pasupat, Sonal Gupta, Rushin Shah, Mrinal Mohit, Mike Lewis, Luke Zettlemoyer

Semantic parsing using hierarchical representations has recently been proposed for task oriented dialog with promising results [Gupta et al 2018].

Language Modelling Re-Ranking +1

Strategies for Structuring Story Generation

no code implementations ACL 2019 Angela Fan, Mike Lewis, Yann Dauphin

Writers generally rely on plans or sketches to write long stories, but most current language models generate word by word from left to right.

Story Generation

Cross-Lingual Transfer Learning for Multilingual Task Oriented Dialog

no code implementations NAACL 2019 Sebastian Schuster, Sonal Gupta, Rushin Shah, Mike Lewis

We use this data set to evaluate three different cross-lingual transfer methods: (1) translating the training data, (2) using cross-lingual pre-trained embeddings, and (3) a novel method of using a multilingual machine translation encoder as contextual word representations.

Cross-Lingual Transfer Machine Translation +1

A Dataset for Telling the Stories of Social Media Videos

no code implementations EMNLP 2018 Sp Gella, ana, Mike Lewis, Marcus Rohrbach

Video content on social media platforms constitutes a major part of the communication between people, as it allows everyone to share their stories.

Sentence Video Captioning +1

Community Regularization of Visually-Grounded Dialog

1 code implementation10 Aug 2018 Akshat Agarwal, Swaminathan Gurumurthy, Vasu Sharma, Mike Lewis, Katia Sycara

The task of conducting visually grounded dialog involves learning goal-oriented cooperative dialog between autonomous agents who exchange information about a scene through several rounds of questions and answers in natural language.

Hierarchical Neural Story Generation

7 code implementations ACL 2018 Angela Fan, Mike Lewis, Yann Dauphin

We explore story generation: creative systems that can build coherent and fluent passages of text about a topic.

Story Generation

Hierarchical Text Generation and Planning for Strategic Dialogue

1 code implementation ICML 2018 Denis Yarats, Mike Lewis

End-to-end models for goal-orientated dialogue are challenging to train, because linguistic and strategic aspects are entangled in latent state vectors.

Decision Making reinforcement-learning +3

Deal or No Deal? End-to-End Learning of Negotiation Dialogues

no code implementations EMNLP 2017 Mike Lewis, Denis Yarats, Yann Dauphin, Devi Parikh, Dhruv Batra

Much of human dialogue occurs in semi-cooperative settings, where agents with different goals attempt to agree on common decisions.

End-to-end Neural Coreference Resolution

4 code implementations EMNLP 2017 Kenton Lee, Luheng He, Mike Lewis, Luke Zettlemoyer

We introduce the first end-to-end coreference resolution model and show that it significantly outperforms all previous work without using a syntactic parser or hand-engineered mention detector.

A Corpus of Natural Language for Visual Reasoning

no code implementations ACL 2017 Alane Suhr, Mike Lewis, James Yeh, Yoav Artzi

We present a new visual reasoning language dataset, containing 92, 244 pairs of examples of natural statements grounded in synthetic images with 3, 962 unique sentences.

Question Answering Visual Question Answering (VQA) +1

Deep Semantic Role Labeling: What Works and What's Next

1 code implementation ACL 2017 Luheng He, Kenton Lee, Mike Lewis, Luke Zettlemoyer

We introduce a new deep learning model for semantic role labeling (SRL) that significantly improves the state of the art, along with detailed analyses to reveal its strengths and limitations.

Predicate Detection

Deal or No Deal? End-to-End Learning for Negotiation Dialogues

2 code implementations16 Jun 2017 Mike Lewis, Denis Yarats, Yann N. Dauphin, Devi Parikh, Dhruv Batra

Much of human dialogue occurs in semi-cooperative settings, where agents with different goals attempt to agree on common decisions.

Global Neural CCG Parsing with Optimality Guarantees

1 code implementation EMNLP 2016 Kenton Lee, Mike Lewis, Luke Zettlemoyer

We introduce the first global recursive neural parsing model with optimality guarantees during decoding.


Improved CCG Parsing with Semi-supervised Supertagging

no code implementations TACL 2014 Mike Lewis, Mark Steedman

Current supervised parsers are limited by the size of their labelled training data, making improving them with unlabelled data an important goal.

CCG Supertagging Dependency Parsing +5

Combined Distributional and Logical Semantics

no code implementations TACL 2013 Mike Lewis, Mark Steedman

We introduce a new approach to semantics which combines the benefits of distributional and formal logical semantics.

Clustering Question Answering +2

Cannot find the paper you are looking for? You can Submit a new open access paper.