Search Results for author: Brian Lester

Found 17 papers, 13 papers with code

Finetuned Language Models Are Zero-Shot Learners

5 code implementations ICLR 2022 Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, Quoc V. Le

We show that instruction tuning -- finetuning language models on a collection of tasks described via instructions -- substantially improves zero-shot performance on unseen tasks.

Common Sense Reasoning Coreference Resolution +8

The Power of Scale for Parameter-Efficient Prompt Tuning

10 code implementations EMNLP 2021 Brian Lester, Rami Al-Rfou, Noah Constant

More remarkably, through ablations on model size using T5, we show that prompt tuning becomes more competitive with scale: as models exceed billions of parameters, our method "closes the gap" and matches the strong performance of model tuning (where all model weights are tuned).

Few-Shot Learning

Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation

1 code implementation25 May 2022 Tu Vu, Aditya Barua, Brian Lester, Daniel Cer, Mohit Iyyer, Noah Constant

In this paper, we explore the challenging problem of performing a generative task in a target language when labeled data is only available in English, using summarization as a case study.

Cross-Lingual Transfer Machine Translation +1

Computationally Efficient NER Taggers with Combined Embeddings and Constrained Decoding

1 code implementation5 Jan 2020 Brian Lester, Daniel Pressel, Amy Hemmeter, Sagnik Ray Choudhury

The CRF layer is used to facilitate global coherence between labels, and the contextual embeddings provide a better representation of words in context.

named-entity-recognition Named Entity Recognition +2

iobes: A Library for Span-Level Processing

1 code implementation9 Oct 2020 Brian Lester

After a model assigns labels to each token, these prefixes are used to group the tokens into spans.

named-entity-recognition Named Entity Recognition +3

Leader: Prefixing a Length for Faster Word Vector Serialization

1 code implementation29 Sep 2020 Brian Lester

Two competing file formats have become the de facto standards for distributing pre-trained word embeddings.

Word Embeddings

Multiple Word Embeddings for Increased Diversity of Representation

1 code implementation30 Sep 2020 Brian Lester, Daniel Pressel, Amy Hemmeter, Sagnik Ray Choudhury, Srinivas Bangalore

Most state-of-the-art models in natural language processing (NLP) are neural models built on top of large, pre-trained, contextual language models that generate representations of words in context and are fine-tuned for the task at hand.

Word Embeddings

Intent Features for Rich Natural Language Understanding

1 code implementation NAACL 2021 Brian Lester, Sagnik Ray Choudhury, Rashmi Prasad, Srinivas Bangalore

Complex natural language understanding modules in dialog systems have a richer understanding of user utterances, and thus are critical in providing a better user experience.

Natural Language Understanding

SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer

no code implementations ACL 2022 Tu Vu, Brian Lester, Noah Constant, Rami Al-Rfou, Daniel Cer

Finally, we propose an efficient retrieval approach that interprets task prompts as task embeddings to identify similar tasks and predict the most transferable source tasks for a novel target task.

Language Modelling Retrieval +1

Reducing Retraining by Recycling Parameter-Efficient Prompts

no code implementations10 Aug 2022 Brian Lester, Joshua Yurtsever, Siamak Shakeri, Noah Constant

Parameter-efficient methods are able to use a single frozen pre-trained large language model (LLM) to perform many tasks by learning task-specific soft prompts that modulate model behavior when concatenated to the input text.

Language Modelling Large Language Model

Training LLMs over Neurally Compressed Text

no code implementations4 Apr 2024 Brian Lester, Jaehoon Lee, Alex Alemi, Jeffrey Pennington, Adam Roberts, Jascha Sohl-Dickstein, Noah Constant

In this paper, we explore the idea of training large language models (LLMs) over highly compressed text.

Cannot find the paper you are looking for? You can Submit a new open access paper.