Search Results for author: Brian Lester

Found 17 papers, 13 papers with code

Finetuned Language Models Are Zero-Shot Learners

5 code implementations • ICLR 2022 • Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, Quoc V. Le

We show that instruction tuning -- finetuning language models on a collection of tasks described via instructions -- substantially improves zero-shot performance on unseen tasks.

Ranked #1 on Question Answering on OBQA

Common Sense Reasoning Coreference Resolution +8

17,473

Paper
Code

Scaling Up Models and Data with $\texttt{t5x}$ and $\texttt{seqio}$

3 code implementations • 31 Mar 2022 • Adam Roberts, Hyung Won Chung, Anselm Levskaya, Gaurav Mishra, James Bradbury, Daniel Andor, Sharan Narang, Brian Lester, Colin Gaffney, Afroz Mohiuddin, Curtis Hawthorne, Aitor Lewkowycz, Alex Salcianu, Marc van Zee, Jacob Austin, Sebastian Goodman, Livio Baldini Soares, Haitang Hu, Sasha Tsvyashchenko, Aakanksha Chowdhery, Jasmijn Bastings, Jannis Bulian, Xavier Garcia, Jianmo Ni, Andrew Chen, Kathleen Kenealy, Jonathan H. Clark, Stephan Lee, Dan Garrette, James Lee-Thorp, Colin Raffel, Noam Shazeer, Marvin Ritter, Maarten Bosma, Alexandre Passos, Jeremy Maitin-Shepard, Noah Fiedel, Mark Omernick, Brennan Saeta, Ryan Sepassi, Alexander Spiridonov, Joshua Newlan, Andrea Gesmundo

Recent neural network-based language models have benefited greatly from scaling up the size of training datasets and the number of parameters in the models themselves.

2,484

Paper
Code

The Power of Scale for Parameter-Efficient Prompt Tuning

10 code implementations • EMNLP 2021 • Brian Lester, Rami Al-Rfou, Noah Constant

More remarkably, through ablations on model size using T5, we show that prompt tuning becomes more competitive with scale: as models exceed billions of parameters, our method "closes the gap" and matches the strong performance of model tuning (where all model weights are tuned).

Few-Shot Learning

622

Paper
Code

Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation

1 code implementation • 25 May 2022 • Tu Vu, Aditya Barua, Brian Lester, Daniel Cer, Mohit Iyyer, Noah Constant

In this paper, we explore the challenging problem of performing a generative task in a target language when labeled data is only available in English, using summarization as a case study.

Cross-Lingual Transfer Machine Translation +1

622

Paper
Code

Baseline: A Library for Rapid Modeling, Experimentation and Development of Deep Learning Algorithms targeting NLP

1 code implementation • WS 2018 • Daniel Pressel, Sagnik Ray Choudhury, Brian Lester, Yanjie Zhao, Matt Barta

We introduce Baseline: a library for reproducible deep learning research and fast model development for NLP.

Language Modelling Machine Translation +3

243

Paper
Code

Computationally Efficient NER Taggers with Combined Embeddings and Constrained Decoding

1 code implementation • 5 Jan 2020 • Brian Lester, Daniel Pressel, Amy Hemmeter, Sagnik Ray Choudhury

The CRF layer is used to facilitate global coherence between labels, and the contextual embeddings provide a better representation of words in context.

named-entity-recognition Named Entity Recognition +2

243

Paper
Code

Git-Theta: A Git Extension for Collaborative Development of Machine Learning Models

1 code implementation • 7 Jun 2023 • Nikhil Kandpal, Brian Lester, Mohammed Muqeeth, Anisha Mascarenhas, Monty Evans, Vishal Baskaran, Tenghao Huang, Haokun Liu, Colin Raffel

Currently, most machine learning models are trained by centralized teams and are rarely updated.

187

Paper
Code

iobes: A Library for Span-Level Processing

1 code implementation • 9 Oct 2020 • Brian Lester

After a model assigns labels to each token, these prefixes are used to group the tokens into spans.

named-entity-recognition Named Entity Recognition +3

Paper
Code

iobes: Library for Span Level Processing

1 code implementation • EMNLP (NLPOSS) 2020 • Brian Lester

After a model assigns labels to each token, these prefixes are used to group the tokens into spans.

named-entity-recognition Named Entity Recognition +3

Paper
Code

Leader: Prefixing a Length for Faster Word Vector Serialization

1 code implementation • 29 Sep 2020 • Brian Lester

Two competing file formats have become the de facto standards for distributing pre-trained word embeddings.

Word Embeddings

Paper
Code

Constrained Decoding for Computationally Efficient Named Entity Recognition Taggers

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Brian Lester, Daniel Pressel, Amy Hemmeter, Sagnik Ray Choudhury, Srinivas Bangalore

Current state-of-the-art models for named entity recognition (NER) are neural models with a conditional random field (CRF) as the final layer.

named-entity-recognition Named Entity Recognition +2

Paper
Code

Multiple Word Embeddings for Increased Diversity of Representation

1 code implementation • 30 Sep 2020 • Brian Lester, Daniel Pressel, Amy Hemmeter, Sagnik Ray Choudhury, Srinivas Bangalore

Most state-of-the-art models in natural language processing (NLP) are neural models built on top of large, pre-trained, contextual language models that generate representations of words in context and are fine-tuned for the task at hand.

Word Embeddings

Paper
Code

An Effective Label Noise Model for DNN Text Classification

no code implementations • NAACL 2019 • Ishan Jindal, Daniel Pressel, Brian Lester, Matthew Nokleby

In this paper, we propose an approach to training deep networks that is robust to label noise.

General Classification Image Classification +3

Paper
Add Code

Intent Features for Rich Natural Language Understanding

1 code implementation • NAACL 2021 • Brian Lester, Sagnik Ray Choudhury, Rashmi Prasad, Srinivas Bangalore

Complex natural language understanding modules in dialog systems have a richer understanding of user utterances, and thus are critical in providing a better user experience.

Natural Language Understanding

Paper
Code

SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer

no code implementations • ACL 2022 • Tu Vu, Brian Lester, Noah Constant, Rami Al-Rfou, Daniel Cer

Finally, we propose an efficient retrieval approach that interprets task prompts as task embeddings to identify similar tasks and predict the most transferable source tasks for a novel target task.

Language Modelling Retrieval +1

Paper
Add Code

Reducing Retraining by Recycling Parameter-Efficient Prompts

no code implementations • 10 Aug 2022 • Brian Lester, Joshua Yurtsever, Siamak Shakeri, Noah Constant

Parameter-efficient methods are able to use a single frozen pre-trained large language model (LLM) to perform many tasks by learning task-specific soft prompts that modulate model behavior when concatenated to the input text.

Language Modelling Large Language Model

Paper
Add Code

Training LLMs over Neurally Compressed Text

no code implementations • 4 Apr 2024 • Brian Lester, Jaehoon Lee, Alex Alemi, Jeffrey Pennington, Adam Roberts, Jascha Sohl-Dickstein, Noah Constant

In this paper, we explore the idea of training large language models (LLMs) over highly compressed text.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.