Search Results for author: David Dohan

Found 13 papers, 6 papers with code

Show Your Work: Scratchpads for Intermediate Computation with Language Models

no code implementations30 Nov 2021 Maxwell Nye, Anders Johan Andreassen, Guy Gur-Ari, Henryk Michalewski, Jacob Austin, David Bieber, David Dohan, Aitor Lewkowycz, Maarten Bosma, David Luan, Charles Sutton, Augustus Odena

Large pre-trained language models perform remarkably well on tasks that can be done "in one pass", such as generating realistic text or synthesizing computer programs.

Program Synthesis with Large Language Models

no code implementations16 Aug 2021 Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, Charles Sutton

Our largest models, even without finetuning on a code dataset, can synthesize solutions to 59. 6 percent of the problems from MBPP using few-shot learning with a well-designed prompt.

Few-Shot Learning Program Synthesis

Latent Programmer: Discrete Latent Codes for Program Synthesis

no code implementations1 Dec 2020 Joey Hong, David Dohan, Rishabh Singh, Charles Sutton, Manzil Zaheer

The latent codes are learned using a self-supervised learning principle, in which first a discrete autoencoder is trained on the output sequences, and then the resulting latent codes are used as intermediate targets for the end-to-end sequence prediction task.

Document Summarization Program Synthesis +1

Is Transfer Learning Necessary for Protein Landscape Prediction?

no code implementations31 Oct 2020 Amir Shanehsazzadeh, David Belanger, David Dohan

In this paper, we show that CNN models trained solely using supervised learning both compete with and sometimes outperform the best models from TAPE that leverage expensive pretraining on large protein datasets.

Representation Learning Transfer Learning

Fixed-Length Protein Embeddings using Contextual Lenses

1 code implementation15 Oct 2020 Amir Shanehsazzadeh, David Belanger, David Dohan

We consider transformer (BERT) protein language models that are pretrained on the TrEMBL data set and learn fixed-length embeddings on top of them with contextual lenses.

Rethinking Attention with Performers

11 code implementations ICLR 2021 Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy Colwell, Adrian Weller

We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to quadratic) space and time complexity, without relying on any priors such as sparsity or low-rankness.

Ranked #15 on Image Generation on ImageNet 64x64 (Bits per dim metric)

Image Generation

Population-Based Black-Box Optimization for Biological Sequence Design

no code implementations ICML 2020 Christof Angermueller, David Belanger, Andreea Gane, Zelda Mariet, David Dohan, Kevin Murphy, Lucy Colwell, D. Sculley

The cost and latency of wet-lab experiments requires methods that find good sequences in few experimental rounds of large batches of sequences--a setting that off-the-shelf black-box optimization methods are ill-equipped to handle.


no code implementations ICLR 2018 Minh-Thang Luong, David Dohan, Adams Wei Yu, Quoc V. Le, Barret Zoph, Vijay Vasudevan

Neural architecture search (NAS), the task of finding neural architectures automatically, has recently emerged as a promising approach for unveiling better models over human-designed ones.

Language Modelling Neural Architecture Search +2

Cannot find the paper you are looking for? You can Submit a new open access paper.