Search Results for author: Quoc Le

Found 42 papers, 15 papers with code

Evolving Machine Learning Algorithms From Scratch

no code implementations ICML 2020 Esteban Real, Chen Liang, David So, Quoc Le

However, this progress has largely focused on the architecture of neural networks, where it has relied on sophisticated expert-designed layers as building blocks---or similarly restrictive search spaces.

AutoML BIG-bench Machine Learning

Brainformers: Trading Simplicity for Efficiency

no code implementations29 May 2023 Yanqi Zhou, Nan Du, Yanping Huang, Daiyi Peng, Chang Lan, Da Huang, Siamak Shakeri, David So, Andrew Dai, Yifeng Lu, Zhifeng Chen, Quoc Le, Claire Cui, James Laundon, Jeff Dean

Using this insight, we develop a complex block, named Brainformer, that consists of a diverse sets of layers such as sparsely gated feed-forward layers, dense feed-forward layers, attention layers, and various forms of layer normalization and activation functions.

Rationale-Augmented Ensembles in Language Models

no code implementations2 Jul 2022 Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Denny Zhou

Recent research has shown that rationales, or step-by-step chains of thought, can be used to improve performance in multi-step reasoning tasks.

Prompt Engineering Question Answering +2

TabNAS: Rejection Sampling for Neural Architecture Search on Tabular Datasets

1 code implementation15 Apr 2022 Chengrun Yang, Gabriel Bender, Hanxiao Liu, Pieter-Jan Kindermans, Madeleine Udell, Yifeng Lu, Quoc Le, Da Huang

The best neural architecture for a given machine learning problem depends on many factors: not only the complexity and structure of the dataset, but also on resource constraints including latency, compute, energy consumption, etc.

Image Retrieval Neural Architecture Search +1

Self-Consistency Improves Chain of Thought Reasoning in Language Models

no code implementations21 Mar 2022 Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang, Aakanksha Chowdhery, Denny Zhou

Chain-of-thought prompting combined with pre-trained large language models has achieved encouraging results on complex reasoning tasks.

Ranked #21 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +2

Mixture-of-Experts with Expert Choice Routing

no code implementations18 Feb 2022 Yanqi Zhou, Tao Lei, Hanxiao Liu, Nan Du, Yanping Huang, Vincent Zhao, Andrew Dai, Zhifeng Chen, Quoc Le, James Laudon

Prior work allocates a fixed number of experts to each token using a top-k function regardless of the relative importance of different tokens.

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

10 code implementations28 Jan 2022 Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou

We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning.

GSM8K Language Modelling

Searching for Efficient Transformers for Language Modeling

no code implementations NeurIPS 2021 David So, Wojciech Mańke, Hanxiao Liu, Zihang Dai, Noam Shazeer, Quoc Le

For example, at a 500M parameter size, Primer improves the original T5 architecture on C4 auto-regressive language modeling, reducing the training cost by 4X.

Language Modelling

Program Synthesis with Large Language Models

1 code implementation16 Aug 2021 Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, Charles Sutton

Our largest models, even without finetuning on a code dataset, can synthesize solutions to 59. 6 percent of the problems from MBPP using few-shot learning with a well-designed prompt.

Few-Shot Learning Program Synthesis

A Full-Stack Search Technique for Domain Optimized Deep Learning Accelerators

no code implementations26 May 2021 Dan Zhang, Safeen Huda, Ebrahim Songhori, Kartik Prabhu, Quoc Le, Anna Goldie, Azalia Mirhoseini

The rapidly-changing deep learning landscape presents a unique opportunity for building inference accelerators optimized for specific datacenter-scale workloads.

Optical Character Recognition (OCR) Scheduling

Carbon Emissions and Large Neural Network Training

no code implementations21 Apr 2021 David Patterson, Joseph Gonzalez, Quoc Le, Chen Liang, Lluis-Miquel Munguia, Daniel Rothchild, David So, Maud Texier, Jeff Dean

To help reduce the carbon footprint of ML, we believe energy usage and CO2e should be a key metric in evaluating models, and we are collaborating with MLPerf developers to include energy usage during training and inference in this industry standard benchmark.

Neural Architecture Search Scheduling

SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network

no code implementations5 Apr 2021 William Chan, Daniel Park, Chris Lee, Yu Zhang, Quoc Le, Mohammad Norouzi

We present SpeechStew, a speech recognition model that is trained on a combination of various publicly available speech recognition datasets: AMI, Broadcast News, Common Voice, LibriSpeech, Switchboard/Fisher, Tedlium, and Wall Street Journal.

Language Modelling speech-recognition +2

Searching for Fast Model Families on Datacenter Accelerators

no code implementations CVPR 2021 Sheng Li, Mingxing Tan, Ruoming Pang, Andrew Li, Liqun Cheng, Quoc Le, Norman P. Jouppi

On top of our DC accelerator optimized neural architecture search space, we further propose a latency-aware compound scaling (LACS), the first multi-objective compound scaling method optimizing both accuracy and latency.

Neural Architecture Search

Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour

no code implementations30 Oct 2020 Arissa Wongpanich, Hieu Pham, James Demmel, Mingxing Tan, Quoc Le, Yang You, Sameer Kumar

EfficientNets are a family of state-of-the-art image classification models based on efficiently scaled convolutional neural networks.

Image Classification Playing the Game of 2048

Can weight sharing outperform random architecture search? An investigation with TuNAS

1 code implementation CVPR 2020 Gabriel Bender, Hanxiao Liu, Bo Chen, Grace Chu, Shuyang Cheng, Pieter-Jan Kindermans, Quoc Le

Efficient Neural Architecture Search methods based on weight sharing have shown good promise in democratizing Neural Architecture Search for computer vision models.

Image Classification Neural Architecture Search

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

no code implementations ICML 2020 Denny Zhou, Mao Ye, Chen Chen, Tianjian Meng, Mingxing Tan, Xiaodan Song, Quoc Le, Qiang Liu, Dale Schuurmans

This is achieved by layerwise imitation, that is, forcing the thin network to mimic the intermediate outputs of the wide network from layer to layer.

Model Compression

BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models

1 code implementation ECCV 2020 Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas Huang, Xiaodan Song, Ruoming Pang, Quoc Le

Without extra retraining or post-processing steps, we are able to train a single set of shared weights on ImageNet and use these weights to obtain child models whose sizes range from 200 to 1000 MFLOPs.

Neural Architecture Search

Scaling Up Neural Architecture Search with Big Single-Stage Models

no code implementations25 Sep 2019 Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas Huang, Xiaodan Song, Quoc Le

In this work, we propose BigNAS, an approach that simplifies this workflow and scales up neural architecture search to target a wide range of model sizes simultaneously.

Neural Architecture Search

Using Videos to Evaluate Image Model Robustness

no code implementations22 Apr 2019 Keren Gu, Brandon Yang, Jiquan Ngiam, Quoc Le, Jonathon Shlens

Compared to previous studies on adversarial examples and synthetic distortions, natural robustness captures a more diverse set of common image transformations that occur in the natural environment.

AirDialogue: An Environment for Goal-Oriented Dialogue Research

1 code implementation EMNLP 2018 Wei Wei, Quoc Le, Andrew Dai, Jia Li

However, current datasets are limited in size, and the environment for training agents and evaluating progress is relatively unsophisticated.

Dialogue Generation

Backprop Evolution

no code implementations8 Aug 2018 Maximilian Alber, Irwan Bello, Barret Zoph, Pieter-Jan Kindermans, Prajit Ramachandran, Quoc Le

The back-propagation algorithm is the cornerstone of deep learning.

Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing

4 code implementations NeurIPS 2018 Chen Liang, Mohammad Norouzi, Jonathan Berant, Quoc Le, Ni Lao

We present Memory Augmented Policy Optimization (MAPO), a simple and novel way to leverage a memory buffer of promising trajectories to reduce the variance of policy gradient estimate.

Combinatorial Optimization Program Synthesis +2

Learning Longer-term Dependencies in RNNs with Auxiliary Losses

no code implementations ICML 2018 Trieu Trinh, Andrew Dai, Thang Luong, Quoc Le

Despite recent advances in training recurrent neural networks (RNNs), capturing long-term dependencies in sequences remains a fundamental challenge.

Document Classification General Classification +1

Unsupervised Pretraining for Sequence to Sequence Learning

no code implementations EMNLP 2017 Ramach, Prajit ran, Peter Liu, Quoc Le

We apply this method to challenging benchmarks in machine translation and abstractive summarization and find that it significantly improves the subsequent supervised models.

Abstractive Text Summarization Language Modelling +2

Massive Exploration of Neural Machine Translation Architectures

12 code implementations EMNLP 2017 Denny Britz, Anna Goldie, Minh-Thang Luong, Quoc Le

Neural Machine Translation (NMT) has shown remarkable progress over the past few years with production systems now being deployed to end-users.

Machine Translation NMT +1

Large-Scale Evolution of Image Classifiers

2 code implementations ICML 2017 Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc Le, Alex Kurakin

Neural networks have proven effective at solving difficult problems but designing their architectures can be challenging, even for image classification problems alone.

Evolutionary Algorithms Hyperparameter Optimization +3

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

4 code implementations23 Jan 2017 Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, Jeff Dean

In this work, we address these challenges and finally realize the promise of conditional computation, achieving greater than 1000x improvements in model capacity with only minor losses in computational efficiency on modern GPU clusters.

Language Modelling Machine Translation +1

Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision (Short Version)

no code implementations4 Dec 2016 Chen Liang, Jonathan Berant, Quoc Le, Kenneth D. Forbus, Ni Lao

In this work, we propose the Manager-Programmer-Computer framework, which integrates neural networks with non-differentiable memory to support abstract, scalable and precise operations through a friendly neural computer interface.

Feature Engineering Natural Language Understanding +2

Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision

2 code implementations ACL 2017 Chen Liang, Jonathan Berant, Quoc Le, Kenneth D. Forbus, Ni Lao

Harnessing the statistical power of neural networks to perform language understanding and symbolic reasoning is difficult, when it requires executing efficient discrete operations against a large knowledge-base.

Feature Engineering Structured Prediction

A Neural Conversational Model

19 code implementations19 Jun 2015 Oriol Vinyals, Quoc Le

We find that this straightforward model can generate simple conversations given a large conversational training dataset.

Common Sense Reasoning Natural Language Understanding

Using Web Co-occurrence Statistics for Improving Image Categorization

no code implementations19 Dec 2013 Samy Bengio, Jeff Dean, Dumitru Erhan, Eugene Ie, Quoc Le, Andrew Rabinovich, Jonathon Shlens, Yoram Singer

Albeit the simplicity of the resulting optimization problem, it is effective in improving both recognition and localization accuracy.

Common Sense Reasoning Image Categorization +1

Cannot find the paper you are looking for? You can Submit a new open access paper.