Search Results for author: David So

Found 6 papers, 0 papers with code

Evolving Machine Learning Algorithms From Scratch

no code implementations • ICML 2020 • Esteban Real, Chen Liang, David So, Quoc Le

However, this progress has largely focused on the architecture of neural networks, where it has relied on sophisticated expert-designed layers as building blocks---or similarly restrictive search spaces.

AutoML BIG-bench Machine Learning

Paper
Add Code

Brainformers: Trading Simplicity for Efficiency

no code implementations • 29 May 2023 • Yanqi Zhou, Nan Du, Yanping Huang, Daiyi Peng, Chang Lan, Da Huang, Siamak Shakeri, David So, Andrew Dai, Yifeng Lu, Zhifeng Chen, Quoc Le, Claire Cui, James Laundon, Jeff Dean

Using this insight, we develop a complex block, named Brainformer, that consists of a diverse sets of layers such as sparsely gated feed-forward layers, dense feed-forward layers, attention layers, and various forms of layer normalization and activation functions.

Paper
Add Code

The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink

no code implementations • 11 Apr 2022 • David Patterson, Joseph Gonzalez, Urs Hölzle, Quoc Le, Chen Liang, Lluis-Miquel Munguia, Daniel Rothchild, David So, Maud Texier, Jeff Dean

Four best practices can reduce ML training energy by up to 100x and CO2 emissions up to 1000x.

BIG-bench Machine Learning Total Energy

Paper
Add Code

Searching for Efficient Transformers for Language Modeling

no code implementations • NeurIPS 2021 • David So, Wojciech Mańke, Hanxiao Liu, Zihang Dai, Noam Shazeer, Quoc Le

For example, at a 500M parameter size, Primer improves the original T5 architecture on C4 auto-regressive language modeling, reducing the training cost by 4X.

Language Modelling

Paper
Add Code

Carbon Emissions and Large Neural Network Training

no code implementations • 21 Apr 2021 • David Patterson, Joseph Gonzalez, Quoc Le, Chen Liang, Lluis-Miquel Munguia, Daniel Rothchild, David So, Maud Texier, Jeff Dean

To help reduce the carbon footprint of ML, we believe energy usage and CO2e should be a key metric in evaluating models, and we are collaborating with MLPerf developers to include energy usage during training and inference in this industry standard benchmark.

Neural Architecture Search Scheduling

Paper
Add Code

Improving image generative models with human interactions

no code implementations • ICLR 2018 • Andrew Kyle Lampinen, David So, Douglas Eck, Fred Bertsch

GANs provide a framework for training generative models which mimic a data distribution.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.