Search Results for author: Christopher J. Shallue

Found 8 papers, 4 papers with code

On Empirical Comparisons of Optimizers for Deep Learning

no code implementations11 Oct 2019 Dami Choi, Christopher J. Shallue, Zachary Nado, Jaehoon Lee, Chris J. Maddison, George E. Dahl

In particular, we find that the popular adaptive gradient methods never underperform momentum or gradient descent.

Faster Neural Network Training with Data Echoing

1 code implementation12 Jul 2019 Dami Choi, Alexandre Passos, Christopher J. Shallue, George E. Dahl

In the twilight of Moore's law, GPUs and other specialized hardware accelerators have dramatically sped up neural network training.

Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model

1 code implementation NeurIPS 2019 Guodong Zhang, Lala Li, Zachary Nado, James Martens, Sushant Sachdeva, George E. Dahl, Christopher J. Shallue, Roger Grosse

Increasing the batch size is a popular way to speed up neural network training, but beyond some critical batch size, larger batch sizes yield diminishing returns.

Identifying Exoplanets with Deep Learning III: Automated Triage and Vetting of TESS Candidates

2 code implementations4 Apr 2019 Liang Yu, Andrew Vanderburg, Chelsea Huang, Christopher J. Shallue, Ian J. M. Crossfield, B. Scott Gaudi, Tansu Daylan, Anne Dattilo, David J. Armstrong, George R. Ricker, Roland K. Vanderspek, David W. Latham, Sara Seager, Jason Dittmann, John P. Doty, Ana Glidden, Samuel N. Quinn

We apply our model on new data from Sector 6, and present 335 new signals that received the highest scores in triage and vetting and were also identified as planet candidates by human vetters.

Earth and Planetary Astrophysics

Measuring the Effects of Data Parallelism on Neural Network Training

no code implementations8 Nov 2018 Christopher J. Shallue, Jaehoon Lee, Joseph Antognini, Jascha Sohl-Dickstein, Roy Frostig, George E. Dahl

Along the way, we show that disagreements in the literature on how batch size affects model quality can largely be explained by differences in metaparameter tuning and compute budgets at different batch sizes.

Embedding Text in Hyperbolic Spaces

no code implementations WS 2018 Bhuwan Dhingra, Christopher J. Shallue, Mohammad Norouzi, Andrew M. Dai, George E. Dahl

Ideally, we could incorporate our prior knowledge of this hierarchical structure into unsupervised learning algorithms that work on text data.

Sentence Embeddings

Cannot find the paper you are looking for? You can Submit a new open access paper.