Search Results for author: Dami Choi

Found 8 papers, 5 papers with code

Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

no code implementations NeurIPS 2023 Dami Choi, Derrick Xin, Hamid Dadkhahi, Justin Gilmer, Ankush Garg, Orhan Firat, Chih-Kuan Yeh, Andrew M. Dai, Behrooz Ghorbani

In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance.

Language Modelling Machine Translation +3

Self-Tuning Stochastic Optimization with Curvature-Aware Gradient Filtering

no code implementations NeurIPS Workshop ICBINB 2020 Ricky T. Q. Chen, Dami Choi, Lukas Balles, David Duvenaud, Philipp Hennig

Standard first-order stochastic optimization algorithms base their updates solely on the average mini-batch gradient, and it has been shown that tracking additional quantities such as the curvature can help de-sensitize common hyperparameters.

Stochastic Optimization

On Empirical Comparisons of Optimizers for Deep Learning

1 code implementation11 Oct 2019 Dami Choi, Christopher J. Shallue, Zachary Nado, Jaehoon Lee, Chris J. Maddison, George E. Dahl

In particular, we find that the popular adaptive gradient methods never underperform momentum or gradient descent.


Faster Neural Network Training with Data Echoing

1 code implementation12 Jul 2019 Dami Choi, Alexandre Passos, Christopher J. Shallue, George E. Dahl

In the twilight of Moore's law, GPUs and other specialized hardware accelerators have dramatically sped up neural network training.

Guided Evolutionary Strategies: Escaping the curse of dimensionality in random search

no code implementations ICLR 2019 Niru Maheswaranathan, Luke Metz, George Tucker, Dami Choi, Jascha Sohl-Dickstein

This arises when an approximate gradient is easier to compute than the full gradient (e. g. in meta-learning or unrolled optimization), or when a true gradient is intractable and is replaced with a surrogate (e. g. in certain reinforcement learning applications or training networks with discrete variables).


Guided evolutionary strategies: Augmenting random search with surrogate gradients

1 code implementation ICLR 2019 Niru Maheswaranathan, Luke Metz, George Tucker, Dami Choi, Jascha Sohl-Dickstein

We propose Guided Evolutionary Strategies, a method for optimally using surrogate gradient directions along with random search.


Cannot find the paper you are looking for? You can Submit a new open access paper.