no code implementations • 20 Feb 2023 • Bobby He, James Martens, Guodong Zhang, Aleksandar Botev, Andrew Brock, Samuel L Smith, Yee Whye Teh
Skip connections and normalisation layers form two standard architectural components that are ubiquitous for the training of Deep Neural Networks (DNNs), but whose precise roles are poorly understood.
no code implementations • 14 Oct 2022 • Zengjing Chen, Larry G. Epstein, Guodong Zhang
This paper studies a multi-armed bandit problem where payoff distributions are known but where the riskiness of payoffs matters.
1 code implementation • ICLR 2022 • Guodong Zhang, Aleksandar Botev, James Martens
However, this method (called Deep Kernel Shaping) isn't fully compatible with ReLUs, and produces networks that overfit significantly more than ResNets on ImageNet.
no code implementations • 3 Nov 2021 • Nan Feng, Guodong Zhang, Kapil Khandelwal
For nonlinear dynamics, it is shown that sparsity in network layers is lost, and efficient DNNs architectures with fully-connected and convolutional network layers are explored.
no code implementations • 27 Aug 2021 • Cem Anil, Guodong Zhang, Yuhuai Wu, Roger Grosse
We develop instantiations of the PVG for two algorithmic tasks, and show that in practice, the verifier learns a robust decision rule that is able to receive useful and reliable information from an untrusted prover.
no code implementations • NeurIPS 2021 • Guodong Zhang, Kyle Hsu, Jianing Li, Chelsea Finn, Roger Grosse
To this end, we propose Differentiable AIS (DAIS), a variant of AIS which ensures differentiability by abandoning the Metropolis-Hastings corrections.
no code implementations • 10 Jun 2021 • Zengjing Chen, Larry G. Epstein, Guodong Zhang
This paper studies a multi-armed bandit problem where the decision-maker is loss averse, in particular she is risk averse in the domain of gains and risk loving in the domain of losses.
no code implementations • 18 Feb 2021 • Guodong Zhang, Yuanhao Wang, Laurent Lessard, Roger Grosse
Smooth minimax games often proceed by simultaneous or alternating gradient updates.
1 code implementation • 23 Sep 2020 • Guodong Zhang, Xuchan Bao, Laurent Lessard, Roger Grosse
The theory of integral quadratic constraints (IQCs) allows the certification of exponential convergence of interconnected systems containing nonlinear or uncertain elements.
no code implementations • 17 Aug 2020 • Guodong Zhang, Yuanhao Wang
Smooth game optimization has recently attracted great interest in machine learning as it generalizes the single-objective optimization paradigm.
3 code implementations • ICLR 2020 • Chaoqi Wang, Guodong Zhang, Roger Grosse
Overparameterization has been shown to benefit both the optimization and generalization of neural networks, but large networks are resource hungry at both training and test time.
no code implementations • NeurIPS 2019 • Guodong Zhang, James Martens, Roger B. Grosse
For two-layer ReLU neural networks (i. e. with one hidden layer), we prove that these two conditions do hold throughout the training under the assumptions that the inputs do not degenerate and the network is over-parameterized.
no code implementations • ICLR 2020 • Yuanhao Wang, Guodong Zhang, Jimmy Ba
Many tasks in modern machine learning can be formulated as finding equilibria in \emph{sequential} games.
1 code implementation • NeurIPS 2019 • Guodong Zhang, Lala Li, Zachary Nado, James Martens, Sushant Sachdeva, George E. Dahl, Christopher J. Shallue, Roger Grosse
Increasing the batch size is a popular way to speed up neural network training, but beyond some critical batch size, larger batch sizes yield diminishing returns.
2 code implementations • 3 Jul 2019 • Tingwu Wang, Xuchan Bao, Ignasi Clavera, Jerrick Hoang, Yeming Wen, Eric Langlois, Shunshi Zhang, Guodong Zhang, Pieter Abbeel, Jimmy Ba
Model-based reinforcement learning (MBRL) is widely seen as having the potential to be significantly more sample efficient than model-free RL.
no code implementations • 27 May 2019 • Guodong Zhang, James Martens, Roger Grosse
In this work, we analyze for the first time the speed of convergence of natural gradient descent on nonlinear neural networks with squared-error loss.
1 code implementation • 15 May 2019 • Chaoqi Wang, Roger Grosse, Sanja Fidler, Guodong Zhang
Reducing the test time resource requirements of a neural network while preserving test accuracy is crucial for running inference on resource-constrained devices.
2 code implementations • ICLR 2019 • Shengyang Sun, Guodong Zhang, Jiaxin Shi, Roger Grosse
We introduce functional variational Bayesian neural networks (fBNNs), which maximize an Evidence Lower BOund (ELBO) defined directly on stochastic processes, i. e. distributions over functions.
no code implementations • 21 Feb 2019 • Yeming Wen, Kevin Luk, Maxime Gazeau, Guodong Zhang, Harris Chan, Jimmy Ba
We demonstrate that the learning performance of our method is more accurately captured by the structure of the covariance matrix of the noise rather than by the variance of gradients.
3 code implementations • 30 Nov 2018 • Juhan Bae, Guodong Zhang, Roger Grosse
A recently proposed method, noisy natural gradient, is a surprisingly simple method to fit expressive posteriors by adding weight noise to regular natural gradient updates.
no code implementations • ICLR 2019 • Guodong Zhang, Chaoqi Wang, Bowen Xu, Roger Grosse
Weight decay is one of the standard tricks in the neural network toolbox, but the reasons for its regularization effect are poorly understood, and recent results have cast doubt on the traditional interpretation in terms of $L_2$ regularization.
no code implementations • 27 Sep 2018 • Yeming Wen, Kevin Luk, Maxime Gazeau, Guodong Zhang, Harris Chan, Jimmy Ba
Unfortunately, a major drawback is the so-called generalization gap: large-batch training typically leads to a degradation in generalization performance of the model as compared to small-batch training.
4 code implementations • ICML 2018 • Shengyang Sun, Guodong Zhang, Chaoqi Wang, Wenyuan Zeng, Jiaman Li, Roger Grosse
The NKN architecture is based on the composition rules for kernels, so that each unit of the network corresponds to a valid kernel.
2 code implementations • ICML 2018 • Guodong Zhang, Shengyang Sun, David Duvenaud, Roger Grosse
Variational Bayesian neural nets combine the flexibility of deep learning with Bayesian uncertainty estimation.
38 code implementations • ICCV 2017 • Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, Yichen Wei
Convolutional neural networks (CNNs) are inherently limited to model geometric transformations due to the fixed geometric structures in its building modules.
Ranked #3 on
Vessel Detection
on Vessel detection Dateset