Search Results for author: Guodong Zhang

Found 25 papers, 11 papers with code

Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation

no code implementations20 Feb 2023 Bobby He, James Martens, Guodong Zhang, Aleksandar Botev, Andrew Brock, Samuel L Smith, Yee Whye Teh

Skip connections and normalisation layers form two standard architectural components that are ubiquitous for the training of Deep Neural Networks (DNNs), but whose precise roles are poorly understood.

Approximate optimality and the risk/reward tradeoff in a class of bandit problems

no code implementations14 Oct 2022 Zengjing Chen, Larry G. Epstein, Guodong Zhang

This paper studies a multi-armed bandit problem where payoff distributions are known but where the riskiness of payoffs matters.

Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers

1 code implementation ICLR 2022 Guodong Zhang, Aleksandar Botev, James Martens

However, this method (called Deep Kernel Shaping) isn't fully compatible with ReLUs, and produces networks that overfit significantly more than ResNets on ImageNet.

On the Application of Data-Driven Deep Neural Networks in Linear and Nonlinear Structural Dynamics

no code implementations3 Nov 2021 Nan Feng, Guodong Zhang, Kapil Khandelwal

For nonlinear dynamics, it is shown that sparsity in network layers is lost, and efficient DNNs architectures with fully-connected and convolutional network layers are explored.

Transfer Learning

Learning to Give Checkable Answers with Prover-Verifier Games

no code implementations27 Aug 2021 Cem Anil, Guodong Zhang, Yuhuai Wu, Roger Grosse

We develop instantiations of the PVG for two algorithmic tasks, and show that in practice, the verifier learns a robust decision rule that is able to receive useful and reliable information from an untrusted prover.

Differentiable Annealed Importance Sampling and the Perils of Gradient Noise

no code implementations NeurIPS 2021 Guodong Zhang, Kyle Hsu, Jianing Li, Chelsea Finn, Roger Grosse

To this end, we propose Differentiable AIS (DAIS), a variant of AIS which ensures differentiability by abandoning the Metropolis-Hastings corrections.

Stochastic Optimization

A Central Limit Theorem, Loss Aversion and Multi-Armed Bandits

no code implementations10 Jun 2021 Zengjing Chen, Larry G. Epstein, Guodong Zhang

This paper studies a multi-armed bandit problem where the decision-maker is loss averse, in particular she is risk averse in the domain of gains and risk loving in the domain of losses.

Multi-Armed Bandits

A Unified Analysis of First-Order Methods for Smooth Games via Integral Quadratic Constraints

1 code implementation23 Sep 2020 Guodong Zhang, Xuchan Bao, Laurent Lessard, Roger Grosse

The theory of integral quadratic constraints (IQCs) allows the certification of exponential convergence of interconnected systems containing nonlinear or uncertain elements.

On the Suboptimality of Negative Momentum for Minimax Optimization

no code implementations17 Aug 2020 Guodong Zhang, Yuanhao Wang

Smooth game optimization has recently attracted great interest in machine learning as it generalizes the single-objective optimization paradigm.

Picking Winning Tickets Before Training by Preserving Gradient Flow

3 code implementations ICLR 2020 Chaoqi Wang, Guodong Zhang, Roger Grosse

Overparameterization has been shown to benefit both the optimization and generalization of neural networks, but large networks are resource hungry at both training and test time.

Network Pruning

Fast Convergence of Natural Gradient Descent for Over-Parameterized Neural Networks

no code implementations NeurIPS 2019 Guodong Zhang, James Martens, Roger B. Grosse

For two-layer ReLU neural networks (i. e. with one hidden layer), we prove that these two conditions do hold throughout the training under the assumptions that the inputs do not degenerate and the network is over-parameterized.

On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach

no code implementations ICLR 2020 Yuanhao Wang, Guodong Zhang, Jimmy Ba

Many tasks in modern machine learning can be formulated as finding equilibria in \emph{sequential} games.

Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model

1 code implementation NeurIPS 2019 Guodong Zhang, Lala Li, Zachary Nado, James Martens, Sushant Sachdeva, George E. Dahl, Christopher J. Shallue, Roger Grosse

Increasing the batch size is a popular way to speed up neural network training, but beyond some critical batch size, larger batch sizes yield diminishing returns.

Benchmarking Model-Based Reinforcement Learning

2 code implementations3 Jul 2019 Tingwu Wang, Xuchan Bao, Ignasi Clavera, Jerrick Hoang, Yeming Wen, Eric Langlois, Shunshi Zhang, Guodong Zhang, Pieter Abbeel, Jimmy Ba

Model-based reinforcement learning (MBRL) is widely seen as having the potential to be significantly more sample efficient than model-free RL.

Benchmarking Model-based Reinforcement Learning +3

Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks

no code implementations27 May 2019 Guodong Zhang, James Martens, Roger Grosse

In this work, we analyze for the first time the speed of convergence of natural gradient descent on nonlinear neural networks with squared-error loss.

EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis

1 code implementation15 May 2019 Chaoqi Wang, Roger Grosse, Sanja Fidler, Guodong Zhang

Reducing the test time resource requirements of a neural network while preserving test accuracy is crucial for running inference on resource-constrained devices.

Network Pruning

Functional Variational Bayesian Neural Networks

2 code implementations ICLR 2019 Shengyang Sun, Guodong Zhang, Jiaxin Shi, Roger Grosse

We introduce functional variational Bayesian neural networks (fBNNs), which maximize an Evidence Lower BOund (ELBO) defined directly on stochastic processes, i. e. distributions over functions.

Bayesian Inference Gaussian Processes +1

An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise

no code implementations21 Feb 2019 Yeming Wen, Kevin Luk, Maxime Gazeau, Guodong Zhang, Harris Chan, Jimmy Ba

We demonstrate that the learning performance of our method is more accurately captured by the structure of the covariance matrix of the noise rather than by the variance of gradients.

Stochastic Optimization

Eigenvalue Corrected Noisy Natural Gradient

3 code implementations30 Nov 2018 Juhan Bae, Guodong Zhang, Roger Grosse

A recently proposed method, noisy natural gradient, is a surprisingly simple method to fit expressive posteriors by adding weight noise to regular natural gradient updates.

Three Mechanisms of Weight Decay Regularization

no code implementations ICLR 2019 Guodong Zhang, Chaoqi Wang, Bowen Xu, Roger Grosse

Weight decay is one of the standard tricks in the neural network toolbox, but the reasons for its regularization effect are poorly understood, and recent results have cast doubt on the traditional interpretation in terms of $L_2$ regularization.

Exploring Curvature Noise in Large-Batch Stochastic Optimization

no code implementations27 Sep 2018 Yeming Wen, Kevin Luk, Maxime Gazeau, Guodong Zhang, Harris Chan, Jimmy Ba

Unfortunately, a major drawback is the so-called generalization gap: large-batch training typically leads to a degradation in generalization performance of the model as compared to small-batch training.

Stochastic Optimization

Noisy Natural Gradient as Variational Inference

2 code implementations ICML 2018 Guodong Zhang, Shengyang Sun, David Duvenaud, Roger Grosse

Variational Bayesian neural nets combine the flexibility of deep learning with Bayesian uncertainty estimation.

Active Learning Efficient Exploration +2

Deformable Convolutional Networks

38 code implementations ICCV 2017 Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, Yichen Wei

Convolutional neural networks (CNNs) are inherently limited to model geometric transformations due to the fixed geometric structures in its building modules.

Object Detection Semantic Segmentation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.