Search Results for author: Zhanxing Zhu

Found 63 papers, 19 papers with code

Functional Bayesian Neural Networks for Model Uncertainty Quantification

no code implementations ICLR 2019 Nanyang Ye, Zhanxing Zhu

In this paper, we extend the Bayesian neural network to functional Bayesian neural network with functional Monte Carlo methods that use the samples of functionals instead of samples of networks' parameters for inference to overcome the curse of dimensionality for uncertainty quantification.

Uncertainty Quantification

On Breaking Deep Generative Model-based Defenses and Beyond

1 code implementation ICML 2020 Yanzhi Chen, Renjie Xie, Zhanxing Zhu

The idea is to view the inversion phase as a dynamical system, through which we extract the gradient with respect to the input by tracing its recent trajectory.

Doubly Stochastic Models: Learning with Unbiased Label Noises and Inference Stability

no code implementations1 Apr 2023 Haoyi Xiong, Xuhong LI, Boyang Yu, Zhanxing Zhu, Dongrui Wu, Dejing Dou

While previous studies primarily focus on the affects of label noises to the performance of learning, our work intends to investigate the implicit regularization effects of the label noises, under mini-batch sampling settings of stochastic gradient descent (SGD), with assumptions that label noises are unbiased.

MonoFlow: Rethinking Divergence GANs via the Perspective of Wasserstein Gradient Flows

no code implementations2 Feb 2023 Mingxuan Yi, Zhanxing Zhu, Song Liu

The conventional understanding of adversarial training in generative adversarial networks (GANs) is that the discriminator is trained to estimate a divergence, and the generator learns to minimize this divergence.

Fine-grained differentiable physics: a yarn-level model for fabrics

1 code implementation ICLR 2022 Deshan Gong, Zhanxing Zhu, Andrew J. Bulpitt, He Wang

To this end, we propose several differentiable forces, whose counterparts in empirical physics are indifferentiable, to facilitate gradient-based learning.

Spherical Motion Dynamics: Learning Dynamics of Normalized Neural Network using SGD and Weight Decay

no code implementations NeurIPS 2021 Ruosi Wan, Zhanxing Zhu, Xiangyu Zhang, Jian Sun

Specifically, 1) we introduce the assumptions that can lead to equilibrium state in SMD, and prove equilibrium can be reached in a linear rate regime under given assumptions; 2) we propose ``angular update" as a substitute for effective learning rate to depict the state of SMD, and derive the theoretical value of angular update in equilibrium state; 3) we verify our assumptions and theoretical results on various large-scale computer vision tasks including ImageNet and MSCOCO with standard settings.

Implicit Bias of Adversarial Training for Deep Neural Networks

no code implementations ICLR 2022 Bochen Lv, Zhanxing Zhu

Furthermore, we generalize this result to the case of adversarial training for non-linear homogeneous deep neural networks without the linear separability of the dataset.

Adversarial Invariant Learning

1 code implementation CVPR 2021 Nanyang Ye, Jingxuan Tang, Huayu Deng, Xiao-Yun Zhou, Qianxiao Li, Zhenguo Li, Guang-Zhong Yang, Zhanxing Zhu

To the best of our knowledge, this is one of the first to adopt differentiable environment splitting method to enable stable predictions across environments without environment index information, which achieves the state-of-the-art performance on datasets with strong spurious correlation, such as Colored MNIST.

Domain Generalization Out-of-Distribution Generalization

Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization

1 code implementation31 Mar 2021 Zeke Xie, Li Yuan, Zhanxing Zhu, Masashi Sugiyama

It is well-known that stochastic gradient noise (SGN) acts as implicit regularization for deep learning and is essentially important for both optimization and generalization of deep networks.

Robust Multi-Agent Reinforcement Learning Driven by Correlated Equilibrium

no code implementations1 Jan 2021 Yizheng Hu, Kun Shao, Dong Li, Jianye Hao, Wulong Liu, Yaodong Yang, Jun Wang, Zhanxing Zhu

Therefore, to achieve robust CMARL, we introduce novel strategies to encourage agents to learn correlated equilibrium while maximally preserving the convenience of the decentralized execution.

Adversarial Robustness reinforcement-learning +2

Can We Use Gradient Norm as a Measure of Generalization Error for Model Selection in Practice?

no code implementations1 Jan 2021 Haozhe An, Haoyi Xiong, Xuhong LI, Xingjian Li, Dejing Dou, Zhanxing Zhu

The recent theoretical investigation (Li et al., 2020) on the upper bound of generalization error of deep neural networks (DNNs) demonstrates the potential of using the gradient norm as a measure that complements validation accuracy for model selection in practice.

Model Selection

A frequency domain analysis of gradient-based adversarial examples

no code implementations1 Jan 2021 Bochen Lv, Pu Yang, Zehao Wang, Zhanxing Zhu

And the log-spectrum difference of the adversarial examples and clean image is more concentrated in the high-frequency part than the low-frequency part.

Implicit Regularization Effects of Unbiased Random Label Noises with SGD

no code implementations1 Jan 2021 Haoyi Xiong, Xuhong LI, Boyang Yu, Dejing Dou, Dongrui Wu, Zhanxing Zhu

Random label noises (or observational noises) widely exist in practical machinelearning settings.

Neural Approximate Sufficient Statistics for Likelihood-free Inference

no code implementations ICLR 2021 Yanzhi Chen, Dinghuai Zhang, Michael U. Gutmann, Aaron Courville, Zhanxing Zhu

We consider the fundamental problem of how to automatically construct summary statistics for likelihood-free inference where the evaluation of likelihood function is intractable but sampling / simulating data from the model is possible.

Amata: An Annealing Mechanism for Adversarial Training Acceleration

no code implementations15 Dec 2020 Nanyang Ye, Qianxiao Li, Xiao-Yun Zhou, Zhanxing Zhu

However, conducting adversarial training brings much computational overhead compared with standard training.

Spatial-Temporal Fusion Graph Neural Networks for Traffic Flow Forecasting

2 code implementations15 Dec 2020 Mengzhang Li, Zhanxing Zhu

SFTGNN could effectively learn hidden spatial-temporal dependencies by a novel fusion operation of various spatial and temporal graphs, which is generated by a data-driven method.

Traffic Prediction

Neural Approximate Sufficient Statistics for Implicit Models

1 code implementation20 Oct 2020 Yanzhi Chen, Dinghuai Zhang, Michael Gutmann, Aaron Courville, Zhanxing Zhu

We consider the fundamental problem of how to automatically construct summary statistics for implicit generative models where the evaluation of the likelihood function is intractable, but sampling data from the model is possible.

Automatic Data Augmentation for 3D Medical Image Segmentation

1 code implementation7 Oct 2020 Ju Xu, Mengzhang Li, Zhanxing Zhu

Data augmentation is an effective and universal technique for improving generalization performance of deep neural networks.

Data Augmentation Image Segmentation +2

Informative Dropout for Robust Representation Learning: A Shape-bias Perspective

1 code implementation ICML 2020 Baifeng Shi, Dinghuai Zhang, Qi Dai, Zhanxing Zhu, Yadong Mu, Jingdong Wang

Specifically, we discriminate texture from shape based on local self-information in an image, and adopt a Dropout-like algorithm to decorrelate the model output from the local texture.

Domain Generalization Representation Learning

Spherical Motion Dynamics: Learning Dynamics of Neural Network with Normalization, Weight Decay, and SGD

no code implementations15 Jun 2020 Ruosi Wan, Zhanxing Zhu, Xiangyu Zhang, Jian Sun

In this work, we comprehensively reveal the learning dynamics of neural network with normalization, weight decay (WD), and SGD (with momentum), named as Spherical Motion Dynamics (SMD).

Global Robustness Verification Networks

no code implementations8 Jun 2020 Weidi Sun, Yuteng Lu, Xiyue Zhang, Zhanxing Zhu, Meng Sun

The wide deployment of deep neural networks, though achieving great success in many domains, has severe safety and reliability concerns.

Adversarial Attack

Black-Box Certification with Randomized Smoothing: A Functional Optimization Based Framework

no code implementations NeurIPS 2020 Dinghuai Zhang, Mao Ye, Chengyue Gong, Zhanxing Zhu, Qiang Liu

Randomized classifiers have been shown to provide a promising approach for achieving certified robustness against adversarial attacks in deep learning.

Towards Making Deep Transfer Learning Never Hurt

no code implementations18 Nov 2019 Ruosi Wan, Haoyi Xiong, Xingjian Li, Zhanxing Zhu, Jun Huan

The empirical results show that the proposed descent direction estimation strategy DTNH can always improve the performance of deep transfer learning tasks based on all above regularizers, even when transferring pre-trained weights from inappropriate networks.

Knowledge Distillation Transfer Learning

Filling the Soap Bubbles: Efficient Black-Box Adversarial Certification with Non-Gaussian Smoothing

no code implementations25 Sep 2019 Dinghuai Zhang*, Mao Ye*, Chengyue Gong*, Zhanxing Zhu, Qiang Liu

Randomized classifiers have been shown to provide a promising approach for achieving certified robustness against adversarial attacks in deep learning.

Spatio-temporal Manifold Learning for Human Motions via Long-horizon Modeling

no code implementations20 Aug 2019 He Wang, Edmond S. L. Ho, Hubert P. H. Shum, Zhanxing Zhu

In this paper, we propose a new deep network to tackle these challenges by creating a natural motion manifold that is versatile for many applications.

Denoising Time Series Analysis

AdaGCN: Adaboosting Graph Convolutional Networks into Deep Models

1 code implementation ICLR 2021 Ke Sun, Zhanxing Zhu, Zhouchen Lin

The design of deep graph models still remains to be investigated and the crucial part is how to explore and exploit the knowledge from different hops of neighbors in an efficient way.

Node Classification

On the Noisy Gradient Descent that Generalizes as SGD

1 code implementation ICML 2020 Jingfeng Wu, Wenqing Hu, Haoyi Xiong, Jun Huan, Vladimir Braverman, Zhanxing Zhu

The gradient noise of SGD is considered to play a central role in the observed strong generalization abilities of deep learning.

Efficient Neural Architecture Search via Proximal Iterations

2 code implementations30 May 2019 Quanming Yao, Ju Xu, Wei-Wei Tu, Zhanxing Zhu

Recently, DARTS, which constructs a differentiable search space and then optimizes it by gradient descent, can obtain high-performance architecture and reduces the search time to several days.

Neural Architecture Search

On the Learning Dynamics of Two-layer Nonlinear Convolutional Neural Networks

no code implementations24 May 2019 Bing Yu, Junzhao Zhang, Zhanxing Zhu

Convolutional neural networks (CNNs) have achieved remarkable performance in various fields, particularly in the domain of computer vision.

Image Classification Vocal Bursts Valence Prediction

Interpreting Adversarially Trained Convolutional Neural Networks

1 code implementation23 May 2019 Tianyuan Zhang, Zhanxing Zhu

Our findings shed some light on why AT-CNNs are more robust than those normally trained ones and contribute to a better understanding of adversarial training over CNNs from an interpretation perspective.

Object Recognition Test

Bayesian Optimized Continual Learning with Attention Mechanism

no code implementations10 May 2019 Ju Xu, Jin Ma, Zhanxing Zhu

Though neural networks have achieved much progress in various applications, it is still highly challenging for them to learn from a continuous stream of tasks without forgetting.

Bayesian Optimization Continual Learning

You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle

2 code implementations NeurIPS 2019 Dinghuai Zhang, Tianyuan Zhang, Yiping Lu, Zhanxing Zhu, Bin Dong

Adversarial training, typically formulated as a robust optimization problem, is an effective way of improving the robustness of deep networks.

Adversarial Defense

Exploring and Enhancing the Transferability of Adversarial Examples

no code implementations ICLR 2019 Lei Wu, Zhanxing Zhu, Cheng Tai

State-of-the-art deep neural networks are vulnerable to adversarial examples, formed by applying small but malicious perturbations to the original inputs.

Test

The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Minima and Regularization Effects

no code implementations ICLR 2019 Zhanxing Zhu, Jingfeng Wu, Bing Yu, Lei Wu, Jinwen Ma

Along this line, we theoretically study a general form of gradient based optimization dynamics with unbiased noise, which unifies SGD and standard Langevin dynamics.

SHE2: Stochastic Hamiltonian Exploration and Exploitation for Derivative-Free Optimization

no code implementations ICLR 2019 Haoyi Xiong, Wenqing Hu, Zhanxing Zhu, Xinjian Li, Yunchao Zhang, Jun Huan

Derivative-free optimization (DFO) using trust region methods is frequently used for machine learning applications, such as (hyper-)parameter optimization without the derivatives of objective functions known.

BIG-bench Machine Learning Text-to-Image Generation

Learning to Search Efficient DenseNet with Layer-wise Pruning

no code implementations ICLR 2019 Xuanyang Zhang, Hao liu, Zhanxing Zhu, Zenglin Xu

Deep neural networks have achieved outstanding performance in many real-world applications with the expense of huge computational resources.

ST-UNet: A Spatio-Temporal U-Network for Graph-structured Time Series Modeling

no code implementations13 Mar 2019 Bing Yu, Haoteng Yin, Zhanxing Zhu

In this U-shaped network, a paired sampling operation is proposed in spacetime domain accordingly: the pooling (ST-Pool) coarsens the input graph in spatial from its deterministic partition while abstracts multi-resolution temporal dependencies through dilated recurrent skip connections; based on previous settings in the downsampling, the unpooling (ST-Unpool) restores the original structure of spatio-temporal graphs and resumes regular intervals within graph sequences.

Graph Learning Time Series +2

Enhancing the Robustness of Deep Neural Networks by Boundary Conditional GAN

no code implementations28 Feb 2019 Ke Sun, Zhanxing Zhu, Zhouchen Lin

In this work, we propose a novel defense mechanism called Boundary Conditional GAN to enhance the robustness of deep neural networks against adversarial examples.

Data Augmentation

Virtual Adversarial Training on Graph Convolutional Networks in Node Classification

no code implementations28 Feb 2019 Ke Sun, Zhouchen Lin, Hantao Guo, Zhanxing Zhu

The effectiveness of Graph Convolutional Networks (GCNs) has been demonstrated in a wide range of graph-based machine learning tasks.

Classification General Classification +1

Towards Understanding Adversarial Examples Systematically: Exploring Data Size, Task and Model Factors

no code implementations28 Feb 2019 Ke Sun, Zhanxing Zhu, Zhouchen Lin

In this paper, we present a systematic study on adversarial examples from three aspects: the amount of training data, task-dependent and model-specific factors.

Test

Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labels

1 code implementation28 Feb 2019 Ke Sun, Zhouchen Lin, Zhanxing Zhu

In this paper, we propose a novel training algorithm for Graph Convolutional Network, called Multi-Stage Self-Supervised(M3S) Training Algorithm, combined with self-supervised learning approach, focusing on improving the generalization performance of GCNs on graphs with few labeled nodes.

Graph Embedding Graph Learning +1

Quasi-potential as an implicit regularizer for the loss function in the stochastic gradient descent

no code implementations18 Jan 2019 Wenqing Hu, Zhanxing Zhu, Haoyi Xiong, Jun Huan

We show in this case that the quasi-potential function is related to the noise covariance structure of SGD via a partial differential equation of Hamilton-Jacobi type.

Relation Variational Inference

Bayesian Adversarial Learning

no code implementations NeurIPS 2018 Nanyang Ye, Zhanxing Zhu

In this work, a novel robust training framework is proposed to alleviate this issue, Bayesian Robust Learning, in which a distribution is put on the adversarial data-generating distribution to account for the uncertainty of the adversarial data-generating process.

Test

Tangent-Normal Adversarial Regularization for Semi-supervised Learning

1 code implementation CVPR 2019 Bing Yu, Jingfeng Wu, Jinwen Ma, Zhanxing Zhu

The proposed TNAR is composed by two complementary parts, the tangent adversarial regularization (TAR) and the normal adversarial regularization (NAR).

TAR

Neural Control Variates for Variance Reduction

no code implementations1 Jun 2018 Ruosi Wan, Mingjun Zhong, Haoyi Xiong, Zhanxing Zhu

In statistics and machine learning, approximation of an intractable integration is often achieved by using the unbiased Monte Carlo estimator, but the variances of the estimation are generally high in many applications.

Test

Reinforced Continual Learning

1 code implementation NeurIPS 2018 Ju Xu, Zhanxing Zhu

In this work, a novel approach for continual learning is proposed, which searches for the best neural architecture for each coming task via sophisticatedly designed reinforcement learning strategies.

Continual Learning General Classification +2

The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects

1 code implementation ICLR 2019 Zhanxing Zhu, Jingfeng Wu, Bing Yu, Lei Wu, Jinwen Ma

Along this line, we study a general form of gradient based optimization dynamics with unbiased noise, which unifies SGD and standard Langevin dynamics.

Understanding and Enhancing the Transferability of Adversarial Examples

no code implementations27 Feb 2018 Lei Wu, Zhanxing Zhu, Cheng Tai, Weinan E

State-of-the-art deep neural networks are known to be vulnerable to adversarial examples, formed by applying small but malicious perturbations to the original inputs.

Test

Thermostat-assisted continuously-tempered Hamiltonian Monte Carlo for Bayesian learning

1 code implementation NeurIPS 2018 Rui Luo, Jianhong Wang, Yaodong Yang, Zhanxing Zhu, Jun Wang

We propose a new sampling method, the thermostat-assisted continuously-tempered Hamiltonian Monte Carlo, for Bayesian learning on large datasets and multimodal distributions.

Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes

no code implementations30 Jun 2017 Lei Wu, Zhanxing Zhu, Weinan E

It is widely observed that deep learning models with learned parameters generalize well, even with much more model parameters than the number of training samples.

Langevin Dynamics with Continuous Tempering for Training Deep Neural Networks

no code implementations NeurIPS 2017 Nanyang Ye, Zhanxing Zhu, Rafal K. Mantiuk

Minimizing non-convex and high-dimensional objective functions is challenging, especially when training modern deep neural networks.

Stochastic Optimization

Stochastic Parallel Block Coordinate Descent for Large-scale Saddle Point Problems

no code implementations23 Nov 2015 Zhanxing Zhu, Amos J. Storkey

We consider convex-concave saddle point problems with a separable structure and non-strongly convex functions.

feature selection

Adaptive Stochastic Primal-Dual Coordinate Descent for Separable Saddle Point Problems

no code implementations12 Jun 2015 Zhanxing Zhu, Amos J. Storkey

We consider a generic convex-concave saddle point problem with separable structure, a form that covers a wide-ranged machine learning applications.

Cannot find the paper you are looking for? You can Submit a new open access paper.