1 code implementation • ICML 2020 • Yanzhi Chen, Renjie Xie, Zhanxing Zhu
The idea is to view the inversion phase as a dynamical system, through which we extract the gradient with respect to the input by tracing its recent trajectory.
no code implementations • ICLR 2019 • Nanyang Ye, Zhanxing Zhu
In this paper, we extend the Bayesian neural network to functional Bayesian neural network with functional Monte Carlo methods that use the samples of functionals instead of samples of networks' parameters for inference to overcome the curse of dimensionality for uncertainty quantification.
1 code implementation • 4 Feb 2025 • Ruochen Li, Tanqiu Qiao, Stamos Katsigiannis, Zhanxing Zhu, Hubert P. H. Shum
We propose the Edge-to-Edge-Node-to-Node Graph Convolution (E2E-N2N-GCN), a novel dual-graph network that jointly models explicit N2N social interactions among pedestrians and implicit E2E influence propagation across these interaction patterns.
no code implementations • 3 Oct 2024 • Zygimantas Jocys, Zhanxing Zhu, Henriette M. G. Willems, Katayoun Farrahi
Drug discovery is a complex, resource-intensive process requiring significant time and cost to bring new medicines to patients.
1 code implementation • 20 Jun 2024 • Qianli Shen, Yezhen Wang, Zhouhao Yang, Xiang Li, Haonan Wang, Yang Zhang, Jonathan Scarlett, Zhanxing Zhu, Kenji Kawaguchi
Bi-level optimization (BO) has become a fundamental mathematical framework for addressing hierarchical machine learning problems.
no code implementations • 1 Apr 2023 • Haoyi Xiong, Xuhong LI, Boyang Yu, Zhanxing Zhu, Dongrui Wu, Dejing Dou
While previous studies primarily focus on the affects of label noises to the performance of learning, our work intends to investigate the implicit regularization effects of the label noises, under mini-batch sampling settings of stochastic gradient descent (SGD), with assumptions that label noises are unbiased.
no code implementations • 2 Feb 2023 • Mingxuan Yi, Zhanxing Zhu, Song Liu
The conventional understanding of adversarial training in generative adversarial networks (GANs) is that the discriminator is trained to estimate a divergence, and the generator learns to minimize this divergence.
1 code implementation • ICLR 2022 • Deshan Gong, Zhanxing Zhu, Andrew J. Bulpitt, He Wang
To this end, we propose several differentiable forces, whose counterparts in empirical physics are indifferentiable, to facilitate gradient-based learning.
no code implementations • NeurIPS 2021 • Ruosi Wan, Zhanxing Zhu, Xiangyu Zhang, Jian Sun
Specifically, 1) we introduce the assumptions that can lead to equilibrium state in SMD, and prove equilibrium can be reached in a linear rate regime under given assumptions; 2) we propose ``angular update" as a substitute for effective learning rate to depict the state of SMD, and derive the theoretical value of angular update in equilibrium state; 3) we verify our assumptions and theoretical results on various large-scale computer vision tasks including ImageNet and MSCOCO with standard settings.
no code implementations • ICLR 2022 • Bochen Lv, Zhanxing Zhu
Furthermore, we generalize this result to the case of adversarial training for non-linear homogeneous deep neural networks without the linear separability of the dataset.
no code implementations • 16 Jul 2021 • Quanshi Zhang, Tian Han, Lixin Fan, Zhanxing Zhu, Hang Su, Ying Nian Wu, Jie Ren, Hao Zhang
This workshop pays a special interest in theoretic foundations, limitations, and new application trends in the scope of XAI.
1 code implementation • CVPR 2021 • Nanyang Ye, Jingxuan Tang, Huayu Deng, Xiao-Yun Zhou, Qianxiao Li, Zhenguo Li, Guang-Zhong Yang, Zhanxing Zhu
To the best of our knowledge, this is one of the first to adopt differentiable environment splitting method to enable stable predictions across environments without environment index information, which achieves the state-of-the-art performance on datasets with strong spurious correlation, such as Colored MNIST.
1 code implementation • 31 Mar 2021 • Zeke Xie, Li Yuan, Zhanxing Zhu, Masashi Sugiyama
It is well-known that stochastic gradient noise (SGN) acts as implicit regularization for deep learning and is essentially important for both optimization and generalization of deep networks.
no code implementations • 1 Jan 2021 • Bochen Lv, Pu Yang, Zehao Wang, Zhanxing Zhu
And the log-spectrum difference of the adversarial examples and clean image is more concentrated in the high-frequency part than the low-frequency part.
no code implementations • 1 Jan 2021 • Haoyi Xiong, Xuhong LI, Boyang Yu, Dejing Dou, Dongrui Wu, Zhanxing Zhu
Random label noises (or observational noises) widely exist in practical machinelearning settings.
no code implementations • ICLR 2021 • Yanzhi Chen, Dinghuai Zhang, Michael U. Gutmann, Aaron Courville, Zhanxing Zhu
We consider the fundamental problem of how to automatically construct summary statistics for likelihood-free inference where the evaluation of likelihood function is intractable but sampling / simulating data from the model is possible.
no code implementations • 1 Jan 2021 • Haozhe An, Haoyi Xiong, Xuhong LI, Xingjian Li, Dejing Dou, Zhanxing Zhu
The recent theoretical investigation (Li et al., 2020) on the upper bound of generalization error of deep neural networks (DNNs) demonstrates the potential of using the gradient norm as a measure that complements validation accuracy for model selection in practice.
no code implementations • 1 Jan 2021 • Yizheng Hu, Kun Shao, Dong Li, Jianye Hao, Wulong Liu, Yaodong Yang, Jun Wang, Zhanxing Zhu
Therefore, to achieve robust CMARL, we introduce novel strategies to encourage agents to learn correlated equilibrium while maximally preserving the convenience of the decentralized execution.
2 code implementations • 15 Dec 2020 • Mengzhang Li, Zhanxing Zhu
SFTGNN could effectively learn hidden spatial-temporal dependencies by a novel fusion operation of various spatial and temporal graphs, which is generated by a data-driven method.
Ranked #4 on
Traffic Prediction
on BJTaxi
no code implementations • 15 Dec 2020 • Nanyang Ye, Qianxiao Li, Xiao-Yun Zhou, Zhanxing Zhu
However, conducting adversarial training brings much computational overhead compared with standard training.
no code implementations • NeurIPS 2020 • Guangda Ji, Zhanxing Zhu
In this paper, we theoretically analyze the knowledge distillation of a wide neural network.
1 code implementation • 20 Oct 2020 • Yanzhi Chen, Dinghuai Zhang, Michael Gutmann, Aaron Courville, Zhanxing Zhu
We consider the fundamental problem of how to automatically construct summary statistics for implicit generative models where the evaluation of the likelihood function is intractable, but sampling data from the model is possible.
1 code implementation • 7 Oct 2020 • Ju Xu, Mengzhang Li, Zhanxing Zhu
Data augmentation is an effective and universal technique for improving generalization performance of deep neural networks.
1 code implementation • ICML 2020 • Baifeng Shi, Dinghuai Zhang, Qi Dai, Zhanxing Zhu, Yadong Mu, Jingdong Wang
Specifically, we discriminate texture from shape based on local self-information in an image, and adopt a Dropout-like algorithm to decorrelate the model output from the local texture.
no code implementations • 15 Jun 2020 • Ruosi Wan, Zhanxing Zhu, Xiangyu Zhang, Jian Sun
In this work, we comprehensively reveal the learning dynamics of neural network with normalization, weight decay (WD), and SGD (with momentum), named as Spherical Motion Dynamics (SMD).
no code implementations • 14 Jun 2020 • Bing Yu, Ke Sun, He Wang, Zhouchen Lin, Zhanxing Zhu
The scarcity of class-labeled data is a ubiquitous bottleneck in many machine learning problems.
no code implementations • 8 Jun 2020 • Weidi Sun, Yuteng Lu, Xiyue Zhang, Zhanxing Zhu, Meng Sun
The wide deployment of deep neural networks, though achieving great success in many domains, has severe safety and reliability concerns.
no code implementations • NeurIPS 2020 • Dinghuai Zhang, Mao Ye, Chengyue Gong, Zhanxing Zhu, Qiang Liu
Randomized classifiers have been shown to provide a promising approach for achieving certified robustness against adversarial attacks in deep learning.
no code implementations • 21 Nov 2019 • Ke Sun, Bing Yu, Zhouchen Lin, Zhanxing Zhu
Regularization plays a crucial role in machine learning models, especially for deep neural networks.
no code implementations • 18 Nov 2019 • Ruosi Wan, Haoyi Xiong, Xingjian Li, Zhanxing Zhu, Jun Huan
The empirical results show that the proposed descent direction estimation strategy DTNH can always improve the performance of deep transfer learning tasks based on all above regularizers, even when transferring pre-trained weights from inappropriate networks.
no code implementations • 25 Sep 2019 • Dinghuai Zhang*, Mao Ye*, Chengyue Gong*, Zhanxing Zhu, Qiang Liu
Randomized classifiers have been shown to provide a promising approach for achieving certified robustness against adversarial attacks in deep learning.
no code implementations • 20 Aug 2019 • He Wang, Edmond S. L. Ho, Hubert P. H. Shum, Zhanxing Zhu
In this paper, we propose a new deep network to tackle these challenges by creating a natural motion manifold that is versatile for many applications.
1 code implementation • ICLR 2021 • Ke Sun, Zhanxing Zhu, Zhouchen Lin
The design of deep graph models still remains to be investigated and the crucial part is how to explore and exploit the knowledge from different hops of neighbors in an efficient way.
Ranked #2 on
Node Classification
on MS ACADEMIC
1 code implementation • ICML 2020 • Jingfeng Wu, Wenqing Hu, Haoyi Xiong, Jun Huan, Vladimir Braverman, Zhanxing Zhu
The gradient noise of SGD is considered to play a central role in the observed strong generalization abilities of deep learning.
2 code implementations • 30 May 2019 • Quanming Yao, Ju Xu, Wei-Wei Tu, Zhanxing Zhu
Recently, DARTS, which constructs a differentiable search space and then optimizes it by gradient descent, can obtain high-performance architecture and reduces the search time to several days.
no code implementations • 24 May 2019 • Bing Yu, Junzhao Zhang, Zhanxing Zhu
Convolutional neural networks (CNNs) have achieved remarkable performance in various fields, particularly in the domain of computer vision.
1 code implementation • 23 May 2019 • Tianyuan Zhang, Zhanxing Zhu
Our findings shed some light on why AT-CNNs are more robust than those normally trained ones and contribute to a better understanding of adversarial training over CNNs from an interpretation perspective.
no code implementations • 10 May 2019 • Ju Xu, Jin Ma, Zhanxing Zhu
Though neural networks have achieved much progress in various applications, it is still highly challenging for them to learn from a continuous stream of tasks without forgetting.
2 code implementations • NeurIPS 2019 • Dinghuai Zhang, Tianyuan Zhang, Yiping Lu, Zhanxing Zhu, Bin Dong
Adversarial training, typically formulated as a robust optimization problem, is an effective way of improving the robustness of deep networks.
no code implementations • ICLR 2019 • Lei Wu, Zhanxing Zhu, Cheng Tai
State-of-the-art deep neural networks are vulnerable to adversarial examples, formed by applying small but malicious perturbations to the original inputs.
no code implementations • ICLR 2019 • Wenpeng Hu, Zhengwei Tao, Zhanxing Zhu, Bing Liu, Zhou Lin, Jinwen Ma, Dongyan Zhao, Rui Yan
A large amount of parallel data is needed to train a strong neural machine translation (NMT) system.
no code implementations • ICLR 2019 • Xuanyang Zhang, Hao liu, Zhanxing Zhu, Zenglin Xu
Deep neural networks have achieved outstanding performance in many real-world applications with the expense of huge computational resources.
no code implementations • ICLR 2019 • Haoyi Xiong, Wenqing Hu, Zhanxing Zhu, Xinjian Li, Yunchao Zhang, Jun Huan
Derivative-free optimization (DFO) using trust region methods is frequently used for machine learning applications, such as (hyper-)parameter optimization without the derivatives of objective functions known.
no code implementations • ICLR 2019 • Zhanxing Zhu, Jingfeng Wu, Bing Yu, Lei Wu, Jinwen Ma
Along this line, we theoretically study a general form of gradient based optimization dynamics with unbiased noise, which unifies SGD and standard Langevin dynamics.
no code implementations • 13 Mar 2019 • Bing Yu, Haoteng Yin, Zhanxing Zhu
In this U-shaped network, a paired sampling operation is proposed in spacetime domain accordingly: the pooling (ST-Pool) coarsens the input graph in spatial from its deterministic partition while abstracts multi-resolution temporal dependencies through dilated recurrent skip connections; based on previous settings in the downsampling, the unpooling (ST-Unpool) restores the original structure of spatio-temporal graphs and resumes regular intervals within graph sequences.
Ranked #1 on
Traffic Prediction
on PeMS-M
no code implementations • 3 Mar 2019 • Bing Yu, Mengzhang Li, Jiyong Zhang, Zhanxing Zhu
(2) We propose an original 3D graph convolution model to model the spatio-temporal data more accurately.
Ranked #2 on
Traffic Prediction
on PeMS-M
no code implementations • 28 Feb 2019 • Ke Sun, Zhanxing Zhu, Zhouchen Lin
In this paper, we present a systematic study on adversarial examples from three aspects: the amount of training data, task-dependent and model-specific factors.
no code implementations • 28 Feb 2019 • Ke Sun, Zhanxing Zhu, Zhouchen Lin
In this work, we propose a novel defense mechanism called Boundary Conditional GAN to enhance the robustness of deep neural networks against adversarial examples.
no code implementations • 28 Feb 2019 • Ke Sun, Zhouchen Lin, Hantao Guo, Zhanxing Zhu
The effectiveness of Graph Convolutional Networks (GCNs) has been demonstrated in a wide range of graph-based machine learning tasks.
1 code implementation • 28 Feb 2019 • Ke Sun, Zhouchen Lin, Zhanxing Zhu
In this paper, we propose a novel training algorithm for Graph Convolutional Network, called Multi-Stage Self-Supervised(M3S) Training Algorithm, combined with self-supervised learning approach, focusing on improving the generalization performance of GCNs on graphs with few labeled nodes.
no code implementations • 18 Jan 2019 • Wenqing Hu, Zhanxing Zhu, Haoyi Xiong, Jun Huan
We show in this case that the quasi-potential function is related to the noise covariance structure of SGD via a partial differential equation of Hamilton-Jacobi type.
no code implementations • NeurIPS 2018 • Nanyang Ye, Zhanxing Zhu
In this work, a novel robust training framework is proposed to alleviate this issue, Bayesian Robust Learning, in which a distribution is put on the adversarial data-generating distribution to account for the uncertainty of the adversarial data-generating process.
1 code implementation • CVPR 2019 • Bing Yu, Jingfeng Wu, Jinwen Ma, Zhanxing Zhu
The proposed TNAR is composed by two complementary parts, the tangent adversarial regularization (TAR) and the normal adversarial regularization (NAR).
no code implementations • 1 Jun 2018 • Ruosi Wan, Mingjun Zhong, Haoyi Xiong, Zhanxing Zhu
In statistics and machine learning, approximation of an intractable integration is often achieved by using the unbiased Monte Carlo estimator, but the variances of the estimation are generally high in many applications.
1 code implementation • NeurIPS 2018 • Ju Xu, Zhanxing Zhu
In this work, a novel approach for continual learning is proposed, which searches for the best neural architecture for each coming task via sophisticatedly designed reinforcement learning strategies.
1 code implementation • ICLR 2019 • Zhanxing Zhu, Jingfeng Wu, Bing Yu, Lei Wu, Jinwen Ma
Along this line, we study a general form of gradient based optimization dynamics with unbiased noise, which unifies SGD and standard Langevin dynamics.
no code implementations • 27 Feb 2018 • Lei Wu, Zhanxing Zhu, Cheng Tai, Weinan E
State-of-the-art deep neural networks are known to be vulnerable to adversarial examples, formed by applying small but malicious perturbations to the original inputs.
no code implementations • ICLR 2018 • Lei Wu, Zhanxing Zhu, Cheng Tai, Weinan E
Deep neural networks provide state-of-the-art performance for many applications of interest.
1 code implementation • NeurIPS 2018 • Rui Luo, Jianhong Wang, Yaodong Yang, Zhanxing Zhu, Jun Wang
We propose a new sampling method, the thermostat-assisted continuously-tempered Hamiltonian Monte Carlo, for Bayesian learning on large datasets and multimodal distributions.
5 code implementations • 14 Sep 2017 • Bing Yu, Haoteng Yin, Zhanxing Zhu
Timely accurate traffic forecast is crucial for urban traffic control and guidance.
Ranked #2 on
Time Series Forecasting
on PeMSD7
no code implementations • 30 Jun 2017 • Lei Wu, Zhanxing Zhu, Weinan E
It is widely observed that deep learning models with learned parameters generalize well, even with much more model parameters than the number of training samples.
no code implementations • ACL 2017 • Bingfeng Luo, Yansong Feng, Zheng Wang, Zhanxing Zhu, Songfang Huang, Rui Yan, Dongyan Zhao
We show that the dynamic transition matrix can effectively characterize the noise in the training data built by distant supervision.
no code implementations • NeurIPS 2017 • Nanyang Ye, Zhanxing Zhu, Rafal K. Mantiuk
Minimizing non-convex and high-dimensional objective functions is challenging, especially when training modern deep neural networks.
no code implementations • 23 Nov 2015 • Zhanxing Zhu, Amos J. Storkey
We consider convex-concave saddle point problems with a separable structure and non-strongly convex functions.
no code implementations • NeurIPS 2015 • Xiaocheng Shang, Zhanxing Zhu, Benedict Leimkuhler, Amos J. Storkey
Monte Carlo sampling for Bayesian posterior inference is a common approach used in machine learning.
no code implementations • 12 Jun 2015 • Zhanxing Zhu, Amos J. Storkey
We consider a generic convex-concave saddle point problem with separable structure, a form that covers a wide-ranged machine learning applications.