no code implementations • 18 Oct 2024 • Shih-Hsin Wang, Justin Baker, Cory Hauck, Bao Wang
(2) Building on our geometric insights, we augment the message-passing process of graph convolutional layers (GCLs) with a learnable term to modulate the smoothness of node features with computational efficiency.
no code implementations • 11 Aug 2024 • Wenqi Tao, Huaming Ling, Zuoqiang Shi, Bao Wang
Protecting data privacy in deep learning (DL) is of crucial importance.
1 code implementation • 11 Aug 2022 • Zhemin Li, Tao Sun, Hongxia Wang, Bao Wang
Theoretically, we show that the adaptive regularization of \ReTwo{AIR} enhances the implicit regularization and vanishes at the end of training.
no code implementations • 1 Aug 2022 • Tan Nguyen, Richard G. Baraniuk, Robert M. Kirby, Stanley J. Osher, Bao Wang
Transformers have achieved remarkable success in sequence modeling and beyond but suffer from quadratic computational and memory complexities with respect to the length of the input sequence.
no code implementations • 19 Apr 2022 • Justin Baker, Hedi Xia, Yiwei Wang, Elena Cherkaev, Akil Narayan, Long Chen, Jack Xin, Andrea L. Bertozzi, Stanley J. Osher, Bao Wang
Learning neural ODEs often requires solving very stiff ODE systems, primarily using explicit adaptive step size ODE solvers.
1 code implementation • 24 Feb 2022 • Justin Baker, Elena Cherkaev, Akil Narayan, Bao Wang
We compare HBNODE with other popular ROMs on several complex dynamical systems, including the von K\'{a}rm\'{a}n Street flow, the Kurganov-Petrova-Popov equation, and the one-dimensional Euler equations for fluids modeling.
no code implementations • 22 Jan 2022 • Yunling Zheng, Carson Hu, Guang Lin, Meng Yue, Bao Wang, Jack Xin
Due to the sparsified queries, GLassoformer is more computationally efficient than the standard transformers.
no code implementations • 12 Dec 2021 • Yifan Hua, Kevin Miller, Andrea L. Bertozzi, Chen Qian, Bao Wang
As such, our proposed overlay networks accelerate convergence, improve generalization, and enhance robustness to clients failures in DFL with theoretical guarantees.
1 code implementation • 18 Oct 2021 • Tao Sun, Huaming Ling, Zuoqiang Shi, Dongsheng Li, Bao Wang
In this paper, to eliminate the effort for tuning the momentum-related hyperparameter, we propose a new adaptive momentum inspired by the optimal choice of the heavy ball momentum for quadratic optimization.
no code implementations • 13 Oct 2021 • Bao Wang, Hedi Xia, Tan Nguyen, Stanley Osher
As case studies, we consider how momentum can improve the architecture design for recurrent neural networks (RNNs), neural ordinary differential equations (ODEs), and transformers.
2 code implementations • 12 Oct 2021 • Zhemin Li, Tao Sun, Hongxia Wang, Bao Wang
Theoretically, we show that the adaptive regularization of AIR enhances the implicit regularization and vanishes at the end of training.
1 code implementation • NeurIPS 2021 • Hedi Xia, Vai Suliafu, Hangjie Ji, Tan M. Nguyen, Andrea L. Bertozzi, Stanley J. Osher, Bao Wang
We propose heavy ball neural ordinary differential equations (HBNODEs), leveraging the continuous limit of the classical momentum accelerated gradient descent, to improve neural ODEs (NODEs) training and inference.
no code implementations • ICLR 2022 • Matthew Thorpe, Tan Minh Nguyen, Hedi Xia, Thomas Strohmer, Andrea Bertozzi, Stanley Osher, Bao Wang
We propose GRAph Neural Diffusion with a source term (GRAND++) for graph deep learning with a limited number of labeled nodes, i. e., low-labeling rate.
no code implementations • NeurIPS 2021 • Tan M. Nguyen, Vai Suliafu, Stanley J. Osher, Long Chen, Bao Wang
For instance, FMMformers achieve an average classification accuracy of $60. 74\%$ over the five Long Range Arena tasks, which is significantly better than the standard transformer's average accuracy of $58. 70\%$.
no code implementations • 23 Apr 2021 • Tao Sun, Dongsheng Li, Bao Wang
In FedAvg, clients keep their data locally for privacy protection; a central parameter server is used to communicate between clients.
no code implementations • 22 Apr 2021 • Matthew Thorpe, Bao Wang
Graph Laplacian (GL)-based semi-supervised learning is one of the most used approaches for classifying nodes in a graph.
no code implementations • 2 Feb 2021 • Tao Sun, Dongsheng Li, Bao Wang
The stability and generalization of stochastic gradient-based methods provide valuable insights into understanding the algorithmic performance of machine learning models.
no code implementations • 1 Jan 2021 • Matthew Thorpe, Bao Wang
Within a certain adversarial perturbation regime, we prove that GL with a $k$-nearest neighbor graph is intrinsically more robust than the $k$-nearest neighbor classifier.
no code implementations • 3 Dec 2020 • Bao Wang, Qiang Ye
In this paper, we propose a novel \emph{adaptive momentum} for improving DNNs training; this adaptive momentum, with no momentum related hyperparameter required, is motivated by the nonlinear conjugate gradient (NCG) method.
no code implementations • 30 Nov 2020 • Ti Bai, Biling Wang, Dan Nguyen, Bao Wang, Bin Dong, Wenxiang Cong, Mannudeep K. Kalra, Steve Jiang
However, there exists two challenges regarding the DL-based denoisers: 1) a trained model typically does not generate different image candidates with different noise-resolution tradeoffs which sometimes are needed for different clinical tasks; 2) the model generalizability might be an issue when the noise level in the testing images is different from that in the training dataset.
1 code implementation • 31 Aug 2020 • Zhijian Li, Bao Wang, Jack Xin
To solve the problems that adversarial training jeopardizes DNNs' accuracy on clean images and the struture of sparsity, we design a trade-off loss function that helps DNNs preserve their natural accuracy and improve the channel sparsity.
2 code implementations • NeurIPS 2020 • Tan M. Nguyen, Richard G. Baraniuk, Andrea L. Bertozzi, Stanley J. Osher, Bao Wang
Designing deep neural networks is an art that often involves an expensive search over candidate architectures.
no code implementations • 21 May 2020 • Hui Long, Ming Chen, Zhaohui Yang, Bao Wang, Zhiyang Li, Xu Yun, Mohammad Shikh-Bahaei
This paper investigates the problem of secure energy efficiency maximization for a reconfigurable intelligent surface (RIS) assisted uplink wireless communication system, where an unmanned aerial vehicle (UAV) equipped with an RIS works as a mobile relay between the base station (BS) and a group of users.
no code implementations • 1 May 2020 • Zhicong Liang, Bao Wang, Quanquan Gu, Stanley Osher, Yuan YAO
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
no code implementations • 2 Mar 2020 • Thu Dinh, Bao Wang, Andrea L. Bertozzi, Stanley J. Osher
In this paper, we focus on a co-design of efficient DNN compression algorithms and sparse neural architectures for robust and accurate deep learning.
1 code implementation • 24 Feb 2020 • Bao Wang, Tan M. Nguyen, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher
Nesterov accelerated gradient (NAG) improves the convergence rate of gradient descent (GD) for convex optimization using a specially designed momentum; however, it accumulates error when an inexact gradient is used (such as in SGD), slowing convergence at best and diverging at worst.
1 code implementation • 2 Nov 2019 • Bao Wang, Difan Zou, Quanquan Gu, Stanley Osher
As an important Markov Chain Monte Carlo (MCMC) method, stochastic gradient Langevin dynamics (SGLD) algorithm has achieved great success in Bayesian learning and posterior sampling.
1 code implementation • 16 Jul 2019 • Bao Wang, Stanley J. Osher
The proposed DNN with graph interpolating activation integrates the advantages of both deep learning and manifold learning.
1 code implementation • 28 Jun 2019 • Bao Wang, Quanquan Gu, March Boedihardjo, Farzin Barekat, Stanley J. Osher
At the core of DP-LSSGD is the Laplacian smoothing, which smooths out the Gaussian noise used in the Gaussian mechanism.
2 code implementations • 13 Feb 2019 • Zhijian Li, Xiyang Luo, Bao Wang, Andrea L. Bertozzi, Jack Xin
We study epidemic forecasting on real-world health data by a graph-structured recurrent neural network (GSRNN).
no code implementations • 21 Jan 2019 • Lisa Maria Kreusser, Stanley J. Osher, Bao Wang
First-order methods such as gradient descent are usually the methods of choice for training machine learning models.
5 code implementations • NeurIPS 2019 • Bao Wang, Binjie Yuan, Zuoqiang Shi, Stanley J. Osher
However, both natural and robust accuracies, in classifying clean and adversarial images, respectively, of the trained robust models are far from satisfactory.
no code implementations • 15 Nov 2018 • Zehao Dou, Stanley J. Osher, Bao Wang
In this paper, we analyze efficacy of the fast gradient sign method (FGSM) and the Carlini-Wagner's L2 (CW-L2) attack.
1 code implementation • 23 Sep 2018 • Bao Wang, Alex T. Lin, Wei Zhu, Penghang Yin, Andrea L. Bertozzi, Stanley J. Osher
We improve the robustness of Deep Neural Net (DNN) to adversarial attacks by using an interpolating function as the output activation.
1 code implementation • 17 Jun 2018 • Stanley Osher, Bao Wang, Penghang Yin, Xiyang Luo, Farzin Barekat, Minh Pham, Alex Lin
We propose a class of very simple modifications of gradient descent and stochastic gradient descent.
no code implementations • ICLR 2019 • Wei Zhu, Qiang Qiu, Bao Wang, Jianfeng Lu, Guillermo Sapiro, Ingrid Daubechies
Deep neural networks (DNNs) typically have enough capacity to fit random data by brute force even when conventional data-dependent regularizations focusing on the geometry of the features are imposed.
no code implementations • 2 Apr 2018 • Bao Wang, Xiyang Luo, Fangbo Zhang, Baichuan Yuan, Andrea L. Bertozzi, P. Jeffrey Brantingham
We present a generic framework for spatio-temporal (ST) data modeling, analysis, and forecasting, with a special focus on data that is sparse in both space and time.
1 code implementation • NeurIPS 2018 • Bao Wang, Xiyang Luo, Zhen Li, Wei Zhu, Zuoqiang Shi, Stanley J. Osher
We replace the output layer of deep neural nets, typically the softmax function, by a novel interpolating function.
no code implementations • 23 Nov 2017 • Bao Wang, Penghang Yin, Andrea L. Bertozzi, P. Jeffrey Brantingham, Stanley J. Osher, Jack Xin
In this work, we first present a proper representation of crime data.
no code implementations • 9 Jul 2017 • Bao Wang, Duo Zhang, Duanhao Zhang, P. Jeffery Brantingham, Andrea L. Bertozzi
Experiments over a half year period in Los Angeles reveal highly accurate predictive power of our models.
no code implementations • 31 Mar 2017 • Bao Wang, Zhixiong Zhao, Duc D. Nguyen, Guo-Wei Wei
The underpinning assumptions of FFT-BP are as follows: i) representability: there exists a microscopic feature vector that can uniquely characterize and distinguish one protein-ligand complex from another; ii) feature-function relationship: the macroscopic features, including binding free energy, of a complex is a functional of microscopic feature vectors; and iii) similarity: molecules with similar microscopic features have similar macroscopic features, such as binding affinity.