no code implementations • 15 Oct 2024 • Mingze Wang, Ruoxi Yu, Weinan E, Lei Wu
Transformers have demonstrated exceptional in-context learning capabilities, yet the theoretical understanding of the underlying mechanisms remain limited.
no code implementations • 8 Jul 2024 • Boshen Zeng, SiAn Chen, Xinxin Liu, Changhong Chen, Bin Deng, Xiaoxu Wang, Zhifeng Gao, Yuzhi Zhang, Weinan E, Linfeng Zhang
Advancements in lithium battery technology heavily rely on the design and engineering of electrolytes.
no code implementations • 1 Jul 2024 • Hongkang Yang, Zehao Lin, Wenjin Wang, Hao Wu, Zhiyu Li, Bo Tang, Wenqiang Wei, Jinbo Wang, Zeyun Tang, Shichao Song, Chenyang Xi, Yu Yu, Kai Chen, Feiyu Xiong, Linpeng Tang, Weinan E
The model is named $\text{Memory}^3$, since explicit memory is the third form of memory in LLMs after implicit memory (model parameters) and working memory (context key-values).
1 code implementation • 21 Jun 2024 • Xiaohong Ji, Zhen Wang, Zhifeng Gao, Hang Zheng, Linfeng Zhang, Guolin Ke, Weinan E
In recent years, pretraining models have made significant advancements in the fields of natural language processing (NLP), computer vision (CV), and life sciences.
1 code implementation • 31 May 2024 • Mingze Wang, Jinbo Wang, Haotian He, Zilin Wang, Guanhua Huang, Feiyu Xiong, Zhiyu Li, Weinan E, Lei Wu
In this work, we propose an Implicit Regularization Enhancement (IRE) framework to accelerate the discovery of flat solutions in deep learning, thereby improving generalization and convergence.
no code implementations • 20 May 2024 • Pinchen Xie, Yunrui Qiu, Weinan E
A data-driven ab initio generalized Langevin equation (AIGLE) approach is developed to learn and simulate high-dimensional, heterogeneous, coarse-grained conformational dynamics.
no code implementations • 1 Feb 2024 • Mingze Wang, Weinan E
We conduct a systematic study of the approximation properties of Transformer for sequence modeling with long, sparse and complicated memory.
no code implementations • 16 Jan 2024 • Zhongwang Zhang, Zhiwei Wang, Junjie Yao, Zhangchen Zhou, Xiaolong Li, Weinan E, Zhi-Qin John Xu
However, language model research faces significant challenges, especially for academic research groups with constrained resources.
no code implementations • 2 May 2023 • Jun Zhang, Xiaohan Lin, Weinan E, Yi Qin Gao
Multiscale molecular modeling is widely applied in scientific research of molecular properties over large time and length scales.
no code implementations • 5 Feb 2023 • Zeping Min, Qian Ge, Zhong Li, Weinan E
Furthermore, in the ASR task, MAC beats wav2vec2 (with fine-tuning) on common voice datasets of Cantonese and gets really competitive results on common voice datasets of Taiwanese and Japanese.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 9 Jan 2022 • Tianhan Zhang, Yuxiao Yi, Yifan Xu, Zhi X. Chen, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu
The current work aims to understand two basic questions regarding the deep neural network (DNN) method: what data the DNN needs and how general the DNN method can be.
no code implementations • 6 Jan 2022 • Zhiwei Wang, Yaoyu Zhang, Enhan Zhao, Yiguang Ju, Weinan E, Zhi-Qin John Xu, Tianhan Zhang
The mechanism reduction is modeled as an optimization problem on Boolean space, where a Boolean vector, each entry corresponding to a species, represents a reduced mechanism.
no code implementations • 29 Dec 2021 • Lidong Fang, Pei Ge, Lei Zhang, Weinan E, Huan Lei
A long standing problem in the modeling of non-Newtonian hydrodynamics of polymeric flows is the availability of reliable and interpretable hydrodynamic models that faithfully encode the underlying micro-scale polymer dynamics.
no code implementations • 29 Dec 2021 • Jiequn Han, Yucheng Yang, Weinan E
An efficient, reliable, and interpretable global solution method, the Deep learning-based algorithm for Heterogeneous Agent Models (DeepHAM), is proposed for solving high dimensional heterogeneous agent models with aggregate shocks.
no code implementations • 8 Jul 2021 • Hongkang Yang, Weinan E
The generative adversarial network (GAN) is a well-known model for learning high-dimensional distributions, but the mechanism for its generalization ability is not understood.
no code implementations • 8 Jul 2021 • Lulu Zhang, Tao Luo, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu, Zheng Ma
In this paper, we propose a a machine learning approach via model-operator-data network (MOD-Net) for solving PDEs.
no code implementations • 15 Apr 2021 • Jihao Long, Jiequn Han, Weinan E
Reinforcement learning (RL) algorithms based on high-dimensional function approximation have achieved tremendous empirical success in large-scale problems with an enormous number of states.
no code implementations • 9 Feb 2021 • Linfeng Zhang, Han Wang, Roberto Car, Weinan E
Using the Deep Potential methodology, we construct a model that reproduces accurately the potential energy surface of the SCAN approximation of density functional theory for water, from low temperature and pressure to about 2400 K and 50 GPa, excluding the vapor stability region.
Chemical Physics
no code implementations • 10 Dec 2020 • Weinan E, Stephan Wojtowytsch
A recent numerical study observed that neural network classifiers enjoy a large degree of symmetry in the penultimate layer.
no code implementations • 2 Dec 2020 • Weinan E, Stephan Wojtowytsch
We use explicit representation formulas to show that solutions to certain partial differential equations lie in Barron spaces or multilayer spaces if the PDE data lie in such function spaces.
no code implementations • 29 Nov 2020 • Hongkang Yang, Weinan E
Models for learning probability distributions such as generative models and density estimators behave quite differently from models for learning functions.
no code implementations • 24 Nov 2020 • Tianhan Zhang, Yaoyu Zhang, Weinan E, Yiguang Ju
Besides, the ignition delay time differences are within 1%.
no code implementations • NeurIPS 2020 • Pan Zhou, Jiashi Feng, Chao Ma, Caiming Xiong, Steven Hoi, Weinan E
The result shows that (1) the escaping time of both SGD and ADAM~depends on the Radon measure of the basin positively and the heaviness of gradient noise negatively; (2) for the same basin, SGD enjoys smaller escaping time than ADAM, mainly because (a) the geometry adaptation in ADAM~via adaptively scaling each gradient coordinate well diminishes the anisotropic structure in gradient noise and results in larger Radon measure of a basin; (b) the exponential gradient average in ADAM~smooths its gradient and leads to lighter gradient noise tails than SGD.
no code implementations • 11 Oct 2020 • Yucheng Yang, Zhong Zheng, Weinan E
In this paper, we propose a class of interpretable neural network models that can achieve both high prediction accuracy and interpretability.
no code implementations • 11 Oct 2020 • Yucheng Yang, Yue Pang, Guanhua Huang, Weinan E
The current knowledge system of macroeconomics is built on interactions among a small number of variables, since traditional macroeconomic models can mostly handle a handful of inputs.
no code implementations • 28 Sep 2020 • Weinan E, Stephan Wojtowytsch
We consider binary and multi-class classification problems using hypothesis classes of neural networks.
no code implementations • 22 Sep 2020 • Weinan E, Chao Ma, Stephan Wojtowytsch, Lei Wu
The purpose of this article is to review the achievements made in the last few years towards the understanding of the reasons behind the success and subtleties of neural network-based machine learning.
no code implementations • ICLR 2021 • Zhong Li, Jiequn Han, Weinan E, Qianxiao Li
We study the approximation properties and optimization dynamics of recurrent neural networks (RNNs) when applied to learn input-output relationships in temporal data.
no code implementations • 14 Sep 2020 • Chao Ma, Lei Wu, Weinan E
The dynamic behavior of RMSprop and Adam algorithms is studied through a combination of careful numerical experiments and theoretical explanations.
1 code implementation • 6 Sep 2020 • Haijun Yu, Xinyuan Tian, Weinan E, Qianxiao Li
We further apply this method to study Rayleigh-Benard convection and learn Lorenz-like low dimensional autonomous reduced order models that capture both qualitative and quantitative properties of the underlying dynamics.
no code implementations • 13 Aug 2020 • Chao Ma, Lei Wu, Weinan E
The random feature model exhibits a kind of resonance behavior when the number of parameters is close to the training sample size.
no code implementations • 30 Jul 2020 • Weinan E, Stephan Wojtowytsch
The key to this work is a new way of representing functions in some form of expectations, motivated by multi-layer neural networks.
1 code implementation • 19 Jul 2020 • Pinchen Xie, Weinan E
We propose the coarse-grained spectral projection method (CGSP), a deep learning-assisted approach for tackling quantum unitary dynamic problems with an emphasis on quench dynamics.
1 code implementation • 25 Jun 2020 • Chao Ma, Lei Wu, Weinan E
A numerical and phenomenological study of the gradient descent (GD) algorithm for training two-layer neural network models is carried out for different parameter regimes when the target function can be accurately approximated by a relatively small number of neurons.
no code implementations • 10 Jun 2020 • Weinan E, Stephan Wojtowytsch
We study the natural function space for infinitely wide two-layer neural networks with ReLU activation (Barron space) and establish different representation formulae.
no code implementations • 5 Jun 2020 • Jianxing Huang, Linfeng Zhang, Han Wang, Jinbao Zhao, Jun Cheng, Weinan E
It has been a challenge to accurately simulate Li-ion diffusion processes in battery materials at room temperature using {\it ab initio} molecular dynamics (AIMD) due to its high computational cost.
Computational Physics Materials Science Chemical Physics
no code implementations • 4 Jun 2020 • Weinan E, Jiequn Han, Linfeng Zhang
Machine learning is poised as a very powerful tool that can drastically improve our ability to carry out scientific research.
no code implementations • 21 May 2020 • Stephan Wojtowytsch, Weinan E
Thus gradient descent training for fitting reasonably smooth, but truly high-dimensional data may be subject to the curse of dimensionality.
no code implementations • 21 May 2020 • Weinan E, Stephan Wojtowytsch
We establish a scale separation of Kolmogorov width type between subspaces of a given Banach space under the condition that a sequence of linear maps converges much faster on one of the subspaces.
1 code implementation • 1 May 2020 • Weile Jia, Han Wang, Mohan Chen, Denghui Lu, Lin Lin, Roberto Car, Weinan E, Linfeng Zhang
For 35 years, {\it ab initio} molecular dynamics (AIMD) has been the method of choice for modeling complex atomistic phenomena from first principles.
Computational Physics
no code implementations • 7 Mar 2020 • Huan Lei, Lei Wu, Weinan E
We introduce a machine-learning-based framework for constructing continuum non-Newtonian fluid dynamics model directly from a micro-scale description.
no code implementations • 30 Dec 2019 • Weinan E, Chao Ma, Lei Wu
We demonstrate that conventional machine learning models and algorithms, such as the random feature model, the two-layer neural network model and the residual neural network model, can all be recovered (in a scaled form) as particular discretizations of different continuous formulations.
no code implementations • 15 Dec 2019 • Weinan E, Chao Ma, Lei Wu
We study the generalization properties of minimum-norm solutions for three over-parametrized machine learning models including the random feature model, the two-layer neural network model and the residual network model.
1 code implementation • 28 Oct 2019 • Yuzhi Zhang, Haidi Wang, WeiJie Chen, Jinzhe Zeng, Linfeng Zhang, Han Wang, Weinan E
Materials 3, 023804] and is capable of generating uniformly accurate deep learning based PES models in a way that minimizes human intervention and the computational cost for data generation and model training.
Computational Physics
1 code implementation • 29 Jul 2019 • Weinan E, Yajun Zhou
We characterize the meaning of words with language-independent numerical fingerprints, through a mathematical analysis of recurring patterns in texts.
1 code implementation • 27 Jun 2019 • Linfeng Zhang, Mohan Chen, Xifan Wu, Han Wang, Weinan E, Roberto Car
We introduce a deep neural network (DNN) model that assigns the position of the centers of the electronic charge in each atomic configuration on a molecular dynamics trajectory.
Computational Physics Materials Science Chemical Physics
no code implementations • 18 Jun 2019 • Weinan E, Chao Ma, Lei Wu
We define the Barron space and show that it is the right space for two-layer neural network models in the sense that optimal direct and inverse approximation theorems hold for functions in the Barron space.
no code implementations • ICLR 2019 • Lei Wu, Chao Ma, Weinan E
These new estimates are a priori in nature in the sense that the bounds depend only on some norms of the underlying functions to be fitted, not the parameters in the model.
no code implementations • ICLR 2019 • Linfeng Zhang, Weinan E, Lei Wang
We present a deep generative model, named Monge-Amp\`ere flow, which builds on continuous-time gradient flow arising from the Monge-Amp\`ere equation in optimal transport theory.
no code implementations • 10 Apr 2019 • Weinan E, Chao Ma, Qingcan Wang, Lei Wu
In addition, it is also shown that the GD path is uniformly close to the functions given by the related random feature model.
no code implementations • 8 Apr 2019 • Weinan E, Chao Ma, Lei Wu
In the over-parametrized regime, it is shown that gradient descent dynamics can achieve zero training loss exponentially fast regardless of the quality of the labels.
no code implementations • 6 Mar 2019 • Weinan E, Chao Ma, Qingcan Wang
An important part of the regularized model is the usage of a new path norm, called the weighted path norm, as the regularization term.
1 code implementation • NeurIPS 2018 • Lei Wu, Chao Ma, Weinan E
The question of which global minima are accessible by a stochastic gradient decent (SGD) algorithm with specific learning rate and batch size is studied from the perspective of dynamical stability.
no code implementations • 5 Nov 2018 • Qianxiao Li, Cheng Tai, Weinan E
We develop the mathematical foundations of the stochastic modified equations (SME) framework for analyzing the dynamics of stochastic gradient algorithms, where the latter is approximated by a class of stochastic differential equations with small noise parameters.
no code implementations • 28 Oct 2018 • Linfeng Zhang, De-Ye Lin, Han Wang, Roberto Car, Weinan E
An active learning procedure called Deep Potential Generator (DP-GEN) is proposed for the construction of accurate and transferable machine learning-based models of the potential energy surface (PES) for the molecular modeling of materials.
no code implementations • ICLR 2019 • Weinan E, Chao Ma, Lei Wu
New estimates for the population risk are established for two-layer neural networks.
1 code implementation • 26 Sep 2018 • Linfeng Zhang, Weinan E, Lei Wang
We present a deep generative model, named Monge-Amp\`ere flow, which builds on continuous-time gradient flow arising from the Monge-Amp\`ere equation in optimal transport theory.
no code implementations • 10 Aug 2018 • Chao Ma, Jianchun Wang, Weinan E
The well-known Mori-Zwanzig theory tells us that model reduction leads to memory effect.
no code implementations • 18 Jul 2018 • Jiequn Han, Linfeng Zhang, Weinan E
We introduce a new family of trial wave-functions based on deep neural networks to solve the many-electron Schr\"odinger equation.
Computational Physics Chemical Physics
no code implementations • 3 Jul 2018 • Weinan E, Jiequn Han, Qianxiao Li
This paper introduces the mathematical formulation of the population risk minimization problem in deep learning as a mean-field optimal control problem.
no code implementations • 1 Jul 2018 • Weinan E, Qingcan Wang
We prove that for analytic functions in low dimension, the convergence rate of the deep neural network approximation is exponential.
1 code implementation • NeurIPS 2018 • Linfeng Zhang, Jiequn Han, Han Wang, Wissam A. Saidi, Roberto Car, Weinan E
Machine learning models are changing the paradigm of molecular modeling, which is a fundamental tool for material science, chemistry, and computational biology.
Computational Physics Materials Science Chemical Physics
no code implementations • 27 Feb 2018 • Lei Wu, Zhanxing Zhu, Cheng Tai, Weinan E
State-of-the-art deep neural networks are known to be vulnerable to adversarial examples, formed by applying small but malicious perturbations to the original inputs.
no code implementations • ICLR 2018 • Lei Wu, Zhanxing Zhu, Cheng Tai, Weinan E
Deep neural networks provide state-of-the-art performance for many applications of interest.
2 code implementations • 11 Dec 2017 • Han Wang, Linfeng Zhang, Jiequn Han, Weinan E
Here we describe DeePMD-kit, a package written in Python/C++ that has been designed to minimize the effort required to build deep learning based representation of potential energy and force field and to perform molecular dynamics.
no code implementations • 10 Dec 2017 • Linfeng Zhang, Han Wang, Weinan E
Like metadynamics, it allows for an efficient exploration of the configuration space by adding an adaptively computed biasing potential to the original dynamics.
2 code implementations • 26 Oct 2017 • Qianxiao Li, Long Chen, Cheng Tai, Weinan E
The continuous dynamical system approach to deep learning is explored in order to devise alternative frameworks for training algorithms.
1 code implementation • 30 Sep 2017 • Weinan E, Bing Yu
We propose a deep learning based method, the Deep Ritz Method, for numerically solving variational problems, particularly the ones that arise from partial differential equations.
no code implementations • 18 Sep 2017 • Christian Beck, Weinan E, Arnulf Jentzen
The PDEs in such applications are high-dimensional as the dimension corresponds to the number of financial assets in a portfolio.
5 code implementations • 30 Jul 2017 • Linfeng Zhang, Jiequn Han, Han Wang, Roberto Car, Weinan E
We introduce a scheme for molecular simulations, the Deep Potential Molecular Dynamics (DeePMD) method, based on a many-body potential and interatomic forces generated by a carefully crafted deep neural network trained with ab initio data.
6 code implementations • 9 Jul 2017 • Jiequn Han, Arnulf Jentzen, Weinan E
Developing algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the notoriously difficult problem known as the "curse of dimensionality".
1 code implementation • 5 Jul 2017 • Jiequn Han, Linfeng Zhang, Roberto Car, Weinan E
When tested on a wide variety of examples, Deep Potential is able to reproduce the original model, whether empirical or quantum mechanics based, within chemical accuracy.
Computational Physics
no code implementations • 30 Jun 2017 • Lei Wu, Zhanxing Zhu, Weinan E
It is widely observed that deep learning models with learned parameters generalize well, even with much more model parameters than the number of training samples.
5 code implementations • 15 Jun 2017 • Weinan E, Jiequn Han, Arnulf Jentzen
We propose a new algorithm for solving parabolic partial differential equations (PDEs) and backward stochastic differential equations (BSDEs) in high dimension, by making an analogy between the BSDE and reinforcement learning with the gradient of the solution playing the role of the policy function, and the loss function given by the error between the prescribed terminal condition and the solution of the BSDE.
no code implementations • 2 Nov 2016 • Jiequn Han, Weinan E
Many real world stochastic control problems suffer from the "curse of dimensionality".
2 code implementations • 19 Nov 2015 • Cheng Tai, Tong Xiao, Yi Zhang, Xiaogang Wang, Weinan E
Recently, tensor decompositions have been used for speeding up CNNs.
no code implementations • ICML 2017 • Qianxiao Li, Cheng Tai, Weinan E
We develop the method of stochastic modified equations (SME), in which stochastic gradient algorithms are approximated in the weak sense by continuous-time stochastic differential equations.
no code implementations • 9 Oct 2015 • Chu Wang, Yingfei Wang, Weinan E, Robert Schapire
Yet, as the number of base hypotheses becomes larger, boosting can lead to a deterioration of test performance.
no code implementations • 17 Jul 2015 • Cheng Tai, Weinan E
The new framework, called AdaFrame, improves over dictionary learning-based techniques in terms of computational efficiency at inference time.