Search Results for author: Lei Wu

Found 59 papers, 16 papers with code

Exploring Neural Network Landscapes: Star-Shaped and Geodesic Connectivity

no code implementations9 Apr 2024 Zhanran Lin, Puheng Li, Lei Wu

One of the most intriguing findings in the structure of neural network landscape is the phenomenon of mode connectivity: For two typical global minima, there exists a path connecting them without barrier.

valid

A Duality Analysis of Kernel Ridge Regression in the Noiseless Regime

no code implementations24 Feb 2024 Jihao Long, Xiaojun Peng, Lei Wu

In this paper, we conduct a comprehensive analysis of generalization properties of Kernel Ridge Regression (KRR) in the noiseless regime, a scenario crucial to scientific computing, where data are often generated via computer simulations.

regression

The Implicit Bias of Gradient Noise: A Symmetry Perspective

no code implementations11 Feb 2024 Liu Ziyin, Mingze Wang, Lei Wu

For one class of symmetry, SGD naturally converges to solutions that have a balanced and aligned gradient noise.

The Local Landscape of Phase Retrieval Under Limited Samples

no code implementations26 Nov 2023 Kaizhao Liu, ZiHao Wang, Lei Wu

We next consider the one-point strong convexity and show that as long as $n=\omega(d)$, with high probability, the landscape is one-point strongly convex in the local annulus: $\{w\in\mathbb{R}^d: o_d(1)\leqslant \|w-w^*\|\leqslant c\}$, where $w^*$ is the ground truth and $c$ is an absolute constant.

Retrieval

Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling

no code implementations24 Nov 2023 Mingze Wang, Zeping Min, Lei Wu

Inspired by this analysis, we propose a novel algorithm called Progressive Rescaling Gradient Descent (PRGD) and show that PRGD can maximize the margin at an {\em exponential rate}.

A Theoretical Analysis of Noise Geometry in Stochastic Gradient Descent

no code implementations1 Oct 2023 Mingze Wang, Lei Wu

In this paper, we provide a theoretical study of noise geometry for minibatch stochastic gradient descent (SGD), a phenomenon where noise aligns favorably with the geometry of local landscape.

Navigate

The $L^\infty$ Learnability of Reproducing Kernel Hilbert Spaces

no code implementations5 Jun 2023 Hongrui Chen, Jihao Long, Lei Wu

We prove that if $\beta$ is independent of the input dimension $d$, then functions in the RKHS can be learned efficiently under the $L^\infty$ norm, i. e., the sample complexity depends polynomially on $d$.

The Brain Tumor Segmentation (BraTS-METS) Challenge 2023: Brain Metastasis Segmentation on Pre-treatment MRI

no code implementations1 Jun 2023 Ahmed W. Moawad, Anastasia Janas, Ujjwal Baid, Divya Ramakrishnan, Leon Jekel, Kiril Krantchev, Harrison Moy, Rachit Saluja, Klara Osenberg, Klara Wilms, Manpreet Kaur, Arman Avesta, Gabriel Cassinelli Pedersen, Nazanin Maleki, Mahdi Salimi, Sarah Merkaj, Marc von Reppert, Niklas Tillmans, Jan Lost, Khaled Bousabarah, Wolfgang Holler, MingDe Lin, Malte Westerhoff, Ryan Maresca, Katherine E. Link, Nourel Hoda Tahon, Daniel Marcus, Aristeidis Sotiras, Pamela Lamontagne, Strajit Chakrabarty, Oleg Teytelboym, Ayda Youssef, Ayaman Nada, Yuri S. Velichko, Nicolo Gennaro, Connectome Students, Group of Annotators, Justin Cramer, Derek R. Johnson, Benjamin Y. M. Kwan, Boyan Petrovic, Satya N. Patro, Lei Wu, Tiffany So, Gerry Thompson, Anthony Kam, Gloria Guzman Perez-Carrillo, Neil Lall, Group of Approvers, Jake Albrecht, Udunna Anazodo, Marius George Lingaru, Bjoern H Menze, Benedikt Wiestler, Maruf Adewole, Syed Muhammad Anwar, Dominic LaBella, Hongwei Bran Li, Juan Eugenio Iglesias, Keyvan Farahani, James Eddy, Timothy Bergquist, Verena Chung, Russel Takeshi Shinohara, Farouk Dako, Walter Wiggins, Zachary Reitman, Chunhao Wang, Xinyang Liu, Zhifan Jiang, Koen van Leemput, Marie Piraud, Ivan Ezhov, Elaine Johanson, Zeke Meier, Ariana Familiar, Anahita Fathi Kazerooni, Florian Kofler, Evan Calabrese, Sanjay Aneja, Veronica Chiang, Ichiro Ikuta, Umber Shafique, Fatima Memon, Gian Marco Conte, Spyridon Bakas, Jeffrey Rudie, Mariam Aboian

Clinical monitoring of metastatic disease to the brain can be a laborious and time-consuming process, especially in cases involving multiple metastases when the assessment is performed manually.

Brain Tumor Segmentation Decision Making +2

Embedding Inequalities for Barron-type Spaces

no code implementations30 May 2023 Lei Wu

To this end, researchers have introduced the Barron space $\mathcal{B}_s(\Omega)$ and the spectral Barron space $\mathcal{F}_s(\Omega)$, where the index $s\in [0,\infty)$ indicates the smoothness of functions within these spaces and $\Omega\subset\mathbb{R}^d$ denotes the input domain.

Learning Theory

The Implicit Regularization of Dynamical Stability in Stochastic Gradient Descent

no code implementations27 May 2023 Lei Wu, Weijie J. Su

By contrast, for gradient descent (GD), the stability imposes a similar constraint but only on the largest eigenvalue of Hessian.

Theoretical Analysis of Inductive Biases in Deep Convolutional Networks

no code implementations15 May 2023 ZiHao Wang, Lei Wu

To this end, we compare the performance of CNNs, locally-connected networks (LCNs), and fully-connected networks (FCNs) on a simple regression task, where LCNs can be viewed as CNNs without weight sharing.

A duality framework for generalization analysis of random feature models and two-layer neural networks

no code implementations9 May 2023 Hongrui Chen, Jihao Long, Lei Wu

The first application is to study learning functions in $\mathcal{F}_{p,\pi}$ with RFMs.

Self-supervised multimodal neuroimaging yields predictive representations for a spectrum of Alzheimer's phenotypes

1 code implementation7 Sep 2022 Alex Fedorov, Eloy Geenjaar, Lei Wu, Tristan Sylvain, Thomas P. DeRamus, Margaux Luck, Maria Misiura, R Devon Hjelm, Sergey M. Plis, Vince D. Calhoun

Coarse labels do not capture the long-tailed spectrum of brain disorder phenotypes, which leads to a loss of generalizability of the model that makes them less useful in diagnostic settings.

Self-Supervised Learning

Towards Improving Operation Economics: A Bilevel MIP-Based Closed-Loop Predict-and-Optimize Framework for Prescribing Unit Commitment

no code implementations27 Aug 2022 Xianbang Chen, Yikui Liu, Lei Wu

Generally, system operators conduct the economic operation of power systems in an open-loop predict-then-optimize process: the renewable energy source (RES) availability and system reserve requirements are first predicted; given the predictions, system operators solve optimization models such as unit commitment (UC) to determine the economical operation plans accordingly.

RZCR: Zero-shot Character Recognition via Radical-based Reasoning

no code implementations12 Jul 2022 Xiaolei Diao, Daqian Shi, Hao Tang, Qiang Shen, Yanzeng Li, Lei Wu, Hao Xu

The long-tail effect is a common issue that limits the performance of deep learning models on real-world datasets.

The alignment property of SGD noise and how it helps select flat minima: A stability analysis

no code implementations6 Jul 2022 Lei Wu, Mingze Wang, Weijie Su

In this paper, we provide an explanation of this striking phenomenon by relating the particular noise structure of SGD to its \emph{linear stability} (Wu et al., 2018).

Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes

no code implementations24 Apr 2022 Chao Ma, Daniel Kunin, Lei Wu, Lexing Ying

Numerically, we observe that neural network loss functions possesses a multiscale structure, manifested in two ways: (1) in a neighborhood of minima, the loss mixes a continuum of scales and grows subquadratically, and (2) in a larger region, the loss shows several separate scales clearly.

Exploiting the Potential of Datasets: A Data-Centric Approach for Model Robustness

1 code implementation10 Mar 2022 Yiqi Zhong, Lei Wu, Xianming Liu, Junjun Jiang

Robustness of deep neural networks (DNNs) to malicious perturbations is a hot topic in trustworthy AI.

Learning a Single Neuron for Non-monotonic Activation Functions

no code implementations16 Feb 2022 Lei Wu

Specifically, when the input distribution is the standard Gaussian, we show that mild conditions on $\sigma$ (e. g., $\sigma$ has a dominating linear part) are sufficient to guarantee the learnability in polynomial time and polynomial samples.

Efficient Medical Image Segmentation Based on Knowledge Distillation

1 code implementation23 Aug 2021 Dian Qin, Jiajun Bu, Zhe Liu, Xin Shen, Sheng Zhou, Jingjun Gu, Zhijua Wang, Lei Wu, Huifen Dai

To deal with this problem, we propose an efficient architecture by distilling knowledge from well-trained medical image segmentation networks to train another lightweight network.

Image Segmentation Knowledge Distillation +3

A spectral-based analysis of the separation between two-layer neural networks and linear methods

no code implementations10 Aug 2021 Lei Wu, Jihao Long

We propose a spectral-based approach to analyze how two-layer neural networks separate from linear methods in terms of approximating high-dimensional functions.

Tasting the cake: evaluating self-supervised generalization on out-of-distribution multimodal MRI data

1 code implementation29 Mar 2021 Alex Fedorov, Eloy Geenjaar, Lei Wu, Thomas P. DeRamus, Vince D. Calhoun, Sergey M. Plis

We show that self-supervised models are not as robust as expected based on their results in natural imaging benchmarks and can be outperformed by supervised learning with dropout.

Out-of-Distribution Generalization Self-Supervised Learning

Hands-on Guidance for Distilling Object Detectors

no code implementations26 Mar 2021 Yangyang Qin, Hefei Ling, Zhenghai He, Yuxuan Shi, Lei Wu

Knowledge distillation can lead to deploy-friendly networks against the plagued computational complexity problem, but previous methods neglect the feature hierarchy in detectors.

Knowledge Distillation Object

Photon-jet events as a probe of axion-like particles at the LHC

no code implementations2 Feb 2021 Daohan Wang, Lei Wu, Jin Min Yang, Mengchao Zhang

Axion-like particles (ALPs) are predicted by many extensions of the Standard Model (SM).

High Energy Physics - Phenomenology High Energy Physics - Experiment

On self-supervised multi-modal representation learning: An application to Alzheimer's disease

1 code implementation25 Dec 2020 Alex Fedorov, Lei Wu, Tristan Sylvain, Margaux Luck, Thomas P. DeRamus, Dmitry Bleklov, Sergey M. Plis, Vince D. Calhoun

In this paper, we introduce a way to exhaustively consider multimodal architectures for contrastive self-supervised fusion of fMRI and MRI of AD patients and controls.

General Classification Representation Learning

New Strong Bounds on sub-GeV Dark Matter from Boosted and Migdal Effects

no code implementations17 Dec 2020 Victor V. Flambaum, Liangliang Su, Lei Wu, Bin Zhu

Due to the low nuclear recoils, sub-GeV dark matter (DM) is usually beyond the sensitivity of the conventional DM direct detection experiments.

High Energy Physics - Phenomenology Cosmology and Nongalactic Astrophysics

Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't

no code implementations22 Sep 2020 Weinan E, Chao Ma, Stephan Wojtowytsch, Lei Wu

The purpose of this article is to review the achievements made in the last few years towards the understanding of the reasons behind the success and subtleties of neural network-based machine learning.

Complexity Measures for Neural Networks with General Activation Functions Using Path-based Norms

no code implementations14 Sep 2020 Zhong Li, Chao Ma, Lei Wu

The approach is motivated by approximating the general activation functions with one-dimensional ReLU networks, which reduces the problem to the complexity controls of ReLU networks.

A Qualitative Study of the Dynamic Behavior for Adaptive Gradient Algorithms

no code implementations14 Sep 2020 Chao Ma, Lei Wu, Weinan E

The dynamic behavior of RMSprop and Adam algorithms is studied through a combination of careful numerical experiments and theoretical explanations.

The Slow Deterioration of the Generalization Error of the Random Feature Model

no code implementations13 Aug 2020 Chao Ma, Lei Wu, Weinan E

The random feature model exhibits a kind of resonance behavior when the number of parameters is close to the training sample size.

The Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models

1 code implementation25 Jun 2020 Chao Ma, Lei Wu, Weinan E

A numerical and phenomenological study of the gradient descent (GD) algorithm for training two-layer neural network models is carried out for different parameter regimes when the target function can be accurately approximated by a relatively small number of neurons.

Beyond the Virus: A First Look at Coronavirus-themed Mobile Malware

1 code implementation29 May 2020 Ren He, Haoyu Wang, Pengcheng Xia, Liu Wang, Yuanchun Li, Lei Wu, Yajin Zhou, Xiapu Luo, Yao Guo, Guoai Xu

To facilitate future research, we have publicly released all the well-labelled COVID-19 themed apps (and malware) to the research community.

Cryptography and Security

Calibrating the dynamic Huff model for business analysis using location big data

1 code implementation24 Mar 2020 Yunlei Liang, Song Gao, Yuxin Cai, Natasha Zhang Foutz, Lei Wu

In this research, we present a time-aware dynamic Huff model (T-Huff) for location-based market share analysis and calibrate this model using large-scale store visit patterns based on mobile phone location data across ten most populated U. S. cities.

Social and Information Networks H.1

Reinterpretation of LHC Results for New Physics: Status and Recommendations after Run 2

no code implementations17 Mar 2020 Waleed Abdallah, Shehu AbdusSalam, Azar Ahmadov, Amine Ahriche, Gaël Alguero, Benjamin C. Allanach, Jack Y. Araz, Alexandre Arbey, Chiara Arina, Peter Athron, Emanuele Bagnaschi, Yang Bai, Michael J. Baker, Csaba Balazs, Daniele Barducci, Philip Bechtle, Aoife Bharucha, Andy Buckley, Jonathan Butterworth, Haiying Cai, Claudio Campagnari, Cari Cesarotti, Marcin Chrzaszcz, Andrea Coccaro, Eric Conte, Jonathan M. Cornell, Louie Dartmoor Corpe, Matthias Danninger, Luc Darmé, Aldo Deandrea, Nishita Desai, Barry Dillon, Caterina Doglioni, Juhi Dutta, John R. Ellis, Sebastian Ellis, Farida Fassi, Matthew Feickert, Nicolas Fernandez, Sylvain Fichet, Jernej F. Kamenik, Thomas Flacke, Benjamin Fuks, Achim Geiser, Marie-Hélène Genest, Akshay Ghalsasi, Tomas Gonzalo, Mark Goodsell, Stefania Gori, Philippe Gras, Admir Greljo, Diego Guadagnoli, Sven Heinemeyer, Lukas A. Heinrich, Jan Heisig, Deog Ki Hong, Tetiana Hryn'ova, Katri Huitu, Philip Ilten, Ahmed Ismail, Adil Jueid, Felix Kahlhoefer, Jan Kalinowski, Deepak Kar, Yevgeny Kats, Charanjit K. Khosa, Valeri Khoze, Tobias Klingl, Pyungwon Ko, Kyoungchul Kong, Wojciech Kotlarski, Michael Krämer, Sabine Kraml, Suchita Kulkarni, Anders Kvellestad, Clemens Lange, Kati Lassila-Perini, Seung J. Lee, Andre Lessa, Zhen Liu, Lara Lloret Iglesias, Jeanette M. Lorenz, Danika MacDonell, Farvah Mahmoudi, Judita Mamuzic, Andrea C. Marini, Pete Markowitz, Pablo Martinez Ruiz del Arbol, David Miller, Vasiliki Mitsou, Stefano Moretti, Marco Nardecchia, Siavash Neshatpour, Dao Thi Nhung, Per Osland, Patrick H. Owen, Orlando Panella, Alexander Pankov, Myeonghun Park, Werner Porod, Darren Price, Harrison Prosper, Are Raklev, Jürgen Reuter, Humberto Reyes-González, Thomas Rizzo, Tania Robens, Juan Rojo, Janusz A. Rosiek, Oleg Ruchayskiy, Veronica Sanz, Kai Schmidt-Hoberg, Pat Scott, Sezen Sekmen, Dipan Sengupta, Elizabeth Sexton-Kennedy, Hua-Sheng Shao, Seodong Shin, Luca Silvestrini, Ritesh Singh, Sukanya Sinha, Jory Sonneveld, Yotam Soreq, Giordon H. Stark, Tim Stefaniak, Jesse Thaler, Riccardo Torre, Emilio Torrente-Lujan, Gokhan Unel, Natascia Vignaroli, Wolfgang Waltenberger, Nicholas Wardle, Graeme Watt, Georg Weiglein, Martin J. White, Sophie L. Williamson, Jonas Wittbrodt, Lei Wu, Stefan Wunsch, Tevong You, Yang Zhang, José Zurita

We report on the status of efforts to improve the reinterpretation of searches and measurements at the LHC in terms of models for new physics, in the context of the LHC Reinterpretation Forum.

High Energy Physics - Phenomenology High Energy Physics - Experiment

Machine learning based non-Newtonian fluid model with molecular fidelity

no code implementations7 Mar 2020 Huan Lei, Lei Wu, Weinan E

We introduce a machine-learning-based framework for constructing continuum non-Newtonian fluid dynamics model directly from a micro-scale description.

BIG-bench Machine Learning

Machine Learning from a Continuous Viewpoint

no code implementations30 Dec 2019 Weinan E, Chao Ma, Lei Wu

We demonstrate that conventional machine learning models and algorithms, such as the random feature model, the two-layer neural network model and the residual neural network model, can all be recovered (in a scaled form) as particular discretizations of different continuous formulations.

BIG-bench Machine Learning

The Generalization Error of the Minimum-norm Solutions for Over-parameterized Neural Networks

no code implementations15 Dec 2019 Weinan E, Chao Ma, Lei Wu

We study the generalization properties of minimum-norm solutions for three over-parametrized machine learning models including the random feature model, the two-layer neural network model and the residual network model.

BIG-bench Machine Learning

Global Convergence of Gradient Descent for Deep Linear Residual Networks

no code implementations NeurIPS 2019 Lei Wu, Qingcan Wang, Chao Ma

We analyze the global convergence of gradient descent for deep linear residual networks by proposing a new initialization: zero-asymmetric (ZAS) initialization.

EVulHunter: Detecting Fake Transfer Vulnerabilities for EOSIO's Smart Contracts at Webassembly-level

1 code implementation25 Jun 2019 Lijin Quan, Lei Wu, Haoyu Wang

Unfortunately, current tools are web-application oriented and cannot be applied to EOSIO WebAssembly code directly, which makes it more difficult to detect vulnerabilities from those smart contracts.

Cryptography and Security

The Barron Space and the Flow-induced Function Spaces for Neural Network Models

no code implementations18 Jun 2019 Weinan E, Chao Ma, Lei Wu

We define the Barron space and show that it is the right space for two-layer neural network models in the sense that optimal direct and inverse approximation theorems hold for functions in the Barron space.

BIG-bench Machine Learning

Exploring and Enhancing the Transferability of Adversarial Examples

no code implementations ICLR 2019 Lei Wu, Zhanxing Zhu, Cheng Tai

State-of-the-art deep neural networks are vulnerable to adversarial examples, formed by applying small but malicious perturbations to the original inputs.

A Priori Estimates of the Generalization Error for Two-layer Neural Networks

no code implementations ICLR 2019 Lei Wu, Chao Ma, Weinan E

These new estimates are a priori in nature in the sense that the bounds depend only on some norms of the underlying functions to be fitted, not the parameters in the model.

The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Minima and Regularization Effects

no code implementations ICLR 2019 Zhanxing Zhu, Jingfeng Wu, Bing Yu, Lei Wu, Jinwen Ma

Along this line, we theoretically study a general form of gradient based optimization dynamics with unbiased noise, which unifies SGD and standard Langevin dynamics.

Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections

no code implementations10 Apr 2019 Weinan E, Chao Ma, Qingcan Wang, Lei Wu

In addition, it is also shown that the GD path is uniformly close to the functions given by the related random feature model.

A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics

no code implementations8 Apr 2019 Weinan E, Chao Ma, Lei Wu

In the over-parametrized regime, it is shown that gradient descent dynamics can achieve zero training loss exponentially fast regardless of the quality of the labels.

How SGD Selects the Global Minima in Over-parameterized Learning: A Dynamical Stability Perspective

1 code implementation NeurIPS 2018 Lei Wu, Chao Ma, Weinan E

The question of which global minima are accessible by a stochastic gradient decent (SGD) algorithm with specific learning rate and batch size is studied from the perspective of dynamical stability.

A Priori Estimates of the Population Risk for Two-layer Neural Networks

no code implementations ICLR 2019 Weinan E, Chao Ma, Lei Wu

New estimates for the population risk are established for two-layer neural networks.

The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects

1 code implementation ICLR 2019 Zhanxing Zhu, Jingfeng Wu, Bing Yu, Lei Wu, Jinwen Ma

Along this line, we study a general form of gradient based optimization dynamics with unbiased noise, which unifies SGD and standard Langevin dynamics.

Understanding and Enhancing the Transferability of Adversarial Examples

no code implementations27 Feb 2018 Lei Wu, Zhanxing Zhu, Cheng Tai, Weinan E

State-of-the-art deep neural networks are known to be vulnerable to adversarial examples, formed by applying small but malicious perturbations to the original inputs.

Dual Long Short-Term Memory Networks for Sub-Character Representation Learning

1 code implementation23 Dec 2017 Han He, Lei Wu, Xiaokun Yang, Hua Yan, Zhimin Gao, Yi Feng, George Townsend

To build a concrete study and substantiate the efficiency of our neural architecture, we take Chinese Word Segmentation as a research case example.

Chinese Word Segmentation Representation Learning +1

Effective Neural Solution for Multi-Criteria Word Segmentation

1 code implementation7 Dec 2017 Han He, Lei Wu, Hua Yan, Zhimin Gao, Yi Feng, George Townsend

We present a simple yet elegant solution to train a single joint model on multi-criteria corpora for Chinese Word Segmentation (CWS).

Chinese Word Segmentation Sentence

Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes

no code implementations30 Jun 2017 Lei Wu, Zhanxing Zhu, Weinan E

It is widely observed that deep learning models with learned parameters generalize well, even with much more model parameters than the number of training samples.

SOM: Semantic Obviousness Metric for Image Quality Assessment

no code implementations CVPR 2015 Peng Zhang, Wengang Zhou, Lei Wu, Houqiang Li

We propose to extract two types of features, one to measure the semantic obviousness of the image and the other to discover local characteristic.

Image Quality Estimation No-Reference Image Quality Assessment +1

LIFT : Multi-Label Learning with Label-Specific Features

1 code implementation International Joint Conferences on Artificial Intelligence 2014 Min-Ling Zhang, Lei Wu

Existing approaches learn from multi-label data by manipulating with identical feature set, i. e. the very instance representation of each example is employed in the discrimination processes of all class labels.

Clustering Multi-Label Learning

Learning Bregman Distance Functions and Its Application for Semi-Supervised Clustering

no code implementations NeurIPS 2009 Lei Wu, Rong Jin, Steven C. Hoi, Jianke Zhu, Nenghai Yu

Learning distance functions with side information plays a key role in many machine learning and data mining applications.

Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.