Search Results for author: Chaoyue Liu

Found 14 papers, 3 papers with code

Toward High-Performance Energy and Power Battery Cells with Machine Learning-based Optimization of Electrode Manufacturing

no code implementations • 7 Jul 2023 • Marc Duquesnoy, Chaoyue Liu, Vishank Kumar, Elixabete Ayerbe, Alejandro A. Franco

This ML pipeline allows the inverse design of the process parameters to adopt in order to manufacture electrodes for energy or power applications.

Paper
Add Code

Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning

no code implementations • 7 Jun 2023 • Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin

In this paper, we first present an explanation regarding the common occurrence of spikes in the training loss when neural networks are trained with stochastic gradient descent (SGD).

Paper
Add Code

On Emergence of Clean-Priority Learning in Early Stopped Neural Networks

no code implementations • 5 Jun 2023 • Chaoyue Liu, Amirhesam Abedsoltan, Mikhail Belkin

This behaviour is believed to be a result of neural networks learning the pattern of clean data first and fitting the noise later in the training, a phenomenon that we refer to as clean-priority learning.

Paper
Add Code

ReLU soothes the NTK condition number and accelerates optimization for wide neural networks

no code implementations • 15 May 2023 • Chaoyue Liu, Like Hui

Comparing with linear neural networks, we show that a ReLU activated wide neural network at random initialization has a larger angle separation for similar data in the feature space of model gradient, and has a smaller condition number for NTK.

Paper
Add Code

Quadratic models for understanding neural network dynamics

1 code implementation • 24 May 2022 • Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin

While neural networks can be approximated by linear models as their width increases, certain properties of wide neural networks cannot be captured by linear models.

Paper
Code

Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture

no code implementations • 24 May 2022 • Libin Zhu, Chaoyue Liu, Mikhail Belkin

In this paper we show that feedforward neural networks corresponding to arbitrary directed acyclic graphs undergo transition to linearity as their "width" approaches infinity.

Paper
Add Code

Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models

no code implementations • ICLR 2022 • Chaoyue Liu, Libin Zhu, Mikhail Belkin

Wide neural networks with linear output layer have been shown to be near-linear, and to have near-constant neural tangent kernel (NTK), in a region containing the optimization path of gradient descent.

Paper
Add Code

Hyper-parameter optimization based on soft actor critic and hierarchical mixture regularization

no code implementations • 8 Dec 2021 • Chaoyue Liu, Yulai Zhang

Hyper-parameter optimization is a crucial problem in machine learning as it aims to achieve the state-of-the-art performance in any model.

Bayesian Optimization reinforcement-learning +1

Paper
Add Code

On the linearity of large non-linear models: when and why the tangent kernel is constant

no code implementations • NeurIPS 2020 • Chaoyue Liu, Libin Zhu, Mikhail Belkin

We show that the transition to linearity of the model and, equivalently, constancy of the (neural) tangent kernel (NTK) result from the scaling properties of the norm of the Hessian matrix of the network as a function of the network width.

Paper
Add Code

Loss landscapes and optimization in over-parameterized non-linear systems and neural networks

no code implementations • 29 Feb 2020 • Chaoyue Liu, Libin Zhu, Mikhail Belkin

The success of deep learning is due, to a large extent, to the remarkable effectiveness of gradient-based optimization methods applied to large neural networks.

Paper
Add Code

Accelerating SGD with momentum for over-parameterized learning

1 code implementation • ICLR 2020 • Chaoyue Liu, Mikhail Belkin

This is in contrast to the classical results in the deterministic scenario, where the same step size ensures accelerated convergence of the Nesterov's method over optimal gradient descent.

Paper
Code

Parametrized Accelerated Methods Free of Condition Number

no code implementations • 28 Feb 2018 • Chaoyue Liu, Mikhail Belkin

Analyses of accelerated (momentum-based) gradient descent usually assume bounded condition number to obtain exponential convergence rates.

Paper
Add Code

Co-trained convolutional neural networks for automated detection of prostate cancer in multi-parametric MRI

1 code implementation • Elsevier 2017 • Xin Yang, Chaoyue Liu, Zhiwei Wang, Jun Yang, Hung Le Min, Liang Wang, Kwang-Ting (Tim) Cheng

Each network is trained using images of a single modality in a weakly-supervised manner by providing a set of prostate images with image-level labels indicating only the presence of PCa without priors of lesions’ locations.

General Classification

Paper
Code

Clustering with Bregman Divergences: an Asymptotic Analysis

no code implementations • NeurIPS 2016 • Chaoyue Liu, Mikhail Belkin

Clustering, in particular $k$-means clustering, is a central topic in data analysis.

Clustering Quantization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.