no code implementations • 11 Feb 2024 • Liu Ziyin, Mingze Wang, Hongchao Li, Lei Wu

Symmetries exist abundantly in the loss function of neural networks.

no code implementations • 13 Jan 2024 • Yizhou Xu, Liu Ziyin

We identify and exactly solve the learning dynamics of a one-hidden-layer linear model at any finite width whose limits exhibit both the kernel phase and the feature learning phase.

no code implementations • 29 Sep 2023 • Liu Ziyin

Due to common architecture designs, symmetries exist extensively in contemporary neural networks.

no code implementations • 13 Aug 2023 • Liu Ziyin, Hongchao Li, Masahito Ueda

The stochastic gradient descent (SGD) algorithm is the algorithm we use to train neural networks.

1 code implementation • 27 Mar 2023 • James B. Simon, Maksis Knutins, Liu Ziyin, Daniel Geisz, Abraham J. Fetterman, Joshua Albrecht

We present a simple picture of the training process of joint embedding self-supervised learning methods.

no code implementations • 23 Mar 2023 • Liu Ziyin, Botao Li, Tomer Galanti, Masahito Ueda

Characterizing and understanding the stability of Stochastic Gradient Descent (SGD) remains an open problem in deep learning.

2 code implementations • 3 Oct 2022 • Liu Ziyin, ZiHao Wang

We propose to minimize a generic differentiable objective with $L_1$ constraint using a simple reparametrization and straightforward stochastic gradient descent.

no code implementations • 2 Oct 2022 • Liu Ziyin, Ekdeep Singh Lubana, Masahito Ueda, Hidenori Tanaka

Prevention of complete and dimensional collapse of representations has recently become a design principle for self-supervised learning (SSL).

no code implementations • 25 May 2022 • Liu Ziyin, Masahito Ueda

This work reports deep-learning-unique first-order and second-order phase transitions, whose phenomenology closely follows that in statistical physics.

no code implementations • 9 May 2022 • ZiHao Wang, Liu Ziyin

This work identifies the existence and cause of a type of posterior collapse that frequently occurs in the Bayesian deep learning practice.

no code implementations • 10 Feb 2022 • Liu Ziyin, Botao Li, Xiangming Meng

This work finds the analytical expression of the global minima of a deep linear network with weight decay and stochastic neurons, a fundamental model for understanding the landscape of neural networks.

no code implementations • 30 Jan 2022 • Liu Ziyin, HANLIN ZHANG, Xiangming Meng, Yuting Lu, Eric Xing, Masahito Ueda

This work theoretically studies stochastic neural networks, a main type of neural network in use.

no code implementations • 29 Sep 2021 • Takashi Mori, Liu Ziyin, Kangqiao Liu, Masahito Ueda

Stochastic gradient descent (SGD) undergoes complicated multiplicative noise for the mean-square loss.

no code implementations • ICLR 2022 • Liu Ziyin, Botao Li, James B Simon, Masahito Ueda

Stochastic gradient descent (SGD) is widely used for the nonlinear, nonconvex problem of training deep neural networks, but its behavior remains poorly understood.

no code implementations • 25 Jul 2021 • Liu Ziyin, Botao Li, James B. Simon, Masahito Ueda

Previous works on stochastic gradient descent (SGD) often focus on its success.

1 code implementation • 8 Jun 2021 • Liu Ziyin, Kentaro Minami, Kentaro Imajo

The task we consider is portfolio construction in a speculative market, a fundamental problem in modern finance.

no code implementations • 20 May 2021 • Takashi Mori, Liu Ziyin, Kangqiao Liu, Masahito Ueda

Stochastic gradient descent (SGD) undergoes complicated multiplicative noise for the mean-square loss.

no code implementations • 15 May 2021 • Zhang Zhiyi, Liu Ziyin

Adaptive gradient methods have achieved remarkable success in training deep neural networks on a wide variety of tasks.

no code implementations • ICLR 2022 • Liu Ziyin, Kangqiao Liu, Takashi Mori, Masahito Ueda

The noise in stochastic gradient descent (SGD), caused by minibatch sampling, is poorly understood despite its practical importance in deep learning.

no code implementations • 7 Dec 2020 • Kangqiao Liu, Liu Ziyin, Masahito Ueda

In the vanishing learning rate regime, stochastic gradient descent (SGD) is now relatively well understood.

1 code implementation • 4 Dec 2020 • Paul Pu Liang, Peter Wu, Liu Ziyin, Louis-Philippe Morency, Ruslan Salakhutdinov

In this work, we propose algorithms for cross-modal generalization: a learning paradigm to train a model that can (1) quickly perform new tasks in a target modality (i. e. meta-learning) and (2) doing so while being trained on a different source modality.

no code implementations • 23 Oct 2020 • Blair Chen, Liu Ziyin, ZiHao Wang, Paul Pu Liang

In this paper, as a step towards understanding why label smoothing is effective, we propose a theoretical framework to show how label smoothing provides in controlling the generalization loss.

3 code implementations • NeurIPS 2020 • Liu Ziyin, Tilman Hartwig, Masahito Ueda

Previous literature offers limited clues on how to learn a periodic function using modern neural networks.

no code implementations • 25 Mar 2020 • Liu Ziyin, ZiHao Wang, Makoto Yamada, Masahito Ueda

We propose a novel regularization method, called \textit{volumization}, for neural networks.

no code implementations • 16 Feb 2020 • Liu Ziyin, Blair Chen, Ru Wang, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency, Masahito Ueda

Learning in the presence of label noise is a challenging yet important task: it is crucial to design models that are robust in the presence of mislabeled datasets.

1 code implementation • 12 Feb 2020 • Liu Ziyin, Zhikang T. Wang, Masahito Ueda

We also bound the regret of Laprop on a convex problem and show that our bound differs from that of Adam by a key factor, which demonstrates its advantage.

4 code implementations • 6 Jan 2020 • Paul Pu Liang, Terrance Liu, Liu Ziyin, Nicholas B. Allen, Randy P. Auerbach, David Brent, Ruslan Salakhutdinov, Louis-Philippe Morency

To this end, we propose a new federated learning algorithm that jointly learns compact local representations on each device and a global model across all devices.

no code implementations • 25 Sep 2019 • Liu Ziyin, Ru Wang, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency, Masahito Ueda

Learning in the presence of label noise is a challenging yet important task.

3 code implementations • NeurIPS 2019 • Liu Ziyin, Zhikang Wang, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency, Masahito Ueda

We deal with the \textit{selective classification} problem (supervised-learning problem with a rejection option), where we want to achieve the best performance at a certain level of coverage of the data.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.