Search Results for author: Masahito Ueda

Found 27 papers, 6 papers with code

Symbolic Equation Solving via Reinforcement Learning

no code implementations24 Jan 2024 Lennart Dabelow, Masahito Ueda

Machine-learning methods are gradually being adopted in a great variety of social, economic, and scientific contexts, yet they are notorious for struggling with exact mathematics.

reinforcement-learning

Law of Balance and Stationary Distribution of Stochastic Gradient Descent

no code implementations13 Aug 2023 Liu Ziyin, Hongchao Li, Masahito Ueda

The stochastic gradient descent (SGD) algorithm is the algorithm we use to train neural networks.

The Probabilistic Stability of Stochastic Gradient Descent

no code implementations23 Mar 2023 Liu Ziyin, Botao Li, Tomer Galanti, Masahito Ueda

Characterizing and understanding the stability of Stochastic Gradient Descent (SGD) remains an open problem in deep learning.

Learning Theory

What shapes the loss landscape of self-supervised learning?

no code implementations2 Oct 2022 Liu Ziyin, Ekdeep Singh Lubana, Masahito Ueda, Hidenori Tanaka

Prevention of complete and dimensional collapse of representations has recently become a design principle for self-supervised learning (SSL).

Self-Supervised Learning

Three Learning Stages and Accuracy-Efficiency Tradeoff of Restricted Boltzmann Machines

1 code implementation2 Sep 2022 Lennart Dabelow, Masahito Ueda

Restricted Boltzmann Machines (RBMs) offer a versatile architecture for unsupervised machine learning that can in principle approximate any target probability distribution with arbitrary accuracy.

Exact Phase Transitions in Deep Learning

no code implementations25 May 2022 Liu Ziyin, Masahito Ueda

This work reports deep-learning-unique first-order and second-order phase transitions, whose phenomenology closely follows that in statistical physics.

Stochastic Neural Networks with Infinite Width are Deterministic

no code implementations30 Jan 2022 Liu Ziyin, HANLIN ZHANG, Xiangming Meng, Yuting Lu, Eric Xing, Masahito Ueda

This work theoretically studies stochastic neural networks, a main type of neural network in use.

Interplay between depth of neural networks and locality of target functions

no code implementations28 Jan 2022 Takashi Mori, Masahito Ueda

It has been recognized that heavily overparameterized deep neural networks (DNNs) exhibit surprisingly good generalization performance in various machine-learning tasks.

Learning Theory

SGD Can Converge to Local Maxima

no code implementations ICLR 2022 Liu Ziyin, Botao Li, James B Simon, Masahito Ueda

Stochastic gradient descent (SGD) is widely used for the nonlinear, nonconvex problem of training deep neural networks, but its behavior remains poorly understood.

Logarithmic landscape and power-law escape rate of SGD

no code implementations29 Sep 2021 Takashi Mori, Liu Ziyin, Kangqiao Liu, Masahito Ueda

Stochastic gradient descent (SGD) undergoes complicated multiplicative noise for the mean-square loss.

Convergent and Efficient Deep Q Learning Algorithm

no code implementations ICLR 2022 Zhikang T. Wang, Masahito Ueda

Despite the empirical success of the deep Q network (DQN) reinforcement learning algorithm and its variants, DQN is still not well understood and it does not guarantee convergence.

Q-Learning reinforcement-learning +1

SGD with a Constant Large Learning Rate Can Converge to Local Maxima

no code implementations25 Jul 2021 Liu Ziyin, Botao Li, James B. Simon, Masahito Ueda

Previous works on stochastic gradient descent (SGD) often focus on its success.

Convergent and Efficient Deep Q Network Algorithm

1 code implementation29 Jun 2021 Zhikang T. Wang, Masahito Ueda

Despite the empirical success of the deep Q network (DQN) reinforcement learning algorithm and its variants, DQN is still not well understood and it does not guarantee convergence.

reinforcement-learning Reinforcement Learning (RL)

Power-law escape rate of SGD

no code implementations20 May 2021 Takashi Mori, Liu Ziyin, Kangqiao Liu, Masahito Ueda

Stochastic gradient descent (SGD) undergoes complicated multiplicative noise for the mean-square loss.

Strength of Minibatch Noise in SGD

no code implementations ICLR 2022 Liu Ziyin, Kangqiao Liu, Takashi Mori, Masahito Ueda

The noise in stochastic gradient descent (SGD), caused by minibatch sampling, is poorly understood despite its practical importance in deep learning.

Embedding the Yang-Lee Quantum Criticality in Open Quantum Systems

no code implementations24 Dec 2020 Norifumi Matsumoto, Masaya Nakagawa, Masahito Ueda

The Yang-Lee edge singularity is a quintessential nonunitary critical phenomenon accompanied by anomalous scaling laws.

Statistical Mechanics Quantum Physics

Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent

no code implementations7 Dec 2020 Kangqiao Liu, Liu Ziyin, Masahito Ueda

In the vanishing learning rate regime, stochastic gradient descent (SGD) is now relatively well understood.

Bayesian Inference Second-order methods

Improved generalization by noise enhancement

no code implementations28 Sep 2020 Takashi Mori, Masahito Ueda

Recent studies have demonstrated that noise in stochastic gradient descent (SGD) is closely related to generalization: A larger SGD noise, if not too large, results in better generalization.

Intercomponent entanglement entropy and spectrum in binary Bose-Einstein condensates

no code implementations7 Sep 2020 Takumi Yoshino, Shunsuke Furukawa, Masahito Ueda

We study the entanglement entropy and spectrum between components in binary Bose-Einstein condensates in $d$ spatial dimensions.

Quantum Gases Statistical Mechanics Quantum Physics

Neural Networks Fail to Learn Periodic Functions and How to Fix It

3 code implementations NeurIPS 2020 Liu Ziyin, Tilman Hartwig, Masahito Ueda

Previous literature offers limited clues on how to learn a periodic function using modern neural networks.

Inductive Bias

Is deeper better? It depends on locality of relevant features

no code implementations26 May 2020 Takashi Mori, Masahito Ueda

It is shown that the NTK does not correctly capture the depth dependence of the generalization performance, which indicates the importance of the feature learning rather than the lazy learning.

General Classification

Volumization as a Natural Generalization of Weight Decay

no code implementations25 Mar 2020 Liu Ziyin, ZiHao Wang, Makoto Yamada, Masahito Ueda

We propose a novel regularization method, called \textit{volumization}, for neural networks.

Memorization

Learning Not to Learn in the Presence of Noisy Labels

no code implementations16 Feb 2020 Liu Ziyin, Blair Chen, Ru Wang, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency, Masahito Ueda

Learning in the presence of label noise is a challenging yet important task: it is crucial to design models that are robust in the presence of mislabeled datasets.

Memorization text-classification +1

LaProp: Separating Momentum and Adaptivity in Adam

1 code implementation12 Feb 2020 Liu Ziyin, Zhikang T. Wang, Masahito Ueda

We also bound the regret of Laprop on a convex problem and show that our bound differs from that of Adam by a key factor, which demonstrates its advantage.

Style Transfer

Deep Reinforcement Learning Control of Quantum Cartpoles

1 code implementation21 Oct 2019 Zhikang T. Wang, Yuto Ashida, Masahito Ueda

We generalize a standard benchmark of reinforcement learning, the classical cartpole balancing problem, to the quantum regime by stabilizing a particle in an unstable potential through measurement and feedback.

reinforcement-learning Reinforcement Learning (RL)

Deep Gamblers: Learning to Abstain with Portfolio Theory

3 code implementations NeurIPS 2019 Liu Ziyin, Zhikang Wang, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency, Masahito Ueda

We deal with the \textit{selective classification} problem (supervised-learning problem with a rejection option), where we want to achieve the best performance at a certain level of coverage of the data.

Classification General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.