no code implementations • 24 Jul 2023 • Shun-ichi Amari, Takeru Matsuda
The shape of a probability distribution and its affine deformation are separated in the Wasserstein geometry, showing its robustness against the waveform perturbation in exchange for the loss in Fisher efficiency.
1 code implementation • 10 Feb 2022 • Kaito Watanabe, Kotaro Sakamoto, Ryo Karakida, Sho Sonoda, Shun-ichi Amari
In this paper, we investigate such neural fields in a multilayer architecture to investigate the supervised learning of the fields.
no code implementations • ICLR 2021 • Shun-ichi Amari, Jimmy Ba, Roger Grosse, Xuechen Li, Atsushi Nitanda, Taiji Suzuki, Denny Wu, Ji Xu
While second order optimizers such as natural gradient descent (NGD) often speed up optimization, their effect on generalization has been called into question.
no code implementations • 20 Jan 2020 • Shun-ichi Amari
It is known that any target function is realized in a sufficiently small neighborhood of any randomly connected deep network, provided the width (the number of neurons in a layer) is sufficiently large.
no code implementations • 14 Oct 2019 • Ryo Karakida, Shotaro Akaho, Shun-ichi Amari
The Fisher information matrix (FIM) plays an essential role in statistics and machine learning as a Riemannian metric tensor or a component of the Hessian matrix of loss functions.
no code implementations • NeurIPS 2019 • Ryo Karakida, Shotaro Akaho, Shun-ichi Amari
Thus, we can conclude that batch normalization in the last layer significantly contributes to decreasing the sharpness induced by the FIM.
1 code implementation • 18 Oct 2018 • Jean Feydy, Thibault Séjourné, François-Xavier Vialard, Shun-ichi Amari, Alain Trouvé, Gabriel Peyré
Comparing probability distributions is a fundamental problem in data sciences.
Statistics Theory Statistics Theory 62
no code implementations • 22 Aug 2018 • Shun-ichi Amari, Ryo Karakida, Masafumi Oizumi
The manifold of input signals is embedded in a higher dimensional manifold of the next layer as a curved submanifold, provided the number of neurons is larger than that of inputs.
no code implementations • 22 Aug 2018 • Shun-ichi Amari, Ryo Karakida, Masafumi Oizumi
The natural gradient method uses the steepest descent direction in a Riemannian manifold, so it is effective in learning, avoiding plateaus.
no code implementations • 4 Jun 2018 • Ryo Karakida, Shotaro Akaho, Shun-ichi Amari
The Fisher information matrix (FIM) is a fundamental quantity to represent the characteristics of a stochastic model, including deep neural networks (DNNs).
no code implementations • 9 Oct 2014 • Qibin Zhao, Guoxu Zhou, Liqing Zhang, Andrzej Cichocki, Shun-ichi Amari
We propose a generative model for robust tensor factorization in the presence of both missing data and outliers.