Search Results for author: Sho Yaida

Found 6 papers, 2 papers with code

Effective Theory of Transformers at Initialization

no code implementations4 Apr 2023 Emily Dinan, Sho Yaida, Susan Zhang

We perform an effective-theory analysis of forward-backward signal propagation in wide and deep Transformers, i. e., residual neural networks with multi-head self-attention blocks and multilayer perceptron blocks.

Meta-Principled Family of Hyperparameter Scaling Strategies

no code implementations10 Oct 2022 Sho Yaida

In this note, we first derive a one-parameter family of hyperparameter scaling strategies that interpolates between the neural-tangent scaling and mean-field/maximal-update scaling.

Representation Learning

The Principles of Deep Learning Theory

no code implementations18 Jun 2021 Daniel A. Roberts, Sho Yaida, Boris Hanin

This book develops an effective theory approach to understanding deep neural networks of practical relevance.

Inductive Bias Learning Theory +1

Fluctuation-dissipation relations for stochastic gradient descent

2 code implementations ICLR 2019 Sho Yaida

The notion of the stationary equilibrium ensemble has played a central role in statistical mechanics.

BIG-bench Machine Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.