Search Results for author: Yoshiki Tanaka

Found 3 papers, 1 papers with code

Cuttlefish: Low-Rank Model Training without All the Tuning

1 code implementation • 4 May 2023 • Hongyi Wang, Saurabh Agarwal, Pongsakorn U-chupala, Yoshiki Tanaka, Eric P. Xing, Dimitris Papailiopoulos

Cuttlefish leverages the observation that after a few epochs of full-rank training, the stable rank (i. e., an approximation of the true rank) of each layer stabilizes at a constant value.

Paper
Code

MRL: Learning to Mix with Attention and Convolutions

no code implementations • 30 Aug 2022 • Shlok Mohta, Hisahiro Suganuma, Yoshiki Tanaka

To achieve an efficient mix, we exploit the domain-wide receptive field provided by self-attention for regional-scale mixing and convolutional kernels restricted to local scale for local-scale mixing.

Ranked #1 on Multi-tissue Nucleus Segmentation on CoNSeP

Histopathological Segmentation Inductive Bias +3

Paper
Add Code

Massively Distributed SGD: ImageNet/ResNet-50 Training in a Flash

no code implementations • 13 Nov 2018 • Hiroaki Mikami, Hisahiro Suganuma, Pongsakorn U-chupala, Yoshiki Tanaka, Yuichi Kageyama

Scaling the distributed deep learning to a massive GPU cluster level is challenging due to the instability of the large mini-batch training and the overhead of the gradient synchronization.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.