no code implementations • ICLR 2019 • Yuchen Qiao, Kenjiro Taura
These strategies, however, will miss some important opportunities to group the operations in the backward propagation of training neural networks.
no code implementations • 17 Aug 2023 • LiMin Wang, Masatoshi Hanai, Toyotaro Suzumura, Shun Takashige, Kenjiro Taura
In this study, we propose an effective pre-training method that addresses the imbalance in input data.
no code implementations • 16 Aug 2023 • Shun Takashige, Masatoshi Hanai, Toyotaro Suzumura, LiMin Wang, Kenjiro Taura
In material science, the prediction of unobserved values, commonly referred to as extrapolation, is particularly critical for property prediction as it enables researchers to gain insight into materials beyond the limits of available data.
no code implementations • 27 Mar 2022 • Toyotaro Suzumura, Akiyoshi Sugiki, Hiroyuki Takizawa, Akira Imakura, Hiroshi Nakamura, Kenjiro Taura, Tomohiro Kudoh, Toshihiro Hanawa, Yuji Sekiya, Hiroki Kobayashi, Shin Matsushima, Yohei Kuga, Ryo Nakamura, Renhe Jiang, Junya Kawase, Masatoshi Hanai, Hiroshi Miyazaki, Tsutomu Ishizaki, Daisuke Shimotoku, Daisuke Miyamoto, Kento Aida, Atsuko Takefusa, Takashi Kurimoto, Koji Sasayama, Naoya Kitagawa, Ikki Fujiwara, Yusuke Tanimura, Takayuki Aoki, Toshio Endo, Satoshi Ohshima, Keiichiro Fukazawa, Susumu Date, Toshihiro Uchibayashi
The growing amount of data and advances in data science have created a need for a new kind of cloud platform that provides users with flexibility, strong security, and the ability to couple with supercomputers and edge devices through high-performance networks.
no code implementations • 30 Mar 2021 • Masahiro Tanaka, Kenjiro Taura, Toshihiro Hanawa, Kentaro Torisawa
RaNNC also achieved better training throughputs than GPipe on both the enlarged BERT model pre-training (GPipe with hybrid parallelism) and the enlarged ResNet models (GPipe with model parallelism) in all of the settings we tried.