1 code implementation • 27 Jul 2023 • Ryo Nakamura, Hirokatsu Kataoka, Sora Takashima, Edgar Josafat Martinez Noriega, Rio Yokota, Nakamasa Inoue
Prior work on FDSL has shown that pre-training vision transformers on such synthetic datasets can yield competitive accuracy on a wide range of downstream tasks.
2 code implementations • 8 May 2023 • Kazuki Osawa, Satoki Ishikawa, Rio Yokota, Shigang Li, Torsten Hoefler
Gradient preconditioning is a key technique to integrate the second-order information into gradients for improving and extending gradient-based learning algorithms.
no code implementations • CVPR 2023 • Sora Takashima, Ryo Hayamizu, Nakamasa Inoue, Hirokatsu Kataoka, Rio Yokota
Unlike JFT-300M which is a static dataset, the quality of synthetic datasets will continue to improve, and the current work is a testament to this possibility.
no code implementations • 18 Nov 2022 • Aoyu Li, Ikuro Sato, Kohta Ishikawa, Rei Kawakami, Rio Yokota
Among various supervised deep metric learning methods proxy-based approaches have achieved high retrieval accuracies.
1 code implementation • 15 Nov 2022 • Hiroki Naganuma, Kartik Ahuja, Shiro Takagi, Tetsuya Motokawa, Rio Yokota, Kohta Ishikawa, Ikuro Sato, Ioannis Mitliagkas
Modern deep learning systems do not generalize well when the test data distribution is slightly different to the training data distribution.
no code implementations • CVPR 2022 • Hirokatsu Kataoka, Ryo Hayamizu, Ryosuke Yamada, Kodai Nakashima, Sora Takashima, Xinyu Zhang, Edgar Josafat Martinez-Noriega, Nakamasa Inoue, Rio Yokota
In the present work, we show that the performance of formula-driven supervised learning (FDSL) can match or even exceed that of ImageNet-21k without the use of real images, human-, and self-supervision during the pre-training of Vision Transformers (ViTs).
no code implementations • 29 Sep 2021 • Hiroki Naganuma, Taiji Suzuki, Rio Yokota, Masahiro Nomura, Kohta Ishikawa, Ikuro Sato
Generalization measures are intensively studied in the machine learning community for better modeling generalization gaps.
1 code implementation • 9 Sep 2021 • Hana Hoshino, Kei Ota, Asako Kanezaki, Rio Yokota
Inverse Reinforcement Learning (IRL) is attractive in scenarios where reward engineering can be tedious.
1 code implementation • ICCV 2021 • Shun Iwase, Xingyu Liu, Rawal Khirodkar, Rio Yokota, Kris M. Kitani
Furthermore, we utilize differentiable Levenberg-Marquardt (LM) optimization to refine a pose fast and accurately by minimizing the feature-metric error between the input and rendered image representations without the need of zooming in.
Ranked #4 on
6D Pose Estimation using RGB
on LineMOD
no code implementations • 30 Jul 2020 • Kento Doi, Ryuhei Hamaguchi, Shun Iwase, Rio Yokota, Yutaka Matsuo, Ken Sakurada
To cope with the difficulty, we introduce a deep graph matching network that establishes object correspondence between an image pair.
1 code implementation • 13 Feb 2020 • Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse, Chuan-Sheng Foo, Rio Yokota
Large-scale distributed training of deep neural networks results in models with worse generalization performance as a result of the increase in the effective mini-batch size.
1 code implementation • 30 Oct 2019 • Rise Ooi, Takeshi Iwashita, Takeshi Fukaya, Akihiro Ida, Rio Yokota
Hierarchical Matrix (H-matrix) is an approximation technique which splits a target dense matrix into multiple submatrices, and where a selected portion of submatrices are low-rank approximated.
Mathematical Software Distributed, Parallel, and Cluster Computing
1 code implementation • NeurIPS 2019 • Kazuki Osawa, Siddharth Swaroop, Anirudh Jain, Runa Eschenhagen, Richard E. Turner, Rio Yokota, Mohammad Emtiyaz Khan
Importantly, the benefits of Bayesian principles are preserved: predictive probabilities are well-calibrated, uncertainties on out-of-distribution data are improved, and continual-learning performance is boosted.
3 code implementations • CVPR 2019 • Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse, Rio Yokota, Satoshi Matsuoka
Large-scale distributed training of deep neural networks suffer from the generalization gap caused by the increase in the effective mini-batch size.
1 code implementation • 27 Mar 2018 • Mustafa Abduljabbar, Mohammed Al Farhan, Noha Al-Harthi, Rui Chen, Rio Yokota, Hakan Bagci, David Keyes
With distributed memory optimizations, on the other hand, we report near-optimal efficiency in the weak scalability study with respect to both the logarithmic communication complexity as well as the theoretical scaling complexity of FMM.
Performance Computational Engineering, Finance, and Science Mathematical Software
1 code implementation • 29 May 2014 • Mustafa AbdulJabbar, Rio Yokota, David Keyes
Fast multipole methods (FMM) on distributed mem- ory have traditionally used a bulk-synchronous model of com- municating the local essential tree (LET) and overlapping it with computation of the local data.
Distributed, Parallel, and Cluster Computing 70F10 D.1.2; D.1.3; G.1.0; G.1.2