no code implementations • ICML 2020 • Yasutoshi Ida, Sekitoshi Kanai, Yasuhiro Fujiwara, Tomoharu Iwata, Koh Takeuchi, Hisashi Kashima
This is because coordinate descent iteratively updates all the parameters in the objective until convergence.
no code implementations • 14 Mar 2023 • Yasutoshi Ida, Sekitoshi Kanai, Kazuki Adachi, Atsutoshi Kumagai, Yasuhiro Fujiwara
Regularized discrete optimal transport (OT) is a powerful tool to measure the distance between two discrete distributions that have been constructed from data samples on two different domains.
no code implementations • 4 Oct 2022 • Kentaro Ohno, Sekitoshi Kanai, Yasutoshi Ida
We prove that the gradient vanishing of the gate function can be mitigated by accelerating the convergence of the saturating function, i. e., making the output of the function converge to 0 or 1 faster.
no code implementations • 21 Jul 2022 • Sekitoshi Kanai, Shin'ya Yamaguchi, Masanori Yamada, Hiroshi Takahashi, Kentaro Ohno, Yasutoshi Ida
This paper proposes a new loss function for adversarial training.
1 code implementation • 31 May 2022 • Daiki Chijiwa, Shin'ya Yamaguchi, Atsutoshi Kumagai, Yasutoshi Ida
Few-shot learning for neural networks (NNs) is an important problem that aims to train NNs with a few data.
1 code implementation • NeurIPS 2021 • Daiki Chijiwa, Shin'ya Yamaguchi, Yasutoshi Ida, Kenji Umakoshi, Tomohiro Inoue
Pruning the weights of randomly initialized neural networks plays an important role in the context of lottery ticket hypothesis.
no code implementations • 2 Mar 2021 • Sekitoshi Kanai, Masanori Yamada, Hiroshi Takahashi, Yuki Yamanaka, Yasutoshi Ida
We reveal that the constraint of adversarial attacks is one cause of the non-smoothness and that the smoothness depends on the types of the constraints.
no code implementations • 6 Oct 2020 • Sekitoshi Kanai, Masanori Yamada, Shin'ya Yamaguchi, Hiroshi Takahashi, Yasutoshi Ida
We theoretically and empirically reveal that small logits by addition of a common activation function, e. g., hyperbolic tangent, do not improve adversarial robustness since input vectors of the function (pre-logit vectors) can have large norms.
no code implementations • NeurIPS 2019 • Yasutoshi Ida, Yasuhiro Fujiwara, Hisashi Kashima
Block Coordinate Descent is a standard approach to obtain the parameters of Sparse Group Lasso, and iteratively updates the parameters for each parameter group.
no code implementations • 19 Sep 2019 • Sekitoshi Kanai, Yasutoshi Ida, Yasuhiro Fujiwara, Masanori Yamada, Shuichi Adachi
Furthermore, we reveal that robust CNNs with Absum are more robust against transferred attacks due to decreasing the common sensitivity and against high-frequency noise than standard regularization methods.
no code implementations • 10 Jun 2019 • Yasutoshi Ida, Yasuhiro Fujiwara
Our key idea is to introduce a priority term that identifies the importance of a layer; we can select unimportant layers according to the priority and erase them after the training.
no code implementations • 31 May 2016 • Yasutoshi Ida, Yasuhiro Fujiwara, Sotetsu Iwamura
Adaptive learning rate algorithms such as RMSProp are widely used for training deep neural networks.