no code implementations • 6 Feb 2024 • Akira Ito, Masanori Yamada, Atsutoshi Kumagai
This finding shows that permutations found by WM mainly align the directions of singular vectors associated with large singular values across models.
no code implementations • 9 Jun 2023 • Tomoya Yamashita, Masanori Yamada, Takashi Shibata
A naive MU approach is to re-train the whole model with the training data from which the undesirable data has been removed.
no code implementations • 9 Jun 2023 • Masanori Yamada, Tomoya Yamashita, Shin'ya Yamaguchi, Daiki Chijiwa
We also show that merged models require datasets for merging in order to achieve a high accuracy.
no code implementations • 1 Nov 2022 • Tomokatsu Takahashi, Masanori Yamada, Yuuki Yamanaka, Tomoya Yamashita
In addition to the output of the teacher model, ARDIR uses the internal representation of the teacher model as a label for adversarial training.
no code implementations • 21 Jul 2022 • Sekitoshi Kanai, Shin'ya Yamaguchi, Masanori Yamada, Hiroshi Takahashi, Kentaro Ohno, Yasutoshi Ida
This paper proposes a new loss function for adversarial training.
no code implementations • 2 Mar 2021 • Sekitoshi Kanai, Masanori Yamada, Hiroshi Takahashi, Yuki Yamanaka, Yasutoshi Ida
We reveal that the constraint of adversarial attacks is one cause of the non-smoothness and that the smoothness depends on the types of the constraints.
no code implementations • 5 Feb 2021 • Masanori Yamada, Sekitoshi Kanai, Tomoharu Iwata, Tomokatsu Takahashi, Yuki Yamanaka, Hiroshi Takahashi, Atsutoshi Kumagai
We theoretically and experimentally confirm that the weight loss landscape becomes sharper as the magnitude of the noise of adversarial training increases in the linear logistic regression model.
no code implementations • 6 Oct 2020 • Sekitoshi Kanai, Masanori Yamada, Shin'ya Yamaguchi, Hiroshi Takahashi, Yasutoshi Ida
We theoretically and empirically reveal that small logits by addition of a common activation function, e. g., hyperbolic tangent, do not improve adversarial robustness since input vectors of the function (pre-logit vectors) can have large norms.
no code implementations • 19 Sep 2019 • Sekitoshi Kanai, Yasutoshi Ida, Yasuhiro Fujiwara, Masanori Yamada, Shuichi Adachi
Furthermore, we reveal that robust CNNs with Absum are more robust against transferred attacks due to decreasing the common sensitivity and against high-frequency noise than standard regularization methods.
no code implementations • ICLR 2019 • Masanori Yamada, Kim Heecheol, Kosuke Miyoshi, Hiroshi Yamakawa
Previous works succeed in disentangling static factors and dynamic factors by explicitly modeling the priors of latent variables to distinguish between static and dynamic factors.
no code implementations • 26 Mar 2019 • Yuki Yamanaka, Tomoharu Iwata, Hiroshi Takahashi, Masanori Yamada, Sekitoshi Kanai
Since our approach becomes able to reconstruct the normal data points accurately and fails to reconstruct the known and unknown anomalies, it can accurately discriminate both known and unknown anomalies from normal data points.
no code implementations • 22 Mar 2019 • Heecheol Kim, Masanori Yamada, Kosuke Miyoshi, Hiroshi Yamakawa
Macro actions, a sequence of primitive actions, have been studied to diminish the dimensionality of the action space with regard to the time axis.
1 code implementation • 22 Feb 2019 • Masanori Yamada, Heecheol Kim, Kosuke Miyoshi, Hiroshi Yamakawa
Previous models disentangle static and dynamic factors by explicitly modeling the priors of latent variables to distinguish between these factors.
1 code implementation • 14 Sep 2018 • Hiroshi Takahashi, Tomoharu Iwata, Yuki Yamanaka, Masanori Yamada, Satoshi Yagi
However, KL divergence with the aggregated posterior cannot be calculated in a closed form, which prevents us from using this optimal prior.