no code implementations • 9 Jun 2023 • Tomoya Yamashita, Masanori Yamada, Takashi Shibata
A naive MU approach is to re-train the whole model with the training data from which the undesirable data has been removed.
no code implementations • 9 Jun 2023 • Masanori Yamada, Tomoya Yamashita, Shin'ya Yamaguchi, Daiki Chijiwa
We also show that merged models require datasets for merging in order to achieve a high accuracy.
no code implementations • 1 Nov 2022 • Tomokatsu Takahashi, Masanori Yamada, Yuuki Yamanaka, Tomoya Yamashita
In addition to the output of the teacher model, ARDIR uses the internal representation of the teacher model as a label for adversarial training.