no code implementations • 16 Oct 2023 • Makoto Yamada, Yuki Takezawa, Guillaume Houry, Kira Michaela Dusterwald, Deborah Sulem, Han Zhao, Yao-Hung Hubert Tsai
We find that the model performance depends on the combination of TWD and probability model, and that the Jeffrey divergence regularization helps in model training.
1 code implementation • 13 Oct 2023 • Ryoma Sato, Yuki Takezawa, Han Bao, Kenta Niwa, Makoto Yamada
LLMs can generate texts that cannot be distinguished from human-written texts.
no code implementations • 2 Oct 2023 • Yuki Takezawa, Ryoma Sato, Han Bao, Kenta Niwa, Makoto Yamada
Although existing watermarking methods have successfully detected texts generated by LLMs, they significantly degrade the quality of the generated texts.
no code implementations • 30 Sep 2022 • Yuki Takezawa, Han Bao, Kenta Niwa, Ryoma Sato, Makoto Yamada
In this study, we propose Momentum Tracking, which is a method with momentum whose convergence rate is proven to be independent of data heterogeneity.
no code implementations • 24 Jun 2022 • Makoto Yamada, Yuki Takezawa, Ryoma Sato, Han Bao, Zornitsa Kozareva, Sujith Ravi
In this paper, we aim to approximate the 1-Wasserstein distance by the tree-Wasserstein distance (TWD), where TWD is a 1-Wasserstein distance with tree-based embedding and can be computed in linear time with respect to the number of nodes on a tree.
no code implementations • 23 May 2022 • Yuki Takezawa, Kenta Niwa, Makoto Yamada
However, the convergence rate of the ECL is provided only when the objective function is convex, and has not been shown in a standard machine learning setting where the objective function is non-convex.
no code implementations • 8 May 2022 • Yuki Takezawa, Kenta Niwa, Makoto Yamada
Moreover, we demonstrate that the C-ECL is more robust to heterogeneous data than the Gossip-based algorithms.
no code implementations • 13 Oct 2021 • Kazutoshi Shinoda, Yuki Takezawa, Masahiro Suzuki, Yusuke Iwasawa, Yutaka Matsuo
An interactive instruction following task has been proposed as a benchmark for learning to map natural language instructions and first-person vision into sequences of actions to interact with objects in 3D environments.
1 code implementation • 8 Sep 2021 • Yuki Takezawa, Ryoma Sato, Zornitsa Kozareva, Sujith Ravi, Makoto Yamada
By contrast, the Wasserstein distance on a tree, called the tree-Wasserstein distance, can be computed in linear time and allows for the fast comparison of a large number of distributions.
no code implementations • 27 Jan 2021 • Yuki Takezawa, Ryoma Sato, Makoto Yamada
Specifically, we rewrite the Wasserstein distance on the tree metric by the parent-child relationships of a tree and formulate it as a continuous optimization problem using a contrastive loss.