no code implementations • 22 Dec 2017 • Shun Kiyono, Sho Takase, Jun Suzuki, Naoaki Okazaki, Kentaro Inui, Masaaki Nagata
The encoder-decoder model is widely used in natural language generation tasks.
no code implementations • 13 Oct 2018 • Shun Kiyono, Jun Suzuki, Kentaro Inui
We also demonstrate that our method has the more data, better performance property with promising scalability to the amount of unlabeled data.
no code implementations • WS 2018 • Shun Kiyono, Sho Takase, Jun Suzuki, Naoaki Okazaki, Kentaro Inui, Masaaki Nagata
Developing a method for understanding the inner workings of black-box neural methods is an important research endeavor.
1 code implementation • ACL 2019 • Motoki Sato, Jun Suzuki, Shun Kiyono
A regularization technique based on adversarial perturbation, which was initially developed in the field of image processing, has been successfully applied to text classification tasks and has yielded attractive improvements.
1 code implementation • IJCNLP 2019 • Shun Kiyono, Jun Suzuki, Masato Mita, Tomoya Mizumoto, Kentaro Inui
The incorporation of pseudo data in the training of grammatical error correction models has been one of the main factors in improving the performance of such models.
Ranked #10 on Grammatical Error Correction on CoNLL-2014 Shared Task
no code implementations • 8 Oct 2019 • Paul Reisert, Benjamin Heinzerling, Naoya Inoue, Shun Kiyono, Kentaro Inui
Counter-arguments (CAs), one form of constructive feedback, have been proven to be useful for critical thinking skills.
1 code implementation • ACL 2020 • Hirofumi Inaguma, Shun Kiyono, Kevin Duh, Shigeki Karita, Nelson Enrique Yalta Soplin, Tomoki Hayashi, Shinji Watanabe
We present ESPnet-ST, which is designed for the quick development of speech-to-speech translation systems in a single framework.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
1 code implementation • ACL 2020 • Masahiro Kaneko, Masato Mita, Shun Kiyono, Jun Suzuki, Kentaro Inui
The answer to this question is not as straightforward as one might expect because the previous common methods for incorporating a MLM into an EncDec model have potential drawbacks when applied to GEC.
Ranked #2 on Grammatical Error Correction on JFLEG
no code implementations • Findings of the Association for Computational Linguistics 2020 • Masato Mita, Shun Kiyono, Masahiro Kaneko, Jun Suzuki, Kentaro Inui
Existing approaches for grammatical error correction (GEC) largely rely on supervised learning with manually created GEC datasets.
no code implementations • COLING 2020 • Ryuto Konno, Yuichiroh Matsubayashi, Shun Kiyono, Hiroki Ouchi, Ryo Takahashi, Kentaro Inui
This study addresses two underexplored issues on CDA, that is, how to reduce the computational cost of data augmentation and how to ensure the quality of the generated data.
1 code implementation • NAACL 2021 • Sho Takase, Shun Kiyono
We often use perturbations to regularize neural models.
Ranked #1 on Text Summarization on DUC 2004 Task 1 (using extra training data)
2 code implementations • 13 Apr 2021 • Sho Takase, Shun Kiyono
We propose a parameter sharing method for Transformers (Vaswani et al., 2017).
Ranked #1 on Machine Translation on WMT2014 English-German
1 code implementation • EMNLP 2021 • Ryuto Konno, Shun Kiyono, Yuichiroh Matsubayashi, Hiroki Ouchi, Kentaro Inui
Masked language models (MLMs) have contributed to drastic performance improvements with regard to zero anaphora resolution (ZAR).
1 code implementation • 13 Sep 2021 • Shun Kiyono, Sosuke Kobayashi, Jun Suzuki, Kentaro Inui
Position representation is crucial for building position-aware representations in Transformers.
no code implementations • BigScience (ACL) 2022 • Sosuke Kobayashi, Shun Kiyono, Jun Suzuki, Kentaro Inui
Ensembling is a popular method used to improve performance as a last resort.
1 code implementation • 1 Jun 2022 • Sho Takase, Shun Kiyono, Sosuke Kobayashi, Jun Suzuki
Recent Transformers tend to be Pre-LN because, in Post-LN with deep Transformers (e. g., those with ten or more layers), the training is often unstable, resulting in useless models.
no code implementations • 28 Dec 2023 • Sho Takase, Shun Kiyono, Sosuke Kobayashi, Jun Suzuki
Loss spikes often occur during pre-training of large language models.
no code implementations • WMT (EMNLP) 2020 • Shun Kiyono, Takumi Ito, Ryuto Konno, Makoto Morishita, Jun Suzuki
In this paper, we describe the submission of Tohoku-AIP-NTT to the WMT’20 news translation task.
no code implementations • EMNLP (IWSLT) 2019 • Hirofumi Inaguma, Shun Kiyono, Nelson Enrique Yalta Soplin, Jun Suzuki, Kevin Duh, Shinji Watanabe
In this year, we mainly build our systems based on Transformer architectures in all tasks and focus on the end-to-end speech translation (E2E-ST).
no code implementations • EMNLP 2021 • Shun Kiyono, Sosuke Kobayashi, Jun Suzuki, Kentaro Inui
Position representation is crucial for building position-aware representations in Transformers.