no code implementations • AACL (WAT) 2020 • Toshiaki Nakazawa, Hideki Nakayama, Chenchen Ding, Raj Dabre, Shohei Higashiyama, Hideya Mino, Isao Goto, Win Pa Pa, Anoop Kunchukuttan, Shantipriya Parida, Ondřej Bojar, Sadao Kurohashi
This paper presents the results of the shared tasks from the 7th workshop on Asian translation (WAT2020).
no code implementations • WMT (EMNLP) 2021 • Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri
This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021. In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.
no code implementations • AACL (WAT) 2020 • Matīss Rikters, Toshiaki Nakazawa, Ryokan Ri
The paper describes the development process of the The University of Tokyo’s NMT systems that were submitted to the WAT 2020 Document-level Business Scene Dialogue Translation sub-task.
no code implementations • WAT 2022 • Toshiaki Nakazawa, Hideya Mino, Isao Goto, Raj Dabre, Shohei Higashiyama, Shantipriya Parida, Anoop Kunchukuttan, Makoto Morishita, Ondřej Bojar, Chenhui Chu, Akiko Eriguchi, Kaori Abe, Yusuke Oda, Sadao Kurohashi
This paper presents the results of the shared tasks from the 9th workshop on Asian translation (WAT2022).
no code implementations • ACL (WAT) 2021 • Toshiaki Nakazawa, Hideki Nakayama, Chenchen Ding, Raj Dabre, Shohei Higashiyama, Hideya Mino, Isao Goto, Win Pa Pa, Anoop Kunchukuttan, Shantipriya Parida, Ondřej Bojar, Chenhui Chu, Akiko Eriguchi, Kaori Abe, Yusuke Oda, Sadao Kurohashi
This paper presents the results of the shared tasks from the 8th workshop on Asian translation (WAT2021).
no code implementations • 7 Sep 2021 • Matīss Rikters, Toshiaki Nakazawa
One of the most popular methods for context-aware machine translation (MT) is to use separate encoders for the source sentence and context as multiple sources for one target sentence.
1 code implementation • MTSummit 2021 • Ryokan Ri, Toshiaki Nakazawa, Yoshimasa Tsuruoka
Placeholder translation systems enable the users to specify how a specific phrase is translated in the output sentence.
no code implementations • ACL (WAT) 2021 • Ryokan Ri, Toshiaki Nakazawa, Yoshimasa Tsuruoka
For Japanese-to-English translation, zero pronouns in Japanese pose a challenge, since the model needs to infer and produce the corresponding pronoun in the target side of the English sentence.
1 code implementation • WMT (EMNLP) 2020 • Matīss Rikters, Ryokan Ri, Tong Li, Toshiaki Nakazawa
Sentence-level (SL) machine translation (MT) has reached acceptable quality for many high-resourced languages, but not document-level (DL) MT, which is difficult to 1) train with little amount of DL data; and 2) evaluate, as the main methods and data sets focus on SL evaluation.
no code implementations • EMNLP 2020 • Loïc Barrault, Magdalena Biesialska, Ondřej Bojar, Marta R. Costa-jussà, Christian Federmann, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Matthias Huck, Eric Joanis, Tom Kocmi, Philipp Koehn, Chi-kiu Lo, Nikola Ljubešić, Christof Monz, Makoto Morishita, Masaaki Nagata, Toshiaki Nakazawa, Santanu Pal, Matt Post, Marcos Zampieri
In the news task, participants were asked to build machine translation systems for any of 11 language pairs, to be evaluated on test sets consisting mainly of news stories.
1 code implementation • WS 2019 • Matīss Rikters, Ryokan Ri, Tong Li, Toshiaki Nakazawa
While the progress of machine translation of written text has come far in the past several years thanks to the increasing availability of parallel corpora and corpora-based training technologies, automatic translation of spoken text and dialogues remains challenging even for modern systems.
Ranked #1 on Machine Translation on Business Scene Dialogue JA-EN (using extra training data)
no code implementations • LREC 2020 • Sho Shimazu, Sho Takase, Toshiaki Nakazawa, Naoaki Okazaki
Therefore, we present a hand-crafted dataset to evaluate whether translation models can resolve the zero pronoun problems in Japanese to English translations.
no code implementations • LREC 2020 • Nobushige Doi, Yusuke Oda, Toshiaki Nakazawa
In this paper, we describe the details of the Timely Disclosure Documents Corpus (TDDC).
no code implementations • WS 2019 • Toshiaki Nakazawa, Nobushige Doi, Shohei Higashiyama, Chenchen Ding, Raj Dabre, Hideya Mino, Isao Goto, Win Pa Pa, Anoop Kunchukuttan, Yusuke Oda, Shantipriya Parida, Ond{\v{r}}ej Bojar, Sadao Kurohashi
This paper presents the results of the shared tasks from the 6th workshop on Asian translation (WAT2019) including Ja↔En, Ja↔Zh scientific paper translation subtasks, Ja↔En, Ja↔Ko, Ja↔En patent translation subtasks, Hi↔En, My↔En, Km↔En, Ta↔En mixed domain subtasks and Ru↔Ja news commentary translation task.
no code implementations • WS 2017 • Toshiaki Nakazawa, Shohei Higashiyama, Chenchen Ding, Hideya Mino, Isao Goto, Hideto Kazawa, Yusuke Oda, Graham Neubig, Sadao Kurohashi
For the WAT2017, 12 institutions participated in the shared tasks.
1 code implementation • WS 2017 • Fabien Cromieres, Raj Dabre, Toshiaki Nakazawa, Sadao Kurohashi
We describe here our approaches and results on the WAT 2017 shared translation tasks.
no code implementations • IJCNLP 2017 • Fabien Cromieres, Toshiaki Nakazawa, Raj Dabre
Machine Translation (MT) is a sub-field of NLP which has experienced a number of paradigm shifts since its inception.
1 code implementation • WS 2016 • Fabien Cromieres, Chenhui Chu, Toshiaki Nakazawa, Sadao Kurohashi
We report very good translation results, especially when using neural MT for Chinese-to-Japanese translation.
no code implementations • WS 2016 • Chenhui Chu, Toshiaki Nakazawa, Daisuke Kawahara, Sadao Kurohashi
Treebanks are curial for natural language processing (NLP).
no code implementations • WS 2016 • Toshiaki Nakazawa, Chenchen Ding, Hideya Mino, Isao Goto, Graham Neubig, Sadao Kurohashi
For the WAT2016, 15 institutions participated in the shared tasks.
no code implementations • LREC 2016 • Toshiaki Nakazawa, Manabu Yaguchi, Kiyotaka Uchimoto, Masao Utiyama, Eiichiro Sumita, Sadao Kurohashi, Hitoshi Isahara
In this paper, we describe the details of the ASPEC (Asian Scientific Paper Excerpt Corpus), which is the first large-size parallel corpus of scientific paper domain.
no code implementations • LREC 2016 • Antoine Bourlon, Chenhui Chu, Toshiaki Nakazawa, Sadao Kurohashi
Sentence alignment is a task that consists in aligning the parallel sentences in a translated article pair.
no code implementations • LREC 2014 • John Richardson, Toshiaki Nakazawa, Sadao Kurohashi
In this paper we present a bilingual transliteration lexicon of 170K Japanese-English technical terms in the scientific domain.
no code implementations • LREC 2014 • Chenhui Chu, Toshiaki Nakazawa, Sadao Kurohashi
Using the system, we construct a Chinese―Japanese parallel corpus with more than 126k highly accurate parallel sentences from Wikipedia.
no code implementations • LREC 2012 • Chenhui Chu, Toshiaki Nakazawa, Sadao Kurohashi
Chinese characters are used both in Japanese and Chinese, which are called Kanji and Hanzi respectively.