no code implementations • LNLS (ACL) 2022 • Ryokan Ri, Yufang Hou, Radu Marinescu, Akihiro Kishimoto
When mapping a natural language instruction to a sequence of actions, it is often useful toidentify sub-tasks in the instruction.
no code implementations • AACL (WAT) 2020 • Matīss Rikters, Toshiaki Nakazawa, Ryokan Ri
The paper describes the development process of the The University of Tokyo’s NMT systems that were submitted to the WAT 2020 Document-level Business Scene Dialogue Translation sub-task.
1 code implementation • NAACL 2022 • Sosuke Nishikawa, Ryokan Ri, Ikuya Yamada, Yoshimasa Tsuruoka, Isao Echizen
We present EASE, a novel method for learning sentence embeddings via contrastive learning between sentences and their related entities.
no code implementations • ACL 2022 • Ryokan Ri, Yoshimasa Tsuruoka
We investigate what kind of structural knowledge learned in neural network encoders is transferable to processing natural language.
1 code implementation • ACL 2022 • Ryokan Ri, Ikuya Yamada, Yoshimasa Tsuruoka
We train a multilingual language model with 24 languages with entity representations and show the model consistently outperforms word-based pretrained models in various cross-lingual transfer tasks.
Ranked #1 on Cross-Lingual Question Answering on XQuAD (Average F1 metric, using extra training data)
no code implementations • ACL (WAT) 2021 • Ryokan Ri, Toshiaki Nakazawa, Yoshimasa Tsuruoka
For Japanese-to-English translation, zero pronouns in Japanese pose a challenge, since the model needs to infer and produce the corresponding pronoun in the target side of the English sentence.
1 code implementation • MTSummit 2021 • Ryokan Ri, Toshiaki Nakazawa, Yoshimasa Tsuruoka
Placeholder translation systems enable the users to specify how a specific phrase is translated in the output sentence.
1 code implementation • WMT (EMNLP) 2020 • Matīss Rikters, Ryokan Ri, Tong Li, Toshiaki Nakazawa
Sentence-level (SL) machine translation (MT) has reached acceptable quality for many high-resourced languages, but not document-level (DL) MT, which is difficult to 1) train with little amount of DL data; and 2) evaluate, as the main methods and data sets focus on SL evaluation.
1 code implementation • WS 2019 • Matīss Rikters, Ryokan Ri, Tong Li, Toshiaki Nakazawa
While the progress of machine translation of written text has come far in the past several years thanks to the increasing availability of parallel corpora and corpora-based training technologies, automatic translation of spoken text and dialogues remains challenging even for modern systems.
Ranked #1 on Machine Translation on Business Scene Dialogue JA-EN (using extra training data)
no code implementations • ACL 2021 • Sosuke Nishikawa, Ryokan Ri, Yoshimasa Tsuruoka
Unsupervised cross-lingual word embedding (CLWE) methods learn a linear transformation matrix that maps two monolingual embedding spaces that are separately trained with monolingual corpora.
no code implementations • ACL 2020 • Ryokan Ri, Yoshimasa Tsuruoka
Existing approaches to mapping-based cross-lingual word embeddings are based on the assumption that the source and target embedding spaces are structurally similar.
Bilingual Lexicon Induction Cross-Lingual Word Embeddings +1