Search Results for author: Ryokan Ri

Found 13 papers, 6 papers with code

Finding Sub-task Structure with Natural Language Instruction

no code implementations LNLS (ACL) 2022 Ryokan Ri, Yufang Hou, Radu Marinescu, Akihiro Kishimoto

When mapping a natural language instruction to a sequence of actions, it is often useful toidentify sub-tasks in the instruction.

Segmentation

The University of Tokyo’s Submissions to the WAT 2020 Shared Task

no code implementations AACL (WAT) 2020 Matīss Rikters, Toshiaki Nakazawa, Ryokan Ri

The paper describes the development process of the The University of Tokyo’s NMT systems that were submitted to the WAT 2020 Document-level Business Scene Dialogue Translation sub-task.

NMT Translation

LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation

1 code implementation18 Feb 2024 Ikuya Yamada, Ryokan Ri

In this study, we introduce LEIA, a language adaptation tuning method that utilizes Wikipedia entity names aligned across languages.

Cross-Lingual Transfer Data Augmentation +3

Emergent Communication with Attention

no code implementations18 May 2023 Ryokan Ri, Ryo Ueda, Jason Naradowsky

To develop computational agents that better communicate using their own emergent language, we endow the agents with an ability to focus their attention on particular concepts in the environment.

EASE: Entity-Aware Contrastive Learning of Sentence Embedding

1 code implementation NAACL 2022 Sosuke Nishikawa, Ryokan Ri, Ikuya Yamada, Yoshimasa Tsuruoka, Isao Echizen

We present EASE, a novel method for learning sentence embeddings via contrastive learning between sentences and their related entities.

Clustering Contrastive Learning +6

Pretraining with Artificial Language: Studying Transferable Knowledge in Language Models

no code implementations ACL 2022 Ryokan Ri, Yoshimasa Tsuruoka

We investigate what kind of structural knowledge learned in neural network encoders is transferable to processing natural language.

Position

mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models

2 code implementations ACL 2022 Ryokan Ri, Ikuya Yamada, Yoshimasa Tsuruoka

We train a multilingual language model with 24 languages with entity representations and show the model consistently outperforms word-based pretrained models in various cross-lingual transfer tasks.

 Ranked #1 on Cross-Lingual Question Answering on XQuAD (Average F1 metric, using extra training data)

Cross-Lingual Question Answering Cross-Lingual Transfer +1

Modeling Target-side Inflection in Placeholder Translation

1 code implementation MTSummit 2021 Ryokan Ri, Toshiaki Nakazawa, Yoshimasa Tsuruoka

Placeholder translation systems enable the users to specify how a specific phrase is translated in the output sentence.

LEMMA Sentence +1

Zero-pronoun Data Augmentation for Japanese-to-English Translation

no code implementations ACL (WAT) 2021 Ryokan Ri, Toshiaki Nakazawa, Yoshimasa Tsuruoka

For Japanese-to-English translation, zero pronouns in Japanese pose a challenge, since the model needs to infer and produce the corresponding pronoun in the target side of the English sentence.

Data Augmentation Machine Translation +2

Document-aligned Japanese-English Conversation Parallel Corpus

1 code implementation WMT (EMNLP) 2020 Matīss Rikters, Ryokan Ri, Tong Li, Toshiaki Nakazawa

Sentence-level (SL) machine translation (MT) has reached acceptable quality for many high-resourced languages, but not document-level (DL) MT, which is difficult to 1) train with little amount of DL data; and 2) evaluate, as the main methods and data sets focus on SL evaluation.

Machine Translation Sentence +1

Designing the Business Conversation Corpus

1 code implementation WS 2019 Matīss Rikters, Ryokan Ri, Tong Li, Toshiaki Nakazawa

While the progress of machine translation of written text has come far in the past several years thanks to the increasing availability of parallel corpora and corpora-based training technologies, automatic translation of spoken text and dialogues remains challenging even for modern systems.

 Ranked #1 on Machine Translation on Business Scene Dialogue JA-EN (using extra training data)

Machine Translation Translation

Data Augmentation with Unsupervised Machine Translation Improves the Structural Similarity of Cross-lingual Word Embeddings

no code implementations ACL 2021 Sosuke Nishikawa, Ryokan Ri, Yoshimasa Tsuruoka

Unsupervised cross-lingual word embedding (CLWE) methods learn a linear transformation matrix that maps two monolingual embedding spaces that are separately trained with monolingual corpora.

Cross-Lingual Word Embeddings Data Augmentation +3

Revisiting the Context Window for Cross-lingual Word Embeddings

no code implementations ACL 2020 Ryokan Ri, Yoshimasa Tsuruoka

Existing approaches to mapping-based cross-lingual word embeddings are based on the assumption that the source and target embedding spaces are structurally similar.

Bilingual Lexicon Induction Cross-Lingual Word Embeddings +1

Cannot find the paper you are looking for? You can Submit a new open access paper.