Search Results for author: Atsushi Keyaki

Found 4 papers, 3 papers with code

Word-level Perturbation Considering Word Length and Compositional Subwords

1 code implementation • Findings (ACL) 2022 • Tatsuya Hiraoka, Sho Takase, Kei Uchiumi, Atsushi Keyaki, Naoaki Okazaki

We present two simple modifications for word-level perturbation: Word Replacement considering Length (WR-L) and Compositional Word Replacement (CWR). In conventional word replacement, a word in an input is replaced with a word sampled from the entire vocabulary, regardless of the length and context of the target word. WR-L considers the length of a target word by sampling words from the Poisson distribution. CWR considers the compositional candidates by restricting the source of sampling to related words that appear in subword regularization. Experimental results showed that the combination of WR-L and CWR improved the performance of text classification and machine translation.

Machine Translation text-classification +2

Paper
Code

Coarse-Tuning for Ad-hoc Document Retrieval Using Pre-trained Language Models

no code implementations • 25 Mar 2024 • Atsushi Keyaki, Ribeka Keyaki

By learning query representations and query-document relations in coarse-tuning, we aim to reduce the load of fine-tuning and improve the learning effect of downstream IR tasks.

Information Retrieval Retrieval

Paper
Add Code

Joint Optimization of Tokenization and Downstream Model

2 code implementations • Findings (ACL) 2021 • Tatsuya Hiraoka, Sho Takase, Kei Uchiumi, Atsushi Keyaki, Naoaki Okazaki

Since traditional tokenizers are isolated from a downstream task and model, they cannot output an appropriate tokenization depending on the task and model, although recent studies imply that the appropriate tokenization improves the performance.

Machine Translation text-classification +2

Paper
Code

Optimizing Word Segmentation for Downstream Task

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Tatsuya Hiraoka, Sho Takase, Kei Uchiumi, Atsushi Keyaki, Naoaki Okazaki

In traditional NLP, we tokenize a given sentence as a preprocessing, and thus the tokenization is unrelated to a target downstream task.

Natural Language Inference Segmentation +4

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.