Search Results for author: Katsuki Chousa

Found 8 papers, 2 papers with code

WikiSplit++: Easy Data Refinement for Split and Rephrase

1 code implementation13 Apr 2024 Hayato Tsukagoshi, Tsutomu Hirao, Makoto Morishita, Katsuki Chousa, Ryohei Sasano, Koichi Takeda

The task of Split and Rephrase, which splits a complex sentence into multiple simple sentences with the same meaning, improves readability and enhances the performance of downstream tasks in natural language processing (NLP).

Sentence Split and Rephrase +1

JParaCrawl v3.0: A Large-scale English-Japanese Parallel Corpus

no code implementations LREC 2022 Makoto Morishita, Katsuki Chousa, Jun Suzuki, Masaaki Nagata

Most current machine translation models are mainly trained with parallel corpora, and their translation accuracy largely depends on the quality and quantity of the corpora.

Machine Translation Sentence +1

Incorporating Noisy Length Constraints into Transformer with Length-aware Positional Encodings

no code implementations COLING 2020 Yui Oka, Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

Since length constraints with exact target sentence lengths degrade translation performance, we add random noise within a certain window size to the length constraints in the PE during the training.

Machine Translation Sentence +1

Bilingual Text Extraction as Reading Comprehension

no code implementations29 Apr 2020 Katsuki Chousa, Masaaki Nagata, Masaaki Nishino

We also conduct a sentence alignment experiment using En-Ja newspaper articles and find that the proposed method using multilingual BERT achieves significantly better accuracy than a baseline method using a bilingual dictionary and dynamic programming.

Reading Comprehension Sentence +1

Simultaneous Neural Machine Translation using Connectionist Temporal Classification

no code implementations27 Nov 2019 Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous machine translation is a variant of machine translation that starts the translation process before the end of an input.

Classification General Classification +2

Training Neural Machine Translation using Word Embedding-based Loss

no code implementations30 Jul 2018 Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

The proposed loss function encourages an NMT decoder to generate words close to their references in the embedding space; this helps the decoder to choose similar acceptable words when the actual best candidates are not included in the vocabulary due to its size limitation.

Machine Translation NMT +2

Cannot find the paper you are looking for? You can Submit a new open access paper.