Search Results for author: Katsuhito Sudoh

Found 65 papers, 10 papers with code

FINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGN

no code implementations ACL (IWSLT) 2021 Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.

Translation

Multi-paraphrase Augmentation to Leverage Neural Caption Translation

no code implementations IWSLT (EMNLP) 2018 Johanes Effendi, Sakriani Sakti, Katsuhito Sudoh, Satoshi Nakamura

In this paper, we investigate and utilize neural paraphrasing to improve translation quality in neural MT (NMT), which has not yet been much explored.

Machine Translation NMT +1

Simultaneous Neural Machine Translation with Prefix Alignment

no code implementations IWSLT (ACL) 2022 Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous translation is a task that requires starting translation before the speaker has finished speaking, so we face a trade-off between latency and accuracy.

Machine Translation Translation

Large-Scale English-Japanese Simultaneous Interpretation Corpus: Construction and Analyses with Sentence-Aligned Data

no code implementations ACL (IWSLT) 2021 Kosuke Doi, Katsuhito Sudoh, Satoshi Nakamura

This paper describes the construction of a new large-scale English-Japanese Simultaneous Interpretation (SI) corpus and presents the results of its analysis.

Sentence

Is This Translation Error Critical?: Classification-Based Human and Automatic Machine Translation Evaluation Focusing on Critical Errors

no code implementations EACL (HumEval) 2021 Katsuhito Sudoh, Kosuke Takahashi, Satoshi Nakamura

Our classification-based approach focuses on such errors using several error type labels, for practical machine translation evaluation in an age of neural machine translation.

Machine Translation Translation

Multi-Source Cross-Lingual Constituency Parsing

no code implementations ICON 2021 Hour Kaing, Chenchen Ding, Katsuhito Sudoh, Masao Utiyama, Eiichiro Sumita, Satoshi Nakamura

Pretrained multilingual language models have become a key part of cross-lingual transfer for many natural language processing tasks, even those without bilingual information.

Constituency Parsing Cross-Lingual Transfer +1

Findings of the IWSLT 2022 Evaluation Campaign

no code implementations IWSLT (ACL) 2022 Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.

Speech-to-Speech Translation Translation

TransLLaMa: LLM-based Simultaneous Translation System

no code implementations7 Feb 2024 Roman Koshkin, Katsuhito Sudoh, Satoshi Nakamura

Decoder-only large language models (LLMs) have recently demonstrated impressive capabilities in text generation and reasoning.

Decoder Machine Translation +3

Average Token Delay: A Duration-aware Latency Metric for Simultaneous Translation

no code implementations24 Nov 2023 Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

In this work, we propose a novel latency evaluation metric for simultaneous translation called \emph{Average Token Delay} (ATD) that focuses on the duration of partial translations.

Translation

Improving Speech Translation Accuracy and Time Efficiency with Fine-tuned wav2vec 2.0-based Speech Segmentation

1 code implementation25 Apr 2023 Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura

In this study, we extended SHAS to improve ST translation accuracy and efficiency by splitting speech into shorter segments that correspond to sentences.

Segmentation Translation

Modeling Multiple User Interests using Hierarchical Knowledge for Conversational Recommender System

no code implementations1 Mar 2023 Yuka Okuda, Katsuhito Sudoh, Seitaro Shinagawa, Satoshi Nakamura

A conversational recommender system (CRS) is a practical application for item recommendation through natural language conversation.

Recommendation Systems

Evaluating the Robustness of Discrete Prompts

1 code implementation11 Feb 2023 Yoichi Ishibashi, Danushka Bollegala, Katsuhito Sudoh, Satoshi Nakamura

To address this question, we conduct a systematic study of the robustness of discrete prompts by applying carefully designed perturbations into an application using AutoPrompt and then measure their performance in two Natural Language Inference (NLI) datasets.

Natural Language Inference

Average Token Delay: A Latency Metric for Simultaneous Translation

no code implementations22 Nov 2022 Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

In this work, we propose a novel latency evaluation metric called Average Token Delay (ATD) that focuses on the end timings of partial translations in simultaneous translation.

Translation

E2E Refined Dataset

1 code implementation1 Nov 2022 Keisuke Toyama, Katsuhito Sudoh, Satoshi Nakamura

Although the well-known MR-to-text E2E dataset has been used by many researchers, its MR-text pairs include many deletion/insertion/substitution errors.

Subspace Representations for Soft Set Operations and Sentence Similarities

1 code implementation24 Oct 2022 Yoichi Ishibashi, Sho Yokoi, Katsuhito Sudoh, Satoshi Nakamura

In the field of natural language processing (NLP), continuous vector representations are crucial for capturing the semantic meanings of individual words.

Retrieval Semantic Textual Similarity +2

Simultaneous Neural Machine Translation with Constituent Label Prediction

no code implementations WMT (EMNLP) 2021 Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous translation is a task in which translation begins before the speaker has finished speaking, so it is important to decide when to start the translation process.

Machine Translation Translation

Using Perturbed Length-aware Positional Encoding for Non-autoregressive Neural Machine Translation

no code implementations29 Jul 2021 Yui Oka, Katsuhito Sudoh, Satoshi Nakamura

Non-autoregressive neural machine translation (NAT) usually employs sequence-level knowledge distillation using autoregressive neural machine translation (AT) as its teacher model.

Knowledge Distillation Machine Translation +1

ARTA: Collection and Classification of Ambiguous Requests and Thoughtful Actions

1 code implementation SIGDIAL (ACL) 2021 Shohei Tanaka, Koichiro Yoshino, Katsuhito Sudoh, Satoshi Nakamura

In order to train the classification model on such training data, we applied the positive/unlabeled (PU) learning method, which assumes that only a part of the data is labeled with positive examples.

Classification

Improving Spoken Language Understanding by Wisdom of Crowds

no code implementations COLING 2020 Koichiro Yoshino, Kana Ikeuchi, Katsuhito Sudoh, Satoshi Nakamura

Spoken language understanding (SLU), which converts user requests in natural language to machine-interpretable expressions, is becoming an essential task.

Data Augmentation Spoken Language Understanding

Incorporating Noisy Length Constraints into Transformer with Length-aware Positional Encodings

no code implementations COLING 2020 Yui Oka, Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

Since length constraints with exact target sentence lengths degrade translation performance, we add random noise within a certain window size to the length constraints in the PE during the training.

Machine Translation Sentence +1

Image Captioning with Visual Object Representations Grounded in the Textual Modality

no code implementations19 Oct 2020 Dušan Variš, Katsuhito Sudoh, Satoshi Nakamura

We present our work in progress exploring the possibilities of a shared embedding space between textual and visual modality.

Image Captioning Object +3

Reflection-based Word Attribute Transfer

2 code implementations ACL 2020 Yoichi Ishibashi, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura

For transferring king into queen in this analogy-based manner, we subtract a difference vector man - woman based on the knowledge that king is male.

Attribute Word Attribute Transfer +1

Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language Model

no code implementations ACL 2020 Kosuke Takahashi, Katsuhito Sudoh, Satoshi Nakamura

Our experiments show that our proposed method using Cross-lingual Language Model (XLM) trained with a translation language modeling (TLM) objective achieves a higher correlation with human judgments than a baseline method that uses only hypothesis and reference sentences.

Language Modelling Machine Translation +2

Simultaneous Neural Machine Translation using Connectionist Temporal Classification

no code implementations27 Nov 2019 Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous machine translation is a variant of machine translation that starts the translation process before the end of an input.

Classification General Classification +2

Findings of the Third Workshop on Neural Generation and Translation

no code implementations WS 2019 Hiroaki Hayashi, Yusuke Oda, Alexandra Birch, Ioannis Konstas, Andrew Finch, Minh-Thang Luong, Graham Neubig, Katsuhito Sudoh

This document describes the findings of the Third Workshop on Neural Generation and Translation, held in concert with the annual conference of the Empirical Methods in Natural Language Processing (EMNLP 2019).

Machine Translation NMT +1

Another Diversity-Promoting Objective Function for Neural Dialogue Generation

1 code implementation20 Nov 2018 Ryo Nakamura, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura

Although generation-based dialogue systems have been widely researched, the response generations by most existing systems have very low diversities.

Dialogue Generation

Training Neural Machine Translation using Word Embedding-based Loss

no code implementations30 Jul 2018 Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

The proposed loss function encourages an NMT decoder to generate words close to their references in the embedding space; this helps the decoder to choose similar acceptable words when the actual best candidates are not included in the vocabulary due to its size limitation.

Decoder Machine Translation +3

Multi-Source Neural Machine Translation with Missing Data

no code implementations WS 2018 Yuta Nishimura, Katsuhito Sudoh, Graham Neubig, Satoshi Nakamura

This study focuses on the use of incomplete multilingual corpora in multi-encoder NMT and mixture of NMT experts and examines a very simple implementation where missing source translations are replaced by a special symbol <NULL>.

Machine Translation NMT +1

Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings

no code implementations2 Apr 2017 Junki Matsuo, Mamoru Komachi, Katsuhito Sudoh

One of the most important problems in machine translation (MT) evaluation is to evaluate the similarity between translation hypotheses with different surface forms from the reference, especially at the segment level.

Machine Translation Translation +2

Chinese-to-Japanese Patent Machine Translation based on Syntactic Pre-ordering for WAT 2016

no code implementations WS 2016 Katsuhito Sudoh, Masaaki Nagata

This paper presents our Chinese-to-Japanese patent machine translation system for WAT 2016 (Group ID: ntt) that uses syntactic pre-ordering over Chinese dependency structures.

Chinese Word Segmentation Dependency Parsing +5

Cannot find the paper you are looking for? You can Submit a new open access paper.