Search Results for author: Katsuhito Sudoh

Found 65 papers, 9 papers with code

FINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGN

no code implementations • ACL (IWSLT) 2021 • Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.

Translation

Paper
Add Code

NAIST English-to-Japanese Simultaneous Translation System for IWSLT 2021 Simultaneous Text-to-text Task

no code implementations • ACL (IWSLT) 2021 • Ryo Fukuda, Yui Oka, Yasumasa Kano, Yuki Yano, Yuka Ko, Hirotaka Tokuyama, Kosuke Doi, Sakriani Sakti, Katsuhito Sudoh, Satoshi Nakamura

This paper describes NAIST’s system for the English-to-Japanese Simultaneous Text-to-text Translation Task in IWSLT 2021 Evaluation Campaign.

Knowledge Distillation Machine Translation +1

Paper
Add Code

NAIST Simultaneous Speech-to-Text Translation System for IWSLT 2022

no code implementations • IWSLT (ACL) 2022 • Ryo Fukuda, Yuka Ko, Yasumasa Kano, Kosuke Doi, Hirotaka Tokuyama, Sakriani Sakti, Katsuhito Sudoh, Satoshi Nakamura

This paper describes NAIST’s simultaneous speech translation systems developed for IWSLT 2022 Evaluation Campaign.

Segmentation Simultaneous Speech-to-Text Translation +1

Paper
Add Code

Simultaneous Neural Machine Translation with Prefix Alignment

no code implementations • IWSLT (ACL) 2022 • Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous translation is a task that requires starting translation before the speaker has finished speaking, so we face a trade-off between latency and accuracy.

Machine Translation Translation

Paper
Add Code

Pseudo Ambiguous and Clarifying Questions Based on Sentence Structures Toward Clarifying Question Answering System

no code implementations • dialdoc (ACL) 2022 • Yuya Nakano, Seiya Kawano, Koichiro Yoshino, Katsuhito Sudoh, Satoshi Nakamura

Ambiguous questions are generated by eliminating a part of a sentence considering the sentence structure.

Question Answering Question Generation +2

Paper
Add Code

Multilingual Machine Translation Evaluation Metrics Fine-tuned on Pseudo-Negative Examples for WMT 2021 Metrics Task

no code implementations • WMT (EMNLP) 2021 • Kosuke Takahashi, Yoichi Ishibashi, Katsuhito Sudoh, Satoshi Nakamura

This paper describes our submission to the WMT2021 shared metrics task.

Attribute Machine Translation

Paper
Add Code

Large-Scale English-Japanese Simultaneous Interpretation Corpus: Construction and Analyses with Sentence-Aligned Data

no code implementations • ACL (IWSLT) 2021 • Kosuke Doi, Katsuhito Sudoh, Satoshi Nakamura

This paper describes the construction of a new large-scale English-Japanese Simultaneous Interpretation (SI) corpus and presents the results of its analysis.

Sentence

Paper
Add Code

Using Spoken Word Posterior Features in Neural Machine Translation

no code implementations • IWSLT (EMNLP) 2018 • Kaho Osamura, Takatomo Kano, Sakriani Sakti, Katsuhito Sudoh, Satoshi Nakamura

In this paper, a neural sequence-to-sequence ASR is used as feature processing that is trained to produce word posterior features given spoken utterances.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

On Knowledge Distillation for Translating Erroneous Speech Transcriptions

no code implementations • ACL (IWSLT) 2021 • Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura

Recent studies argue that knowledge distillation is promising for speech translation (ST) using end-to-end models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Multi-paraphrase Augmentation to Leverage Neural Caption Translation

no code implementations • IWSLT (EMNLP) 2018 • Johanes Effendi, Sakriani Sakti, Katsuhito Sudoh, Satoshi Nakamura

In this paper, we investigate and utilize neural paraphrasing to improve translation quality in neural MT (NMT), which has not yet been much explored.

Machine Translation NMT +1

Paper
Add Code

Is This Translation Error Critical?: Classification-Based Human and Automatic Machine Translation Evaluation Focusing on Critical Errors

no code implementations • EACL (HumEval) 2021 • Katsuhito Sudoh, Kosuke Takahashi, Satoshi Nakamura

Our classification-based approach focuses on such errors using several error type labels, for practical machine translation evaluation in an age of neural machine translation.

Machine Translation Translation

Paper
Add Code

Overview of the 5th Workshop on Asian Translation

no code implementations • PACLIC 2018 • Toshiaki Nakazawa, Katsuhito Sudoh, Shohei Higashiyama, Chenchen Ding, Raj Dabre, Hideya Mino, Isao Goto, Win Pa Pa, Anoop Kunchukuttan, Sadao Kurohashi

Translation

Paper
Add Code

Multi-Source Cross-Lingual Constituency Parsing

no code implementations • ICON 2021 • Hour Kaing, Chenchen Ding, Katsuhito Sudoh, Masao Utiyama, Eiichiro Sumita, Satoshi Nakamura

Pretrained multilingual language models have become a key part of cross-lingual transfer for many natural language processing tasks, even those without bilingual information.

Constituency Parsing Cross-Lingual Transfer +1

Paper
Add Code

Named Entity-Factored Transformer for Proper Noun Translation

no code implementations • ICON 2021 • Kohichi Takai, Gen Hattori, Akio Yoneyama, Keiji Yasuda, Katsuhito Sudoh, Satoshi Nakamura

The proposed method applies the Named Entity (NE) fea-ture vector to Factored Transformer for accurate proper noun translation.

Machine Translation named-entity-recognition +5

Paper
Add Code

Findings of the IWSLT 2022 Evaluation Campaign

no code implementations • IWSLT (ACL) 2022 • Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.

Speech-to-Speech Translation Translation

Paper
Add Code

Overview of the IWSLT 2017 Evaluation Campaign

no code implementations • IWSLT 2017 • Mauro Cettolo, Marcello Federico, Luisa Bentivogli, Jan Niehues, Sebastian Stüker, Katsuhito Sudoh, Koichiro Yoshino, Christian Federmann

The IWSLT 2017 evaluation campaign has organised three tasks.

Machine Translation Translation

Paper
Add Code

TransLLaMa: LLM-based Simultaneous Translation System

no code implementations • 7 Feb 2024 • Roman Koshkin, Katsuhito Sudoh, Satoshi Nakamura

Decoder-only large language models (LLMs) have recently demonstrated impressive capabilities in text generation and reasoning.

Machine Translation Sentence +2

Paper
Add Code

Average Token Delay: A Duration-aware Latency Metric for Simultaneous Translation

no code implementations • 24 Nov 2023 • Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

In this work, we propose a novel latency evaluation metric for simultaneous translation called \emph{Average Token Delay} (ATD) that focuses on the duration of partial translations.

Translation

Paper
Add Code

Tagged End-to-End Simultaneous Speech Translation Training using Simultaneous Interpretation Data

no code implementations • 14 Jun 2023 • Yuka Ko, Ryo Fukuda, Yuta Nishikawa, Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

In this paper, we propose an effective way to train a SimulST model using mixed data of SI and offline.

Paper
Add Code

Improving Speech Translation Accuracy and Time Efficiency with Fine-tuned wav2vec 2.0-based Speech Segmentation

1 code implementation • 25 Apr 2023 • Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura

In this study, we extended SHAS to improve ST translation accuracy and efficiency by splitting speech into shorter segments that correspond to sentences.

Segmentation Translation

Paper
Code

NAIST-SIC-Aligned: an Aligned English-Japanese Simultaneous Interpretation Corpus

no code implementations • 23 Apr 2023 • Jinming Zhao, Yuka Ko, Kosuke Doi, Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura

Research has been limited due to the lack of a large-scale training corpus.

Machine Translation Sentence +1

Paper
Add Code

Modeling Multiple User Interests using Hierarchical Knowledge for Conversational Recommender System

no code implementations • 1 Mar 2023 • Yuka Okuda, Katsuhito Sudoh, Seitaro Shinagawa, Satoshi Nakamura

A conversational recommender system (CRS) is a practical application for item recommendation through natural language conversation.

Recommendation Systems

Paper
Add Code

Evaluating the Robustness of Discrete Prompts

1 code implementation • 11 Feb 2023 • Yoichi Ishibashi, Danushka Bollegala, Katsuhito Sudoh, Satoshi Nakamura

To address this question, we conduct a systematic study of the robustness of discrete prompts by applying carefully designed perturbations into an application using AutoPrompt and then measure their performance in two Natural Language Inference (NLI) datasets.

Natural Language Inference

Paper
Code

Average Token Delay: A Latency Metric for Simultaneous Translation

no code implementations • 22 Nov 2022 • Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

In this work, we propose a novel latency evaluation metric called Average Token Delay (ATD) that focuses on the end timings of partial translations in simultaneous translation.

Translation

Paper
Add Code

E2E Refined Dataset

1 code implementation • 1 Nov 2022 • Keisuke Toyama, Katsuhito Sudoh, Satoshi Nakamura

Although the well-known MR-to-text E2E dataset has been used by many researchers, its MR-text pairs include many deletion/insertion/substitution errors.

Paper
Code

Subspace Representations for Soft Set Operations and Sentence Similarities

1 code implementation • 24 Oct 2022 • Yoichi Ishibashi, Sho Yokoi, Katsuhito Sudoh, Satoshi Nakamura

In the field of natural language processing (NLP), continuous vector representations are crucial for capturing the semantic meanings of individual words.

Retrieval Semantic Textual Similarity +2

Paper
Code

Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation

1 code implementation • 29 Mar 2022 • Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura

We also propose a hybrid method that combines VAD and the above speech segmentation method.

Binary Classification Segmentation +2

1,880

Paper
Code

Simultaneous Neural Machine Translation with Constituent Label Prediction

no code implementations • WMT (EMNLP) 2021 • Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous translation is a task in which translation begins before the speaker has finished speaking, so it is important to decide when to start the translation process.

Machine Translation Translation

Paper
Add Code

Using Perturbed Length-aware Positional Encoding for Non-autoregressive Neural Machine Translation

no code implementations • 29 Jul 2021 • Yui Oka, Katsuhito Sudoh, Satoshi Nakamura

Non-autoregressive neural machine translation (NAT) usually employs sequence-level knowledge distillation using autoregressive neural machine translation (AT) as its teacher model.

Knowledge Distillation Machine Translation +1

Paper
Add Code

ARTA: Collection and Classification of Ambiguous Requests and Thoughtful Actions

1 code implementation • SIGDIAL (ACL) 2021 • Shohei Tanaka, Koichiro Yoshino, Katsuhito Sudoh, Satoshi Nakamura

In order to train the classification model on such training data, we applied the positive/unlabeled (PU) learning method, which assumes that only a part of the data is labeled with positive examples.

Classification

Paper
Code

Improving Spoken Language Understanding by Wisdom of Crowds

no code implementations • COLING 2020 • Koichiro Yoshino, Kana Ikeuchi, Katsuhito Sudoh, Satoshi Nakamura

Spoken language understanding (SLU), which converts user requests in natural language to machine-interpretable expressions, is becoming an essential task.

Data Augmentation Spoken Language Understanding

Paper
Add Code

Incorporating Noisy Length Constraints into Transformer with Length-aware Positional Encodings

no code implementations • COLING 2020 • Yui Oka, Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

Since length constraints with exact target sentence lengths degrade translation performance, we add random noise within a certain window size to the length constraints in the PE during the training.

Machine Translation Sentence +1

Paper
Add Code

Simultaneous Speech-to-Speech Translation System with Neural Incremental ASR, MT, and TTS

no code implementations • 10 Nov 2020 • Katsuhito Sudoh, Takatomo Kano, Sashi Novitasari, Tomoya Yanagita, Sakriani Sakti, Satoshi Nakamura

This paper presents a newly developed, simultaneous neural speech-to-speech translation system and its evaluation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Add Code

Image Captioning with Visual Object Representations Grounded in the Textual Modality

no code implementations • 19 Oct 2020 • Dušan Variš, Katsuhito Sudoh, Satoshi Nakamura

We present our work in progress exploring the possibilities of a shared embedding space between textual and visual modality.

Image Captioning Object +3

Paper
Add Code

Reflection-based Word Attribute Transfer

2 code implementations • ACL 2020 • Yoichi Ishibashi, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura

For transferring king into queen in this analogy-based manner, we subtract a difference vector man - woman based on the knowledge that king is male.

Attribute Word Attribute Transfer +1

Paper
Code

Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language Model

no code implementations • ACL 2020 • Kosuke Takahashi, Katsuhito Sudoh, Satoshi Nakamura

Our experiments show that our proposed method using Cross-lingual Language Model (XLM) trained with a translation language modeling (TLM) objective achieves a higher correlation with human judgments than a baseline method that uses only hypothesis and reference sentences.

Language Modelling Machine Translation +2

Paper
Add Code

NAIST's Machine Translation Systems for IWSLT 2020 Conversational Speech Translation Task

no code implementations • WS 2020 • Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura

This paper describes NAIST{'}s NMT system submitted to the IWSLT 2020 conversational speech translation task.

Domain Adaptation Machine Translation +3

Paper
Add Code

Simultaneous Neural Machine Translation using Connectionist Temporal Classification

no code implementations • 27 Nov 2019 • Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous machine translation is a variant of machine translation that starts the translation process before the end of an input.

Classification General Classification +2

Paper
Add Code

Findings of the Third Workshop on Neural Generation and Translation

no code implementations • WS 2019 • Hiroaki Hayashi, Yusuke Oda, Alexandra Birch, Ioannis Konstas, Andrew Finch, Minh-Thang Luong, Graham Neubig, Katsuhito Sudoh

This document describes the findings of the Third Workshop on Neural Generation and Translation, held in concert with the annual conference of the Empirical Methods in Natural Language Processing (EMNLP 2019).

Machine Translation NMT +1

Paper
Add Code

Conversational Response Re-ranking Based on Event Causality and Role Factored Tensor Event Embedding

2 code implementations • WS 2019 • Shohei Tanaka, Koichiro Yoshino, Katsuhito Sudoh, Satoshi Nakamura

We propose a novel method for selecting coherent and diverse responses for a given dialogue context.

Re-Ranking

Paper
Code

Another Diversity-Promoting Objective Function for Neural Dialogue Generation

1 code implementation • 20 Nov 2018 • Ryo Nakamura, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura

Although generation-based dialogue systems have been widely researched, the response generations by most existing systems have very low diversities.

Dialogue Generation

Paper
Code

Multi-Source Neural Machine Translation with Data Augmentation

no code implementations • IWSLT (EMNLP) 2018 • Yuta Nishimura, Katsuhito Sudoh, Graham Neubig, Satoshi Nakamura

By using information from these multiple sources, these systems achieve large gains in accuracy.

Data Augmentation Machine Translation +2

Paper
Add Code

Training Neural Machine Translation using Word Embedding-based Loss

no code implementations • 30 Jul 2018 • Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

The proposed loss function encourages an NMT decoder to generate words close to their references in the embedding space; this helps the decoder to choose similar acceptable words when the actual best candidates are not included in the vocabulary due to its size limitation.

Machine Translation NMT +2

Paper
Add Code

Multi-Source Neural Machine Translation with Missing Data

no code implementations • WS 2018 • Yuta Nishimura, Katsuhito Sudoh, Graham Neubig, Satoshi Nakamura

This study focuses on the use of incomplete multilingual corpora in multi-encoder NMT and mixture of NMT experts and examines a very simple implementation where missing source translations are replaced by a special symbol <NULL>.

Machine Translation NMT +1

Paper
Add Code

A Simple and Strong Baseline: NAIST-NICT Neural Machine Translation System for WAT2017 English-Japanese Translation Task

no code implementations • WS 2017 • Yusuke Oda, Katsuhito Sudoh, Satoshi Nakamura, Masao Utiyama, Eiichiro Sumita

This paper describes the details about the NAIST-NICT machine translation system for WAT2017 English-Japanese Scientific Paper Translation Task.

Machine Translation Translation

Paper
Add Code

Tree as a Pivot: Syntactic Matching Methods in Pivot Translation

no code implementations • WS 2017 • Akiva Miura, Graham Neubig, Katsuhito Sudoh, Satoshi Nakamura

Machine Translation Translation

Paper
Add Code

An Empirical Study of Mini-Batch Creation Strategies for Neural Machine Translation

no code implementations • WS 2017 • Makoto Morishita, Yusuke Oda, Graham Neubig, Koichiro Yoshino, Katsuhito Sudoh, Satoshi Nakamura

Training of neural machine translation (NMT) models usually uses mini-batches for efficiency purposes.

Machine Translation NMT +2

Paper
Add Code

Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings

no code implementations • 2 Apr 2017 • Junki Matsuo, Mamoru Komachi, Katsuhito Sudoh

One of the most important problems in machine translation (MT) evaluation is to evaluate the similarity between translation hypotheses with different surface forms from the reference, especially at the segment level.

Machine Translation Translation +2

Paper
Add Code

Reading Comprehension using Entity-based Memory Network

no code implementations • 12 Dec 2016 • Xun Wang, Katsuhito Sudoh, Masaaki Nagata, Tomohide Shibata, Daisuke Kawahara, Sadao Kurohashi

This paper introduces a novel neural network model for question answering, the \emph{entity-based memory network}.

Question Answering Reading Comprehension

Paper
Add Code

Exploring Text Links for Coherent Multi-Document Summarization

no code implementations • COLING 2016 • Xun Wang, Masaaki Nishino, Tsutomu Hirao, Katsuhito Sudoh, Masaaki Nagata

Existing methods focus on the extraction of key information, but often neglect coherence.

Document Summarization Informativeness +1

Paper
Add Code

Chinese-to-Japanese Patent Machine Translation based on Syntactic Pre-ordering for WAT 2016

no code implementations • WS 2016 • Katsuhito Sudoh, Masaaki Nagata

This paper presents our Chinese-to-Japanese patent machine translation system for WAT 2016 (Group ID: ntt) that uses syntactic pre-ordering over Chinese dependency structures.

Chinese Word Segmentation Dependency Parsing +5