Search Results for author: Satoshi Nakamura

Found 103 papers, 13 papers with code

Is This Translation Error Critical?: Classification-Based Human and Automatic Machine Translation Evaluation Focusing on Critical Errors

no code implementations EACL (HumEval) 2021 Katsuhito Sudoh, Kosuke Takahashi, Satoshi Nakamura

Our classification-based approach focuses on such errors using several error type labels, for practical machine translation evaluation in an age of neural machine translation.

Machine Translation Translation

FINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGN

no code implementations ACL (IWSLT) 2021 Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.

Translation

Findings of the IWSLT 2022 Evaluation Campaign

no code implementations IWSLT (ACL) 2022 Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.

Speech-to-Speech Translation Translation

Simultaneous Neural Machine Translation with Prefix Alignment

no code implementations IWSLT (ACL) 2022 Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous translation is a task that requires starting translation before the speaker has finished speaking, so we face a trade-off between latency and accuracy.

Machine Translation Translation

Large-Scale English-Japanese Simultaneous Interpretation Corpus: Construction and Analyses with Sentence-Aligned Data

no code implementations ACL (IWSLT) 2021 Kosuke Doi, Katsuhito Sudoh, Satoshi Nakamura

This paper describes the construction of a new large-scale English-Japanese Simultaneous Interpretation (SI) corpus and presents the results of its analysis.

Using Spoken Word Posterior Features in Neural Machine Translation

no code implementations IWSLT (EMNLP) 2018 Kaho Osamura, Takatomo Kano, Sakriani Sakti, Katsuhito Sudoh, Satoshi Nakamura

In this paper, a neural sequence-to-sequence ASR is used as feature processing that is trained to produce word posterior features given spoken utterances.

Automatic Speech Recognition Machine Translation +2

Multi-paraphrase Augmentation to Leverage Neural Caption Translation

no code implementations IWSLT (EMNLP) 2018 Johanes Effendi, Sakriani Sakti, Katsuhito Sudoh, Satoshi Nakamura

In this paper, we investigate and utilize neural paraphrasing to improve translation quality in neural MT (NMT), which has not yet been much explored.

Machine Translation Translation

Speech Artifact Removal from EEG Recordings of Spoken Word Production with Tensor Decomposition

no code implementations1 Jun 2022 Holy Lovenia, Hiroki Tanaka, Sakriani Sakti, Ayu Purwarianti, Satoshi Nakamura

Research about brain activities involving spoken word production is considerably underdeveloped because of the undiscovered characteristics of speech artifacts, which contaminate electroencephalogram (EEG) signals and prevent the inspection of the underlying cognitive processes.

EEG Tensor Decomposition

Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-Transcribing

no code implementations14 May 2022 Heli Qi, Sashi Novitasari, Sakriani Sakti, Satoshi Nakamura

The existing paradigm of semi-supervised S2S ASR utilizes SpecAugment as data augmentation and requires a static teacher model to produce pseudo transcripts for untranscribed speech.

Automatic Speech Recognition Data Augmentation +1

Representing `how you say' with `what you say': English corpus of focused speech and text reflecting corresponding implications

no code implementations29 Mar 2022 Naoaki Suzuki, Satoshi Nakamura

As a type of paralinguistic information, English speech uses sentence stress, the heaviest prominence within a sentence, to convey emphasis.

Translation

Applying Syntax$\unicode{x2013}$Prosody Mapping Hypothesis and Prosodic Well-Formedness Constraints to Neural Sequence-to-Sequence Speech Synthesis

no code implementations29 Mar 2022 Kei Furukawa, Takeshi Kishiyama, Satoshi Nakamura

End-to-end text-to-speech synthesis (TTS), which generates speech sounds directly from strings of texts or phonemes, has improved the quality of speech synthesis over the conventional TTS.

Speech Synthesis Text-To-Speech Synthesis

Simultaneous Neural Machine Translation with Constituent Label Prediction

no code implementations WMT (EMNLP) 2021 Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous translation is a task in which translation begins before the speaker has finished speaking, so it is important to decide when to start the translation process.

Machine Translation Translation

Using Perturbed Length-aware Positional Encoding for Non-autoregressive Neural Machine Translation

no code implementations29 Jul 2021 Yui Oka, Katsuhito Sudoh, Satoshi Nakamura

Non-autoregressive neural machine translation (NAT) usually employs sequence-level knowledge distillation using autoregressive neural machine translation (AT) as its teacher model.

Knowledge Distillation Machine Translation +1

ARTA: Collection and Classification of Ambiguous Requests and Thoughtful Actions

1 code implementation SIGDIAL (ACL) 2021 Shohei Tanaka, Koichiro Yoshino, Katsuhito Sudoh, Satoshi Nakamura

In order to train the classification model on such training data, we applied the positive/unlabeled (PU) learning method, which assumes that only a part of the data is labeled with positive examples.

Classification

Incorporating Noisy Length Constraints into Transformer with Length-aware Positional Encodings

no code implementations COLING 2020 Yui Oka, Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

Since length constraints with exact target sentence lengths degrade translation performance, we add random noise within a certain window size to the length constraints in the PE during the training.

Machine Translation Translation

Improving Spoken Language Understanding by Wisdom of Crowds

no code implementations COLING 2020 Koichiro Yoshino, Kana Ikeuchi, Katsuhito Sudoh, Satoshi Nakamura

Spoken language understanding (SLU), which converts user requests in natural language to machine-interpretable expressions, is becoming an essential task.

Data Augmentation Spoken Language Understanding

Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis

no code implementations LREC 2020 Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

We then develop ASR and TTS of ethnic languages by utilizing Indonesian ASR and TTS in a cross-lingual machine speech chain framework with only text or only speech data removing the need for paired speech-text data of those ethnic languages.

Machine Translation speech-recognition +3

Sequence-to-Sequence Learning via Attention Transfer for Incremental Speech Recognition

no code implementations4 Nov 2020 Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

One main reason is because the model needs to decide the incremental steps and learn the transcription that aligns with the current short speech segment.

Automatic Speech Recognition speech-recognition

Augmenting Images for ASR and TTS through Single-loop and Dual-loop Multimodal Chain Framework

no code implementations4 Nov 2020 Johanes Effendi, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Previous research has proposed a machine speech chain to enable automatic speech recognition (ASR) and text-to-speech synthesis (TTS) to assist each other in semi-supervised learning and to avoid the need for a large amount of paired speech and text data.

Automatic Speech Recognition Image Generation +4

Image Captioning with Visual Object Representations Grounded in the Textual Modality

no code implementations19 Oct 2020 Dušan Variš, Katsuhito Sudoh, Satoshi Nakamura

We present our work in progress exploring the possibilities of a shared embedding space between textual and visual modality.

Image Captioning object-detection +2

Reflection-based Word Attribute Transfer

2 code implementations ACL 2020 Yoichi Ishibashi, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura

For transferring king into queen in this analogy-based manner, we subtract a difference vector man - woman based on the knowledge that king is male.

Word Attribute Transfer Word Embeddings

Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language Model

no code implementations ACL 2020 Kosuke Takahashi, Katsuhito Sudoh, Satoshi Nakamura

Our experiments show that our proposed method using Cross-lingual Language Model (XLM) trained with a translation language modeling (TLM) objective achieves a higher correlation with human judgments than a baseline method that uses only hypothesis and reference sentences.

Language Modelling Machine Translation +1

Emotional Speech Corpus for Persuasive Dialogue System

no code implementations LREC 2020 Sara Asai, Koichiro Yoshino, Seitaro Shinagawa, Sakriani Sakti, Satoshi Nakamura

Expressing emotion is known as an efficient way to persuade one{'}s dialogue partner to accept one{'}s claim or proposal.

Caption Generation of Robot Behaviors based on Unsupervised Learning of Action Segments

no code implementations23 Mar 2020 Koichiro Yoshino, Kohei Wakimoto, Yuta Nishimura, Satoshi Nakamura

Two reasons make it challenging to apply existing sequence-to-sequence models to this mapping: 1) it is hard to prepare a large-scale dataset for any kind of robots and their environment, and 2) there is a gap between the number of samples obtained from robot action observations and generated word sequences of captions.

Chunking

Simultaneous Neural Machine Translation using Connectionist Temporal Classification

no code implementations27 Nov 2019 Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous machine translation is a variant of machine translation that starts the translation process before the end of an input.

Classification General Classification +2

Using Panoramic Videos for Multi-person Localization and Tracking in a 3D Panoramic Coordinate

1 code implementation24 Nov 2019 Fan Yang, Feiran Li, Yang Wu, Sakriani Sakti, Satoshi Nakamura

3D panoramic multi-person localization and tracking are prominent in many applications, however, conventional methods using LiDAR equipment could be economically expensive and also computationally inefficient due to the processing of point cloud data.

 Ranked #1 on Multi-Object Tracking on MOT15_3D (using extra training data)

Multi-Object Tracking

Deja-vu: Double Feature Presentation and Iterated Loss in Deep Transformer Networks

no code implementations23 Oct 2019 Andros Tjandra, Chunxi Liu, Frank Zhang, Xiaohui Zhang, Yongqiang Wang, Gabriel Synnaeve, Satoshi Nakamura, Geoffrey Zweig

As our motivation is to allow acoustic models to re-examine their input features in light of partial hypotheses we introduce intermediate model heads and loss function.

Speech-to-speech Translation between Untranscribed Unknown Languages

no code implementations2 Oct 2019 Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Second, we train a sequence-to-sequence model that directly maps the source language speech to the target language's discrete representation.

Speech-to-Speech Translation Translation

Neural Conversation Model Controllable by Given Dialogue Act Based on Adversarial Learning and Label-aware Objective

no code implementations WS 2019 Seiya Kawano, Koichiro Yoshino, Satoshi Nakamura

We introduce an adversarial learning framework for the task of generating conditional responses with a new objective to a discriminator, which explicitly distinguishes sentences by using labels.

Make Skeleton-based Action Recognition Model Smaller, Faster and Better

3 code implementations arXiv 2019 Fan Yang, Sakriani Sakti, Yang Wu, Satoshi Nakamura

Although skeleton-based action recognition has achieved great success in recent years, most of the existing methods may suffer from a large model size and slow execution speed.

Action Recognition Skeleton Based Action Recognition

Listening while Speaking and Visualizing: Improving ASR through Multimodal Chain

no code implementations3 Jun 2019 Johanes Effendi, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Previously, a machine speech chain, which is based on sequence-to-sequence deep learning, was proposed to mimic speech perception and production behavior.

Automatic Speech Recognition Data Augmentation +5

An Incremental Turn-Taking Model For Task-Oriented Dialog Systems

2 code implementations28 May 2019 Andrei C. Coman, Koichiro Yoshino, Yukitoshi Murase, Satoshi Nakamura, Giuseppe Riccardi

To identify the point of maximal understanding in an ongoing utterance, we a) implement an incremental Dialog State Tracker which is updated on a token basis (iDST) b) re-label the Dialog State Tracking Challenge 2 (DSTC2) dataset and c) adapt it to the incremental turn-taking experimental scenario.

VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019

no code implementations27 May 2019 Andros Tjandra, Berrak Sisman, Mingyang Zhang, Sakriani Sakti, Haizhou Li, Satoshi Nakamura

Our proposed approach significantly improved the intelligibility (in CER), the MOS, and discrimination ABX scores compared to the official ZeroSpeech 2019 baseline or even the topline.

Optimization of Information-Seeking Dialogue Strategy for Argumentation-Based Dialogue System

no code implementations26 Nov 2018 Hisao Katsumi, Takuya Hiraoka, Koichiro Yoshino, Kazeto Yamamoto, Shota Motoura, Kunihiko Sadamasa, Satoshi Nakamura

It is required that these systems have sufficient supporting information to argue their claims rationally; however, the systems often do not have enough of such information in realistic situations.

Another Diversity-Promoting Objective Function for Neural Dialogue Generation

1 code implementation20 Nov 2018 Ryo Nakamura, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura

Although generation-based dialogue systems have been widely researched, the response generations by most existing systems have very low diversities.

Dialogue Generation

Training Neural Machine Translation using Word Embedding-based Loss

no code implementations30 Jul 2018 Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

The proposed loss function encourages an NMT decoder to generate words close to their references in the embedding space; this helps the decoder to choose similar acceptable words when the actual best candidates are not included in the vocabulary due to its size limitation.

Machine Translation Translation +1

Unsupervised Counselor Dialogue Clustering for Positive Emotion Elicitation in Neural Dialogue System

no code implementations WS 2018 Nurul Lubis, Sakriani Sakti, Koichiro Yoshino, Satoshi Nakamura

Positive emotion elicitation seeks to improve user{'}s emotional state through dialogue system interaction, where a chat-based scenario is layered with an implicit goal to address user{'}s emotional needs.

Emotion Recognition Goal-Oriented Dialogue Systems +1

Multi-Source Neural Machine Translation with Missing Data

no code implementations WS 2018 Yuta Nishimura, Katsuhito Sudoh, Graham Neubig, Satoshi Nakamura

This study focuses on the use of incomplete multilingual corpora in multi-encoder NMT and mixture of NMT experts and examines a very simple implementation where missing source translations are replaced by a special symbol <NULL>.

Machine Translation Translation

Guiding Neural Machine Translation with Retrieved Translation Pieces

no code implementations NAACL 2018 Jingyi Zhang, Masao Utiyama, Eiichro Sumita, Graham Neubig, Satoshi Nakamura

Specifically, for an input sentence, we use a search engine to retrieve sentence pairs whose source sides are similar with the input sentence, and then collect $n$-grams that are both in the retrieved target sentences and aligned with words that match in the source sentences, which we call "translation pieces".

Machine Translation Translation

Machine Speech Chain with One-shot Speaker Adaptation

no code implementations28 Mar 2018 Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

In the speech chain loop mechanism, ASR also benefits from the ability to further learn an arbitrary speaker's characteristics from the generated speech waveform, resulting in a significant improvement in the recognition rate.

Automatic Speech Recognition Speaker Recognition +3

Tensor Decomposition for Compressing Recurrent Neural Network

1 code implementation28 Feb 2018 Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

In the machine learning fields, Recurrent Neural Network (RNN) has become a popular architecture for sequential data modeling.

Tensor Decomposition

Interactive Image Manipulation with Natural Language Instruction Commands

no code implementations23 Feb 2018 Seitaro Shinagawa, Koichiro Yoshino, Sakriani Sakti, Yu Suzuki, Satoshi Nakamura

We propose an interactive image-manipulation system with natural language instruction, which can generate a target image from a source image and an instruction that describes the difference between the source and the target image.

Image Generation Image Manipulation

Structured-based Curriculum Learning for End-to-end English-Japanese Speech Translation

no code implementations13 Feb 2018 Takatomo Kano, Sakriani Sakti, Satoshi Nakamura

Sequence-to-sequence attentional-based neural network architectures have been shown to provide a powerful model for machine translation and speech recognition.

Machine Translation speech-recognition +2

Improving Neural Machine Translation through Phrase-based Forced Decoding

no code implementations IJCNLP 2017 Jingyi Zhang, Masao Utiyama, Eiichro Sumita, Graham Neubig, Satoshi Nakamura

Compared to traditional statistical machine translation (SMT), neural machine translation (NMT) often sacrifices adequacy for the sake of fluency.

Machine Translation Translation

Sequence-to-Sequence ASR Optimization via Reinforcement Learning

no code implementations30 Oct 2017 Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Despite the success of sequence-to-sequence approaches in automatic speech recognition (ASR) systems, the models still suffer from several problems, mainly due to the mismatch between the training and inference conditions.

Automatic Speech Recognition reinforcement-learning +1

Attention-based Wav2Text with Feature Transfer Learning

no code implementations22 Sep 2017 Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

In this paper, we construct the first end-to-end attention-based encoder-decoder model to process directly from raw speech waveform to the text transcription.

Automatic Speech Recognition speech-recognition +1

Transcribing Against Time

no code implementations15 Sep 2017 Matthias Sperber, Graham Neubig, Jan Niehues, Satoshi Nakamura, Alex Waibel

We investigate the problem of manually correcting errors from an automatic speech transcript in a cost-sensitive fashion.

Information Navigation System with Discovering User Interests

no code implementations WS 2017 Koichiro Yoshino, Yu Suzuki, Satoshi Nakamura

We demonstrate an information navigation system for sightseeing domains that has a dialogue interface for discovering user interests for tourist activities.

Semantic Textual Similarity Speech Recognition

Listening while Speaking: Speech Chain by Deep Learning

no code implementations16 Jul 2017 Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

In this paper, we take a step further and develop a closed-loop speech chain model based on deep learning.

Automatic Speech Recognition speech-recognition +2

Gated Recurrent Neural Tensor Network

no code implementations7 Jun 2017 Andros Tjandra, Sakriani Sakti, Ruli Manurung, Mirna Adriani, Satoshi Nakamura

Our proposed RNNs, which are called a Long-Short Term Memory Recurrent Neural Tensor Network (LSTMRNTN) and Gated Recurrent Unit Recurrent Neural Tensor Network (GRURNTN), are made by combining the LSTM and GRU RNN models with the tensor product.

Language Modelling

Analysis of the Effect of Dependency Information on Predicate-Argument Structure Analysis and Zero Anaphora Resolution

no code implementations31 May 2017 Koichiro Yoshino, Shinsuke Mori, Satoshi Nakamura

This paper investigates and analyzes the effect of dependency information on predicate-argument structure analysis (PASA) and zero anaphora resolution (ZAR) for Japanese, and shows that a straightforward approach of PASA and ZAR works effectively even if dependency information was not available.

Dependency Parsing POS

Compressing Recurrent Neural Network with Tensor Train

no code implementations23 May 2017 Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Recurrent Neural Network (RNN) are a popular choice for modeling temporal and sequential tasks and achieve many state-of-the-art performance on various complex problems.

Incorporating Discrete Translation Lexicons into Neural Machine Translation

2 code implementations EMNLP 2016 Philip Arthur, Graham Neubig, Satoshi Nakamura

Neural machine translation (NMT) often makes mistakes in translating low-frequency content words that are essential to understanding the meaning of the sentence.

Machine Translation Translation

Neural Reranking Improves Subjective Quality of Machine Translation: NAIST at WAT2015

no code implementations WS 2015 Graham Neubig, Makoto Morishita, Satoshi Nakamura

We further perform a detailed analysis of reasons for this increase, finding that the main contributions of the neural models lie in improvement of the grammatical correctness of the output, as opposed to improvements in lexical choice of content words.

Machine Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.