Search Results for author: Satoshi Nakamura

Found 125 papers, 21 papers with code

Multi-paraphrase Augmentation to Leverage Neural Caption Translation

no code implementations IWSLT (EMNLP) 2018 Johanes Effendi, Sakriani Sakti, Katsuhito Sudoh, Satoshi Nakamura

In this paper, we investigate and utilize neural paraphrasing to improve translation quality in neural MT (NMT), which has not yet been much explored.

Machine Translation NMT +1

Simultaneous Neural Machine Translation with Prefix Alignment

no code implementations IWSLT (ACL) 2022 Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous translation is a task that requires starting translation before the speaker has finished speaking, so we face a trade-off between latency and accuracy.

Machine Translation Translation

Large-Scale English-Japanese Simultaneous Interpretation Corpus: Construction and Analyses with Sentence-Aligned Data

no code implementations ACL (IWSLT) 2021 Kosuke Doi, Katsuhito Sudoh, Satoshi Nakamura

This paper describes the construction of a new large-scale English-Japanese Simultaneous Interpretation (SI) corpus and presents the results of its analysis.

Sentence

FINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGN

no code implementations ACL (IWSLT) 2021 Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.

Translation

Multi-Source Cross-Lingual Constituency Parsing

no code implementations ICON 2021 Hour Kaing, Chenchen Ding, Katsuhito Sudoh, Masao Utiyama, Eiichiro Sumita, Satoshi Nakamura

Pretrained multilingual language models have become a key part of cross-lingual transfer for many natural language processing tasks, even those without bilingual information.

Constituency Parsing Cross-Lingual Transfer +1

Is This Translation Error Critical?: Classification-Based Human and Automatic Machine Translation Evaluation Focusing on Critical Errors

no code implementations EACL (HumEval) 2021 Katsuhito Sudoh, Kosuke Takahashi, Satoshi Nakamura

Our classification-based approach focuses on such errors using several error type labels, for practical machine translation evaluation in an age of neural machine translation.

Machine Translation Translation

Findings of the IWSLT 2022 Evaluation Campaign

no code implementations IWSLT (ACL) 2022 Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.

Speech-to-Speech Translation Translation

TransLLaMa: LLM-based Simultaneous Translation System

no code implementations7 Feb 2024 Roman Koshkin, Katsuhito Sudoh, Satoshi Nakamura

Decoder-only large language models (LLMs) have recently demonstrated impressive capabilities in text generation and reasoning.

Machine Translation Sentence +2

Response Generation for Cognitive Behavioral Therapy with Large Language Models: Comparative Study with Socratic Questioning

no code implementations29 Jan 2024 Kenta Izumi, Hiroki Tanaka, Kazuhiro Shidara, Hiroyoshi Adachi, Daisuke Kanayama, Takashi Kudo, Satoshi Nakamura

By comparing systems that use LLM-generated responses with those that do not, we investigate the impact of generated responses on subjective evaluations such as mood change, cognitive change, and dialogue quality (e. g., empathy).

Response Generation

Average Token Delay: A Duration-aware Latency Metric for Simultaneous Translation

no code implementations24 Nov 2023 Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

In this work, we propose a novel latency evaluation metric for simultaneous translation called \emph{Average Token Delay} (ATD) that focuses on the duration of partial translations.

Translation

Computational analyses of linguistic features with schizophrenic and autistic traits along with formal thought disorders

no code implementations14 Oct 2023 Takeshi Saga, Hiroki Tanaka, Satoshi Nakamura

We confirmed that an FTD-related subscale, odd speech, was significantly correlated with both the total SPQ and SRS scores, although they themselves were not correlated significantly.

Inter-connection: Effective Connection between Pre-trained Encoder and Decoder for Speech Translation

no code implementations26 May 2023 Yuta Nishikawa, Satoshi Nakamura

In this study, we propose an inter-connection mechanism that aggregates the information from each layer of the speech pre-trained model by weighted sums and inputs into the decoder.

Translation

Arukikata Travelogue Dataset

no code implementations19 May 2023 Hiroki Ouchi, Hiroyuki Shindo, Shoko Wakamiya, Yuki Matsuda, Naoya Inoue, Shohei Higashiyama, Satoshi Nakamura, Taro Watanabe

We have constructed Arukikata Travelogue Dataset and released it free of charge for academic research.

Improving Speech Translation Accuracy and Time Efficiency with Fine-tuned wav2vec 2.0-based Speech Segmentation

1 code implementation25 Apr 2023 Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura

In this study, we extended SHAS to improve ST translation accuracy and efficiency by splitting speech into shorter segments that correspond to sentences.

Segmentation Translation

Sketch-based Medical Image Retrieval

no code implementations7 Mar 2023 Kazuma Kobayashi, Lin Gu, Ryuichiro Hataya, Takaaki Mizuno, Mototaka Miyake, Hirokazu Watanabe, Masamichi Takahashi, Yasuyuki Takamizawa, Yukihiro Yoshida, Satoshi Nakamura, Nobuji Kouno, Amina Bolatkan, Yusuke Kurose, Tatsuya Harada, Ryuji Hamamoto

As a result, our SBMIR system enabled users to overcome previous challenges, including image retrieval based on fine-grained image characteristics, image retrieval without example images, and image retrieval for isolated samples.

Medical Image Retrieval Retrieval

Modeling Multiple User Interests using Hierarchical Knowledge for Conversational Recommender System

no code implementations1 Mar 2023 Yuka Okuda, Katsuhito Sudoh, Seitaro Shinagawa, Satoshi Nakamura

A conversational recommender system (CRS) is a practical application for item recommendation through natural language conversation.

Recommendation Systems

Whats New? Identifying the Unfolding of New Events in Narratives

no code implementations15 Feb 2023 Seyed Mahed Mousavi, Shohei Tanaka, Gabriel Roccabruna, Koichiro Yoshino, Satoshi Nakamura, Giuseppe Riccardi

We publish the annotated dataset, annotation materials, and machine learning baseline models for the task of new event extraction for narrative understanding.

Event Extraction Sentence

Evaluating the Robustness of Discrete Prompts

1 code implementation11 Feb 2023 Yoichi Ishibashi, Danushka Bollegala, Katsuhito Sudoh, Satoshi Nakamura

To address this question, we conduct a systematic study of the robustness of discrete prompts by applying carefully designed perturbations into an application using AutoPrompt and then measure their performance in two Natural Language Inference (NLI) datasets.

Natural Language Inference

SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain

no code implementations8 Jan 2023 Heli Qi, Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

This paper introduces SpeeChain, an open-source Pytorch-based toolkit designed to develop the machine speech chain for large-scale use.

Data Augmentation

Instance-level Heterogeneous Domain Adaptation for Limited-labeled Sketch-to-Photo Retrieval

1 code implementation IEEE Transactions on Multimedia 2020 Fan Yang, Yang Wu, Zheng Wang, Xiang Li, Sakriani Sakti, Satoshi Nakamura

Therefore, previous works pre-train their models on rich-labeled photo retrieval data (i. e., source domain) and then fine-tune them on the limited-labeled sketch-to-photo retrieval data (i. e., target domain).

Domain Adaptation Image Retrieval +1

Average Token Delay: A Latency Metric for Simultaneous Translation

no code implementations22 Nov 2022 Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

In this work, we propose a novel latency evaluation metric called Average Token Delay (ATD) that focuses on the end timings of partial translations in simultaneous translation.

Translation

E2E Refined Dataset

1 code implementation1 Nov 2022 Keisuke Toyama, Katsuhito Sudoh, Satoshi Nakamura

Although the well-known MR-to-text E2E dataset has been used by many researchers, its MR-text pairs include many deletion/insertion/substitution errors.

Speech Artifact Removal from EEG Recordings of Spoken Word Production with Tensor Decomposition

no code implementations1 Jun 2022 Holy Lovenia, Hiroki Tanaka, Sakriani Sakti, Ayu Purwarianti, Satoshi Nakamura

Research about brain activities involving spoken word production is considerably underdeveloped because of the undiscovered characteristics of speech artifacts, which contaminate electroencephalogram (EEG) signals and prevent the inspection of the underlying cognitive processes.

blind source separation EEG +2

Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-Transcribing

no code implementations14 May 2022 Heli Qi, Sashi Novitasari, Sakriani Sakti, Satoshi Nakamura

The existing paradigm of semi-supervised S2S ASR utilizes SpecAugment as data augmentation and requires a static teacher model to produce pseudo transcripts for untranscribed speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Representing 'how you say' with 'what you say': English corpus of focused speech and text reflecting corresponding implications

no code implementations29 Mar 2022 Naoaki Suzuki, Satoshi Nakamura

As a type of paralinguistic information, English speech uses sentence stress, the heaviest prominence within a sentence, to convey emphasis.

Sentence Translation

Applying Syntax$\unicode{x2013}$Prosody Mapping Hypothesis and Prosodic Well-Formedness Constraints to Neural Sequence-to-Sequence Speech Synthesis

no code implementations29 Mar 2022 Kei Furukawa, Takeshi Kishiyama, Satoshi Nakamura

End-to-end text-to-speech synthesis (TTS), which generates speech sounds directly from strings of texts or phonemes, has improved the quality of speech synthesis over the conventional TTS.

Speech Synthesis Text-To-Speech Synthesis

Simultaneous Neural Machine Translation with Constituent Label Prediction

no code implementations WMT (EMNLP) 2021 Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous translation is a task in which translation begins before the speaker has finished speaking, so it is important to decide when to start the translation process.

Machine Translation Translation

Using Perturbed Length-aware Positional Encoding for Non-autoregressive Neural Machine Translation

no code implementations29 Jul 2021 Yui Oka, Katsuhito Sudoh, Satoshi Nakamura

Non-autoregressive neural machine translation (NAT) usually employs sequence-level knowledge distillation using autoregressive neural machine translation (AT) as its teacher model.

Knowledge Distillation Machine Translation +1

ARTA: Collection and Classification of Ambiguous Requests and Thoughtful Actions

1 code implementation SIGDIAL (ACL) 2021 Shohei Tanaka, Koichiro Yoshino, Katsuhito Sudoh, Satoshi Nakamura

In order to train the classification model on such training data, we applied the positive/unlabeled (PU) learning method, which assumes that only a part of the data is labeled with positive examples.

Classification

Incorporating Noisy Length Constraints into Transformer with Length-aware Positional Encodings

no code implementations COLING 2020 Yui Oka, Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

Since length constraints with exact target sentence lengths degrade translation performance, we add random noise within a certain window size to the length constraints in the PE during the training.

Machine Translation Sentence +1

Improving Spoken Language Understanding by Wisdom of Crowds

no code implementations COLING 2020 Koichiro Yoshino, Kana Ikeuchi, Katsuhito Sudoh, Satoshi Nakamura

Spoken language understanding (SLU), which converts user requests in natural language to machine-interpretable expressions, is becoming an essential task.

Data Augmentation Spoken Language Understanding

Sequence-to-Sequence Learning via Attention Transfer for Incremental Speech Recognition

no code implementations4 Nov 2020 Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

One main reason is because the model needs to decide the incremental steps and learn the transcription that aligns with the current short speech segment.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Augmenting Images for ASR and TTS through Single-loop and Dual-loop Multimodal Chain Framework

no code implementations4 Nov 2020 Johanes Effendi, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Previous research has proposed a machine speech chain to enable automatic speech recognition (ASR) and text-to-speech synthesis (TTS) to assist each other in semi-supervised learning and to avoid the need for a large amount of paired speech and text data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis

no code implementations LREC 2020 Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

We then develop ASR and TTS of ethnic languages by utilizing Indonesian ASR and TTS in a cross-lingual machine speech chain framework with only text or only speech data removing the need for paired speech-text data of those ethnic languages.

Machine Translation speech-recognition +3

Image Captioning with Visual Object Representations Grounded in the Textual Modality

no code implementations19 Oct 2020 Dušan Variš, Katsuhito Sudoh, Satoshi Nakamura

We present our work in progress exploring the possibilities of a shared embedding space between textual and visual modality.

Image Captioning Object +3

Reflection-based Word Attribute Transfer

2 code implementations ACL 2020 Yoichi Ishibashi, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura

For transferring king into queen in this analogy-based manner, we subtract a difference vector man - woman based on the knowledge that king is male.

Attribute Word Attribute Transfer +1

Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language Model

no code implementations ACL 2020 Kosuke Takahashi, Katsuhito Sudoh, Satoshi Nakamura

Our experiments show that our proposed method using Cross-lingual Language Model (XLM) trained with a translation language modeling (TLM) objective achieves a higher correlation with human judgments than a baseline method that uses only hypothesis and reference sentences.

Language Modelling Machine Translation +2

Emotional Speech Corpus for Persuasive Dialogue System

no code implementations LREC 2020 Sara Asai, Koichiro Yoshino, Seitaro Shinagawa, Sakriani Sakti, Satoshi Nakamura

Expressing emotion is known as an efficient way to persuade one{'}s dialogue partner to accept one{'}s claim or proposal.

Caption Generation of Robot Behaviors based on Unsupervised Learning of Action Segments

no code implementations23 Mar 2020 Koichiro Yoshino, Kohei Wakimoto, Yuta Nishimura, Satoshi Nakamura

Two reasons make it challenging to apply existing sequence-to-sequence models to this mapping: 1) it is hard to prepare a large-scale dataset for any kind of robots and their environment, and 2) there is a gap between the number of samples obtained from robot action observations and generated word sequences of captions.

Chunking Clustering

Simultaneous Neural Machine Translation using Connectionist Temporal Classification

no code implementations27 Nov 2019 Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

Simultaneous machine translation is a variant of machine translation that starts the translation process before the end of an input.

Classification General Classification +2

Using Panoramic Videos for Multi-person Localization and Tracking in a 3D Panoramic Coordinate

1 code implementation24 Nov 2019 Fan Yang, Feiran Li, Yang Wu, Sakriani Sakti, Satoshi Nakamura

3D panoramic multi-person localization and tracking are prominent in many applications, however, conventional methods using LiDAR equipment could be economically expensive and also computationally inefficient due to the processing of point cloud data.

 Ranked #1 on Multi-Object Tracking on MOT15_3D (using extra training data)

Multi-Object Tracking

Deja-vu: Double Feature Presentation and Iterated Loss in Deep Transformer Networks

1 code implementation23 Oct 2019 Andros Tjandra, Chunxi Liu, Frank Zhang, Xiaohui Zhang, Yongqiang Wang, Gabriel Synnaeve, Satoshi Nakamura, Geoffrey Zweig

As our motivation is to allow acoustic models to re-examine their input features in light of partial hypotheses we introduce intermediate model heads and loss function.

Speech-to-speech Translation between Untranscribed Unknown Languages

no code implementations2 Oct 2019 Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Second, we train a sequence-to-sequence model that directly maps the source language speech to the target language's discrete representation.

Speech-to-Speech Translation Translation

Neural Conversation Model Controllable by Given Dialogue Act Based on Adversarial Learning and Label-aware Objective

no code implementations WS 2019 Seiya Kawano, Koichiro Yoshino, Satoshi Nakamura

We introduce an adversarial learning framework for the task of generating conditional responses with a new objective to a discriminator, which explicitly distinguishes sentences by using labels.

Make Skeleton-based Action Recognition Model Smaller, Faster and Better

3 code implementations arXiv 2019 Fan Yang, Sakriani Sakti, Yang Wu, Satoshi Nakamura

Although skeleton-based action recognition has achieved great success in recent years, most of the existing methods may suffer from a large model size and slow execution speed.

Action Recognition Hand Gesture Recognition +1

Listening while Speaking and Visualizing: Improving ASR through Multimodal Chain

no code implementations3 Jun 2019 Johanes Effendi, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Previously, a machine speech chain, which is based on sequence-to-sequence deep learning, was proposed to mimic speech perception and production behavior.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

An Incremental Turn-Taking Model For Task-Oriented Dialog Systems

2 code implementations28 May 2019 Andrei C. Coman, Koichiro Yoshino, Yukitoshi Murase, Satoshi Nakamura, Giuseppe Riccardi

To identify the point of maximal understanding in an ongoing utterance, we a) implement an incremental Dialog State Tracker which is updated on a token basis (iDST) b) re-label the Dialog State Tracking Challenge 2 (DSTC2) dataset and c) adapt it to the incremental turn-taking experimental scenario.

dialog state tracking

VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019

no code implementations27 May 2019 Andros Tjandra, Berrak Sisman, Mingyang Zhang, Sakriani Sakti, Haizhou Li, Satoshi Nakamura

Our proposed approach significantly improved the intelligibility (in CER), the MOS, and discrimination ABX scores compared to the official ZeroSpeech 2019 baseline or even the topline.

Clustering

Optimization of Information-Seeking Dialogue Strategy for Argumentation-Based Dialogue System

no code implementations26 Nov 2018 Hisao Katsumi, Takuya Hiraoka, Koichiro Yoshino, Kazeto Yamamoto, Shota Motoura, Kunihiko Sadamasa, Satoshi Nakamura

It is required that these systems have sufficient supporting information to argue their claims rationally; however, the systems often do not have enough of such information in realistic situations.

Another Diversity-Promoting Objective Function for Neural Dialogue Generation

1 code implementation20 Nov 2018 Ryo Nakamura, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura

Although generation-based dialogue systems have been widely researched, the response generations by most existing systems have very low diversities.

Dialogue Generation

Training Neural Machine Translation using Word Embedding-based Loss

no code implementations30 Jul 2018 Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura

The proposed loss function encourages an NMT decoder to generate words close to their references in the embedding space; this helps the decoder to choose similar acceptable words when the actual best candidates are not included in the vocabulary due to its size limitation.

Machine Translation NMT +2

Unsupervised Counselor Dialogue Clustering for Positive Emotion Elicitation in Neural Dialogue System

no code implementations WS 2018 Nurul Lubis, Sakriani Sakti, Koichiro Yoshino, Satoshi Nakamura

Positive emotion elicitation seeks to improve user{'}s emotional state through dialogue system interaction, where a chat-based scenario is layered with an implicit goal to address user{'}s emotional needs.

Clustering Emotion Recognition +2

Multi-Source Neural Machine Translation with Missing Data

no code implementations WS 2018 Yuta Nishimura, Katsuhito Sudoh, Graham Neubig, Satoshi Nakamura

This study focuses on the use of incomplete multilingual corpora in multi-encoder NMT and mixture of NMT experts and examines a very simple implementation where missing source translations are replaced by a special symbol <NULL>.

Machine Translation NMT +2

Guiding Neural Machine Translation with Retrieved Translation Pieces

no code implementations NAACL 2018 Jingyi Zhang, Masao Utiyama, Eiichro Sumita, Graham Neubig, Satoshi Nakamura

Specifically, for an input sentence, we use a search engine to retrieve sentence pairs whose source sides are similar with the input sentence, and then collect $n$-grams that are both in the retrieved target sentences and aligned with words that match in the source sentences, which we call "translation pieces".

Machine Translation NMT +3

Machine Speech Chain with One-shot Speaker Adaptation

no code implementations28 Mar 2018 Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

In the speech chain loop mechanism, ASR also benefits from the ability to further learn an arbitrary speaker's characteristics from the generated speech waveform, resulting in a significant improvement in the recognition rate.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Tensor Decomposition for Compressing Recurrent Neural Network

1 code implementation28 Feb 2018 Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

In the machine learning fields, Recurrent Neural Network (RNN) has become a popular architecture for sequential data modeling.

Tensor Decomposition

Interactive Image Manipulation with Natural Language Instruction Commands

no code implementations23 Feb 2018 Seitaro Shinagawa, Koichiro Yoshino, Sakriani Sakti, Yu Suzuki, Satoshi Nakamura

We propose an interactive image-manipulation system with natural language instruction, which can generate a target image from a source image and an instruction that describes the difference between the source and the target image.

Image Generation Image Manipulation

Structured-based Curriculum Learning for End-to-end English-Japanese Speech Translation

no code implementations13 Feb 2018 Takatomo Kano, Sakriani Sakti, Satoshi Nakamura

Sequence-to-sequence attentional-based neural network architectures have been shown to provide a powerful model for machine translation and speech recognition.

Machine Translation speech-recognition +2

Improving Neural Machine Translation through Phrase-based Forced Decoding

no code implementations IJCNLP 2017 Jingyi Zhang, Masao Utiyama, Eiichro Sumita, Graham Neubig, Satoshi Nakamura

Compared to traditional statistical machine translation (SMT), neural machine translation (NMT) often sacrifices adequacy for the sake of fluency.

Machine Translation NMT +1

Sequence-to-Sequence ASR Optimization via Reinforcement Learning

no code implementations30 Oct 2017 Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Despite the success of sequence-to-sequence approaches in automatic speech recognition (ASR) systems, the models still suffer from several problems, mainly due to the mismatch between the training and inference conditions.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Attention-based Wav2Text with Feature Transfer Learning

no code implementations22 Sep 2017 Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

In this paper, we construct the first end-to-end attention-based encoder-decoder model to process directly from raw speech waveform to the text transcription.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Transcribing Against Time

no code implementations15 Sep 2017 Matthias Sperber, Graham Neubig, Jan Niehues, Satoshi Nakamura, Alex Waibel

We investigate the problem of manually correcting errors from an automatic speech transcript in a cost-sensitive fashion.

Information Navigation System with Discovering User Interests

no code implementations WS 2017 Koichiro Yoshino, Yu Suzuki, Satoshi Nakamura

We demonstrate an information navigation system for sightseeing domains that has a dialogue interface for discovering user interests for tourist activities.

Semantic Textual Similarity Speech Recognition

Gated Recurrent Neural Tensor Network

no code implementations7 Jun 2017 Andros Tjandra, Sakriani Sakti, Ruli Manurung, Mirna Adriani, Satoshi Nakamura

Our proposed RNNs, which are called a Long-Short Term Memory Recurrent Neural Tensor Network (LSTMRNTN) and Gated Recurrent Unit Recurrent Neural Tensor Network (GRURNTN), are made by combining the LSTM and GRU RNN models with the tensor product.

Language Modelling

Analysis of the Effect of Dependency Information on Predicate-Argument Structure Analysis and Zero Anaphora Resolution

no code implementations31 May 2017 Koichiro Yoshino, Shinsuke Mori, Satoshi Nakamura

This paper investigates and analyzes the effect of dependency information on predicate-argument structure analysis (PASA) and zero anaphora resolution (ZAR) for Japanese, and shows that a straightforward approach of PASA and ZAR works effectively even if dependency information was not available.

Dependency Parsing POS

Compressing Recurrent Neural Network with Tensor Train

no code implementations23 May 2017 Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Recurrent Neural Network (RNN) are a popular choice for modeling temporal and sequential tasks and achieve many state-of-the-art performance on various complex problems.

Incorporating Discrete Translation Lexicons into Neural Machine Translation

2 code implementations EMNLP 2016 Philip Arthur, Graham Neubig, Satoshi Nakamura

Neural machine translation (NMT) often makes mistakes in translating low-frequency content words that are essential to understanding the meaning of the sentence.

Machine Translation NMT +3

Neural Reranking Improves Subjective Quality of Machine Translation: NAIST at WAT2015

no code implementations WS 2015 Graham Neubig, Makoto Morishita, Satoshi Nakamura

We further perform a detailed analysis of reasons for this increase, finding that the main contributions of the neural models lie in improvement of the grammatical correctness of the output, as opposed to improvements in lexical choice of content words.

Machine Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.