Search Results for author: Ngoc Thang Vu

Found 56 papers, 8 papers with code

IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task

no code implementations30 Jun 2021 Pavel Denisov, Manuel Mager, Ngoc Thang Vu

This paper describes the submission to the IWSLT 2021 Low-Resource Speech Translation Shared Task by IMS team.

Data Augmentation Machine Translation +2

Few-shot Learning for Slot Tagging with Attentive Relational Network

no code implementations EACL 2021 Cennet Oguz, Ngoc Thang Vu

Metric-based learning is a well-known family of methods for few-shot learning, especially in computer vision.

Few-Shot Learning

Investigations on Audiovisual Emotion Recognition in Noisy Conditions

no code implementations2 Mar 2021 Michael Neumann, Ngoc Thang Vu

In this paper we explore audiovisual emotion recognition under noisy acoustic conditions with a focus on speech features.

Speech Emotion Recognition

Meta-Learning for improving rare word recognition in end-to-end ASR

no code implementations25 Feb 2021 Florian Lux, Ngoc Thang Vu

We propose a new method of generating meaningful embeddings for speech, changes to four commonly used meta learning approaches to enable them to perform keyword spotting in continuous signals and an approach of combining their outcomes into an end-to-end automatic speech recognition system to improve rare word recognition.

End-To-End Speech Recognition Keyword Spotting +2

Fine-tuning BERT for Low-Resource Natural Language Understanding via Active Learning

no code implementations4 Dec 2020 Daniel Grießhaber, Johannes Maucher, Ngoc Thang Vu

Recently, leveraging pre-trained Transformer based language models in down stream, task specific models has advanced state of the art results in natural language understanding tasks.

Active Learning Language Modelling +1

F1 is Not Enough! Models and Evaluation Towards User-Centered Explainable Question Answering

1 code implementation EMNLP 2020 Hendrik Schuff, Heike Adel, Ngoc Thang Vu

The user study shows that our models increase the ability of the users to judge the correctness of the system and that scores like F1 are not enough to estimate the usefulness of a model in a practical setting with human users.

Model Selection Question Answering

Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning

no code implementations3 Jul 2020 Pavel Denisov, Ngoc Thang Vu

Spoken language understanding is typically based on pipeline architectures including speech recognition and natural language understanding steps.

Natural Language Understanding Speech Recognition +1

Ensemble Self-Training for Low-Resource Languages: Grapheme-to-Phoneme Conversion and Morphological Inflection

no code implementations WS 2020 Xiang Yu, Ngoc Thang Vu, Jonas Kuhn

We present an iterative data augmentation framework, which trains and searches for an optimal ensemble and simultaneously annotates new training data in a self-training style.

Data Augmentation Morphological Inflection

ADVISER: A Toolkit for Developing Multi-modal, Multi-domain and Socially-engaged Conversational Agents

1 code implementation ACL 2020 Chia-Yu Li, Daniel Ortega, Dirk Väth, Florian Lux, Lindsey Vanderlyn, Maximilian Schmidt, Michael Neumann, Moritz Völkel, Pavel Denisov, Sabrina Jenne, Zorica Kacarevic, Ngoc Thang Vu

We present ADVISER - an open-source, multi-domain dialog system toolkit that enables the development of multi-modal (incorporating speech, text and vision), socially-engaged (e. g. emotion recognition, engagement level prediction and backchanneling) conversational agents.

Emotion Recognition Platform

ArzEn: A Speech Corpus for Code-switched Egyptian Arabic-English

no code implementations LREC 2020 Injy Hamed, Ngoc Thang Vu, Slim Abdennadher

In this paper, we first discuss the CS phenomenon in Egypt and the factors that gave rise to the current language.

Speech Recognition

Head-First Linearization with Tree-Structured Representation

no code implementations WS 2019 Xiang Yu, Agnieszka Falenska, Ngoc Thang Vu, Jonas Kuhn

We present a dependency tree linearization model with two novel components: (1) a tree-structured encoder based on bidirectional Tree-LSTM that propagates information first bottom-up then top-down, which allows each token to access information from the entire tree; and (2) a linguistically motivated head-first decoder that emphasizes the central role of the head and linearizes the subtree by incrementally attaching the dependents on both sides of the head.

To Combine or Not To Combine? A Rainbow Deep Reinforcement Learning Agent for Dialog Policies

no code implementations WS 2019 Dirk V{\"a}th, Ngoc Thang Vu

In this paper, we explore state-of-the-art deep reinforcement learning methods for dialog policy training such as prioritized experience replay, double deep Q-Networks, dueling network architectures and distributional learning.

IMS-Speech: A Speech to Text Tool

no code implementations13 Aug 2019 Pavel Denisov, Ngoc Thang Vu

We present the IMS-Speech, a web based tool for German and English speech transcription aiming to facilitate research in various disciplines which require accesses to lexical information in spoken language materials.

Speech Recognition

End-to-End Multi-Speaker Speech Recognition using Speaker Embeddings and Transfer Learning

no code implementations13 Aug 2019 Pavel Denisov, Ngoc Thang Vu

This paper presents our latest investigation on end-to-end automatic speech recognition (ASR) for overlapped speech.

Speech Recognition Transfer Learning

Learning the Dyck Language with Attention-based Seq2Seq Models

no code implementations WS 2019 Xiang Yu, Ngoc Thang Vu, Jonas Kuhn

The generalized Dyck language has been used to analyze the ability of Recurrent Neural Networks (RNNs) to learn context-free grammars (CFGs).

Approximate Dynamic Oracle for Dependency Parsing with Reinforcement Learning

no code implementations WS 2018 Xiang Yu, Ngoc Thang Vu, Jonas Kuhn

We present a general approach with reinforcement learning (RL) to approximate dynamic oracles for transition systems where exact dynamic oracles are difficult to derive.

Dependency Parsing Imitation Learning +2

Sequence-to-Sequence Models for Data-to-Text Natural Language Generation: Word- vs. Character-based Processing and Output Diversity

no code implementations WS 2018 Glorianna Jagfeld, Sabrina Jenne, Ngoc Thang Vu

We present a comparison of word-based and character-based sequence-to-sequence models for data-to-text natural language generation, which generate natural language descriptions for structured inputs.

Text Generation

Comparing Attention-based Convolutional and Recurrent Neural Networks: Success and Limitations in Machine Reading Comprehension

1 code implementation CONLL 2018 Matthias Blohm, Glorianna Jagfeld, Ekta Sood, Xiang Yu, Ngoc Thang Vu

We propose a machine reading comprehension model based on the compare-aggregate framework with two-staged attention that achieves state-of-the-art results on the MovieQA question answering dataset.

Machine Reading Comprehension Question Answering

Densely Connected Convolutional Networks for Speech Recognition

no code implementations10 Aug 2018 Chia Yu Li, Ngoc Thang Vu

This paper presents our latest investigation on Densely Connected Convolutional Networks (DenseNets) for acoustic modelling (AM) in automatic speech recognition.

Acoustic Modelling Speech Recognition

Unsupervised Domain Adaptation by Adversarial Learning for Robust Speech Recognition

no code implementations30 Jul 2018 Pavel Denisov, Ngoc Thang Vu, Marc Ferras Font

In this paper, we investigate the use of adversarial learning for unsupervised adaptation to unseen recording conditions, more specifically, single microphone far-field speech.

Robust Speech Recognition Unsupervised Domain Adaptation

Low-Resource Text Classification using Domain-Adversarial Learning

no code implementations13 Jul 2018 Daniel Grießhaber, Ngoc Thang Vu, Johannes Maucher

Deep learning techniques have recently shown to be successful in many natural language processing tasks forming state-of-the-art systems.

General Classification Text Classification

Effects of Word Embeddings on Neural Network-based Pitch Accent Detection

no code implementations14 May 2018 Sabrina Stehwien, Ngoc Thang Vu, Antje Schweitzer

Pitch accent detection often makes use of both acoustic and lexical features based on the fact that pitch accents tend to correlate with certain words.

Word Embeddings

Investigations on End-to-End Audiovisual Fusion

no code implementations30 Apr 2018 Michael Wand, Ngoc Thang Vu, Juergen Schmidhuber

Audiovisual speech recognition (AVSR) is a method to alleviate the adverse effect of noise in the acoustic signal.

Speech Recognition

Introducing two Vietnamese Datasets for Evaluating Semantic Models of (Dis-)Similarity and Relatedness

no code implementations NAACL 2018 Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu

We present two novel datasets for the low-resource language Vietnamese to assess models of semantic similarity: ViCon comprises pairs of synonyms and antonyms across word classes, thus offering data to distinguish between similarity and dissimilarity.

Semantic Similarity Semantic Textual Similarity +1

Cross-lingual and Multilingual Speech Emotion Recognition on English and French

no code implementations1 Mar 2018 Michael Neumann, Ngoc Thang Vu

Research on multilingual speech emotion recognition faces the problem that most available speech corpora differ from each other in important ways, such as annotation methods or interaction scenarios.

Speech Emotion Recognition

Syntactic and Semantic Features For Code-Switching Factored Language Models

no code implementations4 Oct 2017 Heike Adel, Ngoc Thang Vu, Katrin Kirchhoff, Dominic Telaar, Tanja Schultz

The experimental results reveal that Brown word clusters, part-of-speech tags and open-class words are the most effective at reducing the perplexity of factored language models on the Mandarin-English Code-Switching corpus SEAME.

Language Modelling Speech Recognition +1

Improving coreference resolution with automatically predicted prosodic information

no code implementations WS 2017 Ina Rösiger, Sabrina Stehwien, Arndt Riester, Ngoc Thang Vu

Adding manually annotated prosodic information, specifically pitch accents and phrasing, to the typical text-based feature set for coreference resolution has previously been shown to have a positive effect on German data.

Coreference Resolution

Encoding Word Confusion Networks with Recurrent Neural Networks for Dialog State Tracking

no code implementations WS 2017 Glorianna Jagfeld, Ngoc Thang Vu

This paper presents our novel method to encode word confusion networks, which can represent a rich hypothesis space of automatic speech recognition systems, via recurrent neural networks.

Speech Recognition

A General-Purpose Tagger with Convolutional Neural Networks

1 code implementation WS 2017 Xiang Yu, Agnieszka Faleńska, Ngoc Thang Vu

We present a general-purpose tagger based on convolutional neural networks (CNN), used for both composing word vectors and encoding context information.

Morphological Tagging Part-Of-Speech Tagging

Prosodic Event Recognition using Convolutional Neural Networks with Context Information

no code implementations2 Jun 2017 Sabrina Stehwien, Ngoc Thang Vu

This paper demonstrates the potential of convolutional neural networks (CNN) for detecting and classifying prosodic events on words, specifically pitch accents and phrase boundary tones, from frame-based acoustic features.

Character Composition Model with Convolutional Neural Networks for Dependency Parsing on Morphologically Rich Languages

1 code implementation ACL 2017 Xiang Yu, Ngoc Thang Vu

We present a transition-based dependency parser that uses a convolutional neural network to compose word representations from characters.

Dependency Parsing Word Embeddings

Challenges of Computational Processing of Code-Switching

no code implementations WS 2016 Özlem Çetinoğlu, Sarah Schulz, Ngoc Thang Vu

This paper addresses challenges of Natural Language Processing (NLP) on non-canonical multilingual data in which two or more languages are mixed.

Dependency Parsing Language Identification +4

Sequential Convolutional Neural Networks for Slot Filling in Spoken Language Understanding

no code implementations24 Jun 2016 Ngoc Thang Vu

We investigate the usage of convolutional neural networks (CNNs) for the slot filling task in spoken language understanding.

General Classification Slot Filling +1

Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction

no code implementations ACL 2016 Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu

We propose a novel vector representation that integrates lexical contrast into distributional vectors and strengthens the most salient features for determining degrees of word similarity.

Word Embeddings Word Similarity

Combining Recurrent and Convolutional Neural Networks for Relation Classification

no code implementations NAACL 2016 Ngoc Thang Vu, Heike Adel, Pankaj Gupta, Hinrich Schütze

This paper investigates two different neural architectures for the task of relation classification: convolutional neural networks and recurrent neural networks.

General Classification Relation Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.