Search Results for author: Dat Quoc Nguyen

Found 57 papers, 38 papers with code

Investigating the Impact of ASR Errors on Spoken Implicit Discourse Relation Recognition

no code implementations TU (COLING) 2022 Linh The Nguyen, Dat Quoc Nguyen

We present an empirical study investigating the influence of automatic speech recognition (ASR) errors on the spoken implicit discourse relation recognition (IDRR) task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

JPIS: A Joint Model for Profile-based Intent Detection and Slot Filling with Slot-to-Intent Attention

1 code implementation14 Dec 2023 Thinh Pham, Dat Quoc Nguyen

JPIS incorporates the supporting profile information into its encoder and introduces a slot-to-intent attention mechanism to transfer slot information representations to intent detection.

Intent Detection slot-filling +1

MISCA: A Joint Model for Multiple Intent Detection and Slot Filling with Intent-Slot Co-Attention

1 code implementation10 Dec 2023 Thinh Pham, Chi Tran, Dat Quoc Nguyen

The research study of detecting multiple intents and filling slots is becoming more popular because of its relevance to complicated real-world situations.

graph construction Intent Detection +2

PhoGPT: Generative Pre-training for Vietnamese

1 code implementation6 Nov 2023 Dat Quoc Nguyen, Linh The Nguyen, Chi Tran, Dung Ngoc Nguyen, Dinh Phung, Hung Bui

The base model, PhoGPT-4B, with exactly 3. 7B parameters, is pre-trained from scratch on a Vietnamese corpus of 102B tokens, with an 8192 context length, employing a vocabulary of 20480 token types.

Instruction Following

XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech

2 code implementations31 May 2023 Linh The Nguyen, Thinh Pham, Dat Quoc Nguyen

We present XPhoneBERT, the first multilingual model pre-trained to learn phoneme representations for the downstream text-to-speech (TTS) task.

From Disfluency Detection to Intent Detection and Slot Filling

1 code implementation17 Sep 2022 Mai Hoang Dao, Thinh Hung Truong, Dat Quoc Nguyen

We present the first empirical study investigating the influence of disfluency detection on downstream tasks of intent detection and slot filling.

Intent Detection slot-filling +2

A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation

1 code implementation8 Aug 2022 Linh The Nguyen, Nguyen Luong Tran, Long Doan, Manh Luong, Dat Quoc Nguyen

In this paper, we introduce a high-quality and large-scale benchmark dataset for English-Vietnamese speech translation with 508 audio hours, consisting of 331K triplets of (sentence-lengthed audio, English source transcript sentence, Vietnamese target subtitle sentence).

Sentence Translation

Two-view Graph Neural Networks for Knowledge Graph Completion

1 code implementation16 Dec 2021 Vinh Tong, Dai Quoc Nguyen, Dinh Phung, Dat Quoc Nguyen

WGE also constructs another single undirected graph from relation-focused constraints, which views entities and relations as nodes.

Knowledge Graph Completion Knowledge Graph Embedding +2

PhoMT: A High-Quality and Large-Scale Benchmark Dataset for Vietnamese-English Machine Translation

1 code implementation EMNLP 2021 Long Doan, Linh The Nguyen, Nguyen Luong Tran, Thai Hoang, Dat Quoc Nguyen

We introduce a high-quality and large-scale Vietnamese-English parallel dataset of 3. 02M sentence pairs, which is 2. 9M pairs larger than the benchmark Vietnamese-English machine translation corpus IWSLT15.

Denoising Machine Translation +2

BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

2 code implementations20 Sep 2021 Nguyen Luong Tran, Duong Minh Le, Dat Quoc Nguyen

We present BARTpho with two versions, BARTpho-syllable and BARTpho-word, which are the first public large-scale monolingual sequence-to-sequence models pre-trained for Vietnamese.

Abstractive Text Summarization Denoising +1

Node Co-occurrence based Graph Neural Networks for Knowledge Graph Link Prediction

1 code implementation15 Apr 2021 Dai Quoc Nguyen, Vinh Tong, Dinh Phung, Dat Quoc Nguyen

We introduce a novel embedding model, named NoGE, which aims to integrate co-occurrence among entities and relations into graph neural networks to improve knowledge graph completion (i. e., link prediction).

Knowledge Graph Completion Link Prediction

COVID-19 Named Entity Recognition for Vietnamese

1 code implementation NAACL 2021 Thinh Hung Truong, Mai Hoang Dao, Dat Quoc Nguyen

The current COVID-19 pandemic has lead to the creation of many corpora that facilitate NLP research and downstream applications to help fight the pandemic.

named-entity-recognition Named Entity Recognition +4

PhoNLP: A joint multi-task learning model for Vietnamese part-of-speech tagging, named entity recognition and dependency parsing

1 code implementation NAACL 2021 Linh The Nguyen, Dat Quoc Nguyen

We present the first multi-task learning model -- named PhoNLP -- for joint Vietnamese part-of-speech (POS) tagging, named entity recognition (NER) and dependency parsing.

Dependency Parsing Language Modelling +7

A Pilot Study of Text-to-SQL Semantic Parsing for Vietnamese

1 code implementation Findings of the Association for Computational Linguistics 2020 Anh Tuan Nguyen, Mai Hoang Dao, Dat Quoc Nguyen

We compare the two baselines with key configurations and find that: automatic Vietnamese word segmentation improves the parsing results of both baselines; the normalized pointwise mutual information (NPMI) score (Bouma, 2009) is useful for schema linking; latent syntactic features extracted from a neural dependency parser for Vietnamese also improve the results; and the monolingual language model PhoBERT for Vietnamese (Nguyen and Nguyen, 2020) helps produce higher performances than the recent best multilingual language model XLM-R (Conneau et al., 2020).

Semantic Parsing Text-To-SQL +2

A Label Attention Model for ICD Coding from Clinical Text

2 code implementations13 Jul 2020 Thanh Vu, Dat Quoc Nguyen, Anthony Nguyen

In this paper, we propose a new label attention model for automatic ICD coding, which can handle both the various lengths and the interdependence of the ICD code related text fragments.

Medical Code Prediction

PhoBERT: Pre-trained language models for Vietnamese

1 code implementation Findings of the Association for Computational Linguistics 2020 Dat Quoc Nguyen, Anh Tuan Nguyen

We present PhoBERT with two versions, PhoBERT-base and PhoBERT-large, the first public large-scale monolingual language models pre-trained for Vietnamese.

Dependency Parsing named-entity-recognition +5

A Vietnamese information retrieval system for product-price

no code implementations26 Nov 2019 Tien-Thanh Vu, Dat Quoc Nguyen

A price information retrieval (IR) system allows users to search and view differences among prices of specific products.

Information Retrieval Retrieval

A Vietnamese Text-Based Conversational Agent

no code implementations26 Nov 2019 Dai Quoc Nguyen, Dat Quoc Nguyen, Son Bao Pham

This paper introduces a Vietnamese text-based conversational agent architecture on specific knowledge domain which is integrated in a question answering system.

Question Answering

A Vietnamese Question Answering System

no code implementations26 Nov 2019 Dai Quoc Nguyen, Dat Quoc Nguyen, Son Bao Pham

Question answering systems aim to produce exact answers to users' questions instead of a list of related documents as used by current search engines.

Question Answering

Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings

1 code implementation WS 2019 Zenan Zhai, Dat Quoc Nguyen, Saber A. Akhondi, Camilo Thorne, Christian Druckenbrodt, Trevor Cohn, Michelle Gregory, Karin Verspoor

In this paper, we explore the NER performance of a BiLSTM-CRF model utilising pre-trained word embeddings, character-level word representations and contextualized ELMo word representations for chemical patents.

named-entity-recognition Named Entity Recognition +2

A neural joint model for Vietnamese word segmentation, POS tagging and dependency parsing

no code implementations ALTA 2019 Dat Quoc Nguyen

We propose the first multi-task learning model for joint Vietnamese word segmentation, part-of-speech (POS) tagging and dependency parsing.

Dependency Parsing Multi-Task Learning +5

End-to-end neural relation extraction using deep biaffine attention

1 code implementation29 Dec 2018 Dat Quoc Nguyen, Karin Verspoor

We propose a neural network model for joint extraction of named entities and relations between them, without any hand-crafted features.

General Classification Relation +1

Improving Topic Models with Latent Feature Word Representations

no code implementations TACL 2015 Dat Quoc Nguyen, Richard Billingsley, Lan Du, Mark Johnson

Probabilistic topic models are widely used to discover latent topics in document collections, while latent feature vector representations of words have been used to obtain high performance in many NLP tasks.

Clustering Document Classification +2

Comparing CNN and LSTM character-level embeddings in BiLSTM-CRF models for chemical and disease named entity recognition

no code implementations WS 2018 Zenan Zhai, Dat Quoc Nguyen, Karin Verspoor

We compare the use of LSTM-based and CNN-based character-level word embeddings in BiLSTM-CRF models to approach chemical and disease named entity recognition (NER) tasks.

named-entity-recognition Named Entity Recognition +2

jLDADMM: A Java package for the LDA and DMM topic models

1 code implementation11 Aug 2018 Dat Quoc Nguyen

In this technical report, we present jLDADMM---an easy-to-use Java toolkit for conventional topic models.

Clustering Topic Models

From POS tagging to dependency parsing for biomedical event extraction

2 code implementations11 Aug 2018 Dat Quoc Nguyen, Karin Verspoor

Results: We perform an empirical study comparing state-of-the-art traditional feature-based and neural network-based models for two core natural language processing tasks of part-of-speech (POS) tagging and dependency parsing on two benchmark biomedical corpora, GENIA and CRAFT.

Dependency Parsing Event Extraction +3

From Word Segmentation to POS Tagging for Vietnamese

1 code implementation ALTA 2017 Dat Quoc Nguyen, Thanh Vu, Dai Quoc Nguyen, Mark Dras, Mark Johnson

This paper presents an empirical comparison of two strategies for Vietnamese Part-of-Speech (POS) tagging from unsegmented text: (i) a pipeline strategy where we consider the output of a word segmenter as the input of a POS tagger, and (ii) a joint strategy where we predict a combined segmentation and POS tag for each syllable.

Part-Of-Speech Tagging POS +2

Sequence to Sequence Learning for Event Prediction

1 code implementation IJCNLP 2017 Dai Quoc Nguyen, Dat Quoc Nguyen, Cuong Xuan Chu, Stefan Thater, Manfred Pinkal

This paper presents an approach to the task of predicting an event description from a preceding sentence in a text.

Sentence

A Mixture Model for Learning Multi-Sense Word Embeddings

no code implementations SEMEVAL 2017 Dai Quoc Nguyen, Dat Quoc Nguyen, Ashutosh Modi, Stefan Thater, Manfred Pinkal

Our model generalizes the previous works in that it allows to induce different weights of different senses of a word.

Word Embeddings

A survey of embedding models of entities and relationships for knowledge graph completion

2 code implementations COLING (TextGraphs) 2020 Dat Quoc Nguyen

Knowledge graphs (KGs) of real-world facts about entities and their relationships are useful resources for a variety of natural language processing tasks.

Knowledge Base Completion Link Prediction

Search Personalization with Embeddings

1 code implementation12 Dec 2016 Thanh Vu, Dat Quoc Nguyen, Mark Johnson, Dawei Song, Alistair Willis

Recent research has shown that the performance of search personalization depends on the richness of user profiles which normally represent the user's topical interests.

An empirical study for Vietnamese dependency parsing

no code implementations ALTA 2016 Dat Quoc Nguyen, Mark Dras, Mark Johnson

This paper presents an empirical comparison of different dependency parsers for Vietnamese, which has some unusual characteristics such as copula drop and verb serialization.

Dependency Parsing

STransE: a novel embedding model of entities and relationships in knowledge bases

1 code implementation NAACL 2016 Dat Quoc Nguyen, Kairit Sirts, Lizhen Qu, Mark Johnson

Knowledge bases of real-world facts about entities and their relationships are useful resources for a variety of natural language processing tasks.

Knowledge Base Completion Link Prediction +1

A Robust Transformation-Based Learning Approach Using Ripple Down Rules for Part-of-Speech Tagging

1 code implementation12 Dec 2014 Dat Quoc Nguyen, Dai Quoc Nguyen, Dang Duc Pham, Son Bao Pham

In this paper, we propose a new approach to construct a system of transformation rules for the Part-of-Speech (POS) tagging task.

Part-Of-Speech Tagging POS +1

Ripple Down Rules for Question Answering

no code implementations12 Dec 2014 Dat Quoc Nguyen, Dai Quoc Nguyen, Son Bao Pham

Recent years have witnessed a new trend of building ontology-based question answering systems.

Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.