Search Results for author: Ngan Luu-Thuy Nguyen

Found 53 papers, 17 papers with code

ViNLI: A Vietnamese Corpus for Studies on Open-Domain Natural Language Inference

no code implementations COLING 2022 Tin Van Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

In this paper, we introduce ViNLI (Vietnamese Natural Language Inference), an open-domain and high-quality corpus for evaluating Vietnamese NLI models, which is created and evaluated with a strict process of quality control.

Natural Language Inference Sentence +1

ViANLI: Adversarial Natural Language Inference for Vietnamese

no code implementations25 Jun 2024 Tin Van Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

The development of Natural Language Processing (NLI) datasets and models has been inspired by innovations in annotation design.

Adversarial Natural Language Inference Natural Language Inference +3

ViOCRVQA: Novel Benchmark Dataset and Vision Reader for Visual Question Answering by Understanding Vietnamese Text in Images

1 code implementation29 Apr 2024 Huy Quang Pham, Thang Kien-Bao Nguyen, Quan Van Nguyen, Dan Quang Tran, Nghia Hieu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

To this end, we introduce a novel dataset, ViOCRVQA (Vietnamese Optical Character Recognition - Visual Question Answering dataset), consisting of 28, 000+ images and 120, 000+ question-answer pairs.

Optical Character Recognition Optical Character Recognition (OCR) +2

VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding

no code implementations23 Mar 2024 Phong Nguyen-Thuan Do, Son Quoc Tran, Phu Gia Hoang, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

The success of Natural Language Understanding (NLU) benchmarks in various languages, such as GLUE for English, CLUE for Chinese, KLUE for Korean, and IndoNLU for Indonesian, has facilitated the evaluation of new NLU models across a wide range of tasks.

Natural Language Understanding text-classification +3

VlogQA: Task, Dataset, and Baseline Models for Vietnamese Spoken-Based Machine Reading Comprehension

1 code implementation5 Feb 2024 Thinh Phuoc Ngo, Khoa Tran Anh Dang, Son T. Luu, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

This paper presents the development process of a Vietnamese spoken language corpus for machine reading comprehension (MRC) tasks and provides insights into the challenges and opportunities associated with using real-world data for machine reading comprehension tasks.

Machine Reading Comprehension

Abusive Span Detection for Vietnamese Narrative Texts

no code implementations13 Dec 2023 Nhu-Thanh Nguyen, Khoa Thi-Kim Phan, Duc-Vu Nguyen, Ngan Luu-Thuy Nguyen

Abuse in its various forms, including physical, psychological, verbal, sexual, financial, and cultural, has a negative impact on mental health.

A Multiple Choices Reading Comprehension Corpus for Vietnamese Language Education

1 code implementation31 Mar 2023 Son T. Luu, Khoi Trong Hoang, Tuong Quang Pham, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

From the results of the error analysis, we found the challenge of the reading comprehension models is understanding the implicit context in texts and linking them together in order to find the correct answers.

Machine Reading Comprehension Multiple-choice +1

EVJVQA Challenge: Multilingual Visual Question Answering

no code implementations23 Feb 2023 Ngan Luu-Thuy Nguyen, Nghia Hieu Nguyen, Duong T. D Vo, Khanh Quoc Tran, Kiet Van Nguyen

Visual Question Answering (VQA) is a challenging task of natural language processing (NLP) and computer vision (CV), attracting significant attention from researchers.

Language Modelling Question Answering +2

ViHOS: Hate Speech Spans Detection for Vietnamese

1 code implementation24 Jan 2023 Phu Gia Hoang, Canh Duc Luu, Khanh Quoc Tran, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

The rise in hateful and offensive language directed at other users is one of the adverse side effects of the increased use of social networking platforms.

Sequence-to-sequence Language Modeling XLM-R

Is word segmentation necessary for Vietnamese sentiment classification?

no code implementations1 Jan 2023 Duc-Vu Nguyen, Ngan Luu-Thuy Nguyen

To the best of our knowledge, this paper made the first attempt to answer whether word segmentation is necessary for Vietnamese sentiment classification.

Classification Segmentation +2

A Comparative Study of Question Answering over Knowledge Bases

1 code implementation15 Nov 2022 Khiem Vinh Tran, Hao Phu Phan, Khang Nguyen Duc Quach, Ngan Luu-Thuy Nguyen, Jun Jo, Thanh Tam Nguyen

In that, we study various question types, properties, languages, and domains to provide insights on where existing systems struggle.

Diversity Question Answering

SMTCE: A Social Media Text Classification Evaluation Benchmark and BERTology Models for Vietnamese

no code implementations21 Sep 2022 Luan Thanh Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Inspired by the success of the GLUE, we introduce the Social Media Text Classification Evaluation (SMTCE) benchmark, as a collection of datasets and models across a diverse set of SMTC tasks.

text-classification Text Classification +1

XLMRQA: Open-Domain Question Answering on Vietnamese Wikipedia-based Textual Knowledge Source

no code implementations14 Apr 2022 Kiet Van Nguyen, Phong Nguyen-Thuan Do, Nhat Duy Nguyen, Tin Van Huynh, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen

Question answering (QA) is a natural language understanding task within the fields of information retrieval and information extraction that has attracted much attention from the computational linguistics and artificial intelligence research community in recent years because of the strong development of machine reading comprehension-based models.

Information Retrieval Machine Reading Comprehension +3

VLSP 2021 - ViMRC Challenge: Vietnamese Machine Reading Comprehension

no code implementations22 Mar 2022 Kiet Van Nguyen, Son Quoc Tran, Luan Thanh Nguyen, Tin Van Huynh, Son T. Luu, Ngan Luu-Thuy Nguyen

To address the weakness, we provide the research community with a benchmark dataset named UIT-ViQuAD 2. 0 for evaluating the MRC task and question answering systems for the Vietnamese language.

Language Modelling Machine Reading Comprehension +7

Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-stage Span Labeling

no code implementations PACLIC 2021 Duc-Vu Nguyen, Linh-Bao Vo, Ngoc-Linh Tran, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Previous studies on joint Chinese word segmentation and part-of-speech tagging mainly follow the character-based tagging model focusing on modeling n-gram features.

Chinese Word Segmentation Part-Of-Speech Tagging +2

Span Labeling Approach for Vietnamese and Chinese Word Segmentation

no code implementations1 Oct 2021 Duc-Vu Nguyen, Linh-Bao Vo, Dang Van Thin, Ngan Luu-Thuy Nguyen

In this paper, we propose a span labeling approach to model n-gram information for Vietnamese word segmentation, namely SPAN SEG.

Chinese Word Segmentation Language Modelling +2

Sentence Extraction-Based Machine Reading Comprehension for Vietnamese

no code implementations19 May 2021 Phong Nguyen-Thuan Do, Nhat Duy Nguyen, Tin Van Huynh, Kiet Van Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen

We propose a conversion algorithm to create the dataset for sentence extraction-based machine reading comprehension and three types of approaches for sentence extraction-based machine reading comprehension in Vietnamese.

Machine Reading Comprehension Question Answering +2

Conversational Machine Reading Comprehension for Vietnamese Healthcare Texts

1 code implementation4 May 2021 Son T. Luu, Mao Nguyen Bui, Loi Duc Nguyen, Khiem Vinh Tran, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

To help machines understand conversation texts, we present UIT-ViCoQA, a new corpus for conversational machine reading comprehension in the Vietnamese language.

Chatbot Machine Reading Comprehension +2

Constructive and Toxic Speech Detection for Open-domain Social Media Comments in Vietnamese

no code implementations18 Mar 2021 Luan Thanh Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

For these tasks, we propose a system for constructive and toxic speech detection with the state-of-the-art transfer learning model in Vietnamese NLP as PhoBERT.

Constructive Comment Classification General Classification +2

Investigating Monolingual and Multilingual BERTModels for Vietnamese Aspect Category Detection

no code implementations17 Mar 2021 Dang Van Thin, Lac Si Le, Vu Xuan Hoang, Ngan Luu-Thuy Nguyen

In this paper, we investigate the performance of various monolingual pre-trained language models compared with multilingual models on the Vietnamese aspect category detection problem.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

Augmenting Part-of-speech Tagging with Syntactic Information for Vietnamese and Chinese

1 code implementation24 Feb 2021 Duc-Vu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

In this paper, we implement this idea to improve word segmentation and part of speech tagging the Vietnamese language by employing a simplified constituency parser.

Part-Of-Speech Tagging Segmentation

A Vietnamese Dataset for Evaluating Machine Reading Comprehension

no code implementations30 Sep 2020 Kiet Van Nguyen, Duc-Vu Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen

Due to the lack of benchmark datasets for Vietnamese, we present the Vietnamese Question Answering Dataset (UIT-ViQuAD), a new dataset for the low-resource language as Vietnamese to evaluate MRC models.

Machine Reading Comprehension Question Answering +3

Empirical Study of Text Augmentation on Social Media Text in Vietnamese

1 code implementation25 Sep 2020 Son T. Luu, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Thus, when collecting the data about user comments on the social network, the data is usually skewed about one label, which leads the dataset to become imbalanced and deteriorate the model's ability.

General Classification Hate Speech Detection +5

An Experimental Study of Deep Neural Network Models for Vietnamese Multiple-Choice Reading Comprehension

no code implementations20 Aug 2020 Son T. Luu, Kiet Van Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen

In this paper, we conduct several experiments on neural network-based model to understand the impact of word representation to the Vietnamese multiple-choice machine reading comprehension.

Machine Reading Comprehension Multiple-choice +1

Vietnamese Word Segmentation with SVM: Ambiguity Reduction and Suffix Capture

1 code implementation14 Jun 2020 Duc-Vu Nguyen, Dang Van Thin, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

In this paper, we approach Vietnamese word segmentation as a binary classification by using the Support Vector Machine classifier.

Binary Classification Segmentation +2

UIT-ViIC: A Dataset for the First Evaluation on Vietnamese Image Captioning

3 code implementations1 Feb 2020 Quan Hoang Lam, Quang Duy Le, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

This paper contributes to research on Image Captioning task in terms of extending dataset to a different language - Vietnamese.

Vietnamese Datasets Vietnamese Image Captioning

Comparison Between Traditional Machine Learning Models And Neural Network Models For Vietnamese Hate Speech Detection

1 code implementation31 Jan 2020 Son T. Luu, Hung P. Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Consequently, we compare traditional machine learning and deep learning on a large dataset about the user's comments on social network in Vietnamese and find out what is the advantage and disadvantage of each model by comparing their accuracy on F1-score, then we pick two models in which has highest accuracy in traditional machine learning models and deep neural models respectively.

BIG-bench Machine Learning Hate Speech Detection

Enhancing lexical-based approach with external knowledge for Vietnamese multiple-choice machine reading comprehension

no code implementations16 Jan 2020 Kiet Van Nguyen, Khiem Vinh Tran, Son T. Luu, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen

Although Vietnamese is the 17th most popular native-speaker language in the world, there are not many research studies on Vietnamese machine reading comprehension (MRC), the task of understanding a text and answering questions about it.

Machine Reading Comprehension Multiple-choice +3

Job Prediction: From Deep Neural Network Models to Applications

no code implementations27 Dec 2019 Tin Van Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen

In addition, we also proposed a simple and effective ensemble model combining different deep neural network models.

Job classification Job Prediction +1

Emotion Recognition for Vietnamese Social Media Text

no code implementations21 Nov 2019 Vong Anh Ho, Duong Huynh-Cong Nguyen, Danh Hoang Nguyen, Linh Thi-Van Pham, Duc-Vu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

In this task, the result is not produced in terms of either polarity: positive or negative or in the form of rating (from 1 to 5) but of a more detailed level of analysis in which the results are depicted in more expressions like sadness, enjoyment, anger, disgust, fear, and surprise.

Emotion Recognition Sentiment Analysis

Error Analysis for Vietnamese Named Entity Recognition on Deep Neural Network Models

no code implementations17 Nov 2019 Binh An Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

In recent years, Vietnamese Named Entity Recognition (NER) systems have had a great breakthrough when using Deep Neural Network methods.

named-entity-recognition Named Entity Recognition +2

Error Analysis for Vietnamese Dependency Parsing

no code implementations9 Nov 2019 Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Dependency parsing is needed in different applications of natural language processing.

Dependency Parsing

Vietnamese transition-based dependency parsing with supertag features

no code implementations9 Nov 2019 Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

In recent years, dependency parsing is a fascinating research topic and has a lot of applications in natural language processing.

Transition-Based Dependency Parsing

Hate Speech Detection on Vietnamese Social Media Text using the Bidirectional-LSTM Model

1 code implementation9 Nov 2019 Hang Thi-Thuy Do, Huy Duc Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen

In this paper, we describe our system which participates in the shared task of Hate Speech Detection on Social Networks of VLSP 2019 evaluation campaign.

BIG-bench Machine Learning Hate Speech Detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.