Search Results for author: Ngan Luu-Thuy Nguyen

Found 51 papers, 16 papers with code

ViNLI: A Vietnamese Corpus for Studies on Open-Domain Natural Language Inference

no code implementations • COLING 2022 • Tin Van Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

In this paper, we introduce ViNLI (Vietnamese Natural Language Inference), an open-domain and high-quality corpus for evaluating Vietnamese NLI models, which is created and evaluated with a strict process of quality control.

Natural Language Inference Sentence

Paper
Add Code

ViTextVQA: A Large-Scale Visual Question Answering Dataset for Evaluating Vietnamese Text Comprehension in Images

1 code implementation • 16 Apr 2024 • Quan Van Nguyen, Dan Quang Tran, Huy Quang Pham, Thang Kien-Bao Nguyen, Nghia Hieu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Visual Question Answering (VQA) is a complicated task that requires the capability of simultaneously processing natural language and images.

Multimodal Deep Learning Optical Character Recognition (OCR) +5

Paper
Code

VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding

no code implementations • 23 Mar 2024 • Phong Nguyen-Thuan Do, Son Quoc Tran, Phu Gia Hoang, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

The success of Natural Language Understanding (NLU) benchmarks in various languages, such as GLUE for English, CLUE for Chinese, KLUE for Korean, and IndoNLU for Indonesian, has facilitated the evaluation of new NLU models across a wide range of tasks.

Natural Language Understanding text-classification +3

Paper
Add Code

VlogQA: Task, Dataset, and Baseline Models for Vietnamese Spoken-Based Machine Reading Comprehension

1 code implementation • 5 Feb 2024 • Thinh Phuoc Ngo, Khoa Tran Anh Dang, Son T. Luu, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

This paper presents the development process of a Vietnamese spoken language corpus for machine reading comprehension (MRC) tasks and provides insights into the challenges and opportunities associated with using real-world data for machine reading comprehension tasks.

Machine Reading Comprehension

Paper
Code

ComOM at VLSP 2023: A Dual-Stage Framework with BERTology and Unified Multi-Task Instruction Tuning Model for Vietnamese Comparative Opinion Mining

no code implementations • 14 Dec 2023 • Dang Van Thin, Duong Ngoc Hao, Ngan Luu-Thuy Nguyen

The ComOM shared task aims to extract comparative opinions from product reviews in Vietnamese language.

Data Augmentation Opinion Mining +1

Paper
Add Code

Abusive Span Detection for Vietnamese Narrative Texts

no code implementations • 13 Dec 2023 • Nhu-Thanh Nguyen, Khoa Thi-Kim Phan, Duc-Vu Nguyen, Ngan Luu-Thuy Nguyen

Abuse in its various forms, including physical, psychological, verbal, sexual, financial, and cultural, has a negative impact on mental health.

Paper
Add Code

OpenViVQA: Task, Dataset, and Multimodal Fusion Models for Visual Question Answering in Vietnamese

1 code implementation • 7 May 2023 • Nghia Hieu Nguyen, Duong T. D. Vo, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

The VQA task requires methods that have the ability to fuse the information from questions and images to produce appropriate answers.

Information Retrieval Question Answering +3

Paper
Code

A Multiple Choices Reading Comprehension Corpus for Vietnamese Language Education

1 code implementation • 31 Mar 2023 • Son T. Luu, Khoi Trong Hoang, Tuong Quang Pham, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

From the results of the error analysis, we found the challenge of the reading comprehension models is understanding the implicit context in texts and linking them together in order to find the correct answers.

Machine Reading Comprehension Multiple-choice +1

Paper
Code

Revealing Weaknesses of Vietnamese Language Models Through Unanswerable Questions in Machine Reading Comprehension

no code implementations • 16 Mar 2023 • Son Quoc Tran, Phong Nguyen-Thuan Do, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

From the analysis results, we suggest new directions for developing Vietnamese language models.

Machine Reading Comprehension Vietnamese Language Models +1

Paper
Add Code

EVJVQA Challenge: Multilingual Visual Question Answering

no code implementations • 23 Feb 2023 • Ngan Luu-Thuy Nguyen, Nghia Hieu Nguyen, Duong T. D Vo, Khanh Quoc Tran, Kiet Van Nguyen

Visual Question Answering (VQA) is a challenging task of natural language processing (NLP) and computer vision (CV), attracting significant attention from researchers.

Language Modelling Question Answering +1

Paper
Add Code

ViHOS: Hate Speech Spans Detection for Vietnamese

1 code implementation • 24 Jan 2023 • Phu Gia Hoang, Canh Duc Luu, Khanh Quoc Tran, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

The rise in hateful and offensive language directed at other users is one of the adverse side effects of the increased use of social networking platforms.

Ranked #1 on Sequence-to-sequence Language Modeling on ViHOS

Sequence-to-sequence Language Modeling XLM-R

Paper
Code

Is word segmentation necessary for Vietnamese sentiment classification?

no code implementations • 1 Jan 2023 • Duc-Vu Nguyen, Ngan Luu-Thuy Nguyen

To the best of our knowledge, this paper made the first attempt to answer whether word segmentation is necessary for Vietnamese sentiment classification.

Classification Segmentation +2

Paper
Add Code

Leveraging Semantic Representations Combined with Contextual Word Representations for Recognizing Textual Entailment in Vietnamese

no code implementations • 1 Jan 2023 • Quoc-Loc Duong, Duc-Vu Nguyen, Ngan Luu-Thuy Nguyen

The experimental results give conclusions about the influence and role of semantic representation on Vietnamese in understanding natural language.

Natural Language Inference Natural Language Understanding +2

Paper
Add Code

Integrating Semantic Information into Sketchy Reading Module of Retro-Reader for Vietnamese Machine Reading Comprehension

no code implementations • 1 Jan 2023 • Hang Thi-Thu Le, Viet-Duc Ho, Duc-Vu Nguyen, Ngan Luu-Thuy Nguyen

The classification of answerability questions is a relatively significant sub-task in machine reading comprehension; however, there haven't been many studies.

Machine Reading Comprehension Vietnamese Machine Reading Comprehension +1

Paper
Add Code

A Comparative Study of Question Answering over Knowledge Bases

1 code implementation • 15 Nov 2022 • Khiem Vinh Tran, Hao Phu Phan, Khang Nguyen Duc Quach, Ngan Luu-Thuy Nguyen, Jun Jo, Thanh Tam Nguyen

In that, we study various question types, properties, languages, and domains to provide insights on where existing systems struggle.

Question Answering

Paper
Code

SMTCE: A Social Media Text Classification Evaluation Benchmark and BERTology Models for Vietnamese

no code implementations • 21 Sep 2022 • Luan Thanh Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Inspired by the success of the GLUE, we introduce the Social Media Text Classification Evaluation (SMTCE) benchmark, as a collection of datasets and models across a diverse set of SMTC tasks.

text-classification Text Classification +1

Paper
Add Code

SPBERTQA: A Two-Stage Question Answering System Based on Sentence Transformers for Medical Texts

no code implementations • 20 Jun 2022 • Nhung Thi-Hong Nguyen, Phuong Phan-Dieu Ha, Luan Thanh Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Question answering (QA) systems have gained explosive attention in recent years.

Question Answering Sentence

Paper
Add Code

XLMRQA: Open-Domain Question Answering on Vietnamese Wikipedia-based Textual Knowledge Source

no code implementations • 14 Apr 2022 • Kiet Van Nguyen, Phong Nguyen-Thuan Do, Nhat Duy Nguyen, Tin Van Huynh, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen

Question answering (QA) is a natural language understanding task within the fields of information retrieval and information extraction that has attracted much attention from the computational linguistics and artificial intelligence research community in recent years because of the strong development of machine reading comprehension-based models.

Information Retrieval Machine Reading Comprehension +3

Paper
Add Code

VLSP 2021 - ViMRC Challenge: Vietnamese Machine Reading Comprehension

no code implementations • 22 Mar 2022 • Kiet Van Nguyen, Son Quoc Tran, Luan Thanh Nguyen, Tin Van Huynh, Son T. Luu, Ngan Luu-Thuy Nguyen

To address the weakness, we provide the research community with a benchmark dataset named UIT-ViQuAD 2. 0 for evaluating the MRC task and question answering systems for the Vietnamese language.

Language Modelling Machine Reading Comprehension +7

Paper
Add Code

Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-stage Span Labeling

no code implementations • PACLIC 2021 • Duc-Vu Nguyen, Linh-Bao Vo, Ngoc-Linh Tran, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Previous studies on joint Chinese word segmentation and part-of-speech tagging mainly follow the character-based tagging model focusing on modeling n-gram features.

Chinese Word Segmentation Part-Of-Speech Tagging +2

Paper
Add Code

Span Labeling Approach for Vietnamese and Chinese Word Segmentation

no code implementations • 1 Oct 2021 • Duc-Vu Nguyen, Linh-Bao Vo, Dang Van Thin, Ngan Luu-Thuy Nguyen

In this paper, we propose a span labeling approach to model n-gram information for Vietnamese word segmentation, namely SPAN SEG.

Chinese Word Segmentation Language Modelling +2

Paper
Add Code

Monolingual versus Multilingual BERTology for Vietnamese Extractive Multi-Document Summarization

no code implementations • 31 Aug 2021 • Huy Quoc To, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen

Recent researches have demonstrated that BERT shows potential in a wide range of natural language processing tasks.

Document Summarization Extractive Text Summarization +1

Paper
Add Code

Sentence Extraction-Based Machine Reading Comprehension for Vietnamese

no code implementations • 19 May 2021 • Phong Nguyen-Thuan Do, Nhat Duy Nguyen, Tin Van Huynh, Kiet Van Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen

We propose a conversion algorithm to create the dataset for sentence extraction-based machine reading comprehension and three types of approaches for sentence extraction-based machine reading comprehension in Vietnamese.

Machine Reading Comprehension Question Answering +2

Paper
Add Code

Conversational Machine Reading Comprehension for Vietnamese Healthcare Texts

1 code implementation • 4 May 2021 • Son T. Luu, Mao Nguyen Bui, Loi Duc Nguyen, Khiem Vinh Tran, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

To help machines understand conversation texts, we present UIT-ViCoQA, a new corpus for conversational machine reading comprehension in the Vietnamese language.

Chatbot Machine Reading Comprehension +2

Paper
Code

Vietnamese Complaint Detection on E-Commerce Websites

no code implementations • 24 Apr 2021 • Nhung Thi-Hong Nguyen, Phuong Phan-Dieu Ha, Luan Thanh Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Customer product reviews play a role in improving the quality of products and services for business organizations or their brands.

Complaint Comment Classification Constructive Comment Classification +2

Paper
Add Code

UIT-ISE-NLP at SemEval-2021 Task 5: Toxic Spans Detection with BiLSTM-CRF and ToxicBERT Comment Classification

1 code implementation • SEMEVAL 2021 • Son T. Luu, Ngan Luu-Thuy Nguyen

We present our works on SemEval-2021 Task 5 about Toxic Spans Detection.

Toxic Spans Detection

Paper
Code

A Large-scale Dataset for Hate Speech Detection on Vietnamese Social Media Texts

2 code implementations • 22 Mar 2021 • Son T. Luu, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

On social medias, hate speech has become a critical problem for social network users.

Hate Speech Detection Vietnamese Social Media Text Processing

Paper
Code

Constructive and Toxic Speech Detection for Open-domain Social Media Comments in Vietnamese

no code implementations • 18 Mar 2021 • Luan Thanh Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

For these tasks, we propose a system for constructive and toxic speech detection with the state-of-the-art transfer learning model in Vietnamese NLP as PhoBERT.

Constructive Comment Classification General Classification +2

Paper
Add Code

Investigating Monolingual and Multilingual BERTModels for Vietnamese Aspect Category Detection

no code implementations • 17 Mar 2021 • Dang Van Thin, Lac Si Le, Vu Xuan Hoang, Ngan Luu-Thuy Nguyen

In this paper, we investigate the performance of various monolingual pre-trained language models compared with multilingual models on the Vietnamese aspect category detection problem.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

Paper
Add Code

Augmenting Part-of-speech Tagging with Syntactic Information for Vietnamese and Chinese

1 code implementation • 24 Feb 2021 • Duc-Vu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

In this paper, we implement this idea to improve word segmentation and part of speech tagging the Vietnamese language by employing a simplified constituency parser.

Part-Of-Speech Tagging Segmentation

Paper
Code

Gender Prediction Based on Vietnamese Names with Machine Learning Techniques

no code implementations • 21 Oct 2020 • Huy Quoc To, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen

We propose a new dataset for gender prediction based on Vietnamese names.

BIG-bench Machine Learning Gender Classification +2

Paper
Add Code

An Empirical Study for Vietnamese Constituency Parsing with Pre-training

no code implementations • 19 Oct 2020 • Tuan-Vi Tran, Xuan-Thien Pham, Duc-Vu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

In this work, we use a span-based approach for Vietnamese constituency parsing.

Constituency Parsing Vietnamese Datasets +1

Paper
Add Code

A Vietnamese Dataset for Evaluating Machine Reading Comprehension

no code implementations • 30 Sep 2020 • Kiet Van Nguyen, Duc-Vu Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen

Due to the lack of benchmark datasets for Vietnamese, we present the Vietnamese Question Answering Dataset (UIT-ViQuAD), a new dataset for the low-resource language as Vietnamese to evaluate MRC models.

Ranked #1 on Vietnamese Machine Reading Comprehension on UIT-ViQuAD

Machine Reading Comprehension Question Answering +3

Paper
Add Code

A Simple and Efficient Ensemble Classifier Combining Multiple Neural Network Models on Social Media Datasets in Vietnamese

no code implementations • PACLIC 2020 • Huy Duc Huynh, Hang Thi-Thuy Do, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

There are various studies in this field in many languages but limited to the Vietnamese language.

text-classification Text Classification

Paper
Add Code

Empirical Study of Text Augmentation on Social Media Text in Vietnamese

1 code implementation • 25 Sep 2020 • Son T. Luu, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Thus, when collecting the data about user comments on the social network, the data is usually skewed about one label, which leads the dataset to become imbalanced and deteriorate the model's ability.

General Classification Hate Speech Detection +5

Paper
Code

UIT-HSE at WNUT-2020 Task 2: Exploiting CT-BERT for Identifying COVID-19 Information on the Twitter Social Network

no code implementations • 7 Sep 2020 • Khiem Vinh Tran, Hao Phu Phan, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Recently, COVID-19 has affected a variety of real-life aspects of the world and led to dreadful consequences.

Task 2

Paper
Add Code

An Experimental Study of Deep Neural Network Models for Vietnamese Multiple-Choice Reading Comprehension

no code implementations • 20 Aug 2020 • Son T. Luu, Kiet Van Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen

In this paper, we conduct several experiments on neural network-based model to understand the impact of word representation to the Vietnamese multiple-choice machine reading comprehension.

Machine Reading Comprehension Multiple-choice +1

Paper
Add Code

New Vietnamese Corpus for Machine Reading Comprehension of Health News Articles

no code implementations • 19 Jun 2020 • Kiet Van Nguyen, Tin Van Huynh, Duc-Vu Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen

In particular, we develop a process of creating a corpus for the Vietnamese machine reading comprehension.

Machine Reading Comprehension Sentence +2

Paper
Add Code

Vietnamese Word Segmentation with SVM: Ambiguity Reduction and Suffix Capture

1 code implementation • 14 Jun 2020 • Duc-Vu Nguyen, Dang Van Thin, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

In this paper, we approach Vietnamese word segmentation as a binary classification by using the Support Vector Machine classifier.

Binary Classification Segmentation +2

Paper
Code

UIT-ViIC: A Dataset for the First Evaluation on Vietnamese Image Captioning

3 code implementations • 1 Feb 2020 • Quan Hoang Lam, Quang Duy Le, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

This paper contributes to research on Image Captioning task in terms of extending dataset to a different language - Vietnamese.

Vietnamese Datasets Vietnamese Image Captioning

Paper
Code

Comparison Between Traditional Machine Learning Models And Neural Network Models For Vietnamese Hate Speech Detection

1 code implementation • 31 Jan 2020 • Son T. Luu, Hung P. Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Consequently, we compare traditional machine learning and deep learning on a large dataset about the user's comments on social network in Vietnamese and find out what is the advantage and disadvantage of each model by comparing their accuracy on F1-score, then we pick two models in which has highest accuracy in traditional machine learning models and deep neural models respectively.

BIG-bench Machine Learning Hate Speech Detection

Paper
Code

Enhancing lexical-based approach with external knowledge for Vietnamese multiple-choice machine reading comprehension

no code implementations • 16 Jan 2020 • Kiet Van Nguyen, Khiem Vinh Tran, Son T. Luu, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen

Although Vietnamese is the 17th most popular native-speaker language in the world, there are not many research studies on Vietnamese machine reading comprehension (MRC), the task of understanding a text and answering questions about it.

Machine Reading Comprehension Multiple-choice +3

Paper
Add Code

Job Prediction: From Deep Neural Network Models to Applications

no code implementations • 27 Dec 2019 • Tin Van Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen

In addition, we also proposed a simple and effective ensemble model combining different deep neural network models.

Job classification Job Prediction +1

Paper
Add Code

Emotion Recognition for Vietnamese Social Media Text

no code implementations • 21 Nov 2019 • Vong Anh Ho, Duong Huynh-Cong Nguyen, Danh Hoang Nguyen, Linh Thi-Van Pham, Duc-Vu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

In this task, the result is not produced in terms of either polarity: positive or negative or in the form of rating (from 1 to 5) but of a more detailed level of analysis in which the results are depicted in more expressions like sadness, enjoyment, anger, disgust, fear, and surprise.

Emotion Recognition Sentiment Analysis

Paper
Add Code

Error Analysis for Vietnamese Named Entity Recognition on Deep Neural Network Models

no code implementations • 17 Nov 2019 • Binh An Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

In recent years, Vietnamese Named Entity Recognition (NER) systems have had a great breakthrough when using Deep Neural Network methods.

named-entity-recognition Named Entity Recognition +2

Paper
Add Code

Deep Learning versus Traditional Classifiers on Vietnamese Students' Feedback Corpus

no code implementations • 17 Nov 2019 • Phu X. V. Nguyen, Tham T. T. Hong, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Student's feedback is an important source of collecting students' opinions to improve the quality of training activities.

General Classification Sentiment Analysis +2

Paper
Add Code

Hate Speech Detection on Vietnamese Social Media Text using the Bidirectional-LSTM Model

1 code implementation • 9 Nov 2019 • Hang Thi-Thuy Do, Huy Duc Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen

In this paper, we describe our system which participates in the shared task of Hate Speech Detection on Social Networks of VLSP 2019 evaluation campaign.

BIG-bench Machine Learning Hate Speech Detection +1