Search Results for author: Thien Huu Nguyen

Found 96 papers, 21 papers with code

Learning Cross-lingual Representations for Event Coreference Resolution with Multi-view Alignment and Optimal Transport

no code implementations EMNLP (MRL) 2021 Duy Phung, Hieu Minh Tran, Minh Van Nguyen, Thien Huu Nguyen

We study a new problem of cross-lingual transfer learning for event coreference resolution (ECR) where models trained on data from a source language are adapted for evaluations in different target languages.

coreference-resolution Cross-Lingual Transfer +4

Event Detection: Gate Diversity and Syntactic Importance Scores for Graph Convolution Neural Networks

no code implementations EMNLP 2020 Viet Dac Lai, Tuan Ngo Nguyen, Thien Huu Nguyen

Recent studies on event detection (ED) have shown that the syntactic dependency graph can be employed in graph convolution neural networks (GCN) to achieve state-of-the-art performance.

Diversity Event Detection

Improving Cross-Lingual Transfer for Event Argument Extraction with Language-Universal Sentence Structures

no code implementations EACL (WANLP) 2021 Minh Van Nguyen, Thien Huu Nguyen

Previous work on CEAE has shown the cross-lingual benefits of universal dependency trees in capturing shared syntactic structures of sentences across languages.

Cross-Lingual Transfer Event Argument Extraction +4

Parameter-Efficient Domain Knowledge Integration from Multiple Sources for Biomedical Pre-trained Language Models

no code implementations Findings (EMNLP) 2021 Qiuhao Lu, Dejing Dou, Thien Huu Nguyen

These knowledge adapters are pre-trained for individual domain knowledge sources and integrated via an attention-based knowledge controller to enrich PLMs.

Self-Supervised Learning

Learning Prototype Representations Across Few-Shot Tasks for Event Detection

1 code implementation EMNLP 2021 Viet Lai, Franck Dernoncourt, Thien Huu Nguyen

We address the sampling bias and outlier issues in few-shot learning for event detection, a subtask of information extraction.

Event Detection Few-Shot Learning

Fine-grained Temporal Relation Extraction with Ordered-Neuron LSTM and Graph Convolutional Networks

no code implementations WNUT (ACL) 2021 Minh Tran Phu, Minh Van Nguyen, Thien Huu Nguyen

In this work, we propose to fill this gap by introducing novel methods to integrate the syntactic structures into the deep learning models for FineTempRel.

Deep Learning Relation +3

Modeling Document-Level Context for Event Detection via Important Context Selection

no code implementations EMNLP 2021 Amir Pouran Ben Veyseh, Minh Van Nguyen, Nghia Ngo Trung, Bonan Min, Thien Huu Nguyen

To address this issue, we propose a novel method to model document-level context for ED that dynamically selects relevant sentences in the document for the event prediction of the target sentence.

Event Detection Representation Learning +2

GloCOM: A Short Text Neural Topic Model via Global Clustering Context

no code implementations30 Nov 2024 Quang Duc Nguyen, Tung Nguyen, Duc Anh Nguyen, Linh Ngo Van, Sang Dinh, Thien Huu Nguyen

Uncovering hidden topics from short texts is challenging for traditional and neural models due to data sparsity, which limits word co-occurrence patterns, and label sparsity, stemming from incomplete reconstruction targets.

Clustering Topic Models

Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering

no code implementations14 Nov 2024 Nghia Trung Ngo, Chien Van Nguyen, Franck Dernoncourt, Thien Huu Nguyen

Retrieval-augmented generation (RAG) has emerged as a promising approach to enhance the performance of large language models (LLMs) in knowledge-intensive tasks such as those from medical domain.

Misinformation Question Answering +2

Zero-shot Cross-lingual Transfer Learning with Multiple Source and Target Languages for Information Extraction: Language Selection and Adversarial Training

no code implementations13 Nov 2024 Nghia Trung Ngo, Thien Huu Nguyen

The majority of previous researches addressing multi-lingual IE are limited to zero-shot cross-lingual single-transfer (one-to-one) setting, with high-resource languages predominantly as source training data.

Transfer Learning Zero-Shot Cross-Lingual Transfer

Lifelong Event Detection via Optimal Transport

no code implementations11 Oct 2024 Viet Dao, Van-Cuong Pham, Quyen Tran, Thanh-Thien Le, Linh Ngo Van, Thien Huu Nguyen

Continual Event Detection (CED) poses a formidable challenge due to the catastrophic forgetting phenomenon, where learning new tasks (with new coming event types) hampers performance on previous ones.

Event Detection Language Modelling

Preserving Generalization of Language models in Few-shot Continual Relation Extraction

1 code implementation1 Oct 2024 Quyen Tran, Nguyen Xuan Thanh, Nguyen Hoang Anh, Nam Le Hai, Trung Le, Linh Van Ngo, Thien Huu Nguyen

Few-shot Continual Relations Extraction (FCRE) is an emerging and dynamic area of study where models can sequentially integrate knowledge from new relations with limited labeled data while circumventing catastrophic forgetting and preserving prior knowledge from pre-trained backbones.

Continual Relation Extraction Language Modelling +1

NeuroMax: Enhancing Neural Topic Modeling via Maximizing Mutual Information and Group Topic Regularization

no code implementations29 Sep 2024 Duy-Tung Pham, Thien Trang Nguyen Vu, Tung Nguyen, Linh Ngo Van, Duc Anh Nguyen, Thien Huu Nguyen

Recent advances in neural topic models have concentrated on two primary directions: the integration of the inference network (encoder) with a pre-trained language model (PLM) and the modeling of the relationship between words and topics in the generative model (decoder).

Decoder Language Modelling +1

Householder Pseudo-Rotation: A Novel Approach to Activation Editing in LLMs with Direction-Magnitude Perspective

1 code implementation16 Sep 2024 Van-Cuong Pham, Thien Huu Nguyen

Activation Editing, which involves directly editting the internal representations of large language models (LLMs) to alter their behaviors and achieve desired properties, has emerged as a promising area of research.

ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented Learning

1 code implementation6 Aug 2024 Hieu Man, Nghia Trung Ngo, Franck Dernoncourt, Thien Huu Nguyen

This is due to their causal attention mechanism and the misalignment between their pre-training objectives and the text ranking tasks.

Language Modelling Large Language Model +1

Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models

1 code implementation16 Jul 2024 Minh Nguyen, Franck Dernoncourt, Seunghyun Yoon, Hanieh Deilamsalehy, Hao Tan, Ryan Rossi, Quan Hung Tran, Trung Bui, Thien Huu Nguyen

We introduce an approach to identifying speaker names in dialogue transcripts, a crucial task for enhancing content accessibility and searchability in digital media archives.

Attribute Speaker Identification +2

LongLaMP: A Benchmark for Personalized Long-form Text Generation

no code implementations27 Jun 2024 Ishita Kumar, Snigdha Viswanathan, Sushrita Yerra, Alireza Salemi, Ryan A. Rossi, Franck Dernoncourt, Hanieh Deilamsalehy, Xiang Chen, Ruiyi Zhang, Shubham Agarwal, Nedim Lipka, Chien Van Nguyen, Thien Huu Nguyen, Hamed Zamani

In this work, we demonstrate the importance of user-specific personalization for long-text generation tasks and develop the Long-text Language Model Personalization (LongLaMP) Benchmark.

Language Modelling Text Generation

ToVo: Toxicity Taxonomy via Voting

no code implementations21 Jun 2024 Tinh Son Luong, Thanh-Thien Le, Thang Viet Doan, Linh Ngo Van, Thien Huu Nguyen, Diep Thi-Ngoc Nguyen

To address these issues, we propose a dataset creation mechanism that integrates voting and chain-of-thought processes, producing a high-quality open-source dataset for toxic content detection.

Realistic Evaluation of Toxicity in Large Language Models

no code implementations17 May 2024 Tinh Son Luong, Thanh-Thien Le, Linh Ngo Van, Thien Huu Nguyen

Large language models (LLMs) have become integral to our professional workflows and daily lives.

Prompt Engineering

BKEE: Pioneering Event Extraction in the Vietnamese Language

1 code implementation LREC | COLING 2024 Thi-Nhung Nguyen, Bang Tien Tran, Trong-Nghia Luu, Thien Huu Nguyen, Kiem-Hieu Nguyen

Event Extraction (EE) is a fundamental task in information extraction, aimed at identifying events and their associated arguments within textual data.

Event Detection Event Extraction +1

Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback

2 code implementations29 Jul 2023 Viet Dac Lai, Chien Van Nguyen, Nghia Trung Ngo, Thuat Nguyen, Franck Dernoncourt, Ryan A. Rossi, Thien Huu Nguyen

Okapi introduces instruction and response-ranked data in 26 diverse languages to facilitate the experiments and development of future multilingual LLM research.

Boosting Punctuation Restoration with Data Generation and Reinforcement Learning

no code implementations24 Jul 2023 Viet Dac Lai, Abel Salinas, Hao Tan, Trung Bui, Quan Tran, Seunghyun Yoon, Hanieh Deilamsalehy, Franck Dernoncourt, Thien Huu Nguyen

Punctuation restoration is an important task in automatic speech recognition (ASR) which aim to restore the syntactic structure of generated ASR texts to improve readability.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Question-Context Alignment and Answer-Context Dependencies for Effective Answer Sentence Selection

no code implementations3 Jun 2023 Minh Van Nguyen, Kishan Kc, Toan Nguyen, Thien Huu Nguyen, Ankit Chadha, Thuy Vu

In this paper, we propose to improve the candidate scoring by explicitly incorporating the dependencies between question-context and answer-context into the final representation of a candidate.

Open-Domain Question Answering Sentence

ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning

no code implementations12 Apr 2023 Viet Dac Lai, Nghia Trung Ngo, Amir Pouran Ben Veyseh, Hieu Man, Franck Dernoncourt, Trung Bui, Thien Huu Nguyen

The answer to this question requires a thorough evaluation of ChatGPT over multiple tasks with diverse languages and large datasets (i. e., beyond reported anecdotes), which is still missing or limited in current research.

Multilingual NLP Text Generation +1

Textual Data Augmentation for Patient Outcomes Prediction

no code implementations13 Nov 2022 Qiuhao Lu, Dejing Dou, Thien Huu Nguyen

Deep learning models have demonstrated superior performance in various healthcare applications.

Data Augmentation Language Modelling

MEE: A Novel Multilingual Event Extraction Dataset

no code implementations11 Nov 2022 Amir Pouran Ben Veyseh, Javid Ebrahimi, Franck Dernoncourt, Thien Huu Nguyen

Event Extraction (EE) is one of the fundamental tasks in Information Extraction (IE) that aims to recognize event mentions and their arguments (i. e., participants) from text.

Event Extraction

Tutorial Recommendation for Livestream Videos using Discourse-Level Consistency and Ontology-Based Filtering

no code implementations11 Sep 2022 Amir Pouran Ben Veyseh, Franck Dernoncourt, Thien Huu Nguyen

In order to alleviate this issue, one solution is to link the streaming videos with the relevant tutorial available for the tools used in the streaming video.

Symlink: A New Dataset for Scientific Symbol-Description Linking

no code implementations26 Apr 2022 Viet Dac Lai, Amir Pouran Ben Veyseh, Franck Dernoncourt, Thien Huu Nguyen

Mathematical symbols and descriptions appear in various forms across document section boundaries without explicit markup.

SemEval 2022 Task 12: Symlink- Linking Mathematical Symbols to their Descriptions

no code implementations19 Feb 2022 Viet Dac Lai, Amir Pouran Ben Veyseh, Franck Dernoncourt, Thien Huu Nguyen

Given the increasing number of livestreaming videos, automatic speech recognition and post-processing for livestreaming video transcripts are crucial for efficient data management as well as knowledge mining.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

MACRONYM: A Large-Scale Dataset for Multilingual and Multi-Domain Acronym Extraction

no code implementations COLING 2022 Amir Pouran Ben Veyseh, Nicole Meister, Seunghyun Yoon, Rajiv Jain, Franck Dernoncourt, Thien Huu Nguyen

Acronym extraction is the task of identifying acronyms and their expanded forms in texts that is necessary for various NLP applications.

FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction

1 code implementation NAACL (ACL) 2022 Minh Van Nguyen, Nghia Trung Ngo, Bonan Min, Thien Huu Nguyen

FAMIE is designed to address a fundamental problem in existing AL frameworks where annotators need to wait for a long time between annotation batches due to the time-consuming nature of model training and data selection at each AL iteration.

Active Learning Knowledge Distillation

DPR at SemEval-2021 Task 8: Dynamic Path Reasoning for Measurement Relation Extraction

no code implementations SEMEVAL 2021 Amir Pouran Ben Veyseh, Franck Dernoncourt, Thien Huu Nguyen

To this end, in this paper, we propose a novel model for the task of measurement relation extraction (MRE) whose goal is to recognize the relation between measured entities, quantities, and conditions mentioned in a document.

Relation Relation Extraction +1

Unleash GPT-2 Power for Event Detection

no code implementations ACL 2021 Amir Pouran Ben Veyseh, Viet Lai, Franck Dernoncourt, Thien Huu Nguyen

To prevent the noises inevitable in automatically generated data from hampering training process, we propose to exploit a teacher-student architecture in which the teacher is supposed to learn anchor knowledge from the original data.

Event Detection Language Modelling

Exploiting Document Structures and Cluster Consistencies for Event Coreference Resolution

no code implementations ACL 2021 Hieu Minh Tran, Duy Phung, Thien Huu Nguyen

In addition, consistency constraints between golden and predicted clusters of event mentions have not been considered to improve representation learning in prior deep learning models for ECR.

coreference-resolution Deep Learning +2

Graph Convolutional Networks for Event Causality Identification with Rich Document-level Structures

no code implementations NAACL 2021 Minh Tran Phu, Thien Huu Nguyen

Although deep learning models have recently shown state-of-the-art performance for ECI, they are limited to the intra-sentence setting where event mention pairs are presented in the same sentences.

Event Causality Identification Sentence

Fine-Grained Event Trigger Detection

no code implementations EACL 2021 Duong Le, Thien Huu Nguyen

Most of the previous work on Event Detection (ED) has only considered the datasets with a small number of event types (i. e., up to 38 types).

Event Detection Word Sense Disambiguation

Cross-Task Instance Representation Interactions and Label Dependencies for Joint Information Extraction with Graph Convolutional Networks

no code implementations NAACL 2021 Minh Van Nguyen, Viet Dac Lai, Thien Huu Nguyen

Existing works on information extraction (IE) have mainly solved the four main tasks separately (entity mention recognition, relation extraction, event trigger detection, and argument extraction), thus failing to benefit from inter-dependencies between tasks.

Relation Extraction Representation Learning +1

MadDog: A Web-based System for Acronym Identification and Disambiguation

1 code implementation EACL 2021 Amir Pouran Ben Veyseh, Franck Dernoncourt, Walter Chang, Thien Huu Nguyen

However, none of the existing works provide a unified solution capable of processing acronyms in various domains and to be publicly available.

Acronym Identification and Disambiguation Shared Tasks for Scientific Document Understanding

no code implementations22 Dec 2020 Amir Pouran Ben Veyseh, Franck Dernoncourt, Thien Huu Nguyen, Walter Chang, Leo Anthony Celi

To push forward research in this direction, we have organized two shared task for acronym identification and acronym disambiguation in scientific documents, named AI@SDU and AD@SDU, respectively.

document understanding

What Does This Acronym Mean? Introducing a New Dataset for Acronym Identification and Disambiguation

2 code implementations COLING 2020 Amir Pouran Ben Veyseh, Franck Dernoncourt, Quan Hung Tran, Thien Huu Nguyen

The proposed model outperforms the state-of-the-art models on the new AD dataset, providing a strong baseline for future research on this dataset.

Sentence

Event Detection: Gate Diversity and Syntactic Importance Scoresfor Graph Convolution Neural Networks

no code implementations27 Oct 2020 Viet Dac Lai, Tuan Ngo Nguyen, Thien Huu Nguyen

Recent studies on event detection (ED) haveshown that the syntactic dependency graph canbe employed in graph convolution neural net-works (GCN) to achieve state-of-the-art per-formance.

Diversity Event Detection

Introducing Syntactic Structures into Target Opinion Word Extraction with Deep Learning

no code implementations EMNLP 2020 Amir Pouran Ben Veyseh, Nasim Nouri, Franck Dernoncourt, Dejing Dou, Thien Huu Nguyen

In this work, we propose to incorporate the syntactic structures of the sentences into the deep learning models for TOWE, leveraging the syntax-based opinion possibility scores and the syntactic connections between the words.

Aspect-Based Sentiment Analysis Aspect-oriented Opinion Extraction +2

Exploiting the Syntax-Model Consistency for Neural Relation Extraction

no code implementations ACL 2020 Amir Pouran Ben Veyseh, Franck Dernoncourt, Dejing Dou, Thien Huu Nguyen

In order to overcome these issues, we propose a novel deep learning model for RE that uses the dependency trees to extract the syntax-based importance scores for the words, serving as a tree representation to introduce syntactic information into the models with greater generalization.

Multi-Task Learning Relation +1

Extensively Matching for Few-shot Learning Event Detection

1 code implementation WS 2020 Viet Dac Lai, Franck Dernoncourt, Thien Huu Nguyen

In this work, weformulate event detection as a few-shot learn-ing problem to enable to extend event detec-tion to new event types.

Event Detection Few-Shot Learning

Exploiting the Matching Information in the Support Set for Few Shot Event Classification

no code implementations13 Feb 2020 Viet Dac Lai, Franck Dernoncourt, Thien Huu Nguyen

The existing event classification (EC) work primarily focuseson the traditional supervised learning setting in which models are unableto extract event mentions of new/unseen event types.

Classification Few-Shot Learning +2

Improving Slot Filling by Utilizing Contextual Information

no code implementations WS 2020 Amir Pouran Ben Veyseh, Franck Dernoncourt, Thien Huu Nguyen

To address this issue, in this paper, we propose a novel method to incorporate the contextual information in two different levels, i. e., representation level and task-specific (i. e., label) level.

Intent Detection slot-filling +2

A Joint Model for Definition Extraction with Syntactic Connection and Semantic Consistency

1 code implementation5 Nov 2019 Amir Pouran Ben Veyseh, Franck Dernoncourt, Dejing Dou, Thien Huu Nguyen

In this work, we propose a novel model for DE that simultaneously performs the two tasks in a single framework to benefit from their inter-dependencies.

Definition Extraction Multi-Task Learning +2

Extending Event Detection to New Types with Learning from Keywords

no code implementations WS 2019 Viet Dac Lai, Thien Huu Nguyen

We introduce a novel feature-based attention mechanism for convolutional neural networks for event detection in the new formulation.

Event Detection Sentence

Language-independent Cross-lingual Contextual Representations

no code implementations25 Sep 2019 Xiao Zhang, Song Wang, Dejing Dou, Xien Liu, Thien Huu Nguyen, Ji Wu

Contextual representation models like BERT have achieved state-of-the-art performance on a diverse range of NLP tasks.

Transfer Learning Zero-Shot Cross-Lingual Transfer

Improving Cross-Domain Performance for Relation Extraction via Dependency Prediction and Information Flow Control

no code implementations7 Jul 2019 Amir Pouran Ben Veyseh, Thien Huu Nguyen, Dejing Dou

The current deep learning models for relation extraction has mainly exploited this dependency information by guiding their computation along the structures of the dependency trees.

Domain Generalization Relation +1

Graph based Neural Networks for Event Factuality Prediction using Syntactic and Semantic Structures

1 code implementation ACL 2019 Amir Pouran Ben Veyseh, Thien Huu Nguyen, Dejing Dou

In this work, we introduce a novel graph-based neural network for EFP that can integrate the semantic and syntactic information more effectively.

Sentence

Employing the Correspondence of Relations and Connectives to Identify Implicit Discourse Relations via Label Embeddings

no code implementations ACL 2019 Linh The Nguyen, Linh Van Ngo, Khoat Than, Thien Huu Nguyen

It has been shown that implicit connectives can be exploited to improve the performance of the models for implicit discourse relation recognition (IDRR).

Multi-Task Learning

One for All: Neural Joint Modeling of Entities and Events

no code implementations1 Dec 2018 Trung Minh Nguyen, Thien Huu Nguyen

The previous work for event extraction has mainly focused on the predictions for event triggers and argument roles, treating entity mentions as being provided by human annotators.

Event Extraction

Systematic Generalization: What Is Required and Can It Be Learned?

2 code implementations ICLR 2019 Dzmitry Bahdanau, Shikhar Murty, Michael Noukhovitch, Thien Huu Nguyen, Harm de Vries, Aaron Courville

Numerous models for grounded language understanding have been recently proposed, including (i) generic models that can be easily adapted to any given task and (ii) intuitively appealing modular models that require background knowledge to be instantiated.

Systematic Generalization Visual Question Answering (VQA)

BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning

6 code implementations ICLR 2019 Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Salem Lahlou, Lucas Willems, Chitwan Saharia, Thien Huu Nguyen, Yoshua Bengio

Allowing humans to interactively train artificial agents to understand language instructions is desirable for both practical and scientific reasons, but given the poor data efficiency of the current learning methods, this goal may require substantial research efforts.

Grounded language learning

Similar but not the Same: Word Sense Disambiguation Improves Event Detection via Neural Representation Matching

no code implementations EMNLP 2018 Weiyi Lu, Thien Huu Nguyen

Event detection (ED) and word sense disambiguation (WSD) are two similar tasks in that they both involve identifying the classes (i. e. event types or word senses) of some word in a given sentence.

Event Detection Sentence +1

Who is Killed by Police: Introducing Supervised Attention for Hierarchical LSTMs

no code implementations COLING 2018 Minh Nguyen, Thien Huu Nguyen

The early work in this field \cite{keith2017identifying} proposed a distant supervision framework based on Expectation Maximization (EM) to deal with the multiple appearances of the names in documents.

Toward Mention Detection Robustness with Recurrent Neural Networks

no code implementations24 Feb 2016 Thien Huu Nguyen, Avirup Sil, Georgiana Dinu, Radu Florian

One of the key challenges in natural language processing (NLP) is to yield good performance across application domains and languages.

named-entity-recognition Named Entity Recognition +2

Combining Neural Networks and Log-linear Models to Improve Relation Extraction

no code implementations18 Nov 2015 Thien Huu Nguyen, Ralph Grishman

The last decade has witnessed the success of the traditional feature-based method on exploiting the discrete structures such as words or lexical patterns to extract relations from text.

 Ranked #1 on Relation Extraction on ACE 2005 (Cross Sentence metric)

Relation Relation Extraction +1

Cannot find the paper you are looking for? You can Submit a new open access paper.