Search Results for author: Thai Le

Found 25 papers, 10 papers with code

Machine Learning Based Detection of Clickbait Posts in Social Media

no code implementations • 5 Oct 2017 • Xinyue Cao, Thai Le, Jason, Zhang

In this paper, we make use of a dataset from the clickbait challenge 2017 (clickbait-challenge. com) comprising of over 21, 000 headlines/titles, each of which is annotated by at least five judgments from crowdsourcing on how clickbait it is.

BIG-bench Machine Learning Clickbait Detection

Paper
Add Code

GRACE: Generating Concise and Informative Contrastive Sample to Explain Neural Network Model's Prediction

1 code implementation • 5 Nov 2019 • Thai Le, Suhang Wang, Dongwon Lee

Despite the recent development in the topic of explainable AI/ML for image and text data, the majority of current solutions are not suitable to explain the prediction of neural network models when the datasets are tabular and their features are in high-dimensional vectorized formats.

Philosophy

Paper
Code

MALCOM: Generating Malicious Comments to Attack Neural Fake News Detection Models

1 code implementation • 1 Sep 2020 • Thai Le, Suhang Wang, Dongwon Lee

In recent years, the proliferation of so-called "fake news" has caused much disruptions in society and weakened the news ecosystem.

Comment Generation Fake News Detection

Paper
Code

SHIELD: Defending Textual Neural Networks against Multiple Black-Box Adversarial Attacks with Stochastic Multi-Expert Patcher

1 code implementation • ACL 2022 • Thai Le, Noseong Park, Dongwon Lee

Even though several methods have proposed to defend textual neural network (NN) models against black-box adversarial attacks, they often defend against a specific text perturbation strategy and/or require re-training the models from scratch.

Adversarial Robustness

Paper
Code

A Sweet Rabbit Hole by DARCY: Using Honeypots to Detect Universal Trigger's Adversarial Attacks

no code implementations • ACL 2021 • Thai Le, Noseong Park, Dongwon Lee

The Universal Trigger (UniTrigger) is a recently-proposed powerful adversarial textual attack method.

Adversarial Attack

Paper
Add Code

Large-Scale Data-Driven Airline Market Influence Maximization

no code implementations • 31 May 2021 • Duanshun Li, Jing Liu, Jinsung Jeon, Seoyoung Hong, Thai Le, Dongwon Lee, Noseong Park

On top of the prediction models, we define a budget-constrained flight frequency optimization problem to maximize the market influence over 2, 262 routes.

Paper
Add Code

TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation

3 code implementations • Findings (EMNLP) 2021 • Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee

Recent progress in generative language models has enabled machines to generate astonishingly realistic texts.

Authorship Attribution Fake News Detection +1

Paper
Code

Socialbots on Fire: Modeling Adversarial Behaviors of Socialbots via Multi-Agent Hierarchical Reinforcement Learning

no code implementations • 20 Oct 2021 • Thai Le, Long Tran-Thanh, Dongwon Lee

To this question, we successfully demonstrate that indeed it is possible for adversaries to exploit computational learning mechanism such as reinforcement learning (RL) to maximize the influence of socialbots while avoiding being detected.

Adversarial Attack Hierarchical Reinforcement Learning +2

Paper
Add Code

Do Language Models Plagiarize?

1 code implementation • 15 Mar 2022 • Jooyoung Lee, Thai Le, Jinghui Chen, Dongwon Lee

Our results suggest that (1) three types of plagiarism widely exist in LMs beyond memorization, (2) both size and decoding methods of LMs are strongly associated with the degrees of plagiarism they exhibit, and (3) fine-tuned LMs' plagiarism patterns vary based on their corpus similarity and homogeneity.

Language Modelling Memorization +1

Paper
Code

Perturbations in the Wild: Leveraging Human-Written Text Perturbations for Realistic Adversarial Attack and Defense

1 code implementation • Findings (ACL) 2022 • Thai Le, Jooyoung Lee, Kevin Yen, Yifan Hu, Dongwon Lee

We find that adversarial texts generated by ANTHRO achieve the best trade-off between (1) attack success rate, (2) semantic preservation of the original text, and (3) stealthiness--i. e. indistinguishable from human writings hence harder to be flagged as suspicious.

Adversarial Attack

Paper
Code

Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective

no code implementations • 19 Oct 2022 • Adaku Uchendu, Thai Le, Dongwon Lee

Two interlocking research questions of growing interest and importance in privacy research are Authorship Attribution (AA) and Authorship Obfuscation (AO).

Attribute Authorship Attribution +1

Paper
Add Code

CRYPTEXT: Database and Interactive Toolkit of Human-Written Text Perturbations in the Wild

no code implementations • 16 Jan 2023 • Thai Le, Ye Yiran, Yifan Hu, Dongwon Lee

CRYPTEXT is a data-intensive application that provides the users with a database and several tools to extract and interact with human-written perturbations.

Paper
Add Code

NoisyHate: Benchmarking Content Moderation Machine Learning Models with Human-Written Perturbations Online

no code implementations • 18 Mar 2023 • Yiran Ye, Thai Le, Dongwon Lee

In this paper, we introduce a benchmark test set containing human-written perturbations online for toxic speech detection models.

Adversarial Attack Benchmarking +1

Paper
Add Code

Does Human Collaboration Enhance the Accuracy of Identifying LLM-Generated Deepfake Texts?

2 code implementations • 3 Apr 2023 • Adaku Uchendu, Jooyoung Lee, Hua Shen, Thai Le, Ting-Hao 'Kenneth' Huang, Dongwon Lee

Advances in Large Language Models (e. g., GPT-4, LLaMA) have improved the generation of coherent sentences resembling human writing on a large scale, resulting in the creation of so-called deepfake texts.

Face Swapping Human Detection +1

Paper
Code

Are Your Explanations Reliable? Investigating the Stability of LIME in Explaining Text Classifiers by Marrying XAI and Adversarial Attack

1 code implementation • 21 May 2023 • Christopher Burger, Lingwei Chen, Thai Le

LIME has emerged as one of the most commonly referenced tools in explainable AI (XAI) frameworks that is integrated into critical machine learning applications--e. g., healthcare and finance.

Adversarial Attack

Paper
Code

TOPFORMER: Topology-Aware Authorship Attribution of Deepfake Texts with Diverse Writing Styles

no code implementations • 22 Sep 2023 • Adaku Uchendu, Thai Le, Dongwon Lee

We propose TopFormer to improve existing AA solutions by capturing more linguistic patterns in deepfake texts by including a Topological Data Analysis (TDA) layer in the Transformer-based model.

Authorship Attribution Face Swapping +3

Paper
Add Code

MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark

1 code implementation • 20 Oct 2023 • Dominik Macko, Robert Moro, Adaku Uchendu, Jason Samuel Lucas, Michiharu Yamashita, Matúš Pikuliak, Ivan Srba, Thai Le, Dongwon Lee, Jakub Simko, Maria Bielikova

There is a lack of research into capabilities of recent LLMs to generate convincing text in languages other than English and into performance of detectors of machine-generated text in multilingual settings.

Benchmarking Text Detection

Paper
Code

HANSEN: Human and AI Spoken Text Benchmark for Authorship Analysis

no code implementations • 25 Oct 2023 • Nafis Irtiza Tripto, Adaku Uchendu, Thai Le, Mattia Setzu, Fosca Giannotti, Dongwon Lee

Thus, we introduce the largest benchmark for spoken texts - HANSEN (Human ANd ai Spoken tExt beNchmark).

Authorship Attribution Text Detection

Paper
Add Code

A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts

no code implementations • 14 Nov 2023 • Nafis Irtiza Tripto, Saranya Venkatraman, Dominik Macko, Robert Moro, Ivan Srba, Adaku Uchendu, Thai Le, Dongwon Lee

In the realm of text manipulation and linguistic transformation, the question of authorship has always been a subject of fascination and philosophical inquiry.

Paper
Add Code

Marrying Adapters and Mixup to Efficiently Enhance the Adversarial Robustness of Pre-Trained Language Models for Text Classification

no code implementations • 18 Jan 2024 • Tuc Nguyen, Thai Le

Existing works show that augmenting training data of neural networks using both clean and adversarial examples can enhance their generalizability under adversarial attacks.

Adversarial Robustness text-classification +1

Paper
Add Code

ALISON: Fast and Effective Stylometric Authorship Obfuscation

1 code implementation • 1 Feb 2024 • Eric Xing, Saranya Venkatraman, Thai Le, Dongwon Lee

AO is the corresponding adversarial task, aiming to modify a text in such a way that its semantics are preserved, yet an AA model cannot correctly infer its authorship.

Authorship Attribution

Paper
Code

Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning

no code implementations • 16 Feb 2024 • Tuc Nguyen, Thai Le

Several parameter-efficient fine-tuning methods based on adapters have been proposed as a streamlined approach to incorporate not only a single specialized knowledge into existing Pre-Trained Language Models (PLMs) but also multiple of them at once.

Computational Efficiency