Search Results for author: Taesung Lee

Found 15 papers, 3 papers with code

Towards Generating Informative Textual Description for Neurons in Language Models

no code implementations30 Jan 2024 Shrayani Mondal, Rishabh Garodia, Arbaaz Qureshi, Taesung Lee, Youngja Park

We leverage the potential of generative language models to discover human-interpretable descriptors present in a dataset and use an unsupervised approach to explain neurons with these descriptors.

World Knowledge

URET: Universal Robustness Evaluation Toolkit (for Evasion)

1 code implementation3 Aug 2023 Kevin Eykholt, Taesung Lee, Douglas Schales, Jiyong Jang, Ian Molloy, Masha Zorin

In this work, we propose a new framework to enable the generation of adversarial inputs irrespective of the input type and task domain.

Image Classification

Matching Pairs: Attributing Fine-Tuned Models to their Pre-Trained Large Language Models

1 code implementation15 Jun 2023 Myles Foley, Ambrish Rawat, Taesung Lee, Yufang Hou, Gabriele Picco, Giulio Zizzo

The wide applicability and adaptability of generative large language models (LLMs) has enabled their rapid adoption.

Robustness of Explanation Methods for NLP Models

no code implementations24 Jun 2022 Shriya Atmakuri, Tejas Chheda, Dinesh Kandula, Nishant Yadav, Taesung Lee, Hessel Tuinhof

Explanation methods have emerged as an important tool to highlight the features responsible for the predictions of neural networks.

Adversarial Attack Adversarial Robustness +1

Adaptive Verifiable Training Using Pairwise Class Similarity

no code implementations14 Dec 2020 Shiqi Wang, Kevin Eykholt, Taesung Lee, Jiyong Jang, Ian Molloy

On CIFAR10, a non-robust LeNet model has a 21. 63% error rate, while a model created using verifiable training and a L-infinity robustness criterion of 8/255, has an error rate of 57. 10%.

Attribute

Backdoor Smoothing: Demystifying Backdoor Attacks on Deep Neural Networks

no code implementations11 Jun 2020 Kathrin Grosse, Taesung Lee, Battista Biggio, Youngja Park, Michael Backes, Ian Molloy

Backdoor attacks mislead machine-learning models to output an attacker-specified class when presented a specific trigger at test time.

Supervising Unsupervised Open Information Extraction Models

no code implementations IJCNLP 2019 Arpita Roy, Youngja Park, Taesung Lee, SHimei Pan

We propose a novel supervised open information extraction (Open IE) framework that leverages an ensemble of unsupervised Open IE systems and a small amount of labeled data to improve system performance.

Open Information Extraction Relation +1

Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering

1 code implementation9 Nov 2018 Bryant Chen, Wilka Carvalho, Nathalie Baracaldo, Heiko Ludwig, Benjamin Edwards, Taesung Lee, Ian Molloy, Biplav Srivastava

While machine learning (ML) models are being increasingly trusted to make decisions in different and varying areas, the safety of systems using such models has become an increasing concern.

Clustering

Defending Against Machine Learning Model Stealing Attacks Using Deceptive Perturbations

no code implementations31 May 2018 Taesung Lee, Benjamin Edwards, Ian Molloy, Dong Su

Machine learning models are vulnerable to simple model stealing attacks if the adversary can obtain output labels for chosen inputs.

BIG-bench Machine Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.