Search Results for author: Shang-Tse Chen

Found 23 papers, 9 papers with code

Enhancing Certified Robustness via Block Reflector Orthogonal Layers and Logit Annealing Loss

1 code implementation21 May 2025 Bo-Han Lai, Pin-Han Huang, Bo-Han Kung, Shang-Tse Chen

In addition, by theoretically analyzing the nature of Lipschitz neural networks, we introduce a new loss function that employs an annealing mechanism to increase margin for most data points.

Jailbreaking with Universal Multi-Prompts

1 code implementation3 Feb 2025 Yu-Ling Hsu, Hsuan Su, Shang-Tse Chen

Large language models (LLMs) have seen rapid development in recent years, revolutionizing various applications and significantly enhancing convenience and productivity.

Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging

no code implementations27 Dec 2024 Hua Farn, Hsuan Su, Shachi H Kumar, Saurav Sahay, Shang-Tse Chen, Hung-Yi Lee

In this paper, we address the question: How can we improve downstream task performance while preserving safety in LLMs without relying on additional safety data?

Trap-MID: Trapdoor-based Defense against Model Inversion Attacks

1 code implementation13 Nov 2024 Zhen-Ting Liu, Shang-Tse Chen

Model Inversion (MI) attacks pose a significant threat to the privacy of Deep Neural Networks by recovering training data distribution from well-trained models.

Adversarial Robustness Overestimation and Instability in TRADES

no code implementations10 Oct 2024 Jonathan Weiping Li, Ren-Wei Liang, Cheng-Han Yeh, Cheng-Chang Tsai, Kuanchun Yu, Chun-Shien Lu, Shang-Tse Chen

This paper examines the phenomenon of probabilistic robustness overestimation in TRADES, a prominent adversarial training method.

Adversarial Robustness

Revisiting Semi-supervised Adversarial Robustness via Noise-aware Online Robust Distillation

no code implementations19 Sep 2024 Tsung-Han Wu, Hung-Ting Su, Shang-Tse Chen, Winston H. Hsu

The robust self-training (RST) framework has emerged as a prominent approach for semi-supervised adversarial training.

Adversarial Robustness

Task Arithmetic can Mitigate Synthetic-to-Real Gap in Automatic Speech Recognition

no code implementations5 Jun 2024 Hsuan Su, Hua Farn, Fan-Yun Sun, Shang-Tse Chen, Hung-Yi Lee

Synthetic data is widely used in speech recognition due to the availability of text-to-speech models, which facilitate adapting models to previously unseen text domains.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Learning from Red Teaming: Gender Bias Provocation and Mitigation in Large Language Models

no code implementations17 Oct 2023 Hsuan Su, Cheng-Chu Cheng, Hua Farn, Shachi H Kumar, Saurav Sahay, Shang-Tse Chen, Hung-Yi Lee

Recently, researchers have made considerable improvements in dialogue systems with the progress of large language models (LLMs) such as ChatGPT and GPT-4.

In-Context Learning Red Teaming

Annealing Self-Distillation Rectification Improves Adversarial Training

1 code implementation20 May 2023 Yu-Yu Wu, Hung-Jui Wang, Shang-Tse Chen

To address this issue and enhance adversarial robustness, we analyze the characteristics of robust models and identify that robust models tend to produce smoother and well-calibrated outputs.

Adversarial Robustness

Towards Large Certified Radius in Randomized Smoothing using Quasiconcave Optimization

1 code implementation1 Feb 2023 Bo-Han Kung, Shang-Tse Chen

This observation leads to an efficient and effective input-specific randomized smoothing algorithm.

Enhancing Targeted Attack Transferability via Diversified Weight Pruning

no code implementations18 Aug 2022 Hung-Jui Wang, Yu-Yu Wu, Shang-Tse Chen

In this work, we propose Diversified Weight Pruning (DWP), a novel model augmentation technique for generating transferable targeted attacks.

Diversity Model Compression

UnMask: Adversarial Detection and Defense Through Robust Feature Alignment

2 code implementations21 Feb 2020 Scott Freitas, Shang-Tse Chen, Zijie J. Wang, Duen Horng Chau

UnMask detects such attacks and defends the model by rectifying the misclassification, re-classifying the image based on its robust features.

Medical Diagnosis Self-Driving Cars

Talk Proposal: Towards the Realistic Evaluation of Evasion Attacks using CARLA

3 code implementations18 Apr 2019 Cory Cornelius, Shang-Tse Chen, Jason Martin, Duen Horng Chau

In this talk we describe our content-preserving attack on object detectors, ShapeShifter, and demonstrate how to evaluate this threat in realistic scenarios.

The Efficacy of SHIELD under Different Threat Models

no code implementations1 Feb 2019 Cory Cornelius, Nilaksh Das, Shang-Tse Chen, Li Chen, Michael E. Kounavis, Duen Horng Chau

To evaluate the robustness of the defense against an adaptive attacker, we consider the targeted-attack success rate of the Projected Gradient Descent (PGD) attack, which is a strong gradient-based adversarial attack proposed in adversarial machine learning research.

Adversarial Attack image-classification +1

ADAGIO: Interactive Experimentation with Adversarial Attack and Defense for Audio

no code implementations30 May 2018 Nilaksh Das, Madhuri Shanbhogue, Shang-Tse Chen, Li Chen, Michael E. Kounavis, Duen Horng Chau

Adversarial machine learning research has recently demonstrated the feasibility to confuse automatic speech recognition (ASR) models by introducing acoustically imperceptible perturbations to audio samples.

Adversarial Attack Audio Compression +3

ShapeShifter: Robust Physical Adversarial Attack on Faster R-CNN Object Detector

3 code implementations16 Apr 2018 Shang-Tse Chen, Cory Cornelius, Jason Martin, Duen Horng Chau

Given the ability to directly manipulate image pixels in the digital input space, an adversary can easily generate imperceptible perturbations to fool a Deep Neural Network (DNN) image classifier, as demonstrated in prior work.

Adversarial Attack Autonomous Vehicles +6

Shield: Fast, Practical Defense and Vaccination for Deep Learning using JPEG Compression

3 code implementations19 Feb 2018 Nilaksh Das, Madhuri Shanbhogue, Shang-Tse Chen, Fred Hohman, Siwei Li, Li Chen, Michael E. Kounavis, Duen Horng Chau

The rapidly growing body of research in adversarial machine learning has demonstrated that deep neural networks (DNNs) are highly vulnerable to adversarially generated images.

Keeping the Bad Guys Out: Protecting and Vaccinating Deep Learning with JPEG Compression

no code implementations8 May 2017 Nilaksh Das, Madhuri Shanbhogue, Shang-Tse Chen, Fred Hohman, Li Chen, Michael E. Kounavis, Duen Horng Chau

Deep neural networks (DNNs) have achieved great success in solving a variety of machine learning (ML) problems, especially in the domain of image recognition.

Communication Efficient Distributed Agnostic Boosting

no code implementations21 Jun 2015 Shang-Tse Chen, Maria-Florina Balcan, Duen Horng Chau

We consider the problem of learning from distributed data in the agnostic setting, i. e., in the presence of arbitrary forms of noise.

Cannot find the paper you are looking for? You can Submit a new open access paper.