1 code implementation • 21 May 2025 • Bo-Han Lai, Pin-Han Huang, Bo-Han Kung, Shang-Tse Chen
In addition, by theoretically analyzing the nature of Lipschitz neural networks, we introduce a new loss function that employs an annealing mechanism to increase margin for most data points.
1 code implementation • 3 Feb 2025 • Yu-Ling Hsu, Hsuan Su, Shang-Tse Chen
Large language models (LLMs) have seen rapid development in recent years, revolutionizing various applications and significantly enhancing convenience and productivity.
no code implementations • 27 Dec 2024 • Hua Farn, Hsuan Su, Shachi H Kumar, Saurav Sahay, Shang-Tse Chen, Hung-Yi Lee
In this paper, we address the question: How can we improve downstream task performance while preserving safety in LLMs without relying on additional safety data?
1 code implementation • 13 Nov 2024 • Zhen-Ting Liu, Shang-Tse Chen
Model Inversion (MI) attacks pose a significant threat to the privacy of Deep Neural Networks by recovering training data distribution from well-trained models.
no code implementations • 10 Oct 2024 • Jonathan Weiping Li, Ren-Wei Liang, Cheng-Han Yeh, Cheng-Chang Tsai, Kuanchun Yu, Chun-Shien Lu, Shang-Tse Chen
This paper examines the phenomenon of probabilistic robustness overestimation in TRADES, a prominent adversarial training method.
no code implementations • 19 Sep 2024 • Tsung-Han Wu, Hung-Ting Su, Shang-Tse Chen, Winston H. Hsu
The robust self-training (RST) framework has emerged as a prominent approach for semi-supervised adversarial training.
no code implementations • 5 Jun 2024 • Hsuan Su, Hua Farn, Fan-Yun Sun, Shang-Tse Chen, Hung-Yi Lee
Synthetic data is widely used in speech recognition due to the availability of text-to-speech models, which facilitate adapting models to previously unseen text domains.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
no code implementations • 11 Nov 2023 • Hsuan Su, Rebecca Qian, Chinnadhurai Sankar, Shahin Shayandeh, Shang-Tse Chen, Hung-Yi Lee, Daniel M. Bikel
In this paper, we propose a diagnosis method to attribute bias to each component of a TOD system.
no code implementations • 17 Oct 2023 • Hsuan Su, Cheng-Chu Cheng, Hua Farn, Shachi H Kumar, Saurav Sahay, Shang-Tse Chen, Hung-Yi Lee
Recently, researchers have made considerable improvements in dialogue systems with the progress of large language models (LLMs) such as ChatGPT and GPT-4.
1 code implementation • 20 May 2023 • Yu-Yu Wu, Hung-Jui Wang, Shang-Tse Chen
To address this issue and enhance adversarial robustness, we analyze the characteristics of robust models and identify that robust models tend to produce smoother and well-calibrated outputs.
no code implementations • 12 Feb 2023 • Hsuan Su, Shachi H Kumar, Sahisnu Mazumder, Wenda Chen, Ramesh Manuvinakurike, Eda Okur, Saurav Sahay, Lama Nachman, Shang-Tse Chen, Hung-Yi Lee
With the power of large pretrained language models, various research works have integrated knowledge into dialogue systems.
1 code implementation • 1 Feb 2023 • Bo-Han Kung, Shang-Tse Chen
This observation leads to an efficient and effective input-specific randomized smoothing algorithm.
no code implementations • 22 Sep 2022 • Tsung-Han Wu, Hung-Ting Su, Shang-Tse Chen, Winston H. Hsu
Fairness and robustness play vital roles in trustworthy machine learning.
no code implementations • 18 Aug 2022 • Hung-Jui Wang, Yu-Yu Wu, Shang-Tse Chen
In this work, we propose Diversified Weight Pruning (DWP), a novel model augmentation technique for generating transferable targeted attacks.
no code implementations • 8 Jun 2022 • Hsuan Su, PoHan Chi, Shih-Cheng Huang, Chung Ho Lam, Saurav Sahay, Shang-Tse Chen, Hung-Yi Lee
Much literature has shown that prompt-based learning is an efficient method to make use of the large pre-trained language model.
2 code implementations • 21 Feb 2020 • Scott Freitas, Shang-Tse Chen, Zijie J. Wang, Duen Horng Chau
UnMask detects such attacks and defends the model by rectifying the misclassification, re-classifying the image based on its robust features.
3 code implementations • 18 Apr 2019 • Cory Cornelius, Shang-Tse Chen, Jason Martin, Duen Horng Chau
In this talk we describe our content-preserving attack on object detectors, ShapeShifter, and demonstrate how to evaluate this threat in realistic scenarios.
no code implementations • 1 Feb 2019 • Cory Cornelius, Nilaksh Das, Shang-Tse Chen, Li Chen, Michael E. Kounavis, Duen Horng Chau
To evaluate the robustness of the defense against an adaptive attacker, we consider the targeted-attack success rate of the Projected Gradient Descent (PGD) attack, which is a strong gradient-based adversarial attack proposed in adversarial machine learning research.
no code implementations • 30 May 2018 • Nilaksh Das, Madhuri Shanbhogue, Shang-Tse Chen, Li Chen, Michael E. Kounavis, Duen Horng Chau
Adversarial machine learning research has recently demonstrated the feasibility to confuse automatic speech recognition (ASR) models by introducing acoustically imperceptible perturbations to audio samples.
3 code implementations • 16 Apr 2018 • Shang-Tse Chen, Cory Cornelius, Jason Martin, Duen Horng Chau
Given the ability to directly manipulate image pixels in the digital input space, an adversary can easily generate imperceptible perturbations to fool a Deep Neural Network (DNN) image classifier, as demonstrated in prior work.
3 code implementations • 19 Feb 2018 • Nilaksh Das, Madhuri Shanbhogue, Shang-Tse Chen, Fred Hohman, Siwei Li, Li Chen, Michael E. Kounavis, Duen Horng Chau
The rapidly growing body of research in adversarial machine learning has demonstrated that deep neural networks (DNNs) are highly vulnerable to adversarially generated images.
no code implementations • 8 May 2017 • Nilaksh Das, Madhuri Shanbhogue, Shang-Tse Chen, Fred Hohman, Li Chen, Michael E. Kounavis, Duen Horng Chau
Deep neural networks (DNNs) have achieved great success in solving a variety of machine learning (ML) problems, especially in the domain of image recognition.
no code implementations • 21 Jun 2015 • Shang-Tse Chen, Maria-Florina Balcan, Duen Horng Chau
We consider the problem of learning from distributed data in the agnostic setting, i. e., in the presence of arbitrary forms of noise.