Search Results for author: Xuwang Yin

Found 18 papers, 11 papers with code

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

1 code implementation • 6 Feb 2024 • Mantas Mazeika, Long Phan, Xuwang Yin, Andy Zou, Zifan Wang, Norman Mu, Elham Sakhaee, Nathaniel Li, Steven Basart, Bo Li, David Forsyth, Dan Hendrycks

Automated red teaming holds substantial promise for uncovering and mitigating the risks associated with the malicious use of large language models (LLMs), yet the field lacks a standardized evaluation framework to rigorously assess new methods.

137

Paper
Code

Learning Globally Optimized Language Structure via Adversarial Training

no code implementations • 12 Nov 2023 • Xuwang Yin

Key contributions include: (1) an adversarial attack strategy tailored to text to generate negative samples, circumventing MCMC limitations; (2) an adversarial training algorithm for EBMs leveraging these attacks; (3) empirical validation of performance improvements on a sequence generation task.

Adversarial Attack Text Generation

Paper
Add Code

Representation Engineering: A Top-Down Approach to AI Transparency

1 code implementation • 2 Oct 2023 • Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski, Shashwat Goel, Nathaniel Li, Michael J. Byun, Zifan Wang, Alex Mallen, Steven Basart, Sanmi Koyejo, Dawn Song, Matt Fredrikson, J. Zico Kolter, Dan Hendrycks

In this paper, we identify and characterize the emerging area of representation engineering (RepE), an approach to enhancing the transparency of AI systems that draws on insights from cognitive neuroscience.

Ranked #3 on Question Answering on TruthfulQA

Question Answering

528

Paper
Code

Generative Robust Classification

no code implementations • 14 Dec 2022 • Xuwang Yin

Training adversarially robust discriminative (i. e., softmax) classifier has been the dominant approach to robust classification.

Classification Data Augmentation +1

Paper
Add Code

End-to-End Signal Classification in Signed Cumulative Distribution Transform Space

1 code implementation • 30 Apr 2022 • Abu Hasnat Mohammad Rubaiyat, Shiying Li, Xuwang Yin, Mohammad Shifat E Rabbi, Yan Zhuang, Gustavo K. Rohde

This paper presents a new end-to-end signal classification method using the signed cumulative distribution transform (SCDT).

Classification

Paper
Code

Local Sliced-Wasserstein Feature Sets for Illumination-invariant Face Recognition

1 code implementation • 22 Feb 2022 • Yan Zhuang, Shiying Li, Mohammad Shifat-E-Rabbi, Xuwang Yin, Abu Hasnat Mohammad Rubaiyat, Gustavo K. Rohde

Face recognition is then performed using a nearest subspace in R-CDT domain of local gradient distributions.

Face Recognition

Paper
Code

Invariance encoding in sliced-Wasserstein space for image classification with limited training data

2 code implementations • 9 Jan 2022 • Mohammad Shifat E Rabbi, Yan Zhuang, Shiying Li, Abu Hasnat Mohammad Rubaiyat, Xuwang Yin, Gustavo K. Rohde

However, they are known to underperform when training data are limited and thus require data augmentation strategies that render the method computationally expensive and not always effective.

Data Augmentation Image Classification

Paper
Code

Learning Energy-Based Models With Adversarial Training

1 code implementation • 11 Dec 2020 • Xuwang Yin, Shiying Li, Gustavo K. Rohde

We study a new approach to learning energy-based models (EBMs) based on adversarial training (AT).

Adversarial Defense Adversarial Robustness +3

Paper
Code

GAT: Generative Adversarial Training for Adversarial Example Detection and Classification

no code implementations • ICLR 2020 • Xuwang Yin, Soheil Kolouri, Gustavo K. Rohde

The vulnerabilities of deep neural networks against adversarial examples have become a significant concern for deploying these models in sensitive domains.

General Classification Robust classification +1

Paper
Add Code

Radon cumulative distribution transform subspace modeling for image classification

3 code implementations • 7 Apr 2020 • Mohammad Shifat-E-Rabbi, Xuwang Yin, Abu Hasnat Mohammad Rubaiyat, Shiying Li, Soheil Kolouri, Akram Aldroubi, Jonathan M. Nichols, Gustavo K. Rohde

We present a new supervised image classification method applicable to a broad class of image deformation models.

Classification Computational Efficiency +4

Paper
Code

Testing Robustness Against Unforeseen Adversaries

3 code implementations • 21 Aug 2019 • Max Kaufmann, Daniel Kang, Yi Sun, Steven Basart, Xuwang Yin, Mantas Mazeika, Akul Arora, Adam Dziedzic, Franziska Boenisch, Tom Brown, Jacob Steinhardt, Dan Hendrycks

To narrow in on this discrepancy between research and reality we introduce ImageNet-UA, a framework for evaluating model robustness against a range of unforeseen adversaries, including eighteen new non-L_p attacks.

Adversarial Defense Adversarial Robustness

Paper
Code

Neural Networks, Hypersurfaces, and Radon Transforms

1 code implementation • 4 Jul 2019 • Soheil Kolouri, Xuwang Yin, Gustavo K. Rohde

Connections between integration along hypersufaces, Radon transforms, and neural networks are exploited to highlight an integral geometric mathematical interpretation of neural networks.

Paper
Code

Cell image classification: a comparative overview

1 code implementation • 7 Jun 2019 • Mohammad Shifat-E-Rabbi, Xuwang Yin, Cailey Elizabeth Fitzgerald, Gustavo K. Rohde

Cell image classification methods are currently being used in numerous applications in cell biology and medicine.

Classification Image Classification

Paper
Code

GAT: Generative Adversarial Training for Adversarial Example Detection and Robust Classification

1 code implementation • 27 May 2019 • Xuwang Yin, Soheil Kolouri, Gustavo K. Rohde

The vulnerabilities of deep neural networks against adversarial examples have become a significant concern for deploying these models in sensitive domains.

Classification General Classification +2

Paper
Code

Chat-crowd: A Dialog-based Platform for Visual Layout Composition

no code implementations • NAACL 2019 • Paola Cascante-Bonilla, Xuwang Yin, Vicente Ordonez, Song Feng

In this paper we introduce Chat-crowd, an interactive environment for visual layout composition via conversational interactions.

Goal-Oriented Dialog

Paper
Add Code

Privacy Partitioning: Protecting User Data During the Deep Learning Inference Phase

no code implementations • 7 Dec 2018 • Jianfeng Chi, Emmanuel Owusu, Xuwang Yin, Tong Yu, William Chan, Patrick Tague, Yuan Tian

We present a practical method for protecting data during the inference phase of deep learning based on bipartite topology threat modeling and an interactive adversarial deep network construction.

BIG-bench Machine Learning Face Identification +1