Search Results for author: Aniruddha Saha

Found 12 papers, 11 papers with code

Generating Potent Poisons and Backdoors from Scratch with Guided Diffusion

1 code implementation • 25 Mar 2024 • Hossein Souri, Arpit Bansal, Hamid Kazemi, Liam Fowl, Aniruddha Saha, Jonas Geiping, Andrew Gordon Wilson, Rama Chellappa, Tom Goldstein, Micah Goldblum

As a result, we may be able to craft more potent poisons by carefully choosing the base samples.

Backdoor Attack

Paper
Code

Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text

1 code implementation • 22 Jan 2024 • Abhimanyu Hans, Avi Schwarzschild, Valeriia Cherepanova, Hamid Kazemi, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein

Detecting text generated by modern large language models is thought to be hard, as both LLMs and humans can exhibit a wide range of complex behaviors.

155

Paper
Code

NEFTune: Noisy Embeddings Improve Instruction Finetuning

3 code implementations • 9 Oct 2023 • Neel Jain, Ping-Yeh Chiang, Yuxin Wen, John Kirchenbauer, Hong-Min Chu, Gowthami Somepalli, Brian R. Bartoldson, Bhavya Kailkhura, Avi Schwarzschild, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein

We show that language model finetuning can be improved, sometimes dramatically, with a simple augmentation.

Language Modelling

5,744

Paper
Code

Baseline Defenses for Adversarial Attacks Against Aligned Language Models

1 code implementation • 1 Sep 2023 • Neel Jain, Avi Schwarzschild, Yuxin Wen, Gowthami Somepalli, John Kirchenbauer, Ping-Yeh Chiang, Micah Goldblum, Aniruddha Saha, Jonas Geiping, Tom Goldstein

We find that the weakness of existing discrete optimizers for text, combined with the relatively high costs of optimization, makes standard adaptive attacks more challenging for LLMs.

Paper
Code

Bring Your Own Data! Self-Supervised Evaluation for Large Language Models

1 code implementation • 23 Jun 2023 • Neel Jain, Khalid Saifullah, Yuxin Wen, John Kirchenbauer, Manli Shu, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein

With the rise of Large Language Models (LLMs) and their ubiquitous deployment in diverse domains, measuring language model behavior on realistic data is imperative.

Chatbot Language Modelling

107

Paper
Code

Revisiting Image Classifier Training for Improved Certified Robust Defense against Adversarial Patches

no code implementations • 22 Jun 2023 • Aniruddha Saha, Shuhua Yu, Arash Norouzzadeh, Wan-Yi Lin, Chaithanya Kumar Mummadi

The success of this strategy relies heavily on the model's invariance to image pixel masking.

Robust classification

Paper
Add Code

On the Reliability of Watermarks for Large Language Models

1 code implementation • 7 Jun 2023 • John Kirchenbauer, Jonas Geiping, Yuxin Wen, Manli Shu, Khalid Saifullah, Kezhi Kong, Kasun Fernando, Aniruddha Saha, Micah Goldblum, Tom Goldstein

We also consider a range of new detection schemes that are sensitive to short spans of watermarked text embedded inside a large document, and we compare the robustness of watermarking to other kinds of detectors.

444

Paper
Code

Backdoor Attacks on Vision Transformers

1 code implementation • 16 Jun 2022 • Akshayvarun Subramanya, Aniruddha Saha, Soroush Abbasi Koohpayegani, Ajinkya Tejankar, Hamed Pirsiavash

Vision Transformers (ViT) have recently demonstrated exemplary performance on a variety of vision tasks and are being used as an alternative to CNNs.

Blocking

Paper
Code

Backdoor Attacks on Self-Supervised Learning

1 code implementation • CVPR 2022 • Aniruddha Saha, Ajinkya Tejankar, Soroush Abbasi Koohpayegani, Hamed Pirsiavash

We show that such methods are vulnerable to backdoor attacks - where an attacker poisons a small part of the unlabeled data by adding a trigger (image patch chosen by the attacker) to the images.

Inductive Bias Knowledge Distillation +1

Paper
Code

Hidden Trigger Backdoor Attacks

3 code implementations • 30 Sep 2019 • Aniruddha Saha, Akshayvarun Subramanya, Hamed Pirsiavash

Backdoor attacks are a form of adversarial attacks on deep networks where the attacker provides poisoned data to the victim to train the model with, and then activates the attack by showing a specific small trigger pattern at the test time.

Backdoor Attack Image Classification

107

Paper
Code

Role of Spatial Context in Adversarial Robustness for Object Detection

1 code implementation • 30 Sep 2019 • Aniruddha Saha, Akshayvarun Subramanya, Koninika Patil, Hamed Pirsiavash

However, one can show that an adversary can design adversarial patches which do not overlap with any objects of interest in the scene and exploit contextual reasoning to fool standard detectors.

Adversarial Attack Adversarial Robustness +3

Paper
Code

Universal Litmus Patterns: Revealing Backdoor Attacks in CNNs

1 code implementation • CVPR 2020 • Soheil Kolouri, Aniruddha Saha, Hamed Pirsiavash, Heiko Hoffmann

In this paper, we introduce a benchmark technique for detecting backdoor attacks (aka Trojan attacks) on deep convolutional neural networks (CNNs).

Traffic Sign Recognition

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.