Search Results for author: Neal Mangaokar

Found 8 papers, 4 papers with code

PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails

no code implementations • 24 Feb 2024 • Neal Mangaokar, Ashish Hooda, Jihye Choi, Shreyas Chandrashekaran, Kassem Fawaz, Somesh Jha, Atul Prakash

More recent LLMs often incorporate an additional layer of defense, a Guard Model, which is a second LLM that is designed to check and moderate the output response of the primary LLM.

Language Modelling Large Language Model

Paper
Add Code

Theoretically Principled Trade-off for Stateful Defenses against Query-Based Black-Box Attacks

no code implementations • 30 Jul 2023 • Ashish Hooda, Neal Mangaokar, Ryan Feng, Kassem Fawaz, Somesh Jha, Atul Prakash

This work aims to address this gap by offering a theoretical characterization of the trade-off between detection and false positive rates for stateful defenses.

Paper
Add Code

Stateful Defenses for Machine Learning Models Are Not Yet Secure Against Black-box Attacks

1 code implementation • 11 Mar 2023 • Ryan Feng, Ashish Hooda, Neal Mangaokar, Kassem Fawaz, Somesh Jha, Atul Prakash

Such stateful defenses aim to defend against black-box attacks by tracking the query history and detecting and rejecting queries that are "similar" and thus preventing black-box attacks from finding useful gradients and making progress towards finding adversarial attacks within a reasonable query budget.

Paper
Code

D4: Detection of Adversarial Diffusion Deepfakes Using Disjoint Ensembles

no code implementations • 11 Feb 2022 • Ashish Hooda, Neal Mangaokar, Ryan Feng, Kassem Fawaz, Somesh Jha, Atul Prakash

D4 uses an ensemble of models over disjoint subsets of the frequency spectrum to significantly improve adversarial robustness.

Adversarial Robustness DeepFake Detection +1

Paper
Add Code

Jekyll: Attacking Medical Image Diagnostics using Deep Generative Models

no code implementations • 5 Apr 2021 • Neal Mangaokar, Jiameng Pu, Parantapa Bhattacharya, Chandan K. Reddy, Bimal Viswanath

The potential for fraudulent claims based on such generated 'fake' medical images is significant, and we demonstrate successful attacks on both X-rays and retinal fundus image modalities.

Style Transfer Translation

Paper
Add Code

T-Miner: A Generative Approach to Defend Against Trojan Attacks on DNN-based Text Classification

1 code implementation • 7 Mar 2021 • Ahmadreza Azizi, Ibrahim Asadullah Tahmid, Asim Waheed, Neal Mangaokar, Jiameng Pu, Mobin Javed, Chandan K. Reddy, Bimal Viswanath

T-Miner employs a sequence-to-sequence (seq-2-seq) generative model that probes the suspicious classifier and learns to produce text sequences that are likely to contain the Trojan trigger.

text-classification Text Classification

Paper
Code

Deepfake Videos in the Wild: Analysis and Detection

1 code implementation • 7 Mar 2021 • Jiameng Pu, Neal Mangaokar, Lauren Kelly, Parantapa Bhattacharya, Kavya Sundaram, Mobin Javed, Bolun Wang, Bimal Viswanath

AI-manipulated videos, commonly known as deepfakes, are an emerging problem.

DeepFake Detection Face Swapping +1

Paper
Code

GRAPHITE: Generating Automatic Physical Examples for Machine-Learning Attacks on Computer Vision Systems

1 code implementation • 17 Feb 2020 • Ryan Feng, Neal Mangaokar, Jiefeng Chen, Earlence Fernandes, Somesh Jha, Atul Prakash

We address three key requirements for practical attacks for the real-world: 1) automatically constraining the size and shape of the attack so it can be applied with stickers, 2) transform-robustness, i. e., robustness of a attack to environmental physical variations such as viewpoint and lighting changes, and 3) supporting attacks in not only white-box, but also black-box hard-label scenarios, so that the adversary can attack proprietary models.

BIG-bench Machine Learning General Classification +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.