Search Results for author: Radha Poovendran

Found 41 papers, 13 papers with code

ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates

1 code implementation17 Jun 2024 Fengqing Jiang, Zhangchen Xu, Luyao Niu, Bill Yuchen Lin, Radha Poovendran

We demonstrate that a malicious user can exploit the ChatBug vulnerability of eight state-of-the-art (SOTA) LLMs and effectively elicit unintended responses from these models.

Instruction Following

ACE: A Model Poisoning Attack on Contribution Evaluation Methods in Federated Learning

no code implementations31 May 2024 Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bo Li, Radha Poovendran

Specifically, we show that any malicious client utilizing ACE could manipulate the parameters of its local model such that it is evaluated to have a high contribution by the server, even when its local training data is indeed of low quality.

Federated Learning Model Poisoning

Fault Tolerant Neural Control Barrier Functions for Robotic Systems under Sensor Faults and Attacks

1 code implementation28 Feb 2024 Hongchao Zhang, Luyao Niu, Andrew Clark, Radha Poovendran

Control barrier function (CBF)-based approaches have been proposed to guarantee the safety of robotic systems.

ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs

1 code implementation19 Feb 2024 Fengqing Jiang, Zhangchen Xu, Luyao Niu, Zhen Xiang, Bhaskar Ramasubramanian, Bo Li, Radha Poovendran

In this paper, we propose a novel ASCII art-based jailbreak attack and introduce a comprehensive benchmark Vision-in-Text Challenge (ViTC) to evaluate the capabilities of LLMs in recognizing prompts that cannot be solely interpreted by semantics.

SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding

1 code implementation14 Feb 2024 Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bill Yuchen Lin, Radha Poovendran

Our results show that SafeDecoding significantly reduces the attack success rate and harmfulness of jailbreak attacks without compromising the helpfulness of responses to benign user queries.

Chatbot Code Generation

Game of Trojans: Adaptive Adversaries Against Output-based Trojaned-Model Detectors

no code implementations12 Feb 2024 Dinuka Sahabandu, Xiaojun Xu, Arezoo Rajabi, Luyao Niu, Bhaskar Ramasubramanian, Bo Li, Radha Poovendran

We propose and analyze an adaptive adversary that can retrain a Trojaned DNN and is also aware of SOTA output-based Trojaned model detectors.

BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models

1 code implementation20 Jan 2024 Zhen Xiang, Fengqing Jiang, Zidi Xiong, Bhaskar Ramasubramanian, Radha Poovendran, Bo Li

Moreover, we show that LLMs endowed with stronger reasoning capabilities exhibit higher susceptibility to BadChain, exemplified by a high average attack success rate of 97. 0% across the six benchmark tasks on GPT-4.

Backdoor Attack

Brave: Byzantine-Resilient and Privacy-Preserving Peer-to-Peer Federated Learning

no code implementations10 Jan 2024 Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Radha Poovendran

Our results show that the global model learned with Brave in the presence of adversaries achieves comparable classification accuracy to a global model trained in the absence of any adversary.

Federated Learning Image Classification +1

Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications

no code implementations7 Nov 2023 Fengqing Jiang, Zhangchen Xu, Luyao Niu, Boxin Wang, Jinyuan Jia, Bo Li, Radha Poovendran

Successful exploits of the identified vulnerabilities result in the users receiving responses tailored to the intent of a threat initiator.

Code Completion

MDTD: A Multi Domain Trojan Detector for Deep Neural Networks

1 code implementation30 Aug 2023 Arezoo Rajabi, Surudhi Asokraj, Fengqing Jiang, Luyao Niu, Bhaskar Ramasubramanian, Jim Ritcey, Radha Poovendran

An adversary carrying out a backdoor attack embeds a predefined perturbation called a trigger into a small subset of input samples and trains the DNN such that the presence of the trigger in the input results in an adversary-desired output class.

Backdoor Attack

Risk-Aware Distributed Multi-Agent Reinforcement Learning

no code implementations4 Apr 2023 Abdullah Al Maruf, Luyao Niu, Bhaskar Ramasubramanian, Andrew Clark, Radha Poovendran

We then propose a distributed MARL algorithm called the CVaR QD-Learning algorithm, and establish that value functions of individual agents reaches consensus.

Decision Making Multi-agent Reinforcement Learning +1

LDL: A Defense for Label-Based Membership Inference Attacks

no code implementations3 Dec 2022 Arezoo Rajabi, Dinuka Sahabandu, Luyao Niu, Bhaskar Ramasubramanian, Radha Poovendran

Overfitted models have been shown to be susceptible to query-based attacks such as membership inference attacks (MIAs).

Game of Trojans: A Submodular Byzantine Approach

no code implementations13 Jul 2022 Dinuka Sahabandu, Arezoo Rajabi, Luyao Niu, Bo Li, Bhaskar Ramasubramanian, Radha Poovendran

The results show that (i) with Submodular Trojan algorithm, the adversary needs to embed a Trojan trigger into a very small fraction of samples to achieve high accuracy on both Trojan and clean samples, and (ii) the MM Trojan algorithm yields a trained Trojan model that evades detection with probability 1.

A Natural Language Processing Approach for Instruction Set Architecture Identification

no code implementations13 Apr 2022 Dinuka Sahabandu, Sukarno Mertoguno, Radha Poovendran

Empirical evaluations show that using our byte-level features in ML-based ISA identification results in an 8% higher accuracy than the state-of-the-art features based on byte-histograms and byte pattern signatures.

Malware Detection

Privacy-Preserving Reinforcement Learning Beyond Expectation

no code implementations18 Mar 2022 Arezoo Rajabi, Bhaskar Ramasubramanian, Abdullah Al Maruf, Radha Poovendran

Through empirical evaluations, we highlight a privacy-utility tradeoff and demonstrate that the RL agent is able to learn behaviors that are aligned with that of a human user in the same environment in a privacy-preserving manner

Decision Making Privacy Preserving +2

Shaping Advice in Deep Reinforcement Learning

1 code implementation19 Feb 2022 Baicen Xiao, Bhaskar Ramasubramanian, Radha Poovendran

We design two algorithms- Shaping Advice in Single-agent reinforcement learning (SAS) and Shaping Advice in Multi-agent reinforcement learning (SAM).

Multi-agent Reinforcement Learning reinforcement-learning +1

Agent-Temporal Attention for Reward Redistribution in Episodic Multi-Agent Reinforcement Learning

1 code implementation12 Jan 2022 Baicen Xiao, Bhaskar Ramasubramanian, Radha Poovendran

In this paper, we introduce Agent-Temporal Attention for Reward Redistribution in Episodic Multi-Agent Reinforcement Learning (AREL) to address these two challenges.

Multi-agent Reinforcement Learning reinforcement-learning +2

A Game-Theoretic Framework for Controlled Islanding in the Presence of Adversaries

no code implementations3 Aug 2021 Luyao Niu, Dinuka Sahabandu, Andrew Clark, Radha Poovendran

In this paper, we study the controlled islanding problem of a power system under disturbances introduced by a malicious adversary.

Reinforcement Learning Beyond Expectation

no code implementations29 Mar 2021 Bhaskar Ramasubramanian, Luyao Niu, Andrew Clark, Radha Poovendran

In this paper, we consider a setting where an autonomous agent has to learn behaviors in an unknown environment.

reinforcement-learning Reinforcement Learning (RL)

Shaping Advice in Deep Multi-Agent Reinforcement Learning

1 code implementation29 Mar 2021 Baicen Xiao, Bhaskar Ramasubramanian, Radha Poovendran

We observe that using SAM results in agents learning policies to complete tasks faster, and obtain higher rewards than: i) using sparse rewards alone; ii) a state-of-the-art reward redistribution method.

Multi-agent Reinforcement Learning reinforcement-learning +1

Safety-Critical Online Control with Adversarial Disturbances

no code implementations20 Sep 2020 Bhaskar Ramasubramanian, Baicen Xiao, Linda Bushnell, Radha Poovendran

We propose an iterative approach to the synthesis of the controller by solving a modified discrete-time Riccati equation.

Stochastic Dynamic Information Flow Tracking Game using Supervised Learning for Detecting Advanced Persistent Threats

1 code implementation24 Jul 2020 Shana Moothedath, Dinuka Sahabandu, Joey Allen, Linda Bushnell, Wenke Lee, Radha Poovendran

Our game model has imperfect information as the players do not have information about the actions of the opponent.

Computer Science and Game Theory Cryptography and Security

Are Odds Really Odd? Bypassing Statistical Detection of Adversarial Examples

no code implementations28 Jul 2019 Hossein Hosseini, Sreeram Kannan, Radha Poovendran

In this paper, we first develop a classifier-based adaptation of the statistical test method and show that it improves the detection performance.

Potential-Based Advice for Stochastic Policy Learning

no code implementations20 Jul 2019 Baicen Xiao, Bhaskar Ramasubramanian, Andrew Clark, Hannaneh Hajishirzi, Linda Bushnell, Radha Poovendran

This paper augments the reward received by a reinforcement learning agent with potential functions in order to help the agent learn (possibly stochastic) optimal policies.

Q-Learning

Assessing Shape Bias Property of Convolutional Neural Networks

no code implementations21 Mar 2018 Hossein Hosseini, Baicen Xiao, Mayoore Jaiswal, Radha Poovendran

In order to conduct large scale experiments, we propose using the model accuracy on images with reversed brightness as a metric to evaluate the shape bias property.

One-Shot Learning

Semantic Adversarial Examples

1 code implementation16 Mar 2018 Hossein Hosseini, Radha Poovendran

This property is used by several defense methods to counter adversarial examples by applying denoising filters or training the model to be robust to small perturbations.

Denoising

Google's Cloud Vision API Is Not Robust To Noise

no code implementations16 Apr 2017 Hossein Hosseini, Baicen Xiao, Radha Poovendran

For example, an adversary can bypass an image filtering system by adding noise to inappropriate images.

Deceiving Google's Cloud Video Intelligence API Built for Summarizing Videos

no code implementations26 Mar 2017 Hossein Hosseini, Baicen Xiao, Radha Poovendran

For this, we select an image, which is different from the video content, and insert it, periodically and at a very low rate, into the video.

Image Classification

On the Limitation of Convolutional Neural Networks in Recognizing Negative Images

no code implementations20 Mar 2017 Hossein Hosseini, Baicen Xiao, Mayoore Jaiswal, Radha Poovendran

To this end, we evaluate CNNs on negative images, since they share the same structure and semantics as regular images and humans can classify them correctly.

Blocking Transferability of Adversarial Examples in Black-Box Learning Systems

no code implementations13 Mar 2017 Hossein Hosseini, Yize Chen, Sreeram Kannan, Baosen Zhang, Radha Poovendran

Advances in Machine Learning (ML) have led to its adoption as an integral component in many applications, including banking, medical diagnosis, and driverless cars.

Blocking Medical Diagnosis

Deceiving Google's Perspective API Built for Detecting Toxic Comments

no code implementations27 Feb 2017 Hossein Hosseini, Sreeram Kannan, Baosen Zhang, Radha Poovendran

In this paper, we propose an attack on the Perspective toxic detection system based on the adversarial examples.

Learning Temporal Dependence from Time-Series Data with Latent Variables

no code implementations27 Aug 2016 Hossein Hosseini, Sreeram Kannan, Baosen Zhang, Radha Poovendran

We consider the setting where a collection of time series, modeled as random processes, evolve in a causal manner, and one is interested in learning the graph governing the relationships of these processes.

Time Series Time Series Analysis

Group Event Detection with a Varying Number of Group Members for Video Surveillance

no code implementations28 Feb 2015 Weiyao Lin, Ming-Ting Sun, Radha Poovendran, Zhengyou Zhang

This paper presents a novel approach for automatic recognition of group activities for video surveillance applications.

Action Detection Activity Detection +1

Activity Recognition Using A Combination of Category Components And Local Models for Video Surveillance

no code implementations28 Feb 2015 Weiyao Lin, Ming-Ting Sun, Radha Poovendran, Zhengyou Zhang

This paper presents a novel approach for automatic recognition of human activities for video surveillance applications.

Activity Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.