Search Results for author: Matthew Hull

Found 5 papers, 4 papers with code

Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models

no code implementations27 May 2024 Shengyun Peng, Pin-Yu Chen, Matthew Hull, Duen Horng Chau

Safety alignment is the key to guiding the behaviors of large language models (LLMs) that are in line with human preferences and restrict harmful behaviors at inference time, but recent studies show that it can be easily compromised by finetuning with only a few adversarially designed training examples.

Safety Alignment

REVAMP: Automated Simulations of Adversarial Attacks on Arbitrary Objects in Realistic Scenes

1 code implementation18 Oct 2023 Matthew Hull, Zijie J. Wang, Duen Horng Chau

Generating these adversarial objects in the digital space has been extensively studied, however successfully transferring these attacks from the digital realm to the physical realm has proven challenging when controlling for real-world environmental factors.

Object

Robust Principles: Architectural Design Principles for Adversarially Robust CNNs

1 code implementation30 Aug 2023 Shengyun Peng, Weilin Xu, Cory Cornelius, Matthew Hull, Kevin Li, Rahul Duggal, Mansi Phute, Jason Martin, Duen Horng Chau

Our research aims to unify existing works' diverging opinions on how architectural components affect the adversarial robustness of CNNs.

Adversarial Robustness

LLM Self Defense: By Self Examination, LLMs Know They Are Being Tricked

1 code implementation14 Aug 2023 Mansi Phute, Alec Helbling, Matthew Hull, Shengyun Peng, Sebastian Szyller, Cory Cornelius, Duen Horng Chau

We test LLM Self Defense on GPT 3. 5 and Llama 2, two of the current most prominent LLMs against various types of attacks, such as forcefully inducing affirmative responses to prompts and prompt engineering attacks.

Language Modelling Large Language Model +2

DetectorDetective: Investigating the Effects of Adversarial Examples on Object Detectors

1 code implementation CVPR 2022 Sivapriya Vellaichamy, Matthew Hull, Zijie J. Wang, Nilaksh Das, Shengyun Peng, Haekyu Park, Duen Horng (Polo) Chau

With deep learning based systems performing exceedingly well in many vision-related tasks, a major concern with their widespread deployment especially in safety-critical applications is their susceptibility to adversarial attacks.

Object object-detection +2

Cannot find the paper you are looking for? You can Submit a new open access paper.