Search Results for author: Alexander Robey

Found 24 papers, 15 papers with code

Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation

no code implementations • 28 Mar 2024 • Yutong He, Alexander Robey, Naoki Murata, Yiding Jiang, Joshua Williams, George J. Pappas, Hamed Hassani, Yuki Mitsufuji, Ruslan Salakhutdinov, J. Zico Kolter

Prompt engineering is effective for controlling the output of text-to-image (T2I) generative models, but it is also laborious due to the need for manually crafted prompts.

In-Context Learning Language Modelling +3

Paper
Add Code

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

1 code implementation • 28 Mar 2024 • Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramer, Hamed Hassani, Eric Wong

To address these challenges, we introduce JailbreakBench, an open-sourced benchmark with the following components: (1) a new jailbreaking dataset containing 100 unique behaviors, which we call JBB-Behaviors; (2) an evolving repository of state-of-the-art adversarial prompts, which we refer to as jailbreak artifacts; (3) a standardized evaluation framework that includes a clearly defined threat model, system prompts, chat templates, and scoring functions; and (4) a leaderboard that tracks the performance of attacks and defenses for various LLMs.

Paper
Code

A Safe Harbor for AI Evaluation and Red Teaming

no code implementations • 7 Mar 2024 • Shayne Longpre, Sayash Kapoor, Kevin Klyman, Ashwin Ramaswami, Rishi Bommasani, Borhane Blili-Hamelin, Yangsibo Huang, Aviya Skowron, Zheng-Xin Yong, Suhas Kotha, Yi Zeng, Weiyan Shi, Xianjun Yang, Reid Southen, Alexander Robey, Patrick Chao, Diyi Yang, Ruoxi Jia, Daniel Kang, Sandy Pentland, Arvind Narayanan, Percy Liang, Peter Henderson

Independent evaluation and red teaming are critical for identifying the risks posed by generative AI systems.

Paper
Add Code

Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing

1 code implementation • 25 Feb 2024 • Jiabao Ji, Bairu Hou, Alexander Robey, George J. Pappas, Hamed Hassani, Yang Zhang, Eric Wong, Shiyu Chang

Aligned large language models (LLMs) are vulnerable to jailbreaking attacks, which bypass the safeguards of targeted LLMs and fool them into generating objectionable content.

Instruction Following

Paper
Code

Data-Driven Modeling and Verification of Perception-Based Autonomous Systems

no code implementations • 11 Dec 2023 • Thomas Waite, Alexander Robey, Hassani Hamed, George J. Pappas, Radoslav Ivanov

This paper addresses the problem of data-driven modeling and verification of perception-based autonomous systems.

Navigate

Paper
Add Code

Jailbreaking Black Box Large Language Models in Twenty Queries

1 code implementation • 12 Oct 2023 • Patrick Chao, Alexander Robey, Edgar Dobriban, Hamed Hassani, George J. Pappas, Eric Wong

PAIR -- which is inspired by social engineering attacks -- uses an attacker LLM to automatically generate jailbreaks for a separate targeted LLM without human intervention.

242

Paper
Code

SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks

1 code implementation • 5 Oct 2023 • Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas

Despite efforts to align large language models (LLMs) with human values, widely-used LLMs such as GPT, Llama, Claude, and PaLM are susceptible to jailbreaking attacks, wherein an adversary fools a targeted LLM into generating objectionable content.

Paper
Code

Adversarial Training Should Be Cast as a Non-Zero-Sum Game

no code implementations • 19 Jun 2023 • Alexander Robey, Fabian Latorre, George J. Pappas, Hamed Hassani, Volkan Cevher

One prominent approach toward resolving the adversarial vulnerability of deep neural networks is the two-player zero-sum paradigm of adversarial training, in which predictors are trained against adversarially chosen perturbations of data.

Paper
Add Code

Probable Domain Generalization via Quantile Risk Minimization

2 code implementations • 20 Jul 2022 • Cian Eastwood, Alexander Robey, Shashank Singh, Julius von Kügelgen, Hamed Hassani, George J. Pappas, Bernhard Schölkopf

By minimizing the $\alpha$-quantile of predictor's risk distribution over domains, QRM seeks predictors that perform well with probability $\alpha$.

Domain Generalization

1,328

Paper
Code

Toward Certified Robustness Against Real-World Distribution Shifts

1 code implementation • 8 Jun 2022 • Haoze Wu, Teruhiro Tagomori, Alexander Robey, Fengjun Yang, Nikolai Matni, George Pappas, Hamed Hassani, Corina Pasareanu, Clark Barrett

We consider the problem of certifying the robustness of deep neural networks against real-world distribution shifts.

Paper
Code

Chordal Sparsity for Lipschitz Constant Estimation of Deep Neural Networks

1 code implementation • 2 Apr 2022 • Anton Xue, Lars Lindemann, Alexander Robey, Hamed Hassani, George J. Pappas, Rajeev Alur

Lipschitz constants of neural networks allow for guarantees of robustness in image classification, safety in controller design, and generalizability beyond the training data.

Image Classification Navigate

Paper
Code

Do Deep Networks Transfer Invariances Across Classes?

1 code implementation • ICLR 2022 • Allan Zhou, Fahim Tajwar, Alexander Robey, Tom Knowles, George J. Pappas, Hamed Hassani, Chelsea Finn

Based on this analysis, we show how a generative approach for learning the nuisance transformations can help transfer invariances across classes and improve performance on a set of imbalanced image classification benchmarks.

Ranked #21 on Long-tail Learning on CIFAR-10-LT (ρ=100)

Image Classification Long-tail Learning

Paper
Code

Probabilistically Robust Learning: Balancing Average- and Worst-case Performance

1 code implementation • 2 Feb 2022 • Alexander Robey, Luiz F. O. Chamon, George J. Pappas, Hamed Hassani

From a theoretical point of view, this framework overcomes the trade-offs between the performance and the sample-complexity of worst-case and average-case learning.

Paper
Code

Learning Robust Output Control Barrier Functions from Safe Expert Demonstrations

1 code implementation • 18 Nov 2021 • Lars Lindemann, Alexander Robey, Lejun Jiang, Satyajeet Das, Stephen Tu, Nikolai Matni

Along with the optimization problem, we provide verifiable conditions in terms of the density of the data, smoothness of the system model and state estimator, and the size of the error bounds that guarantee validity of the obtained ROCBF.

Autonomous Driving

Paper
Code

Adversarial Robustness with Semi-Infinite Constrained Learning

no code implementations • NeurIPS 2021 • Alexander Robey, Luiz F. O. Chamon, George J. Pappas, Hamed Hassani, Alejandro Ribeiro

In particular, we leverage semi-infinite optimization and non-convex duality theory to show that adversarial training is equivalent to a statistical problem over perturbation distributions, which we characterize completely.

Adversarial Robustness

Paper
Add Code

Model-Based Domain Generalization

1 code implementation • NeurIPS 2021 • Alexander Robey, George J. Pappas, Hamed Hassani

Despite remarkable success in a variety of applications, it is well-known that deep learning can fail catastrophically when presented with out-of-distribution data.

Domain Generalization

Paper
Code

On the Sample Complexity of Stability Constrained Imitation Learning

no code implementations • 18 Feb 2021 • Stephen Tu, Alexander Robey, Tingnan Zhang, Nikolai Matni

We study the following question in the context of imitation learning for continuous control: how are the underlying stability properties of an expert policy reflected in the sample-complexity of an imitation learning task?

Continuous Control Generalization Bounds +1

Paper
Add Code

Learning Robust Hybrid Control Barrier Functions for Uncertain Systems

1 code implementation • 16 Jan 2021 • Alexander Robey, Lars Lindemann, Stephen Tu, Nikolai Matni

We identify sufficient conditions on the data such that feasibility of the optimization problem ensures correctness of the learned robust hybrid control barrier functions.

Paper
Code

Learning Hybrid Control Barrier Functions from Data

no code implementations • 8 Nov 2020 • Lars Lindemann, Haimin Hu, Alexander Robey, Hanwen Zhang, Dimos V. Dimarogonas, Stephen Tu, Nikolai Matni

Motivated by the lack of systematic tools to obtain safe control laws for hybrid systems, we propose an optimization-based framework for learning certifiably safe control laws from data.

Paper
Add Code

Provable tradeoffs in adversarially robust classification

no code implementations • 9 Jun 2020 • Edgar Dobriban, Hamed Hassani, David Hong, Alexander Robey

It is well known that machine learning methods can be vulnerable to adversarially-chosen perturbations of their inputs.

Classification General Classification +1

Paper
Add Code

Model-Based Robust Deep Learning: Generalizing to Natural, Out-of-Distribution Data

1 code implementation • 20 May 2020 • Alexander Robey, Hamed Hassani, George J. Pappas

Indeed, natural variation such as lighting or weather conditions can significantly degrade the accuracy of trained neural networks, proving that such natural variation presents a significant challenge for deep learning.

Adversarial Robustness

Paper
Code

Learning Control Barrier Functions from Expert Demonstrations

1 code implementation • 7 Apr 2020 • Alexander Robey, Haimin Hu, Lars Lindemann, Hanwen Zhang, Dimos V. Dimarogonas, Stephen Tu, Nikolai Matni

Furthermore, if the CBF parameterization is convex, then under mild assumptions, so is our learning process.

Paper
Code

Optimal Algorithms for Submodular Maximization with Distributed Constraints

no code implementations • 30 Sep 2019 • Alexander Robey, Arman Adibi, Brent Schlotfeldt, George J. Pappas, Hamed Hassani

Given this distributed setting, we develop Constraint-Distributed Continuous Greedy (CDCG), a message passing algorithm that converges to the tight $(1-1/e)$ approximation factor of the optimum global solution using only local computation and communication.

Paper
Add Code

Efficient and Accurate Estimation of Lipschitz Constants for Deep Neural Networks

1 code implementation • NeurIPS 2019 • Mahyar Fazlyab, Alexander Robey, Hamed Hassani, Manfred Morari, George J. Pappas

The resulting SDP can be adapted to increase either the estimation accuracy (by capturing the interaction between activation functions of different layers) or scalability (by decomposition and parallel implementation).

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.