Search Results for author: Alexander Robey

Found 24 papers, 15 papers with code

Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation

no code implementations28 Mar 2024 Yutong He, Alexander Robey, Naoki Murata, Yiding Jiang, Joshua Williams, George J. Pappas, Hamed Hassani, Yuki Mitsufuji, Ruslan Salakhutdinov, J. Zico Kolter

Prompt engineering is effective for controlling the output of text-to-image (T2I) generative models, but it is also laborious due to the need for manually crafted prompts.

In-Context Learning Language Modelling +3

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

1 code implementation28 Mar 2024 Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramer, Hamed Hassani, Eric Wong

To address these challenges, we introduce JailbreakBench, an open-sourced benchmark with the following components: (1) a new jailbreaking dataset containing 100 unique behaviors, which we call JBB-Behaviors; (2) an evolving repository of state-of-the-art adversarial prompts, which we refer to as jailbreak artifacts; (3) a standardized evaluation framework that includes a clearly defined threat model, system prompts, chat templates, and scoring functions; and (4) a leaderboard that tracks the performance of attacks and defenses for various LLMs.

Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing

1 code implementation25 Feb 2024 Jiabao Ji, Bairu Hou, Alexander Robey, George J. Pappas, Hamed Hassani, Yang Zhang, Eric Wong, Shiyu Chang

Aligned large language models (LLMs) are vulnerable to jailbreaking attacks, which bypass the safeguards of targeted LLMs and fool them into generating objectionable content.

Instruction Following

Data-Driven Modeling and Verification of Perception-Based Autonomous Systems

no code implementations11 Dec 2023 Thomas Waite, Alexander Robey, Hassani Hamed, George J. Pappas, Radoslav Ivanov

This paper addresses the problem of data-driven modeling and verification of perception-based autonomous systems.

Navigate

Jailbreaking Black Box Large Language Models in Twenty Queries

1 code implementation12 Oct 2023 Patrick Chao, Alexander Robey, Edgar Dobriban, Hamed Hassani, George J. Pappas, Eric Wong

PAIR -- which is inspired by social engineering attacks -- uses an attacker LLM to automatically generate jailbreaks for a separate targeted LLM without human intervention.

SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks

1 code implementation5 Oct 2023 Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas

Despite efforts to align large language models (LLMs) with human values, widely-used LLMs such as GPT, Llama, Claude, and PaLM are susceptible to jailbreaking attacks, wherein an adversary fools a targeted LLM into generating objectionable content.

Adversarial Training Should Be Cast as a Non-Zero-Sum Game

no code implementations19 Jun 2023 Alexander Robey, Fabian Latorre, George J. Pappas, Hamed Hassani, Volkan Cevher

One prominent approach toward resolving the adversarial vulnerability of deep neural networks is the two-player zero-sum paradigm of adversarial training, in which predictors are trained against adversarially chosen perturbations of data.

Probable Domain Generalization via Quantile Risk Minimization

2 code implementations20 Jul 2022 Cian Eastwood, Alexander Robey, Shashank Singh, Julius von Kügelgen, Hamed Hassani, George J. Pappas, Bernhard Schölkopf

By minimizing the $\alpha$-quantile of predictor's risk distribution over domains, QRM seeks predictors that perform well with probability $\alpha$.

Domain Generalization

Toward Certified Robustness Against Real-World Distribution Shifts

1 code implementation8 Jun 2022 Haoze Wu, Teruhiro Tagomori, Alexander Robey, Fengjun Yang, Nikolai Matni, George Pappas, Hamed Hassani, Corina Pasareanu, Clark Barrett

We consider the problem of certifying the robustness of deep neural networks against real-world distribution shifts.

Chordal Sparsity for Lipschitz Constant Estimation of Deep Neural Networks

1 code implementation2 Apr 2022 Anton Xue, Lars Lindemann, Alexander Robey, Hamed Hassani, George J. Pappas, Rajeev Alur

Lipschitz constants of neural networks allow for guarantees of robustness in image classification, safety in controller design, and generalizability beyond the training data.

Image Classification Navigate

Do Deep Networks Transfer Invariances Across Classes?

1 code implementation ICLR 2022 Allan Zhou, Fahim Tajwar, Alexander Robey, Tom Knowles, George J. Pappas, Hamed Hassani, Chelsea Finn

Based on this analysis, we show how a generative approach for learning the nuisance transformations can help transfer invariances across classes and improve performance on a set of imbalanced image classification benchmarks.

Image Classification Long-tail Learning

Probabilistically Robust Learning: Balancing Average- and Worst-case Performance

1 code implementation2 Feb 2022 Alexander Robey, Luiz F. O. Chamon, George J. Pappas, Hamed Hassani

From a theoretical point of view, this framework overcomes the trade-offs between the performance and the sample-complexity of worst-case and average-case learning.

Learning Robust Output Control Barrier Functions from Safe Expert Demonstrations

1 code implementation18 Nov 2021 Lars Lindemann, Alexander Robey, Lejun Jiang, Satyajeet Das, Stephen Tu, Nikolai Matni

Along with the optimization problem, we provide verifiable conditions in terms of the density of the data, smoothness of the system model and state estimator, and the size of the error bounds that guarantee validity of the obtained ROCBF.

Autonomous Driving

Adversarial Robustness with Semi-Infinite Constrained Learning

no code implementations NeurIPS 2021 Alexander Robey, Luiz F. O. Chamon, George J. Pappas, Hamed Hassani, Alejandro Ribeiro

In particular, we leverage semi-infinite optimization and non-convex duality theory to show that adversarial training is equivalent to a statistical problem over perturbation distributions, which we characterize completely.

Adversarial Robustness

Model-Based Domain Generalization

1 code implementation NeurIPS 2021 Alexander Robey, George J. Pappas, Hamed Hassani

Despite remarkable success in a variety of applications, it is well-known that deep learning can fail catastrophically when presented with out-of-distribution data.

Domain Generalization

On the Sample Complexity of Stability Constrained Imitation Learning

no code implementations18 Feb 2021 Stephen Tu, Alexander Robey, Tingnan Zhang, Nikolai Matni

We study the following question in the context of imitation learning for continuous control: how are the underlying stability properties of an expert policy reflected in the sample-complexity of an imitation learning task?

Continuous Control Generalization Bounds +1

Learning Robust Hybrid Control Barrier Functions for Uncertain Systems

1 code implementation16 Jan 2021 Alexander Robey, Lars Lindemann, Stephen Tu, Nikolai Matni

We identify sufficient conditions on the data such that feasibility of the optimization problem ensures correctness of the learned robust hybrid control barrier functions.

Learning Hybrid Control Barrier Functions from Data

no code implementations8 Nov 2020 Lars Lindemann, Haimin Hu, Alexander Robey, Hanwen Zhang, Dimos V. Dimarogonas, Stephen Tu, Nikolai Matni

Motivated by the lack of systematic tools to obtain safe control laws for hybrid systems, we propose an optimization-based framework for learning certifiably safe control laws from data.

Provable tradeoffs in adversarially robust classification

no code implementations9 Jun 2020 Edgar Dobriban, Hamed Hassani, David Hong, Alexander Robey

It is well known that machine learning methods can be vulnerable to adversarially-chosen perturbations of their inputs.

Classification General Classification +1

Model-Based Robust Deep Learning: Generalizing to Natural, Out-of-Distribution Data

1 code implementation20 May 2020 Alexander Robey, Hamed Hassani, George J. Pappas

Indeed, natural variation such as lighting or weather conditions can significantly degrade the accuracy of trained neural networks, proving that such natural variation presents a significant challenge for deep learning.

Adversarial Robustness

Learning Control Barrier Functions from Expert Demonstrations

1 code implementation7 Apr 2020 Alexander Robey, Haimin Hu, Lars Lindemann, Hanwen Zhang, Dimos V. Dimarogonas, Stephen Tu, Nikolai Matni

Furthermore, if the CBF parameterization is convex, then under mild assumptions, so is our learning process.

Optimal Algorithms for Submodular Maximization with Distributed Constraints

no code implementations30 Sep 2019 Alexander Robey, Arman Adibi, Brent Schlotfeldt, George J. Pappas, Hamed Hassani

Given this distributed setting, we develop Constraint-Distributed Continuous Greedy (CDCG), a message passing algorithm that converges to the tight $(1-1/e)$ approximation factor of the optimum global solution using only local computation and communication.

Efficient and Accurate Estimation of Lipschitz Constants for Deep Neural Networks

1 code implementation NeurIPS 2019 Mahyar Fazlyab, Alexander Robey, Hamed Hassani, Manfred Morari, George J. Pappas

The resulting SDP can be adapted to increase either the estimation accuracy (by capturing the interaction between activation functions of different layers) or scalability (by decomposition and parallel implementation).

Cannot find the paper you are looking for? You can Submit a new open access paper.