Search Results for author: David Dobre

Found 7 papers, 2 papers with code

Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space

no code implementations14 Feb 2024 Leo Schwinn, David Dobre, Sophie Xhonneux, Gauthier Gidel, Stephan Gunnemann

We address this research gap and propose the embedding space attack, which directly attacks the continuous embedding representation of input tokens.

Adversarial Robustness

In-Context Learning Can Re-learn Forbidden Tasks

no code implementations8 Feb 2024 Sophie Xhonneux, David Dobre, Jian Tang, Gauthier Gidel, Dhanya Sridhar

Specifically, we investigate whether in-context learning (ICL) can be used to re-learn forbidden tasks despite the explicit fine-tuning of the model to refuse them.

In-Context Learning Misinformation +2

Adversarial Attacks and Defenses in Large Language Models: Old and New Threats

1 code implementation30 Oct 2023 Leo Schwinn, David Dobre, Stephan Günnemann, Gauthier Gidel

Here, one major impediment has been the overestimation of the robustness of new defense approaches due to faulty defense evaluations.

Raising the Bar for Certified Adversarial Robustness with Diffusion Models

no code implementations17 May 2023 Thomas Altstidl, David Dobre, Björn Eskofier, Gauthier Gidel, Leo Schwinn

In this work, we demonstrate that a similar approach can substantially improve deterministic certified defenses.

Adversarial Robustness

Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features

no code implementations23 Apr 2023 Aleksandr Beznosikov, David Dobre, Gauthier Gidel

Moreover, our second approach does not require either large batches or full deterministic gradients, which is a typical weakness of many techniques for finite-sum problems.

Dissecting adaptive methods in GANs

no code implementations9 Oct 2022 Samy Jelassi, David Dobre, Arthur Mensch, Yuanzhi Li, Gauthier Gidel

By considering an update rule with the magnitude of the Adam update and the normalized direction of SGD, we empirically show that the adaptive magnitude of Adam is key for GAN training.

Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise

1 code implementation2 Jun 2022 Eduard Gorbunov, Marina Danilova, David Dobre, Pavel Dvurechensky, Alexander Gasnikov, Gauthier Gidel

In this work, we prove the first high-probability complexity results with logarithmic dependence on the confidence level for stochastic methods for solving monotone and structured non-monotone VIPs with non-sub-Gaussian (heavy-tailed) noise and unbounded domains.

Cannot find the paper you are looking for? You can Submit a new open access paper.