1 code implementation • 11 Jun 2024 • Victor Gallego
The robustness of large language models (LLMs) against adversarial manipulations, such as jailbreak attacks, remains a significant challenge.
1 code implementation • 30 Mar 2024 • Victor Gallego
State-of-the-art language model fine-tuning techniques, such as Direct Preference Optimization (DPO), restrict user control by hard-coding predefined behaviors into the model.
1 code implementation • 4 Dec 2023 • Victor Gallego
This paper proposes an interpretation of RLAIF as Bayesian inference by introducing distilled Self-Critique (dSC), which refines the outputs of a LLM through a Gibbs sampler that is later distilled into a fine-tuned model.
2 code implementations • 11 Aug 2023 • Victor Gallego
In this work, we address the problem of directing the text generation of a language model (LM) towards a desired behavior, aligning the generated text with the preferences of the human operator.
no code implementations • 15 Jul 2023 • Victor Gallego
Recently, large multimodal models, such as CLIP and Stable Diffusion have experimented tremendous successes in both foundations and applications.
1 code implementation • 25 Sep 2022 • Victor Gallego
This work proposes aesthetic gradients, a method to personalize a CLIP-conditioned diffusion model by guiding the generative process towards custom aesthetics defined by the user from a set of images.
1 code implementation • 18 Apr 2020 • Victor Gallego, Roi Naveiro, Alberto Redondo, David Rios Insua, Fabrizio Ruggeri
Classification problems in security settings are usually modeled as confrontations in which an adversary tries to fool a classifier manipulating the covariates of instances to obtain a benefit.
1 code implementation • 7 Mar 2020 • David Rios Insua, Roi Naveiro, Victor Gallego, Jason Poulos
Adversarial Machine Learning (AML) is emerging as a major field aimed at protecting machine learning (ML) systems against security threats: in certain scenarios there may be adversaries that actively manipulate input data to fool learning systems.
1 code implementation • pproximateinference AABI Symposium 2019 • Victor Gallego, David Rios Insua
A framework for efficient Bayesian inference in probabilistic programs is introduced by embedding a sampler inside a variational posterior approximation.
1 code implementation • 26 Aug 2019 • Victor Gallego, David Rios Insua
A framework to boost the efficiency of Bayesian inference in probabilistic programs is introduced by embedding a sampler inside a variational posterior approximation.
1 code implementation • 22 Aug 2019 • Victor Gallego, Roi Naveiro, David Rios Insua, David Gomez-Ullate Oteiza
We introduce Threatened Markov Decision Processes (TMDPs) as an extension of the classical Markov Decision Process framework for Reinforcement Learning (RL).
2 code implementations • 30 Nov 2018 • Victor Gallego, David Rios Insua
We propose a unifying view of two different Bayesian inference algorithms, Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) and Stein Variational Gradient Descent (SVGD), leading to improved and efficient novel sampling schemes.
1 code implementation • 5 Sep 2018 • Victor Gallego, Roi Naveiro, David Rios Insua
In several reinforcement learning (RL) scenarios, mainly in security settings, there may be adversaries trying to interfere with the reward generating process.