Search Results for author: Ambrish Rawat

Found 30 papers, 8 papers with code

Activated LoRA: Fine-tuned LLMs for Intrinsics

no code implementations16 Apr 2025 Kristjan Greenewald, Luis Lastras, Thomas Parnell, Vraj Shah, Lucian Popa, Giulio Zizzo, Chulaka Gunasekara, Ambrish Rawat, David Cox

This change crucially allows aLoRA to accept the base model's KV cache of the input string, meaning that aLoRA can be instantly activated whenever needed in a chain without recomputing the cache.

MAD-MAX: Modular And Diverse Malicious Attack MiXtures for Automated LLM Red Teaming

no code implementations8 Mar 2025 Stefan Schoepf, Muhammad Zaid Hameed, Ambrish Rawat, Kieran Fraser, Giulio Zizzo, Giandomenico Cornacchia, Mark Purcell

The MAD-MAX approach is designed to be easily extensible with newly discovered attack strategies and outperforms the prominent Red Teaming method Tree of Attacks with Pruning (TAP) significantly in terms of Attack Success Rate (ASR) and queries needed to achieve jailbreaks.

Red Teaming

Attention Tracker: Detecting Prompt Injection Attacks in LLMs

1 code implementation1 Nov 2024 Kuo-Han Hung, Ching-Yun Ko, Ambrish Rawat, I-Hsin Chung, Winston H. Hsu, Pin-Yu Chen

Large Language Models (LLMs) have revolutionized various domains but remain vulnerable to prompt injection attacks, where malicious inputs manipulate the model into ignoring original instructions and executing designated action.

MoJE: Mixture of Jailbreak Experts, Naive Tabular Classifiers as Guard for Prompt Attacks

no code implementations26 Sep 2024 Giandomenico Cornacchia, Giulio Zizzo, Kieran Fraser, Muhammad Zaid Hameed, Ambrish Rawat, Mark Purcell

The proliferation of Large Language Models (LLMs) in diverse applications underscores the pressing need for robust security measures to thwart potential jailbreak attacks.

Computational Efficiency

Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI

no code implementations23 Sep 2024 Ambrish Rawat, Stefan Schoepf, Giulio Zizzo, Giandomenico Cornacchia, Muhammad Zaid Hameed, Kieran Fraser, Erik Miehling, Beat Buesser, Elizabeth M. Daly, Mark Purcell, Prasanna Sattigeri, Pin-Yu Chen, Kush R. Varshney

As generative AI, particularly large language models (LLMs), become increasingly integrated into production applications, new attack surfaces and vulnerabilities emerge and put a focus on adversarial threats in natural language and multi-modal systems.

Red Teaming

Domain Adaptation for Time series Transformers using One-step fine-tuning

no code implementations12 Jan 2024 Subina Khanal, Seshu Tirupathi, Giulio Zizzo, Ambrish Rawat, Torben Bach Pedersen

To address these limitations, in this paper, we pre-train the time series Transformer model on a source domain with sufficient data and fine-tune it on the target domain with limited data.

Domain Adaptation Time Series +1

FairSISA: Ensemble Post-Processing to Improve Fairness of Unlearning in LLMs

no code implementations12 Dec 2023 Swanand Ravindra Kadhe, Anisa Halimi, Ambrish Rawat, Nathalie Baracaldo

We evaluate the performance-fairness trade-off for SISA, and empirically demsontrate that SISA can indeed reduce fairness in LLMs.

Fairness Unsupervised Pre-training

Matching Pairs: Attributing Fine-Tuned Models to their Pre-Trained Large Language Models

1 code implementation15 Jun 2023 Myles Foley, Ambrish Rawat, Taesung Lee, Yufang Hou, Gabriele Picco, Giulio Zizzo

The wide applicability and adaptability of generative large language models (LLMs) has enabled their rapid adoption.

Robust Learning Protocol for Federated Tumor Segmentation Challenge

no code implementations16 Dec 2022 Ambrish Rawat, Giulio Zizzo, Swanand Kadhe, Jonathan P. Epperlein, Stefano Braghin

In this work, we devise robust and efficient learning protocols for orchestrating a Federated Learning (FL) process for the Federated Tumor Segmentation Challenge (FeTS 2022).

Federated Learning Tumor Segmentation

Federated Unlearning: How to Efficiently Erase a Client in FL?

1 code implementation12 Jul 2022 Anisa Halimi, Swanand Kadhe, Ambrish Rawat, Nathalie Baracaldo

With privacy legislation empowering the users with the right to be forgotten, it has become essential to make a model amenable for forgetting some of its training data.

Federated Learning

Challenges and Pitfalls of Bayesian Unlearning

no code implementations7 Jul 2022 Ambrish Rawat, James Requeima, Wessel Bruinsma, Richard Turner

Machine unlearning refers to the task of removing a subset of training data, thereby removing its contributions to a trained model.

Machine Unlearning Variational Inference

Certified Federated Adversarial Training

no code implementations20 Dec 2021 Giulio Zizzo, Ambrish Rawat, Mathieu Sinn, Sergio Maffeis, Chris Hankin

We model an attacker who poisons the model to insert a weakness into the adversarial training such that the model displays apparent adversarial robustness, while the attacker can exploit the inserted weakness to bypass the adversarial training and force the model to misclassify adversarial examples.

Adversarial Robustness Federated Learning

Automated Robustness with Adversarial Training as a Post-Processing Step

no code implementations6 Sep 2021 Ambrish Rawat, Mathieu Sinn, Beat Buesser

Adversarial training is a computationally expensive task and hence searching for neural network architectures with robustness as the criterion can be challenging.

Deep Learning Image Classification +3

The Devil is in the GAN: Backdoor Attacks and Defenses in Deep Generative Models

1 code implementation3 Aug 2021 Ambrish Rawat, Killian Levacher, Mathieu Sinn

Deep Generative Models (DGMs) are a popular class of deep learning models which find widespread use because of their ability to synthesize data from complex, high-dimensional manifolds.

BIG-bench Machine Learning Data Augmentation +1

FAT: Federated Adversarial Training

no code implementations3 Dec 2020 Giulio Zizzo, Ambrish Rawat, Mathieu Sinn, Beat Buesser

Federated learning (FL) is one of the most important paradigms addressing privacy and data governance issues in machine learning (ML).

Adversarial Robustness Federated Learning

A Survey on Neural Architecture Search

no code implementations4 May 2019 Martin Wistuba, Ambrish Rawat, Tejaswini Pedapati

The growing interest in both the automation of machine learning and deep learning has inevitably led to the development of a wide variety of automated methods for neural architecture search.

Data Augmentation Deep Learning +4

Adversarial Robustness Toolbox v1.0.0

6 code implementations3 Jul 2018 Maria-Irina Nicolae, Mathieu Sinn, Minh Ngoc Tran, Beat Buesser, Ambrish Rawat, Martin Wistuba, Valentina Zantedeschi, Nathalie Baracaldo, Bryant Chen, Heiko Ludwig, Ian M. Molloy, Ben Edwards

Defending Machine Learning models involves certifying and verifying model robustness and model hardening with approaches such as pre-processing inputs, augmenting training data with adversarial samples, and leveraging runtime detection methods to flag any inputs that might have been modified by an adversary.

Adversarial Robustness BIG-bench Machine Learning +2

Scalable Multi-Class Bayesian Support Vector Machines for Structured and Unstructured Data

no code implementations7 Jun 2018 Martin Wistuba, Ambrish Rawat

We introduce a new Bayesian multi-class support vector machine by formulating a pseudo-likelihood for a multi-class hinge loss in the form of a location-scale mixture of Gaussians.

Active Learning General Classification +1

Adversarial Phenomenon in the Eyes of Bayesian Deep Learning

no code implementations22 Nov 2017 Ambrish Rawat, Martin Wistuba, Maria-Irina Nicolae

Deep Learning models are vulnerable to adversarial examples, i. e.\ images obtained via deliberate imperceptible perturbations, such that the model misclassifies them with high confidence.

Deep Learning

Open-World Visual Recognition Using Knowledge Graphs

no code implementations28 Aug 2017 Vincent P. A. Lonij, Ambrish Rawat, Maria-Irina Nicolae

First, a knowledge-graph representation is learned to embed a large set of entities into a semantic space.

Knowledge Graphs

Efficient Defenses Against Adversarial Attacks

no code implementations21 Jul 2017 Valentina Zantedeschi, Maria-Irina Nicolae, Ambrish Rawat

Following the recent adoption of deep neural networks (DNN) accross a wide range of applications, adversarial attacks against these models have proven to be an indisputable threat.

Non-parametric estimation of Jensen-Shannon Divergence in Generative Adversarial Network training

no code implementations25 May 2017 Mathieu Sinn, Ambrish Rawat

Generative Adversarial Networks (GANs) have become a widely popular framework for generative modelling of high-dimensional datasets.

Generative Adversarial Network

Cannot find the paper you are looking for? You can Submit a new open access paper.