Search Results for author: Jeffrey Ladish

Found 4 papers, 1 papers with code

LoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70B

no code implementations31 Oct 2023 Simon Lermen, Charlie Rogers-Smith, Jeffrey Ladish

Our fine-tuning method retains general performance, which we validate by comparing our fine-tuned models against Llama 2-Chat across two benchmarks.

BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B

no code implementations31 Oct 2023 Pranav Gade, Simon Lermen, Charlie Rogers-Smith, Jeffrey Ladish

Llama 2-Chat is a collection of large language models that Meta developed and released to the public.

Cannot find the paper you are looking for? You can Submit a new open access paper.