Search Results for author: Divij Handa

Found 2 papers, 0 papers with code

Jailbreaking Proprietary Large Language Models using Word Substitution Cipher

no code implementations16 Feb 2024 Divij Handa, Advait Chirmule, Bimal Gajera, Chitta Baral

We first present a pilot study on the state-of-the-art LLM, GPT-4, in decoding several safe sentences that have been encrypted using various cryptographic techniques and find that a straightforward word substitution cipher can be decoded most effectively.

Cannot find the paper you are looking for? You can Submit a new open access paper.