Search Results for author: Zeyi Liao

Found 12 papers, 10 papers with code

AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts

1 code implementation29 Oct 2024 Vishal Kumar, Zeyi Liao, Jaylen Jones, Huan Sun

Notably, AmpleGCG-Plus jailbreaks the newer GPT-4o series of models at similar rates to GPT-4, and, uncovers vulnerabilities against the recently proposed circuit breakers defense.

AdvWeb: Controllable Black-box Attacks on VLM-powered Web Agents

no code implementations22 Oct 2024 Chejian Xu, Mintong Kang, Jiawei Zhang, Zeyi Liao, Lingbo Mo, Mengqi Yuan, Huan Sun, Bo Li

Vision Language Models (VLMs) have revolutionized the creation of generalist web agents, empowering them to autonomously complete diverse tasks on real-world websites, thereby boosting human efficiency and productivity.

EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage

1 code implementation17 Sep 2024 Zeyi Liao, Lingbo Mo, Chejian Xu, Mintong Kang, Jiawei Zhang, Chaowei Xiao, Yuan Tian, Bo Li, Huan Sun

In this work, we narrow this gap by conducting the first study on the privacy risks of generalist web agents in adversarial environments.

Introducing v0.5 of the AI Safety Benchmark from MLCommons

1 code implementation18 Apr 2024 Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Max Bartolo, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller, Ram Gandikota, Agasthya Gangavarapu, Ananya Gangavarapu, James Gealy, Rajat Ghosh, James Goel, Usman Gohar, Sujata Goswami, Scott A. Hale, Wiebke Hutiri, Joseph Marvin Imperial, Surgan Jandial, Nick Judd, Felix Juefei-Xu, Foutse khomh, Bhavya Kailkhura, Hannah Rose Kirk, Kevin Klyman, Chris Knotz, Michael Kuchnik, Shachi H. Kumar, Srijan Kumar, Chris Lengerich, Bo Li, Zeyi Liao, Eileen Peters Long, Victor Lu, Sarah Luger, Yifan Mai, Priyanka Mary Mammen, Kelvin Manyeki, Sean McGregor, Virendra Mehta, Shafee Mohammed, Emanuel Moss, Lama Nachman, Dinesh Jinenhally Naganna, Amin Nikanjam, Besmira Nushi, Luis Oala, Iftach Orr, Alicia Parrish, Cigdem Patlak, William Pietri, Forough Poursabzi-Sangdeh, Eleonora Presani, Fabrizio Puletti, Paul Röttger, Saurav Sahay, Tim Santos, Nino Scherrer, Alice Schoenauer Sebag, Patrick Schramowski, Abolfazl Shahbazi, Vin Sharma, Xudong Shen, Vamsi Sistla, Leonard Tang, Davide Testuggine, Vithursan Thangarasa, Elizabeth Anne Watkins, Rebecca Weiss, Chris Welty, Tyler Wilbers, Adina Williams, Carole-Jean Wu, Poonam Yadav, Xianjun Yang, Yi Zeng, Wenhui Zhang, Fedor Zhdanov, Jiacheng Zhu, Percy Liang, Peter Mattson, Joaquin Vanschoren

We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0. 5 benchmark.

AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs

1 code implementation11 Apr 2024 Zeyi Liao, Huan Sun

Moreover, we utilize those successful suffixes as training data to learn a generative model, named AmpleGCG, which captures the distribution of adversarial suffixes given a harmful query and enables the rapid generation of hundreds of suffixes for any harmful queries in seconds.

Safety Alignment

AttributionBench: How Hard is Automatic Attribution Evaluation?

1 code implementation23 Feb 2024 Yifei Li, Xiang Yue, Zeyi Liao, Huan Sun

Modern generative search engines enhance the reliability of large language model (LLM) responses by providing cited evidence.

Binary Classification Language Modeling +2

A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents

1 code implementation15 Feb 2024 Lingbo Mo, Zeyi Liao, Boyuan Zheng, Yu Su, Chaowei Xiao, Huan Sun

There is a surprisingly large gap between the speed and scale of their development and deployment and our understanding of their safety risks.

In Search of the Long-Tail: Systematic Generation of Long-Tail Inferential Knowledge via Logical Rule Guided Search

1 code implementation13 Nov 2023 Huihan Li, Yuting Ning, Zeyi Liao, Siyuan Wang, Xiang Lorraine Li, Ximing Lu, Wenting Zhao, Faeze Brahman, Yejin Choi, Xiang Ren

To effectively use large language models (LLMs) for real-world queries, it is imperative that they generalize to the long-tail distribution, i. e. rare examples where models exhibit low confidence.

Language Modelling Natural Language Inference +1

ChatCounselor: A Large Language Models for Mental Health Support

1 code implementation27 Sep 2023 June M. Liu, Donghao Li, He Cao, Tianhe Ren, Zeyi Liao, Jiamin Wu

This paper presents ChatCounselor, a large language model (LLM) solution designed to provide mental health support.

Language Modeling Language Modelling +1

RobustLR: Evaluating Robustness to Logical Perturbation in Deductive Reasoning

1 code implementation25 May 2022 Soumya Sanyal, Zeyi Liao, Xiang Ren

Transformers have been shown to be able to perform deductive reasoning on a logical rulebase containing rules and statements written in English natural language.

Logical Reasoning Negation

Cannot find the paper you are looking for? You can Submit a new open access paper.