Search Results for author: Zeyi Liao

Found 7 papers, 7 papers with code

Introducing v0.5 of the AI Safety Benchmark from MLCommons

1 code implementation18 Apr 2024 Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller, Ram Gandikota, Agasthya Gangavarapu, Ananya Gangavarapu, James Gealy, Rajat Ghosh, James Goel, Usman Gohar, Sujata Goswami, Scott A. Hale, Wiebke Hutiri, Joseph Marvin Imperial, Surgan Jandial, Nick Judd, Felix Juefei-Xu, Foutse khomh, Bhavya Kailkhura, Hannah Rose Kirk, Kevin Klyman, Chris Knotz, Michael Kuchnik, Shachi H. Kumar, Chris Lengerich, Bo Li, Zeyi Liao, Eileen Peters Long, Victor Lu, Yifan Mai, Priyanka Mary Mammen, Kelvin Manyeki, Sean McGregor, Virendra Mehta, Shafee Mohammed, Emanuel Moss, Lama Nachman, Dinesh Jinenhally Naganna, Amin Nikanjam, Besmira Nushi, Luis Oala, Iftach Orr, Alicia Parrish, Cigdem Patlak, William Pietri, Forough Poursabzi-Sangdeh, Eleonora Presani, Fabrizio Puletti, Paul Röttger, Saurav Sahay, Tim Santos, Nino Scherrer, Alice Schoenauer Sebag, Patrick Schramowski, Abolfazl Shahbazi, Vin Sharma, Xudong Shen, Vamsi Sistla, Leonard Tang, Davide Testuggine, Vithursan Thangarasa, Elizabeth Anne Watkins, Rebecca Weiss, Chris Welty, Tyler Wilbers, Adina Williams, Carole-Jean Wu, Poonam Yadav, Xianjun Yang, Yi Zeng, Wenhui Zhang, Fedor Zhdanov, Jiacheng Zhu, Percy Liang, Peter Mattson, Joaquin Vanschoren

We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0. 5 benchmark.

AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs

1 code implementation11 Apr 2024 Zeyi Liao, Huan Sun

Moreover, we utilize those successful suffixes as training data to learn a generative model, named AmpleGCG, which captures the distribution of adversarial suffixes given a harmful query and enables the rapid generation of hundreds of suffixes for any harmful queries in seconds.

AttributionBench: How Hard is Automatic Attribution Evaluation?

1 code implementation23 Feb 2024 Yifei Li, Xiang Yue, Zeyi Liao, Huan Sun

Modern generative search engines enhance the reliability of large language model (LLM) responses by providing cited evidence.

Binary Classification Language Modelling +1

A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents

1 code implementation15 Feb 2024 Lingbo Mo, Zeyi Liao, Boyuan Zheng, Yu Su, Chaowei Xiao, Huan Sun

There is a surprisingly large gap between the speed and scale of their development and deployment and our understanding of their safety risks.

In Search of the Long-Tail: Systematic Generation of Long-Tail Inferential Knowledge via Logical Rule Guided Search

1 code implementation13 Nov 2023 Huihan Li, Yuting Ning, Zeyi Liao, Siyuan Wang, Xiang Lorraine Li, Ximing Lu, Wenting Zhao, Faeze Brahman, Yejin Choi, Xiang Ren

We further use the data generated by LINK to construct a dataset Logic-Induced-Long-Tail (LINT) that can be used to evaluate downstream models on the long-tail distribution; LINT contains 108K knowledge statements spanning four domains.

Language Modelling Natural Language Inference +1

ChatCounselor: A Large Language Models for Mental Health Support

1 code implementation27 Sep 2023 June M. Liu, Donghao Li, He Cao, Tianhe Ren, Zeyi Liao, Jiamin Wu

This paper presents ChatCounselor, a large language model (LLM) solution designed to provide mental health support.

Language Modelling Large Language Model

RobustLR: Evaluating Robustness to Logical Perturbation in Deductive Reasoning

1 code implementation25 May 2022 Soumya Sanyal, Zeyi Liao, Xiang Ren

Transformers have been shown to be able to perform deductive reasoning on a logical rulebase containing rules and statements written in English natural language.

Logical Reasoning Negation

Cannot find the paper you are looking for? You can Submit a new open access paper.