Search Results for author: Zhouxing Shi

Found 15 papers, 9 papers with code

Defending LLMs against Jailbreaking Attacks via Backtranslation

1 code implementation26 Feb 2024 Yihan Wang, Zhouxing Shi, Andrew Bai, Cho-Jui Hsieh

The inferred prompt is called the backtranslated prompt which tends to reveal the actual intent of the original prompt, since it is generated based on the LLM's response and is not directly manipulated by the attacker.

Language Modelling

Improving the Generation Quality of Watermarked Large Language Models via Word Importance Scoring

no code implementations16 Nov 2023 Yuhang Li, Yihan Wang, Zhouxing Shi, Cho-Jui Hsieh

In this work, we propose to improve the quality of texts generated by a watermarked language model by Watermarking with Importance Scoring (WIS).

Language Modelling

Red Teaming Language Model Detectors with Language Models

2 code implementations31 May 2023 Zhouxing Shi, Yihan Wang, Fan Yin, Xiangning Chen, Kai-Wei Chang, Cho-Jui Hsieh

The prevalence and strong capability of large language models (LLMs) present significant safety and ethical risks if exploited by malicious users.

Adversarial Robustness Language Modelling +2

Efficiently Computing Local Lipschitz Constants of Neural Networks via Bound Propagation

2 code implementations13 Oct 2022 Zhouxing Shi, Yihan Wang, huan zhang, Zico Kolter, Cho-Jui Hsieh

In this paper, we develop an efficient framework for computing the $\ell_\infty$ local Lipschitz constant of a neural network by tightly upper bounding the norm of Clarke Jacobian via linear bound propagation.

Fairness

On the Convergence of Certified Robust Training with Interval Bound Propagation

no code implementations ICLR 2022 Yihan Wang, Zhouxing Shi, Quanquan Gu, Cho-Jui Hsieh

Interval Bound Propagation (IBP) is so far the base of state-of-the-art methods for training neural networks with certifiable robustness guarantees when potential adversarial perturbations present, while the convergence of IBP training remains unknown in existing literature.

On the Sensitivity and Stability of Model Interpretations in NLP

1 code implementation ACL 2022 Fan Yin, Zhouxing Shi, Cho-Jui Hsieh, Kai-Wei Chang

We propose two new criteria, sensitivity and stability, that provide complementary notions of faithfulness to the existed removal-based criteria.

Adversarial Robustness Dependency Parsing +2

Fast Certified Robust Training with Short Warmup

2 code implementations NeurIPS 2021 Zhouxing Shi, Yihan Wang, huan zhang, JinFeng Yi, Cho-Jui Hsieh

Despite that state-of-the-art (SOTA) methods including interval bound propagation (IBP) and CROWN-IBP have per-batch training complexity similar to standard neural network training, they usually use a long warmup schedule with hundreds or thousands epochs to reach SOTA performance and are thus still costly.

Adversarial Defense

On the Adversarial Robustness of Vision Transformers

1 code implementation29 Mar 2021 Rulin Shao, Zhouxing Shi, JinFeng Yi, Pin-Yu Chen, Cho-Jui Hsieh

Following the success in advancing natural language processing and understanding, transformers are expected to bring revolutionary changes to computer vision.

Adversarial Robustness

Robust Text CAPTCHAs Using Adversarial Examples

no code implementations7 Jan 2021 Rulin Shao, Zhouxing Shi, JinFeng Yi, Pin-Yu Chen, Cho-Jui Hsieh

At the second stage, we design and apply a highly transferable adversarial attack for text CAPTCHAs to better obstruct CAPTCHA solvers.

Adversarial Attack Optical Character Recognition (OCR)

Learning Contextual Perturbation Budgets for Training Robust Neural Networks

no code implementations1 Jan 2021 Jing Xu, Zhouxing Shi, huan zhang, JinFeng Yi, Cho-Jui Hsieh, LiWei Wang

We also demonstrate that the perturbation budget generator can produce semantically-meaningful budgets, which implies that the generator can capture contextual information and the sensitivity of different features in a given image.

Knowledge-Aided Open-Domain Question Answering

no code implementations9 Jun 2020 Mantong Zhou, Zhouxing Shi, Minlie Huang, Xiaoyan Zhu

During document retrieval, a candidate document is scored by considering its relationship to the question and other documents.

Open-Domain Question Answering Reading Comprehension +1

Automatic Perturbation Analysis for Scalable Certified Robustness and Beyond

5 code implementations NeurIPS 2020 Kaidi Xu, Zhouxing Shi, huan zhang, Yihan Wang, Kai-Wei Chang, Minlie Huang, Bhavya Kailkhura, Xue Lin, Cho-Jui Hsieh

Linear relaxation based perturbation analysis (LiRPA) for neural networks, which computes provable linear bounds of output neurons given a certain amount of input perturbation, has become a core component in robustness verification and certified defense.

Quantization

Robustness Verification for Transformers

1 code implementation ICLR 2020 Zhouxing Shi, huan zhang, Kai-Wei Chang, Minlie Huang, Cho-Jui Hsieh

Robustness verification that aims to formally certify the prediction behavior of neural networks has become an important tool for understanding model behavior and obtaining safety guarantees.

Position Sentiment Analysis

Robustness to Modification with Shared Words in Paraphrase Identification

no code implementations Findings of the Association for Computational Linguistics 2020 Zhouxing Shi, Minlie Huang

Revealing the robustness issues of natural language processing models and improving their robustness is important to their performance under difficult situations.

Language Modelling Paraphrase Identification +2

A Deep Sequential Model for Discourse Parsing on Multi-Party Dialogues

1 code implementation1 Dec 2018 Zhouxing Shi, Minlie Huang

This paper presents a deep sequential model for parsing discourse dependency structures of multi-party dialogues.

Discourse Parsing Link Prediction +1

Cannot find the paper you are looking for? You can Submit a new open access paper.