Search Results for author: Jiyi Zhang

Found 8 papers, 0 papers with code

Semantic Mirror Jailbreak: Genetic Algorithm Based Jailbreak Prompts Against Open-source LLMs

no code implementations21 Feb 2024 Xiaoxia Li, Siyuan Liang, Jiyi Zhang, Han Fang, Aishan Liu, Ee-Chien Chang

Large Language Models (LLMs), used in creative writing, code generation, and translation, generate text based on input sequences but are vulnerable to jailbreak attacks, where crafted prompts induce harmful outputs.

Code Generation Semantic Similarity +1

Domain Bridge: Generative model-based domain forensic for black-box models

no code implementations7 Feb 2024 Jiyi Zhang, Han Fang, Ee-Chien Chang

In forensic investigations of machine learning models, techniques that determine a model's data domain play an essential role, with prior work relying on large-scale corpora like ImageNet to approximate the target model's domain.

Adaptive Attractors: A Defense Strategy against ML Adversarial Collusion Attacks

no code implementations2 Jun 2023 Jiyi Zhang, Han Fang, Ee-Chien Chang

This induces different adversarial regions in different copies, making adversarial samples generated on one copy not replicable on others.

Tracing the Origin of Adversarial Attack for Forensic Investigation and Deterrence

no code implementations ICCV 2023 Han Fang, Jiyi Zhang, Yupeng Qiu, Ke Xu, Chengfang Fang, Ee-Chien Chang

In this paper, we take the role of investigators who want to trace the attack and identify the source, that is, the particular model which the adversarial examples are generated from.

Adversarial Attack

Mitigating Adversarial Attacks by Distributing Different Copies to Different Users

no code implementations30 Nov 2021 Jiyi Zhang, Han Fang, Wesley Joon-Wie Tann, Ke Xu, Chengfang Fang, Ee-Chien Chang

We point out that by distributing different copies of the model to different buyers, we can mitigate the attack such that adversarial samples found on one copy would not work on another copy.

Confusing and Detecting ML Adversarial Attacks with Injected Attractors

no code implementations5 Mar 2020 Jiyi Zhang, Ee-Chien Chang, Hwee Kuan Lee

Many machine learning adversarial attacks find adversarial samples of a victim model ${\mathcal M}$ by following the gradient of some attack objective functions, either explicitly or implicitly.

Flipped-Adversarial AutoEncoders

no code implementations13 Feb 2018 Jiyi Zhang, Hung Dang, Hwee Kuan Lee, Ee-Chien Chang

We propose a flipped-Adversarial AutoEncoder (FAAE) that simultaneously trains a generative model G that maps an arbitrary latent code distribution to a data distribution and an encoder E that embodies an "inverse mapping" that encodes a data sample into a latent code vector.

Cannot find the paper you are looking for? You can Submit a new open access paper.