Search Results for author: Wenbo Guo

Found 31 papers, 15 papers with code

Co-PatcheR: Collaborative Software Patching with Component(s)-specific Small Reasoning Models

no code implementations25 May 2025 Yuheng Tang, Hongwei Li, Kaijie Zhu, Michael Yang, Yangruibo Ding, Wenbo Guo

Motivated by the collaborative nature, we propose Co-PatcheR, the first collaborative patching system with small and specialized reasoning models for individual components.

AgentVigil: Generic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents

no code implementations9 May 2025 Zhun Wang, Vincent Siu, Zhe Ye, Tianneng Shi, Yuzhou Nie, Xuandong Zhao, Chenguang Wang, Wenbo Guo, Dawn Song

The strong planning and reasoning capabilities of Large Language Models (LLMs) have fostered the development of agent-based systems capable of leveraging external tools and interacting with increasingly complex environments.

Navigate Red Teaming

Progent: Programmable Privilege Control for LLM Agents

1 code implementation16 Apr 2025 Tianneng Shi, Jingxuan He, Zhun Wang, Linyu Wu, Hongwei Li, Wenbo Guo, Dawn Song

At its core is a domain-specific language for flexibly expressing privilege control policies applied during agent execution.

Blocking

Frontier AI's Impact on the Cybersecurity Landscape

no code implementations7 Apr 2025 Wenbo Guo, Yujin Potter, Tianneng Shi, Zhun Wang, Andy Zhang, Dawn Song

As frontier AI advances rapidly, understanding its impact on cybersecurity and inherent risks is essential to ensuring safe AI evolution (e. g., guiding risk mitigation and informing policymakers).

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly

1 code implementation9 Feb 2025 Enquan Yang, Peng Xing, Hanyang Sun, Wenbo Guo, Yuanwei Ma, Zechao Li, Dan Zeng

The key features of 3CAD are that it covers anomalous regions of different sizes, multiple anomaly types, and the possibility of multiple anomalous regions and multiple anomaly types per anomaly image.

Unsupervised Anomaly Detection

MELON: Provable Defense Against Indirect Prompt Injection Attacks in AI Agents

1 code implementation7 Feb 2025 Kaijie Zhu, Xianjun Yang, Jindong Wang, Wenbo Guo, William Yang Wang

Recent research has explored that LLM agents are vulnerable to indirect prompt injection (IPI) attacks, where malicious tasks embedded in tool-retrieved information can redirect the agent to take unauthorized actions.

PatchPilot: A Cost-Efficient Software Engineering Agent with Early Attempts on Formal Verification

1 code implementation4 Feb 2025 Hongwei Li, Yuheng Tang, Shiqi Wang, Wenbo Guo

Based on how to determine the patching workflows, existing patching agents can be categorized as agent-based planning methods, which rely on LLMs for planning, and rule-based planning methods, which follow a pre-defined workflow.

Data Free Backdoor Attacks

1 code implementation9 Dec 2024 Bochuan Cao, Jinyuan Jia, Chuxuan Hu, Wenbo Guo, Zhen Xiang, Jinghui Chen, Bo Li, Dawn Song

Existing backdoor attacks require either retraining the classifier with some clean data or modifying the model's architecture.

Backdoor Attack

PrivAgent: Agentic-based Red-teaming for LLM Privacy Leakage

1 code implementation7 Dec 2024 Yuzhou Nie, Zhun Wang, Ye Yu, Xian Wu, Xuandong Zhao, Wenbo Guo, Dawn Song

We also show PrivAgent's effectiveness in extracting training data from an open-source LLM with a success rate of 5. 9%.

Red Teaming Safety Alignment

SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI

no code implementations14 Oct 2024 Yu Yang, Yuzhou Nie, Zhun Wang, Yuheng Tang, Wenbo Guo, Bo Li, Dawn Song

Our methodology ensures the data quality while enabling large-scale generation.

BlockFound: Customized blockchain foundation model for anomaly detection

no code implementations5 Oct 2024 Jiahao Yu, Xian Wu, Hao liu, Wenbo Guo, Xinyu Xing

We propose BlockFound, a customized foundation model for anomaly blockchain transaction detection.

Anomaly Detection model

F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AI

no code implementations3 Oct 2024 Xu Zheng, Farhad Shirani, Zhuomin Chen, Chaohao Lin, Wei Cheng, Wenbo Guo, Dongsheng Luo

This approach although efficient suffers the Out-of-Distribution (OOD) problem as the perturbed samples may no longer follow the original data distribution.

Enhancing Jailbreak Attack Against Large Language Models through Silent Tokens

no code implementations31 May 2024 Jiahao Yu, Haozheng Luo, Jerry Yao-Chieh Hu, Wenbo Guo, Han Liu, Xinyu Xing

Attackers carefully craft jailbreaking prompts such that a target LLM will respond to the harmful question.

Safety Alignment

TextGuard: Provable Defense against Backdoor Attacks on Text Classification

1 code implementation19 Nov 2023 Hengzhi Pei, Jinyuan Jia, Wenbo Guo, Bo Li, Dawn Song

In this work, we propose TextGuard, the first provable defense against backdoor attacks on text classification.

Sentence text-classification +1

netFound: Foundation Model for Network Security

1 code implementation25 Oct 2023 Satyandra Guthula, Roman Beltiukov, Navya Battula, Wenbo Guo, Arpit Gupta, Inder Monga

This lack of progress can be attributed to an overreliance on supervised learning techniques and the associated challenges of curating well-specified labeled training data.

Feature Engineering feature selection +5

Unique Identification of 50,000+ Virtual Reality Users from Head & Hand Motion Data

1 code implementation17 Feb 2023 Vivek Nair, Wenbo Guo, Justus Mattern, Rui Wang, James F. O'Brien, Louis Rosenberg, Dawn Song

With the recent explosive growth of interest and investment in virtual reality (VR) and the so-called "metaverse," public attention has rightly shifted toward the unique security and privacy threats that these platforms may pose.

Are Shortest Rationales the Best Explanations for Human Understanding?

1 code implementation ACL 2022 Hua Shen, Tongshuang Wu, Wenbo Guo, Ting-Hao 'Kenneth' Huang

Existing self-explaining models typically favor extracting the shortest possible rationales - snippets of an input text "responsible for" corresponding output - to explain the model prediction, with the assumption that shorter rationales are more intuitive to humans.

EDGE: Explaining Deep Reinforcement Learning Policies

1 code implementation NeurIPS 2021 Wenbo Guo, Xian Wu, Usmann Khan, Xinyu Xing

With the rapid development of deep reinforcement learning (DRL) techniques, there is an increasing need to understand and interpret DRL policies.

Deep Reinforcement Learning MuJoCo +4

BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning

no code implementations2 May 2021 Lun Wang, Zaynah Javed, Xian Wu, Wenbo Guo, Xinyu Xing, Dawn Song

Recent research has confirmed the feasibility of backdoor attacks in deep reinforcement learning (RL) systems.

Atari Games Backdoor Attack +3

Decoy-enhanced Saliency Maps

no code implementations1 Jan 2021 Yang Young Lu, Wenbo Guo, Xinyu Xing, William Noble

Saliency methods can make deep neural network predictions more interpretable by identifying a set of critical features in an input sample, such as pixels that contribute most strongly to a prediction made by an image classifier.

DANCE: Enhancing saliency maps using decoys

1 code implementation3 Feb 2020 Yang Lu, Wenbo Guo, Xinyu Xing, William Stafford Noble

Saliency methods can make deep neural network predictions more interpretable by identifying a set of critical features in an input sample, such as pixels that contribute most strongly to a prediction made by an image classifier.

Adversarial Attack

Robust saliency maps with distribution-preserving decoys

no code implementations25 Sep 2019 Yang Young Lu, Wenbo Guo, Xinyu Xing, William Stafford Noble

In this work, we propose a data-driven technique that uses the distribution-preserving decoys to infer robust saliency scores in conjunction with a pre-trained convolutional neural network classifier and any off-the-shelf saliency method.

Adversarial Attack

TABOR: A Highly Accurate Approach to Inspecting and Restoring Trojan Backdoors in AI Systems

1 code implementation2 Aug 2019 Wenbo Guo, Lun Wang, Xinyu Xing, Min Du, Dawn Song

As such, given a deep neural network model and clean input samples, it is very challenging to inspect and determine the existence of a trojan backdoor.

Anomaly Detection

Explaining Deep Learning Models -- A Bayesian Non-parametric Approach

no code implementations NeurIPS 2018 Wenbo Guo, Sui Huang, Yunzhe Tao, Xinyu Xing, Lin Lin

The empirical results indicate that our proposed approach not only outperforms the state-of-the-art techniques in explaining individual decisions but also provides users with an ability to discover the vulnerabilities of the target ML models.

Deep Learning

Explaining Deep Learning Models - A Bayesian Non-parametric Approach

no code implementations7 Nov 2018 Wenbo Guo, Sui Huang, Yunzhe Tao, Xinyu Xing, Lin Lin

The empirical results indicate that our proposed approach not only outperforms the state-of-the-art techniques in explaining individual decisions but also provides users with an ability to discover the vulnerabilities of the target ML models.

Deep Learning

Lemna: Explaining deep learning based security applications

1 code implementation Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security 2018 Wenbo Guo, Dongliang Mu5, Jun Xu4, Purui Su6, Gang Wang3, Xinyu Xing

The local interpretable modelis specially designed to (1) handle feature dependency to betterwork with security applications (e. g., binary code analysis); and(2) handle nonlinear local boundaries to boost explanation delity. We evaluate our system using two popular deep learning applica-tions in security (a malware classier, and a function start detectorfor binary reverse-engineering).

Deep Learning

Towards Interrogating Discriminative Machine Learning Models

no code implementations23 May 2017 Wenbo Guo, Kaixuan Zhang, Lin Lin, Sui Huang, Xinyu Xing

Our results indicate that the proposed approach not only outperforms the state-of-the-art technique in explaining individual decisions but also provides users with an ability to discover the vulnerabilities of a learning model.

BIG-bench Machine Learning

Learning Adversary-Resistant Deep Neural Networks

no code implementations5 Dec 2016 Qinglong Wang, Wenbo Guo, Kaixuan Zhang, Alexander G. Ororbia II, Xinyu Xing, Xue Liu, C. Lee Giles

Despite the superior performance of DNNs in these applications, it has been recently shown that these models are susceptible to a particular type of attack that exploits a fundamental flaw in their design.

Autonomous Vehicles

Using Non-invertible Data Transformations to Build Adversarial-Robust Neural Networks

no code implementations6 Oct 2016 Qinglong Wang, Wenbo Guo, Alexander G. Ororbia II, Xinyu Xing, Lin Lin, C. Lee Giles, Xue Liu, Peng Liu, Gang Xiong

Deep neural networks have proven to be quite effective in a wide variety of machine learning tasks, ranging from improved speech recognition systems to advancing the development of autonomous vehicles.

Autonomous Vehicles Dimensionality Reduction +2

Adversary Resistant Deep Neural Networks with an Application to Malware Detection

no code implementations5 Oct 2016 Qinglong Wang, Wenbo Guo, Kaixuan Zhang, Alexander G. Ororbia II, Xinyu Xing, C. Lee Giles, Xue Liu

However, after a thorough analysis of the fundamental flaw in DNNs, we discover that the effectiveness of current defenses is limited and, more importantly, cannot provide theoretical guarantees as to their robustness against adversarial sampled-based attacks.

Deep Learning Information Retrieval +4

Cannot find the paper you are looking for? You can Submit a new open access paper.