no code implementations • 25 May 2025 • Yuheng Tang, Hongwei Li, Kaijie Zhu, Michael Yang, Yangruibo Ding, Wenbo Guo
Motivated by the collaborative nature, we propose Co-PatcheR, the first collaborative patching system with small and specialized reasoning models for individual components.
no code implementations • 9 May 2025 • Zhun Wang, Vincent Siu, Zhe Ye, Tianneng Shi, Yuzhou Nie, Xuandong Zhao, Chenguang Wang, Wenbo Guo, Dawn Song
The strong planning and reasoning capabilities of Large Language Models (LLMs) have fostered the development of agent-based systems capable of leveraging external tools and interacting with increasingly complex environments.
1 code implementation • 16 Apr 2025 • Tianneng Shi, Jingxuan He, Zhun Wang, Linyu Wu, Hongwei Li, Wenbo Guo, Dawn Song
At its core is a domain-specific language for flexibly expressing privilege control policies applied during agent execution.
no code implementations • 7 Apr 2025 • Wenbo Guo, Yujin Potter, Tianneng Shi, Zhun Wang, Andy Zhang, Dawn Song
As frontier AI advances rapidly, understanding its impact on cybersecurity and inherent risks is essential to ensuring safe AI evolution (e. g., guiding risk mitigation and informing policymakers).
1 code implementation • 9 Feb 2025 • Enquan Yang, Peng Xing, Hanyang Sun, Wenbo Guo, Yuanwei Ma, Zechao Li, Dan Zeng
The key features of 3CAD are that it covers anomalous regions of different sizes, multiple anomaly types, and the possibility of multiple anomalous regions and multiple anomaly types per anomaly image.
1 code implementation • 7 Feb 2025 • Kaijie Zhu, Xianjun Yang, Jindong Wang, Wenbo Guo, William Yang Wang
Recent research has explored that LLM agents are vulnerable to indirect prompt injection (IPI) attacks, where malicious tasks embedded in tool-retrieved information can redirect the agent to take unauthorized actions.
1 code implementation • 4 Feb 2025 • Hongwei Li, Yuheng Tang, Shiqi Wang, Wenbo Guo
Based on how to determine the patching workflows, existing patching agents can be categorized as agent-based planning methods, which rely on LLMs for planning, and rule-based planning methods, which follow a pre-defined workflow.
1 code implementation • 9 Dec 2024 • Bochuan Cao, Jinyuan Jia, Chuxuan Hu, Wenbo Guo, Zhen Xiang, Jinghui Chen, Bo Li, Dawn Song
Existing backdoor attacks require either retraining the classifier with some clean data or modifying the model's architecture.
1 code implementation • 7 Dec 2024 • Yuzhou Nie, Zhun Wang, Ye Yu, Xian Wu, Xuandong Zhao, Wenbo Guo, Dawn Song
We also show PrivAgent's effectiveness in extracting training data from an open-source LLM with a success rate of 5. 9%.
no code implementations • 14 Oct 2024 • Yu Yang, Yuzhou Nie, Zhun Wang, Yuheng Tang, Wenbo Guo, Bo Li, Dawn Song
Our methodology ensures the data quality while enabling large-scale generation.
no code implementations • 5 Oct 2024 • Jiahao Yu, Xian Wu, Hao liu, Wenbo Guo, Xinyu Xing
We propose BlockFound, a customized foundation model for anomaly blockchain transaction detection.
no code implementations • 3 Oct 2024 • Xu Zheng, Farhad Shirani, Zhuomin Chen, Chaohao Lin, Wei Cheng, Wenbo Guo, Dongsheng Luo
This approach although efficient suffers the Out-of-Distribution (OOD) problem as the perturbed samples may no longer follow the original data distribution.
no code implementations • 31 May 2024 • Jiahao Yu, Haozheng Luo, Jerry Yao-Chieh Hu, Wenbo Guo, Han Liu, Xinyu Xing
Attackers carefully craft jailbreaking prompts such that a target LLM will respond to the harmful question.
1 code implementation • 19 Nov 2023 • Hengzhi Pei, Jinyuan Jia, Wenbo Guo, Bo Li, Dawn Song
In this work, we propose TextGuard, the first provable defense against backdoor attacks on text classification.
1 code implementation • 25 Oct 2023 • Satyandra Guthula, Roman Beltiukov, Navya Battula, Wenbo Guo, Arpit Gupta, Inder Monga
This lack of progress can be attributed to an overreliance on supervised learning techniques and the associated challenges of curating well-specified labeled training data.
1 code implementation • 15 Jun 2023 • Roman Beltiukov, Wenbo Guo, Arpit Gupta, Walter Willinger
This issue is commonly referred to as the generalizability problem of ML models.
1 code implementation • 17 Feb 2023 • Vivek Nair, Wenbo Guo, Justus Mattern, Rui Wang, James F. O'Brien, Louis Rosenberg, Dawn Song
With the recent explosive growth of interest and investment in virtual reality (VR) and the so-called "metaverse," public attention has rightly shifted toward the unique security and privacy threats that these platforms may pose.
1 code implementation • ACL 2022 • Hua Shen, Tongshuang Wu, Wenbo Guo, Ting-Hao 'Kenneth' Huang
Existing self-explaining models typically favor extracting the shortest possible rationales - snippets of an input text "responsible for" corresponding output - to explain the model prediction, with the assumption that shorter rationales are more intuitive to humans.
1 code implementation • NeurIPS 2021 • Wenbo Guo, Xian Wu, Usmann Khan, Xinyu Xing
With the rapid development of deep reinforcement learning (DRL) techniques, there is an increasing need to understand and interpret DRL policies.
no code implementations • 2 May 2021 • Lun Wang, Zaynah Javed, Xian Wu, Wenbo Guo, Xinyu Xing, Dawn Song
Recent research has confirmed the feasibility of backdoor attacks in deep reinforcement learning (RL) systems.
no code implementations • 1 Jan 2021 • Yang Young Lu, Wenbo Guo, Xinyu Xing, William Noble
Saliency methods can make deep neural network predictions more interpretable by identifying a set of critical features in an input sample, such as pixels that contribute most strongly to a prediction made by an image classifier.
1 code implementation • 3 Feb 2020 • Yang Lu, Wenbo Guo, Xinyu Xing, William Stafford Noble
Saliency methods can make deep neural network predictions more interpretable by identifying a set of critical features in an input sample, such as pixels that contribute most strongly to a prediction made by an image classifier.
no code implementations • 25 Sep 2019 • Yang Young Lu, Wenbo Guo, Xinyu Xing, William Stafford Noble
In this work, we propose a data-driven technique that uses the distribution-preserving decoys to infer robust saliency scores in conjunction with a pre-trained convolutional neural network classifier and any off-the-shelf saliency method.
1 code implementation • 2 Aug 2019 • Wenbo Guo, Lun Wang, Xinyu Xing, Min Du, Dawn Song
As such, given a deep neural network model and clean input samples, it is very challenging to inspect and determine the existence of a trojan backdoor.
no code implementations • NeurIPS 2018 • Wenbo Guo, Sui Huang, Yunzhe Tao, Xinyu Xing, Lin Lin
The empirical results indicate that our proposed approach not only outperforms the state-of-the-art techniques in explaining individual decisions but also provides users with an ability to discover the vulnerabilities of the target ML models.
no code implementations • 7 Nov 2018 • Wenbo Guo, Sui Huang, Yunzhe Tao, Xinyu Xing, Lin Lin
The empirical results indicate that our proposed approach not only outperforms the state-of-the-art techniques in explaining individual decisions but also provides users with an ability to discover the vulnerabilities of the target ML models.
1 code implementation • Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security 2018 • Wenbo Guo, Dongliang Mu5, Jun Xu4, Purui Su6, Gang Wang3, Xinyu Xing
The local interpretable modelis specially designed to (1) handle feature dependency to betterwork with security applications (e. g., binary code analysis); and(2) handle nonlinear local boundaries to boost explanation delity. We evaluate our system using two popular deep learning applica-tions in security (a malware classier, and a function start detectorfor binary reverse-engineering).
no code implementations • 23 May 2017 • Wenbo Guo, Kaixuan Zhang, Lin Lin, Sui Huang, Xinyu Xing
Our results indicate that the proposed approach not only outperforms the state-of-the-art technique in explaining individual decisions but also provides users with an ability to discover the vulnerabilities of a learning model.
no code implementations • 5 Dec 2016 • Qinglong Wang, Wenbo Guo, Kaixuan Zhang, Alexander G. Ororbia II, Xinyu Xing, Xue Liu, C. Lee Giles
Despite the superior performance of DNNs in these applications, it has been recently shown that these models are susceptible to a particular type of attack that exploits a fundamental flaw in their design.
no code implementations • 6 Oct 2016 • Qinglong Wang, Wenbo Guo, Alexander G. Ororbia II, Xinyu Xing, Lin Lin, C. Lee Giles, Xue Liu, Peng Liu, Gang Xiong
Deep neural networks have proven to be quite effective in a wide variety of machine learning tasks, ranging from improved speech recognition systems to advancing the development of autonomous vehicles.
no code implementations • 5 Oct 2016 • Qinglong Wang, Wenbo Guo, Kaixuan Zhang, Alexander G. Ororbia II, Xinyu Xing, C. Lee Giles, Xue Liu
However, after a thorough analysis of the fundamental flaw in DNNs, we discover that the effectiveness of current defenses is limited and, more importantly, cannot provide theoretical guarantees as to their robustness against adversarial sampled-based attacks.