no code implementations • ICML 2020 • Yuh-Shyang Wang, Tsui-Wei Weng, Luca Daniel
In this paper, we show how to combine recent works on static neural network certification tools with robust control theory to certify a neural network policy in a control loop.
no code implementations • 5 Jul 2024 • Chung-En Sun, Tuomas Oikarinen, Tsui-Wei Weng
We introduce the Concept Bottleneck Large Language Model (CB-LLM), a pioneering approach to creating inherently interpretable Large Language Models (LLMs).
1 code implementation • 26 Jun 2024 • Chung-En Sun, Sicun Gao, Tsui-Wei Weng
However, a notable gap exists in the performance of current smoothed DRL agents, often characterized by significantly low clean rewards and weak robustness.
no code implementations • 24 Jun 2024 • Tung-Yu Wu, Yu-Xiang Lin, Tsui-Wei Weng
Neuron-level interpretations aim to explain network behaviors and properties by investigating neurons responsive to specific perceptual or structural input patterns.
1 code implementation • 10 May 2024 • Tuomas Oikarinen, Tsui-Wei Weng
In recent years many methods have been developed to understand the internal workings of neural networks, often by describing the function of individual neurons in the model.
1 code implementation • 30 Apr 2024 • Ge Yan, Yaniv Romano, Tsui-Wei Weng
To address these limitations, we first propose a novel framework called RSCP+ to provide provable robustness guarantee in evaluation, which fixes the issues in the original RSCP method.
1 code implementation • 20 Mar 2024 • Nicholas Bai, Rahul A. Iyer, Tuomas Oikarinen, Tsui-Wei Weng
In this paper, we propose Describe-and-Dissect (DnD), a novel method to describe the roles of hidden neurons in vision networks.
1 code implementation • 16 Dec 2023 • Wang Zhang, Ziwen Ma, Subhro Das, Tsui-Wei Weng, Alexandre Megretski, Luca Daniel, Lam M. Nguyen
Neural networks are powerful tools in various applications, and quantifying their uncertainty is crucial for reliable decision-making.
1 code implementation • ICCV 2023 • Divyansh Srivastava, Tuomas Oikarinen, Tsui-Wei Weng
The inability of DNNs to explain their black-box behavior has led to a recent surge of explainability methods.
no code implementations • 11 Oct 2023 • Linbo Liu, Trong Nghia Hoang, Lam M. Nguyen, Tsui-Wei Weng
The second approach introduces a post-processing method EsbRS which greatly improves the robustness certificate based on building model ensembles.
no code implementations • 9 Oct 2023 • Justin Lee, Tuomas Oikarinen, Arjun Chatha, Keng-Chi Chang, Yilan Chen, Tsui-Wei Weng
Recent advances have greatly increased the capabilities of large language models (LLMs), but our understanding of the models and their safety has not progressed as fast.
1 code implementation • 24 Aug 2023 • Avni Kothari, Bogdan Kulynych, Tsui-Wei Weng, Berk Ustun
As a result, they can assign predictions that are fixed $-$ meaning that individuals who are denied loans and interviews are, in fact, precluded from access to credit and employment.
no code implementations • 26 Apr 2023 • Mohammad Ali Khan, Tuomas Oikarinen, Tsui-Wei Weng
In this work, we propose a general framework called Concept-Monitor to help demystify the black-box DNN training processes automatically using a novel unified embedding space and concept diversity metric.
2 code implementations • 12 Apr 2023 • Tuomas Oikarinen, Subhro Das, Lam M. Nguyen, Tsui-Wei Weng
Motivated by these challenges, we propose Label-free CBM which is a novel framework to transform any neural network into an interpretable CBM without labeled concept data, while retaining a high accuracy.
no code implementations • 2 Apr 2023 • Ligong Han, Seungwook Han, Shivchander Sudalairaj, Charlotte Loh, Rumen Dangovski, Fei Deng, Pulkit Agrawal, Dimitris Metaxas, Leonid Karlinsky, Tsui-Wei Weng, Akash Srivastava
Recently, several attempts have been made to replace such domain-specific, human-designed transformations with generated views that are learned.
1 code implementation • 11 Feb 2023 • Wang Zhang, Tsui-Wei Weng, Subhro Das, Alexandre Megretski, Luca Daniel, Lam M. Nguyen
Deep neural networks (DNN) have shown great capacity of modeling a dynamical system; nevertheless, they usually do not obey physics constraints such as conservation laws.
no code implementations • 26 Jan 2023 • Alex Gu, Tsui-Wei Weng, Pin-Yu Chen, Sijia Liu, Luca Daniel
Interpreting machine learning models is challenging but crucial for ensuring the safety of deep networks in autonomous driving systems.
no code implementations • 20 Oct 2022 • Chester Holtz, Tsui-Wei Weng, Gal Mishne
There has been great interest in enhancing the robustness of neural network classifiers to defend against adversarial perturbations through adversarial training, while balancing the trade-off between robust accuracy and standard accuracy.
2 code implementations • 23 Apr 2022 • Tuomas Oikarinen, Tsui-Wei Weng
Finally CLIP-Dissect is computationally efficient and can label all neurons from five layers of ResNet-50 in just 4 minutes, which is more than 10 times faster than existing methods.
no code implementations • 7 Feb 2022 • Nhan H. Pham, Lam M. Nguyen, Jie Chen, Hoang Thanh Lam, Subhro Das, Tsui-Wei Weng
In recent years, a proliferation of methods were developed for cooperative multi-agent reinforcement learning (c-MARL).
1 code implementation • NeurIPS 2021 • Yilan Chen, Wei Huang, Lam M. Nguyen, Tsui-Wei Weng
Therefore, in this work, we propose to establish the equivalence between NN and SVM, and specifically, the infinitely wide NN trained by soft margin loss and the standard soft margin SVM with NTK trained by subgradient descent.
no code implementations • 29 Sep 2021 • Victor Rong, Alexandre Megretski, Luca Daniel, Tsui-Wei Weng
Recent developments on the robustness of neural networks have primarily emphasized the notion of worst-case adversarial robustness in both verification and robust training.
no code implementations • 29 Sep 2021 • Chester Holtz, Tsui-Wei Weng, Gal Mishne
There has been great interest in enhancing the robustness of neural network classifiers to defend against adversarial perturbations through adversarial training, while balancing the trade-off between robust accuracy and standard accuracy.
no code implementations • 29 Sep 2021 • Wang Zhang, Lam M. Nguyen, Subhro Das, Pin-Yu Chen, Sijia Liu, Alexandre Megretski, Luca Daniel, Tsui-Wei Weng
In verification-based robust training, existing methods utilize relaxation based methods to bound the worst case performance of neural networks given certain perturbation.
no code implementations • ICLR 2022 • Asaf Gendler, Tsui-Wei Weng, Luca Daniel, Yaniv Romano
By combining conformal prediction with randomized smoothing, our proposed method forms a prediction set with finite-sample coverage guarantee that holds for any data distribution with $\ell_2$-norm bounded adversarial noise, generated by any adversarial attack algorithm.
no code implementations • 29 Sep 2021 • Nhan Pham, Lam M. Nguyen, Jie Chen, Thanh Lam Hoang, Subhro Das, Tsui-Wei Weng
In recent years, a proliferation of methods were developed for multi-agent reinforcement learning (MARL).
1 code implementation • ICLR 2021 • Ren Wang, Kaidi Xu, Sijia Liu, Pin-Yu Chen, Tsui-Wei Weng, Chuang Gan, Meng Wang
Despite the generalization power of the meta-model, it remains elusive that how adversarial robustness can be maintained by MAML in few-shot learning.
no code implementations • 1 Feb 2021 • Akhilan Boopathy, Tsui-Wei Weng, Sijia Liu, Pin-Yu Chen, Gaoyuan Zhang, Luca Daniel
Recent works have developed several methods of defending neural networks against adversarial attacks with certified guarantees.
no code implementations • NeurIPS 2020 • Jeet Mohapatra, Ching-Yun Ko, Tsui-Wei Weng, Pin-Yu Chen, Sijia Liu, Luca Daniel
We also provide a framework that generalizes the calculation for certification using higher-order information.
2 code implementations • NeurIPS 2021 • Tuomas Oikarinen, Wang Zhang, Alexandre Megretski, Luca Daniel, Tsui-Wei Weng
To address this issue, we propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against $l_p$-norm bounded adversarial attacks.
no code implementations • ICLR 2020 • Tsui-Wei Weng, Krishnamurthy (Dj) Dvijotham*, Jonathan Uesato*, Kai Xiao*, Sven Gowal*, Robert Stanforth*, Pushmeet Kohli
Deep reinforcement learning has achieved great success in many previously difficult reinforcement learning tasks, yet recent studies show that deep RL agents are also unavoidably susceptible to adversarial perturbations, similar to deep neural networks in classification tasks.
no code implementations • 18 Aug 2019 • Yuh-Shyang Wang, Tsui-Wei Weng, Luca Daniel
In this paper, we show how to combine recent works on neural network certification tools (which are mainly used in static settings such as image classification) with robust control theory to certify a neural network policy in a control loop.
1 code implementation • 10 Jun 2019 • Kaidi Xu, Hongge Chen, Sijia Liu, Pin-Yu Chen, Tsui-Wei Weng, Mingyi Hong, Xue Lin
Graph neural networks (GNNs) which apply the deep neural networks to graph data have achieved significant performance for the task of semi-supervised node classification.
2 code implementations • 17 May 2019 • Ching-Yun Ko, Zhaoyang Lyu, Tsui-Wei Weng, Luca Daniel, Ngai Wong, Dahua Lin
The vulnerability to adversarial attacks has been a critical issue for deep neural networks.
no code implementations • 22 Jan 2019 • Lam M. Nguyen, Marten van Dijk, Dzung T. Phan, Phuong Ha Nguyen, Tsui-Wei Weng, Jayant R. Kalagnanam
The total complexity (measured as the total number of gradient computations) of a stochastic first-order optimization algorithm that finds a first-order stationary point of a finite-sum smooth nonconvex objective function $F(w)=\frac{1}{n} \sum_{i=1}^n f_i(w)$ has been proven to be at least $\Omega(\sqrt{n}/\epsilon)$ for $n \leq \mathcal{O}(\epsilon^{-2})$ where $\epsilon$ denotes the attained accuracy $\mathbb{E}[ \|\nabla F(\tilde{w})\|^2] \leq \epsilon$ for the outputted approximation $\tilde{w}$ (Fang et al., 2018).
no code implementations • 18 Dec 2018 • Tsui-Wei Weng, Pin-Yu Chen, Lam M. Nguyen, Mark S. Squillante, Ivan Oseledets, Luca Daniel
With deep neural networks providing state-of-the-art machine learning models for numerous machine learning tasks, quantifying the robustness of these models has become an important area of research.
2 code implementations • 29 Nov 2018 • Akhilan Boopathy, Tsui-Wei Weng, Pin-Yu Chen, Sijia Liu, Luca Daniel
This motivates us to propose a general and efficient framework, CNN-Cert, that is capable of certifying robustness on general convolutional neural networks.
14 code implementations • NeurIPS 2018 • Huan Zhang, Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, Luca Daniel
Finding minimum distortion of adversarial examples and thus certifying robustness in neural network classifiers for given data points is known to be a challenging problem.
1 code implementation • 19 Oct 2018 • Tsui-Wei Weng, huan zhang, Pin-Yu Chen, Aurelie Lozano, Cho-Jui Hsieh, Luca Daniel
We apply extreme value theory on the new formal robustness guarantee and the estimated robustness is called second-order CLEVER score.
6 code implementations • ICML 2018 • Tsui-Wei Weng, huan zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Duane Boning, Inderjit S. Dhillon, Luca Daniel
Verifying the robustness property of a general Rectified Linear Unit (ReLU) network is an NP-complete problem [Katz, Barrett, Dill, Julian and Kochenderfer CAV17].
1 code implementation • ICLR 2018 • Tsui-Wei Weng, huan zhang, Pin-Yu Chen, Jin-Feng Yi, Dong Su, Yupeng Gao, Cho-Jui Hsieh, Luca Daniel
Our analysis yields a novel robustness metric called CLEVER, which is short for Cross Lipschitz Extreme Value for nEtwork Robustness.