Large language models (LLMs) have seen significant adoption for natural language tasks, owing their success to massive numbers of model parameters (e. g., 70B+); however, LLM inference incurs significant computation and memory costs.
The emergence of activation sparsity in LLMs provides a natural approach to reduce this cost by involving only parts of the parameters for inference.
In this paper, we recognize that images share common features in a hierarchical way due to the inherent hierarchical structure of the classification system, which is overlooked by current data parameterization methods.
This work aims to address this gap by offering a theoretical characterization of the trade-off between detection and false positive rates for stateful defenses.
CALICO's efficacy is substantiated by extensive evaluations on 3D object detection and BEV map segmentation tasks, where it delivers significant performance improvements.
Such stateful defenses aim to defend against black-box attacks by tracking the query history and detecting and rejecting queries that are "similar" and thus preventing black-box attacks from finding useful gradients and making progress towards finding adversarial attacks within a reasonable query budget.
We then propose a novel one-shot coreset selection method, Coverage-centric Coreset Selection (CCS), that jointly considers overall data coverage upon a distribution as well as the importance of each example.
Preprocessing and outlier detection techniques have both been applied to neural networks to increase robustness with varying degrees of success.
Based on these metrics, we propose an unsupervised framework for learning a set of concepts that satisfy the desired properties of high detection completeness and concept separability, and demonstrate its effectiveness in providing concept-based explanations for diverse off-the-shelf OOD detectors.
D4 uses an ensemble of models over disjoint subsets of the frequency spectrum to significantly improve adversarial robustness.
The problem of detecting out-of-distribution (OOD) examples in neural networks has been widely studied in the literature, with state-of-the-art techniques being supervised in that they require fine-tuning on OOD data to achieve high-quality OOD detection.
We present DeClaW, a system for detecting, classifying, and warning of adversarial inputs presented to a classification neural network.
We first formally prove that adaptive codebooks can provide stronger robustness guarantees than fixed codebooks as a preprocessing defense on some datasets.
Moreover, we design the first diagnostic method to quantify the vulnerability contributed by each layer, which can be used to identify vulnerable parts of model architectures.
We then experimentally demonstrate that our attacks indeed do not significantly change perceptual salience of the background, but are highly effective against classifiers robust to conventional attacks.
The effectiveness of such attacks relies heavily on the availability of data necessary to query the target model.
We address three key requirements for practical attacks for the real-world: 1) automatically constraining the size and shape of the attack so it can be applied with stickers, 2) transform-robustness, i. e., robustness of a attack to environmental physical variations such as viewpoint and lighting changes, and 3) supporting attacks in not only white-box, but also black-box hard-label scenarios, so that the adversary can attack proprietary models.
Focusing on the observation that discrete pixelization in MNIST makes the background completely black and foreground completely white, we hypothesize that the important property for increasing robustness is the elimination of image background using attention masks before classifying an object.
Numerous techniques have been proposed to harden machine learning algorithms and mitigate the effect of adversarial attacks.
Recently, interpretable models called self-explaining models (SEMs) have been proposed with the goal of providing interpretability robustness.
Existing deep neural networks, say for image classification, have been shown to be vulnerable to adversarial images that can cause a DNN misclassification, without any perceptible change to an image.
We provide a methodology, resilient feature engineering, for creating adversarially resilient classifiers.
In this work, we extend physical attacks to more challenging object detection models, a broader class of deep learning algorithms widely used to detect and label multiple objects within a scene.
Recent studies show that the state-of-the-art deep neural networks (DNNs) are vulnerable to adversarial examples, resulting from small-magnitude perturbations added to the input.
When using risk-based permissions, device operations are grouped into units of similar risk, and users grant apps access to devices at that risk-based granularity.
Cryptography and Security
Given the fact that state-of-the-art objection detection algorithms are harder to be fooled by the same set of adversarial examples, here we show that these detectors can also be attacked by physical adversarial examples.
We propose a general attack algorithm, Robust Physical Perturbations (RP2), to generate robust visual adversarial perturbations under different physical conditions.
The Internet of Things (IoT) is a new computing paradigm that spans wearable devices, homes, hospitals, cities, transportation, and critical infrastructure.
Cryptography and Security