Search Results for author: Benjamin I. P. Rubinstein

Found 45 papers, 14 papers with code

CERT-ED: Certifiably Robust Text Classification for Edit Distance

no code implementations1 Aug 2024 Zhuoqun Huang, Neil G Marchant, Olga Ohrimenko, Benjamin I. P. Rubinstein

With the growing integration of AI in daily life, ensuring the robustness of systems to inference-time attacks is crucial.

text-classification Text Classification

Adaptive Data Analysis for Growing Data

no code implementations22 May 2024 Neil G. Marchant, Benjamin I. P. Rubinstein

In a batched query setting, the asymptotic data requirements of our bound grows with the square-root of the number of adaptive queries, matching prior works' improvement over data splitting for the static setting.

Generalization Bounds

SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks

no code implementations19 May 2024 Xuanli He, Qiongkai Xu, Jun Wang, Benjamin I. P. Rubinstein, Trevor Cohn

Modern NLP models are often trained on public datasets drawn from diverse sources, rendering them vulnerable to data poisoning attacks.

Data Poisoning

RS-Reg: Probabilistic and Robust Certified Regression Through Randomized Smoothing

1 code implementation14 May 2024 Aref Miri Rekavandi, Olga Ohrimenko, Benjamin I. P. Rubinstein

Randomized smoothing has shown promising certified robustness against adversaries in classification tasks.

regression valid

TuBA: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning

no code implementations30 Apr 2024 Xuanli He, Jun Wang, Qiongkai Xu, Pasquale Minervini, Pontus Stenetorp, Benjamin I. P. Rubinstein, Trevor Cohn

The implications of backdoor attacks on English-centric large language models (LLMs) have been widely examined - such attacks can be achieved by embedding malicious behaviors during training and activated under specific conditions that trigger malicious outputs.

Backdoor Attack on Multilingual Machine Translation

no code implementations3 Apr 2024 Jun Wang, Qiongkai Xu, Xuanli He, Benjamin I. P. Rubinstein, Trevor Cohn

Our aim is to bring attention to these vulnerabilities within MNMT systems with the hope of encouraging the community to address security concerns in machine translation, especially in the context of low-resource languages.

Backdoor Attack Machine Translation +1

It's Simplex! Disaggregating Measures to Improve Certified Robustness

no code implementations20 Sep 2023 Andrew C. Cullen, Paul Montague, Shijie Liu, Sarah M. Erfani, Benjamin I. P. Rubinstein

Certified robustness circumvents the fragility of defences against adversarial attacks, by endowing model predictions with guarantees of class invariance for attacks up to a calculated size.

Et Tu Certifications: Robustness Certificates Yield Better Adversarial Examples

1 code implementation9 Feb 2023 Andrew C. Cullen, Shijie Liu, Paul Montague, Sarah M. Erfani, Benjamin I. P. Rubinstein

In guaranteeing the absence of adversarial examples in an instance's neighbourhood, certification mechanisms play an important role in demonstrating neural net robustness.

RS-Del: Edit Distance Robustness Certificates for Sequence Classifiers via Randomized Deletion

1 code implementation NeurIPS 2023 Zhuoqun Huang, Neil G. Marchant, Keane Lucas, Lujo Bauer, Olga Ohrimenko, Benjamin I. P. Rubinstein

When applied to the popular MalConv malware detection model, our smoothing mechanism RS-Del achieves a certified accuracy of 91% at an edit distance radius of 128 bytes.

Binary Classification Malware Detection

Double Bubble, Toil and Trouble: Enhancing Certified Robustness through Transitivity

1 code implementation12 Oct 2022 Andrew C. Cullen, Paul Montague, Shijie Liu, Sarah M. Erfani, Benjamin I. P. Rubinstein

In response to subtle adversarial examples flipping classifications of neural network models, recent research has promoted certified robustness as a solution.

Open-Ended Question Answering

Unlabelled Sample Compression Schemes for Intersection-Closed Classes and Extremal Classes

no code implementations11 Oct 2022 J. Hyam Rubinstein, Benjamin I. P. Rubinstein

We also prove that all intersection-closed classes with VC dimension $d$ admit unlabelled compression schemes of size at most $11d$.

Learning Theory LEMMA

Testing the Robustness of Learned Index Structures

1 code implementation23 Jul 2022 Matthias Bachfischer, Renata Borovica-Gajic, Benjamin I. P. Rubinstein

To simulate adversarial workloads, we carry out a data poisoning attack on linear regression models that manipulates the cumulative distribution function (CDF) on which the learned index model is trained.

Data Poisoning regression

State Selection Algorithms and Their Impact on The Performance of Stateful Network Protocol Fuzzing

no code implementations24 Dec 2021 Dongge Liu, Van-Thuan Pham, Gidon Ernst, Toby Murray, Benjamin I. P. Rubinstein

In this work, we evaluate an extensive set of state selection algorithms on the same fuzzing platform that is AFLNet, a state-of-the-art fuzzer for network servers.

Improving Robustness with Optimal Transport based Adversarial Generalization

no code implementations29 Sep 2021 Siqi Xia, Shijie Liu, Trung Le, Dinh Phung, Sarah Erfani, Benjamin I. P. Rubinstein, Christopher Leckie, Paul Montague

More specifically, by minimizing the WS distance of interest, an adversarial example is pushed toward the cluster of benign examples sharing the same label on the latent space, which helps to strengthen the generalization ability of the classifier on the adversarial examples.

Local Intrinsic Dimensionality Signals Adversarial Perturbations

1 code implementation24 Sep 2021 Sandamal Weerasinghe, Tansu Alpcan, Sarah M. Erfani, Christopher Leckie, Benjamin I. P. Rubinstein

In this paper, we derive a lower-bound and an upper-bound for the LID value of a perturbed data point and demonstrate that the bounds, in particular the lower-bound, has a positive correlation with the magnitude of the perturbation.

BIG-bench Machine Learning

Hard to Forget: Poisoning Attacks on Certified Machine Unlearning

1 code implementation17 Sep 2021 Neil G. Marchant, Benjamin I. P. Rubinstein, Scott Alfeld

The right to erasure requires removal of a user's information from data held by organizations, with rigorous interpretations extending to downstream products such as learned models.

Machine Unlearning

No DBA? No regret! Multi-armed bandits for index tuning of analytical and HTAP workloads with provable guarantees

no code implementations23 Aug 2021 R. Malinga Perera, Bastian Oetomo, Benjamin I. P. Rubinstein, Renata Borovica-Gajic

Our comprehensive empirical evaluation against a state-of-the-art commercial tuning tool demonstrates up to 75% speed-up on shifting and ad-hoc workloads and up to 28% speed-up on static workloads in analytical processing environments.

Decision Making Decision Making Under Uncertainty +4

Putting words into the system's mouth: A targeted attack on neural machine translation using monolingual data poisoning

1 code implementation12 Jul 2021 Jun Wang, Chang Xu, Francisco Guzman, Ahmed El-Kishky, Yuqing Tang, Benjamin I. P. Rubinstein, Trevor Cohn

Neural machine translation systems are known to be vulnerable to adversarial test inputs, however, as we show in this paper, these systems are also vulnerable to training attacks.

Data Poisoning Machine Translation +3

TRS: Transferability Reduced Ensemble via Promoting Gradient Diversity and Model Smoothness

no code implementations NeurIPS 2021 Zhuolin Yang, Linyi Li, Xiaojun Xu, Shiliang Zuo, Qian Chen, Pan Zhou, Benjamin I. P. Rubinstein, Ce Zhang, Bo Li

To answer these questions, in this work we first theoretically analyze and outline sufficient conditions for adversarial transferability between models; then propose a practical algorithm to reduce the transferability between base models within an ensemble to improve its robustness.

Diversity

A Targeted Attack on Black-Box Neural Machine Translation with Parallel Data Poisoning

no code implementations2 Nov 2020 Chang Xu, Jun Wang, Yuqing Tang, Francisco Guzman, Benjamin I. P. Rubinstein, Trevor Cohn

In this paper, we show that targeted attacks on black-box NMT systems are feasible, based on poisoning a small fraction of their parallel training data.

Data Poisoning Machine Translation +2

DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guarantees

no code implementations19 Oct 2020 R. Malinga Perera, Bastian Oetomo, Benjamin I. P. Rubinstein, Renata Borovica-Gajic

Automating physical database design has remained a long-term interest in database research due to substantial performance gains afforded by optimised structures.

Attribute Decision Making +3

A Graph Symmetrisation Bound on Channel Information Leakage under Blowfish Privacy

no code implementations12 Jul 2020 Tobias Edwards, Benjamin I. P. Rubinstein, Zuhe Zhang, Sanming Zhou

Blowfish privacy is a recent generalisation of differential privacy that enables improved utility while maintaining privacy policies with semantic guarantees, a factor that has driven the popularity of differential privacy in computer science.

Invertible Concept-based Explanations for CNN Models with Non-negative Concept Activation Vectors

1 code implementation27 Jun 2020 Ruihan Zhang, Prashan Madumal, Tim Miller, Krista A. Ehinger, Benjamin I. P. Rubinstein

Based on the requirements of fidelity (approximate models to target models) and interpretability (being meaningful to people), we design measurements and evaluate a range of matrix factorization methods with our framework.

Clustering Dimensionality Reduction +1

Discrete Few-Shot Learning for Pan Privacy

no code implementations23 Jun 2020 Roei Gelbhart, Benjamin I. P. Rubinstein

In this paper we present the first baseline results for the task of few-shot learning of discrete embedding vectors for image recognition.

Few-Shot Learning

Needle in a Haystack: Label-Efficient Evaluation under Extreme Class Imbalance

2 code implementations12 Jun 2020 Neil G. Marchant, Benjamin I. P. Rubinstein

Important tasks like record linkage and extreme classification demonstrate extreme class imbalance, with 1 minority instance to every 1 million or more majority instances.

Assessing Centrality Without Knowing Connections

no code implementations28 May 2020 Leyla Roohi, Benjamin I. P. Rubinstein, Vanessa Teague

We consider the privacy-preserving computation of node influence in distributed social networks, as measured by egocentric betweenness centrality (EBC).

Privacy Preserving

Legion: Best-First Concolic Testing

no code implementations15 Feb 2020 Dongge Liu, Gidon Ernst, Toby Murray, Benjamin I. P. Rubinstein

Legion incorporates a form of directed fuzzing that we call approximate path-preserving fuzzing (APPFuzzing) to investigate program states selected by MCTS.

Decision Making Decision Making Under Uncertainty +1

d-blink: Distributed End-to-End Bayesian Entity Resolution

4 code implementations13 Sep 2019 Neil G. Marchant, Andee Kaplan, Daniel N. Elazar, Benjamin I. P. Rubinstein, Rebecca C. Steorts

Entity resolution (ER; also known as record linkage or de-duplication) is the process of merging noisy databases, often in the absence of unique identifiers.

Blocking

Adversarial Reinforcement Learning under Partial Observability in Autonomous Computer Network Defence

no code implementations25 Feb 2019 Yi Han, David Hubczenko, Paul Montague, Olivier De Vel, Tamas Abraham, Benjamin I. P. Rubinstein, Christopher Leckie, Tansu Alpcan, Sarah Erfani

Recent studies have demonstrated that reinforcement learning (RL) agents are susceptible to adversarial manipulation, similar to vulnerabilities previously demonstrated in the supervised learning setting.

reinforcement-learning Reinforcement Learning +1

Truth Inference at Scale: A Bayesian Model for Adjudicating Highly Redundant Crowd Annotations

no code implementations24 Feb 2019 Yuan Li, Benjamin I. P. Rubinstein, Trevor Cohn

As we show, datasets produced by crowd-sourcing are often not of this type: the data is highly redundantly annotated ($\ge 5$ annotations per instance), and the vast majority of workers produce high quality outputs.

A Note on Bounding Regret of the C$^2$UCB Contextual Combinatorial Bandit

no code implementations20 Feb 2019 Bastian Oetomo, Malinga Perera, Renata Borovica-Gajic, Benjamin I. P. Rubinstein

We revisit the proof by Qin et al. (2014) of bounded regret of the C$^2$UCB contextual combinatorial bandit.

Differentially-Private Two-Party Egocentric Betweenness Centrality

2 code implementations16 Jan 2019 Leyla Roohi, Benjamin I. P. Rubinstein, Vanessa Teague

We describe a novel protocol for computing the egocentric betweenness centrality of a node when relevant edge information is spread between two mutually distrusting parties such as two telecommunications providers.

Vocal Bursts Valence Prediction

Reinforcement Learning for Autonomous Defence in Software-Defined Networking

no code implementations17 Aug 2018 Yi Han, Benjamin I. P. Rubinstein, Tamas Abraham, Tansu Alpcan, Olivier De Vel, Sarah Erfani, David Hubczenko, Christopher Leckie, Paul Montague

Despite the successful application of machine learning (ML) in a wide range of domains, adaptability---the very property that makes machine learning desirable---can be exploited by adversaries to contaminate training and evade classification.

BIG-bench Machine Learning General Classification +3

Sampling Without Compromising Accuracy in Adaptive Data Analysis

no code implementations28 Sep 2017 Benjamin Fish, Lev Reyzin, Benjamin I. P. Rubinstein

In this work, we study how to use sampling to speed up mechanisms for answering adaptive queries into datasets without reducing the accuracy of those mechanisms.

Pain-Free Random Differential Privacy with Sensitivity Sampling

no code implementations ICML 2017 Benjamin I. P. Rubinstein, Francesco Aldà

Popular approaches to differential privacy, such as the Laplace and exponential mechanisms, calibrate randomised smoothing through global sensitivity of the target non-private function.

Adequacy of the Gradient-Descent Method for Classifier Evasion Attacks

no code implementations6 Apr 2017 Yi Han, Benjamin I. P. Rubinstein

Despite the wide use of machine learning in adversarial settings including computer security, recent studies have demonstrated vulnerabilities to evasion attacks---carefully crafted adversarial samples that closely resemble legitimate instances, but cause misclassification.

Computer Security

Large-Scale Strategic Games and Adversarial Machine Learning

no code implementations21 Sep 2016 Tansu Alpcan, Benjamin I. P. Rubinstein, Christopher Leckie

Such high-dimensional decision spaces and big data sets lead to computational challenges, relating to efforts in non-linear optimization scaling up to large systems of variables.

BIG-bench Machine Learning Decision Making

TopicResponse: A Marriage of Topic Modelling and Rasch Modelling for Automatic Measurement in MOOCs

no code implementations29 Jul 2016 Jiazhen He, Benjamin I. P. Rubinstein, James Bailey, Rui Zhang, Sandra Milligan

This paper explores the suitability of using automatically discovered topics from MOOC discussion forums for modelling students' academic abilities.

MOOCs Meet Measurement Theory: A Topic-Modelling Approach

no code implementations25 Nov 2015 Jiazhen He, Benjamin I. P. Rubinstein, James Bailey, Rui Zhang, Sandra Milligan, Jeffrey Chan

Such models infer latent skill levels by relating them to individuals' observed responses on a series of items such as quiz questions.

Topic Models

Principled Graph Matching Algorithms for Integrating Multiple Data Sources

no code implementations3 Feb 2014 Duo Zhang, Benjamin I. P. Rubinstein, Jim Gemmell

In the most common case of matching two sources, it is often desirable for the final matching to be one-to-one (a record may be matched with at most one other); members of the database and statistical record linkage communities accomplish such matchings in the final stage by weighted bipartite graph matching on similarity scores.

Combinatorial Optimization Entity Resolution +1

Security Evaluation of Support Vector Machines in Adversarial Environments

no code implementations30 Jan 2014 Battista Biggio, Igino Corona, Blaine Nelson, Benjamin I. P. Rubinstein, Davide Maiorca, Giorgio Fumera, Giorgio Giacinto, and Fabio Roli

Support Vector Machines (SVMs) are among the most popular classification techniques adopted in security applications like malware detection, intrusion detection, and spam filtering.

Intrusion Detection Malware Detection

Bounding Embeddings of VC Classes into Maximum Classes

no code implementations29 Jan 2014 J. Hyam Rubinstein, Benjamin I. P. Rubinstein, Peter L. Bartlett

The most promising approach to positively resolving the conjecture is by embedding general VC classes into maximum classes without super-linear increase to their VC dimensions, as such embeddings would extend the known compression schemes to all VC classes.

Learning Theory

Cannot find the paper you are looking for? You can Submit a new open access paper.