Search Results for author: Xiaojin Zhu

Found 53 papers, 10 papers with code

Optimal Attack and Defense for Reinforcement Learning

no code implementations30 Nov 2023 Jeremy McMahan, Young Wu, Xiaojin Zhu, Qiaomin Xie

Although the defense problem is NP-hard, we show that optimal Markovian defenses can be computed (learned) in polynomial time (sample complexity) in many scenarios.

reinforcement-learning Reinforcement Learning (RL)

Optimally Teaching a Linear Behavior Cloning Agent

no code implementations26 Nov 2023 Shubham Kumar Bharti, Stephen Wright, Adish Singla, Xiaojin Zhu

The goal of the teacher is to teach a realizable target policy to the learner using minimum number of state demonstrations.

Anytime-Constrained Reinforcement Learning

1 code implementation9 Nov 2023 Jeremy McMahan, Xiaojin Zhu

Our reduction yields planning and learning algorithms that are time and sample-efficient for tabular cMDPs so long as the precision of the costs is logarithmic in the size of the cMDP.

reinforcement-learning

Minimally Modifying a Markov Game to Achieve Any Nash Equilibrium and Value

no code implementations1 Nov 2023 Young Wu, Jeremy McMahan, Yiding Chen, Yudong Chen, Xiaojin Zhu, Qiaomin Xie

We study the game modification problem, where a benevolent game designer or a malevolent adversary modifies the reward function of a zero-sum Markov game so that a target deterministic or stochastic policy profile becomes the unique Markov perfect Nash equilibrium and has a value within a target range, in a way that minimizes the modification cost.

Dream the Impossible: Outlier Imagination with Diffusion Models

1 code implementation NeurIPS 2023 Xuefeng Du, Yiyou Sun, Xiaojin Zhu, Yixuan Li

Utilizing auxiliary outlier datasets to regularize the machine learning model has demonstrated promise for out-of-distribution (OOD) detection and safe prediction.

Out of Distribution (OOD) Detection

VISER: A Tractable Solution Concept for Games with Information Asymmetry

1 code implementation18 Jul 2023 Jeremy McMahan, Young Wu, Yudong Chen, Xiaojin Zhu, Qiaomin Xie

Many real-world games suffer from information asymmetry: one player is only aware of their own payoffs while the other player has the full game information.

Multi-agent Reinforcement Learning

On Faking a Nash Equilibrium

no code implementations13 Jun 2023 Young Wu, Jeremy McMahan, Xiaojin Zhu, Qiaomin Xie

We characterize offline data poisoning attacks on Multi-Agent Reinforcement Learning (MARL), where an attacker may change a data set in an attempt to install a (potentially fictitious) unique Markov-perfect Nash equilibrium.

Data Poisoning Multi-agent Reinforcement Learning +1

Non-Parametric Outlier Synthesis

1 code implementation6 Mar 2023 Leitian Tao, Xuefeng Du, Xiaojin Zhu, Yixuan Li

Importantly, our proposed synthesis approach does not make any distributional assumption on the ID embeddings, thereby offering strong flexibility and generality.

Out-of-Distribution Detection

Provable Defense against Backdoor Policies in Reinforcement Learning

1 code implementation18 Nov 2022 Shubham Kumar Bharti, Xuezhou Zhang, Adish Singla, Xiaojin Zhu

Instead, our defense mechanism sanitizes the backdoor policy by projecting observed states to a 'safe subspace', estimated from a small number of interactions with a clean (non-triggered) environment.

reinforcement-learning Reinforcement Learning (RL)

Byzantine-Robust Online and Offline Distributed Reinforcement Learning

no code implementations1 Jun 2022 Yiding Chen, Xuezhou Zhang, Kaiqing Zhang, Mengdi Wang, Xiaojin Zhu

We consider a distributed reinforcement learning setting where multiple agents separately explore the environment and communicate their experiences through a central server.

reinforcement-learning Reinforcement Learning (RL)

Out-of-Distribution Detection with Deep Nearest Neighbors

2 code implementations13 Apr 2022 Yiyou Sun, Yifei Ming, Xiaojin Zhu, Yixuan Li

In this paper, we explore the efficacy of non-parametric nearest-neighbor distance for OOD detection, which has been largely overlooked in the literature.

Out-of-Distribution Detection

Game Redesign in No-regret Game Playing

no code implementations18 Oct 2021 Yuzhe ma, Young Wu, Xiaojin Zhu

We study the game redesign problem in which an external designer has the ability to change the payoff function in each round, but incurs a design cost for deviating from the original game.

Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments

no code implementations16 Feb 2021 Amin Rakhsha, Xuezhou Zhang, Xiaojin Zhu, Adish Singla

We study black-box reward poisoning attacks against reinforcement learning (RL), in which an adversary aims to manipulate the rewards to mislead a sequence of RL agents with unknown algorithms to learn a nefarious policy in an environment unknown to the adversary a priori.

reinforcement-learning Reinforcement Learning (RL)

Robust Policy Gradient against Strong Data Corruption

1 code implementation11 Feb 2021 Xuezhou Zhang, Yiding Chen, Xiaojin Zhu, Wen Sun

Our first result shows that no algorithm can find a better than $O(\epsilon)$-optimal policy under our attack model.

Continuous Control

Sequential Attacks on Kalman Filter-based Forward Collision Warning Systems

no code implementations16 Dec 2020 Yuzhe ma, Jon Sharp, Ruizhe Wang, Earlence Fernandes, Xiaojin Zhu

In this paper, we study adversarial attacks on KF as part of the more complex machine-human hybrid system of Forward Collision Warning.

Autonomous Vehicles Model Predictive Control

Policy Teaching in Reinforcement Learning via Environment Poisoning Attacks

no code implementations21 Nov 2020 Amin Rakhsha, Goran Radanovic, Rati Devidze, Xiaojin Zhu, Adish Singla

We provide lower/upper bounds on the attack cost, and instantiate our attacks in two settings: (i) an offline setting where the agent is doing planning in the poisoned environment, and (ii) an online setting where the agent is learning a policy with poisoned feedback.

reinforcement-learning Reinforcement Learning (RL)

Preference-Based Batch and Sequential Teaching

no code implementations17 Oct 2020 Farnam Mansouri, Yuxin Chen, Ara Vartanian, Xiaojin Zhu, Adish Singla

We analyze several properties of the teaching complexity parameter $TD(\sigma)$ associated with different families of the preference functions, e. g., comparison to the VC dimension of the hypothesis class and additivity/sub-additivity of $TD(\sigma)$ over disjoint domains.

Using Machine Teaching to Investigate Human Assumptions when Teaching Reinforcement Learners

no code implementations5 Sep 2020 Yun-Shiuan Chuang, Xuezhou Zhang, Yuzhe ma, Mark K. Ho, Joseph L. Austerweil, Xiaojin Zhu

To solve the machine teaching optimization problem, we use a deep learning approximation method which simulates learners in the environment and learns to predict how feedback affects the learner's internal states.

Q-Learning

Learning to Read through Machine Teaching

no code implementations30 Jun 2020 Ayon Sen, Christopher R. Cox, Matthew Cooper Borkenhagen, Mark S. Seidenberg, Xiaojin Zhu

This is a hard combinatorial optimization problem even for a modest number of learning trials (e. g., 10K).

Combinatorial Optimization

Provable Training Set Debugging for Linear Regression

no code implementations16 Jun 2020 Xiaomin Zhang, Xiaojin Zhu, Po-Ling Loh

We first formulate a general statistical algorithm for identifying buggy points and provide rigorous theoretical guarantees under the assumption that the data follow a linear model.

BIG-bench Machine Learning regression

The Sample Complexity of Teaching-by-Reinforcement on Q-Learning

no code implementations16 Jun 2020 Xuezhou Zhang, Shubham Kumar Bharti, Yuzhe ma, Adish Singla, Xiaojin Zhu

Our TDim results provide the minimum number of samples needed for reinforcement learning, and we discuss their connections to standard PAC-style RL sample complexity and teaching-by-demonstration sample complexity results.

Q-Learning reinforcement-learning +1

Online Data Poisoning Attacks

no code implementations L4DC 2020 Xuezhou Zhang, Xiaojin Zhu, Laurent Lessard

We study data poisoning attacks in the online learning setting, where training data arrive sequentially, and the attacker is eavesdropping the data stream and has the ability to contaminate the current data point to affect the online learning process.

Data Poisoning Model Predictive Control +2

Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning

1 code implementation ICML 2020 Amin Rakhsha, Goran Radanovic, Rati Devidze, Xiaojin Zhu, Adish Singla

We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy chosen by the attacker.

reinforcement-learning Reinforcement Learning (RL)

Adaptive Reward-Poisoning Attacks against Reinforcement Learning

no code implementations ICML 2020 Xuezhou Zhang, Yuzhe ma, Adish Singla, Xiaojin Zhu

In reward-poisoning attacks against reinforcement learning (RL), an attacker can perturb the environment reward $r_t$ into $r_t+\delta_t$ at each step, with the goal of forcing the RL agent to learn a nefarious policy.

reinforcement-learning Reinforcement Learning (RL)

Preference-Based Batch and Sequential Teaching: Towards a Unified View of Models

no code implementations NeurIPS 2019 Farnam Mansouri, Yuxin Chen, Ara Vartanian, Xiaojin Zhu, Adish Singla

In our framework, each function $\sigma \in \Sigma$ induces a teacher-learner pair with teaching complexity as $\TD(\sigma)$.

Error Lower Bounds of Constant Step-size Stochastic Gradient Descent

no code implementations18 Oct 2019 Zhiyan Ding, Yiding Chen, Qin Li, Xiaojin Zhu

To our knowledge, this is the first analysis for SGD error lower bound without the strong convexity assumption.

BIG-bench Machine Learning

Policy Poisoning in Batch Reinforcement Learning and Control

1 code implementation NeurIPS 2019 Yuzhe Ma, Xuezhou Zhang, Wen Sun, Xiaojin Zhu

We study a security threat to batch reinforcement learning and control where the attacker aims to poison the learned policy.

reinforcement-learning Reinforcement Learning (RL)

Can We Distinguish Machine Learning from Human Learning?

no code implementations8 Oct 2019 Vicki Bier, Paul B. Kantor, Gary Lupyan, Xiaojin Zhu

Specifically, we suggest a large and extensible class of learning tasks, formulated as learning under rules.

BIG-bench Machine Learning

Should Adversarial Attacks Use Pixel p-Norm?

no code implementations6 Jun 2019 Ayon Sen, Xiaojin Zhu, Liam Marshall, Robert Nowak

Adversarial attacks aim to confound machine learning systems, while remaining virtually imperceptible to humans.

Adversarial Attack BIG-bench Machine Learning +2

Fooling Computer Vision into Inferring the Wrong Body Mass Index

no code implementations16 May 2019 Owen Levin, Zihang Meng, Vikas Singh, Xiaojin Zhu

Recently it's been shown that neural networks can use images of human faces to accurately predict Body Mass Index (BMI), a widely used health indicator.

General Classification regression

Data Poisoning against Differentially-Private Learners: Attacks and Defenses

no code implementations23 Mar 2019 Yuzhe Ma, Xiaojin Zhu, Justin Hsu

Data poisoning attacks aim to manipulate the model produced by a learning algorithm by adversarially modifying the training set.

Data Poisoning

Online Data Poisoning Attack

no code implementations5 Mar 2019 Xuezhou Zhang, Xiaojin Zhu, Laurent Lessard

We study data poisoning attacks in the online setting where training items arrive sequentially, and the attacker may perturb the current item to manipulate online learning.

Data Poisoning Model Predictive Control +2

Optimal Attack against Autoregressive Models by Manipulating the Environment

no code implementations1 Feb 2019 Yiding Chen, Xiaojin Zhu

In the white-box setting where the attacker knows the environment and forecast models, we present the optimal attack using LQR for linear models, and Model Predictive Control (MPC) for nonlinear models.

Adversarial Attack Model Predictive Control +2

An Optimal Control View of Adversarial Machine Learning

no code implementations11 Nov 2018 Xiaojin Zhu

I describe an optimal control view of adversarial machine learning, where the dynamical system is the machine learner, the input are adversarial actions, and the control costs are defined by the adversary's goals to do harm and be hard to detect.

BIG-bench Machine Learning Data Poisoning +2

Adversarial Attacks on Stochastic Bandits

no code implementations NeurIPS 2018 Kwang-Sung Jun, Lihong Li, Yuzhe ma, Xiaojin Zhu

We study adversarial attacks that manipulate the reward signals to control the actions chosen by a stochastic multi-armed bandit algorithm.

An Optimal Control Approach to Sequential Machine Teaching

no code implementations15 Oct 2018 Laurent Lessard, Xuezhou Zhang, Xiaojin Zhu

Our key insight is to formulate sequential machine teaching as a time-optimal control problem.

Data Poisoning Attacks in Contextual Bandits

no code implementations17 Aug 2018 Yuzhe Ma, Kwang-Sung Jun, Lihong Li, Xiaojin Zhu

We provide a general attack framework based on convex optimization and show that by slightly manipulating rewards in the data, an attacker can force the bandit algorithm to pull a target arm for a target contextual vector.

Data Poisoning Multi-Armed Bandits +2

Program Synthesis from Visual Specification

no code implementations4 Jun 2018 Evan Hernandez, Ara Vartanian, Xiaojin Zhu

Program synthesis is the process of automatically translating a specification into computer code.

Program Synthesis

Teacher Improves Learning by Selecting a Training Subset

no code implementations25 Feb 2018 Yuzhe Ma, Robert Nowak, Philippe Rigollet, Xuezhou Zhang, Xiaojin Zhu

We call a learner super-teachable if a teacher can trim down an iid training set while making the learner learn even better.

General Classification regression

Training Set Debugging Using Trusted Items

no code implementations24 Jan 2018 Xuezhou Zhang, Xiaojin Zhu, Stephen J. Wright

The set of trusted items may not by itself be adequate for learning, so we propose an algorithm that uses these items to identify bugs in the training set and thus im- proves learning.

BIG-bench Machine Learning Bilevel Optimization

An Overview of Machine Teaching

no code implementations18 Jan 2018 Xiaojin Zhu, Adish Singla, Sandra Zilles, Anna N. Rafferty

In this paper we try to organize machine teaching as a coherent set of ideas.

Are Key-Foreign Key Joins Safe to Avoid when Learning High-Capacity Classifiers?

no code implementations3 Apr 2017 Vraj Shah, Arun Kumar, Xiaojin Zhu

Our results show that these high-capacity classifiers are surprisingly and counter-intuitively more robust to avoiding KFK joins compared to linear classifiers, refuting an intuition from the prior work's analysis.

Management

Analysis of a Design Pattern for Teaching with Features and Labels

no code implementations18 Nov 2016 Christopher Meek, Patrice Simard, Xiaojin Zhu

We analyze the potential risks and benefits of this teaching pattern through the use of teaching protocols, illustrative examples, and by providing bounds on the effort required for an optimal machine teacher using a linear learning algorithm, the most commonly used type of learners in interactive machine learning systems.

The Teaching Dimension of Linear Learners

no code implementations7 Dec 2015 Ji Liu, Xiaojin Zhu

Teaching dimension is a learning theoretic quantity that specifies the minimum training set size to teach a target model to a learner.

regression

Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners

no code implementations25 Jan 2015 Shike Mei, Xiaojin Zhu

We investigate a problem at the intersection of machine learning and security: training-set attacks on machine learners.

Bilevel Optimization regression

Machine Teaching for Bayesian Learners in the Exponential Family

no code implementations NeurIPS 2013 Xiaojin Zhu

What if there is a teacher who knows the learning goal and wants to design good training data for a machine learner?

Cannot find the paper you are looking for? You can Submit a new open access paper.