no code implementations • 30 Nov 2023 • Jeremy McMahan, Young Wu, Xiaojin Zhu, Qiaomin Xie
Although the defense problem is NP-hard, we show that optimal Markovian defenses can be computed (learned) in polynomial time (sample complexity) in many scenarios.
no code implementations • 26 Nov 2023 • Shubham Kumar Bharti, Stephen Wright, Adish Singla, Xiaojin Zhu
The goal of the teacher is to teach a realizable target policy to the learner using minimum number of state demonstrations.
no code implementations • 16 Nov 2023 • Ara Vartanian, Xiaoxi Sun, Yun-Shiuan Chuang, Siddharth Suresh, Xiaojin Zhu, Timothy T. Rogers
This paper considers how interactions with AI algorithms can boost human creative thought.
1 code implementation • 9 Nov 2023 • Jeremy McMahan, Xiaojin Zhu
Our reduction yields planning and learning algorithms that are time and sample-efficient for tabular cMDPs so long as the precision of the costs is logarithmic in the size of the cMDP.
no code implementations • 1 Nov 2023 • Young Wu, Jeremy McMahan, Yiding Chen, Yudong Chen, Xiaojin Zhu, Qiaomin Xie
We study the game modification problem, where a benevolent game designer or a malevolent adversary modifies the reward function of a zero-sum Markov game so that a target deterministic or stochastic policy profile becomes the unique Markov perfect Nash equilibrium and has a value within a target range, in a way that minimizes the modification cost.
1 code implementation • NeurIPS 2023 • Xuefeng Du, Yiyou Sun, Xiaojin Zhu, Yixuan Li
Utilizing auxiliary outlier datasets to regularize the machine learning model has demonstrated promise for out-of-distribution (OOD) detection and safe prediction.
1 code implementation • 18 Jul 2023 • Jeremy McMahan, Young Wu, Yudong Chen, Xiaojin Zhu, Qiaomin Xie
Many real-world games suffer from information asymmetry: one player is only aware of their own payoffs while the other player has the full game information.
no code implementations • 13 Jun 2023 • Young Wu, Jeremy McMahan, Xiaojin Zhu, Qiaomin Xie
We characterize offline data poisoning attacks on Multi-Agent Reinforcement Learning (MARL), where an attacker may change a data set in an attempt to install a (potentially fictitious) unique Markov-perfect Nash equilibrium.
1 code implementation • 6 Mar 2023 • Leitian Tao, Xuefeng Du, Xiaojin Zhu, Yixuan Li
Importantly, our proposed synthesis approach does not make any distributional assumption on the ID embeddings, thereby offering strong flexibility and generality.
1 code implementation • 18 Nov 2022 • Shubham Kumar Bharti, Xuezhou Zhang, Adish Singla, Xiaojin Zhu
Instead, our defense mechanism sanitizes the backdoor policy by projecting observed states to a 'safe subspace', estimated from a small number of interactions with a clean (non-triggered) environment.
no code implementations • 4 Jun 2022 • Young Wu, Jeremy McMahan, Xiaojin Zhu, Qiaomin Xie
In offline multi-agent reinforcement learning (MARL), agents estimate policies from a given dataset.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 1 Jun 2022 • Yiding Chen, Xuezhou Zhang, Kaiqing Zhang, Mengdi Wang, Xiaojin Zhu
We consider a distributed reinforcement learning setting where multiple agents separately explore the environment and communicate their experiences through a central server.
2 code implementations • 13 Apr 2022 • Yiyou Sun, Yifei Ming, Xiaojin Zhu, Yixuan Li
In this paper, we explore the efficacy of non-parametric nearest-neighbor distance for OOD detection, which has been largely overlooked in the literature.
no code implementations • 18 Oct 2021 • Yuzhe ma, Young Wu, Xiaojin Zhu
We study the game redesign problem in which an external designer has the ability to change the payoff function in each round, but incurs a design cost for deviating from the original game.
no code implementations • 16 Feb 2021 • Amin Rakhsha, Xuezhou Zhang, Xiaojin Zhu, Adish Singla
We study black-box reward poisoning attacks against reinforcement learning (RL), in which an adversary aims to manipulate the rewards to mislead a sequence of RL agents with unknown algorithms to learn a nefarious policy in an environment unknown to the adversary a priori.
1 code implementation • 11 Feb 2021 • Xuezhou Zhang, Yiding Chen, Xiaojin Zhu, Wen Sun
Our first result shows that no algorithm can find a better than $O(\epsilon)$-optimal policy under our attack model.
no code implementations • 16 Dec 2020 • Yuzhe ma, Jon Sharp, Ruizhe Wang, Earlence Fernandes, Xiaojin Zhu
In this paper, we study adversarial attacks on KF as part of the more complex machine-human hybrid system of Forward Collision Warning.
no code implementations • 21 Nov 2020 • Amin Rakhsha, Goran Radanovic, Rati Devidze, Xiaojin Zhu, Adish Singla
We provide lower/upper bounds on the attack cost, and instantiate our attacks in two settings: (i) an offline setting where the agent is doing planning in the poisoned environment, and (ii) an online setting where the agent is learning a policy with poisoned feedback.
no code implementations • 17 Oct 2020 • Farnam Mansouri, Yuxin Chen, Ara Vartanian, Xiaojin Zhu, Adish Singla
We analyze several properties of the teaching complexity parameter $TD(\sigma)$ associated with different families of the preference functions, e. g., comparison to the VC dimension of the hypothesis class and additivity/sub-additivity of $TD(\sigma)$ over disjoint domains.
no code implementations • 5 Sep 2020 • Yun-Shiuan Chuang, Xuezhou Zhang, Yuzhe ma, Mark K. Ho, Joseph L. Austerweil, Xiaojin Zhu
To solve the machine teaching optimization problem, we use a deep learning approximation method which simulates learners in the environment and learns to predict how feedback affects the learner's internal states.
no code implementations • 30 Jun 2020 • Ayon Sen, Christopher R. Cox, Matthew Cooper Borkenhagen, Mark S. Seidenberg, Xiaojin Zhu
This is a hard combinatorial optimization problem even for a modest number of learning trials (e. g., 10K).
no code implementations • 16 Jun 2020 • Xiaomin Zhang, Xiaojin Zhu, Po-Ling Loh
We first formulate a general statistical algorithm for identifying buggy points and provide rigorous theoretical guarantees under the assumption that the data follow a linear model.
no code implementations • 16 Jun 2020 • Xuezhou Zhang, Shubham Kumar Bharti, Yuzhe ma, Adish Singla, Xiaojin Zhu
Our TDim results provide the minimum number of samples needed for reinforcement learning, and we discuss their connections to standard PAC-style RL sample complexity and teaching-by-demonstration sample complexity results.
no code implementations • L4DC 2020 • Xuezhou Zhang, Xiaojin Zhu, Laurent Lessard
We study data poisoning attacks in the online learning setting, where training data arrive sequentially, and the attacker is eavesdropping the data stream and has the ability to contaminate the current data point to affect the online learning process.
1 code implementation • ICML 2020 • Amin Rakhsha, Goran Radanovic, Rati Devidze, Xiaojin Zhu, Adish Singla
We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy chosen by the attacker.
no code implementations • ICML 2020 • Xuezhou Zhang, Yuzhe ma, Adish Singla, Xiaojin Zhu
In reward-poisoning attacks against reinforcement learning (RL), an attacker can perturb the environment reward $r_t$ into $r_t+\delta_t$ at each step, with the goal of forcing the RL agent to learn a nefarious policy.
no code implementations • NeurIPS 2019 • Xuanqing Liu, Si Si, Xiaojin Zhu, Yang Li, Cho-Jui Hsieh
In this paper, we proposed a general framework for data poisoning attacks to graph-based semi-supervised learning (G-SSL).
no code implementations • NeurIPS 2019 • Farnam Mansouri, Yuxin Chen, Ara Vartanian, Xiaojin Zhu, Adish Singla
In our framework, each function $\sigma \in \Sigma$ induces a teacher-learner pair with teaching complexity as $\TD(\sigma)$.
no code implementations • 18 Oct 2019 • Zhiyan Ding, Yiding Chen, Qin Li, Xiaojin Zhu
To our knowledge, this is the first analysis for SGD error lower bound without the strong convexity assumption.
1 code implementation • NeurIPS 2019 • Yuzhe Ma, Xuezhou Zhang, Wen Sun, Xiaojin Zhu
We study a security threat to batch reinforcement learning and control where the attacker aims to poison the learned policy.
no code implementations • 8 Oct 2019 • Vicki Bier, Paul B. Kantor, Gary Lupyan, Xiaojin Zhu
Specifically, we suggest a large and extensible class of learning tasks, formulated as learning under rules.
no code implementations • 6 Jun 2019 • Ayon Sen, Xiaojin Zhu, Liam Marshall, Robert Nowak
Adversarial attacks aim to confound machine learning systems, while remaining virtually imperceptible to humans.
no code implementations • 16 May 2019 • Owen Levin, Zihang Meng, Vikas Singh, Xiaojin Zhu
Recently it's been shown that neural networks can use images of human faces to accurately predict Body Mass Index (BMI), a widely used health indicator.
no code implementations • 23 Mar 2019 • Yuzhe Ma, Xiaojin Zhu, Justin Hsu
Data poisoning attacks aim to manipulate the model produced by a learning algorithm by adversarially modifying the training set.
no code implementations • 5 Mar 2019 • Xuezhou Zhang, Xiaojin Zhu, Laurent Lessard
We study data poisoning attacks in the online setting where training items arrive sequentially, and the attacker may perturb the current item to manipulate online learning.
no code implementations • 1 Feb 2019 • Yiding Chen, Xiaojin Zhu
In the white-box setting where the attacker knows the environment and forecast models, we present the optimal attack using LQR for linear models, and Model Predictive Control (MPC) for nonlinear models.
no code implementations • 11 Nov 2018 • Xiaojin Zhu
I describe an optimal control view of adversarial machine learning, where the dynamical system is the machine learner, the input are adversarial actions, and the control costs are defined by the adversary's goals to do harm and be hard to detect.
no code implementations • NeurIPS 2018 • Kwang-Sung Jun, Lihong Li, Yuzhe ma, Xiaojin Zhu
We study adversarial attacks that manipulate the reward signals to control the actions chosen by a stochastic multi-armed bandit algorithm.
no code implementations • 15 Oct 2018 • Laurent Lessard, Xuezhou Zhang, Xiaojin Zhu
Our key insight is to formulate sequential machine teaching as a time-optimal control problem.
no code implementations • 17 Aug 2018 • Yuzhe Ma, Kwang-Sung Jun, Lihong Li, Xiaojin Zhu
We provide a general attack framework based on convex optimization and show that by slightly manipulating rewards in the data, an attacker can force the bandit algorithm to pull a target arm for a target contextual vector.
no code implementations • 4 Jun 2018 • Evan Hernandez, Ara Vartanian, Xiaojin Zhu
Program synthesis is the process of automatically translating a specification into computer code.
no code implementations • 25 Feb 2018 • Yuzhe Ma, Robert Nowak, Philippe Rigollet, Xuezhou Zhang, Xiaojin Zhu
We call a learner super-teachable if a teacher can trim down an iid training set while making the learner learn even better.
no code implementations • 24 Jan 2018 • Xuezhou Zhang, Xiaojin Zhu, Stephen J. Wright
The set of trusted items may not by itself be adequate for learning, so we propose an algorithm that uses these items to identify bugs in the training set and thus im- proves learning.
no code implementations • 18 Jan 2018 • Xiaojin Zhu, Adish Singla, Sandra Zilles, Anna N. Rafferty
In this paper we try to organize machine teaching as a coherent set of ideas.
no code implementations • 3 Apr 2017 • Vraj Shah, Arun Kumar, Xiaojin Zhu
Our results show that these high-capacity classifiers are surprisingly and counter-intuitively more robust to avoiding KFK joins compared to linear classifiers, refuting an intuition from the prior work's analysis.
no code implementations • 18 Nov 2016 • Christopher Meek, Patrice Simard, Xiaojin Zhu
We analyze the potential risks and benefits of this teaching pattern through the use of teaching protocols, illustrative examples, and by providing bounds on the effort required for an optimal machine teacher using a linear learning algorithm, the most commonly used type of learners in interactive machine learning systems.
no code implementations • 7 Dec 2015 • Ji Liu, Xiaojin Zhu
Teaching dimension is a learning theoretic quantity that specifies the minimum training set size to teach a target model to a learner.
no code implementations • 29 Jun 2015 • Gautam Dasarathy, Robert Nowak, Xiaojin Zhu
This paper investigates the problem of active learning for binary label prediction on a graph.
no code implementations • 25 Jan 2015 • Shike Mei, Xiaojin Zhu
We investigate a problem at the intersection of machine learning and security: training-set attacks on machine learners.
no code implementations • NeurIPS 2013 • Xiaojin Zhu
What if there is a teacher who knows the learning goal and wants to design good training data for a machine learner?