Search Results for author: Zizhan Zheng

Found 14 papers, 3 papers with code

Belief-Enriched Pessimistic Q-Learning against Adversarial State Perturbations

1 code implementation6 Mar 2024 Xiaolin Sun, Zizhan Zheng

Existing solutions either introduce a regularization term to improve the smoothness of the trained policy against perturbations or alternatively train the agent's policy and the attacker's policy.

Q-Learning Reinforcement Learning (RL)

Enhancing LLM Safety via Constrained Direct Preference Optimization

no code implementations4 Mar 2024 Zixuan Liu, Xiaolin Sun, Zizhan Zheng

Empirically, our approach provides a safety guarantee to LLMs that is missing in DPO while achieving significantly higher rewards under the same safety constraint compared to a recently proposed safe RLHF approach.

reinforcement-learning

A First Order Meta Stackelberg Method for Robust Federated Learning

no code implementations23 Jun 2023 Yunian Pan, Tao Li, Henger Li, Tianyi Xu, Zizhan Zheng, Quanyan Zhu

Previous research has shown that federated learning (FL) systems are exposed to an array of security risks.

Federated Learning Meta-Learning +1

Learning to Backdoor Federated Learning

1 code implementation6 Mar 2023 Henger Li, Chen Wu, Sencun Zhu, Zizhan Zheng

In particular, we propose a general reinforcement learning-based backdoor attack framework where the attacker first trains a (non-myopic) attack policy using a simulator built upon its local data and common knowledge on the FL system, which is then applied during actual FL training.

Backdoor Attack Federated Learning +1

Online Learning for Adaptive Probing and Scheduling in Dense WLANs

no code implementations27 Dec 2022 Tianyi Xu, Ding Zhang, Zizhan Zheng

The problem is challenging even when the link rate distributions are pre-known (the offline setting) due to the necessity of balancing the information gains from probing and the cost of reducing the data transmission opportunity.

Scheduling

Pandering in a Flexible Representative Democracy

no code implementations18 Nov 2022 Xiaolin Sun, Jacob Masur, Ben Abramowitz, Nicholas Mattei, Zizhan Zheng

We introduce a novel formal model of \emph{pandering}, or strategic preference reporting by candidates seeking to be elected, and examine the resilience of two democratic voting systems to pandering within a single round and across multiple rounds.

Privacy Protected Multi-Domain Collaborative Learning

no code implementations29 Sep 2021 Haifeng Xia, Taotao Jing, Zizhan Zheng, Zhengming Ding

Unsupervised domain adaptation (UDA) aims to transfer knowledge from one or more well-labeled source domains to improve model performance on the different-yet-related target domain without any annotations.

Unsupervised Domain Adaptation

Joint AP Probing and Scheduling: A Contextual Bandit Approach

no code implementations6 Aug 2021 Tianyi Xu, Ding Zhang, Parth H. Pathak, Zizhan Zheng

In contrast to traditional link scheduling problems under uncertainty, we assume that in each time step, the device can probe a subset of links before deciding which one to use.

Decision Making Scheduling

Robust Sequence Submodular Maximization

no code implementations NeurIPS 2020 Gamal Sallam, Zizhan Zheng, Jie Wu, Bo Ji

Compared to robust submodular maximization for set function, new challenges arise when sequence functions are concerned.

Structure Matters: Towards Generating Transferable Adversarial Images

no code implementations22 Oct 2019 Dan Peng, Zizhan Zheng, Linhao Luo, Xiaofeng Zhang

In this paper, we propose the novel concepts of structure patterns and structure-aware perturbations that relax the small perturbation constraint while still keeping images natural.

Image Classification Novel Concepts +1

Structure-Preserving Transformation: Generating Diverse and Transferable Adversarial Examples

1 code implementation8 Sep 2018 Dan Peng, Zizhan Zheng, Xiaofeng Zhang

A common requirement in all these works is that the malicious perturbations should be small enough (measured by an L_p norm for some p) so that they are imperceptible to humans.

Image Classification

Analysis of Thompson Sampling for Graphical Bandits Without the Graphs

no code implementations23 May 2018 Fang Liu, Zizhan Zheng, Ness Shroff

To fill this gap, we propose a variant of Thompson Sampling, that attains the optimal regret in the directed setting within a logarithmic factor.

Thompson Sampling

When to Reset Your Keys: Optimal Timing of Security Updates via Learning

no code implementations1 Dec 2016 Zizhan Zheng, Ness B. Shroff, Prasant Mohapatra

As these attacks are often designed to disable a system (or a critical resource, e. g., a user account) repeatedly, it is crucial for the defender to keep updating its security measures to strike a balance between the risk of being compromised and the cost of security updates.

Cannot find the paper you are looking for? You can Submit a new open access paper.