Search Results for author: Yu-Xiang Wang

Found 109 papers, 29 papers with code

Differentially Private Reinforcement Learning with Self-Play

no code implementations • 11 Apr 2024 • Dan Qiao, Yu-Xiang Wang

We study the problem of multi-agent reinforcement learning (multi-agent RL) with differential privacy (DP) constraints.

Multi-agent Reinforcement Learning reinforcement-learning

Paper
Add Code

CPR: Retrieval Augmented Generation for Copyright Protection

no code implementations • 27 Mar 2024 • Aditya Golatkar, Alessandro Achille, Luca Zancato, Yu-Xiang Wang, Ashwin Swaminathan, Stefano Soatto

To reduce risks of leaking private information contained in the retrieved set, we introduce Copy-Protected generation with Retrieval (CPR), a new method for RAG with strong copyright protection guarantees in a mixed-private setting for diffusion models. CPR allows to condition the output of diffusion models on a set of retrieved images, while also guaranteeing that unique identifiable information about those example is not exposed in the generated outputs.

Image Generation Machine Unlearning +1

Paper
Add Code

Privacy Profiles for Private Selection

no code implementations • 9 Feb 2024 • Antti Koskela, Rachel Redberg, Yu-Xiang Wang

Private selection mechanisms (e. g., Report Noisy Max, Sparse Vector) are fundamental primitives of differentially private (DP) data analysis with wide applications to private query release, voting, and hyperparameter tuning.

Paper
Add Code

Permute-and-Flip: An optimally robust and watermarkable decoder for LLMs

1 code implementation • 8 Feb 2024 • Xuandong Zhao, Lei LI, Yu-Xiang Wang

In this paper, we propose a new decoding method called Permute-and-Flip (PF) decoder.

Paper
Code

Online Feature Updates Improve Online (Generalized) Label Shift Adaptation

no code implementations • 5 Feb 2024 • Ruihan Wu, Siddhartha Datta, Yi Su, Dheeraj Baby, Yu-Xiang Wang, Kilian Q. Weinberger

This paper addresses the prevalent issue of label shift in an online setting with missing labels, where data distributions change over time and obtaining timely labels is challenging.

Missing Labels Self-Supervised Learning

Paper
Add Code

Near-Optimal Reinforcement Learning with Self-Play under Adaptivity Constraints

no code implementations • 2 Feb 2024 • Dan Qiao, Yu-Xiang Wang

We study the problem of multi-agent reinforcement learning (MARL) with adaptivity constraints -- a new problem motivated by real-world applications where deployments of new policies are costly and the number of policy updates must be minimized.

Multi-agent Reinforcement Learning reinforcement-learning

Paper
Add Code

Weak-to-Strong Jailbreaking on Large Language Models

1 code implementation • 30 Jan 2024 • Xuandong Zhao, Xianjun Yang, Tianyu Pang, Chao Du, Lei LI, Yu-Xiang Wang, William Yang Wang

In this paper, we propose the weak-to-strong jailbreaking attack, an efficient method to attack aligned LLMs to produce harmful text.

Paper
Code

Improving the Privacy and Practicality of Objective Perturbation for Differentially Private Linear Learners

no code implementations • NeurIPS 2023 • Rachel Redberg, Antti Koskela, Yu-Xiang Wang

In the arena of privacy-preserving machine learning, differentially private stochastic gradient descent (DP-SGD) has outstripped the objective perturbation mechanism in popularity and interest.

Privacy Preserving regression

Paper
Add Code

Pricing with Contextual Elasticity and Heteroscedastic Valuation

no code implementations • 26 Dec 2023 • Jianyu Xu, Yu-Xiang Wang

We study an online contextual dynamic pricing problem, where customers decide whether to purchase a product based on its features and price.

Paper
Add Code

Communication-Efficient Federated Non-Linear Bandit Optimization

no code implementations • 3 Nov 2023 • Chuanhao Li, Chong Liu, Yu-Xiang Wang

Federated optimization studies the problem of collaborative function optimization among multiple clients (e. g. mobile devices or organizations) under the coordination of a central server.

Paper
Add Code

On the accuracy and efficiency of group-wise clipping in differentially private optimization

no code implementations • 30 Oct 2023 • Zhiqi Bu, Ruixuan Liu, Yu-Xiang Wang, Sheng Zha, George Karypis

Recent advances have substantially improved the accuracy, memory cost, and training speed of differentially private (DP) deep learning, especially on large vision and language models with millions to billions of parameters.

Paper
Add Code

Tractable MCMC for Private Learning with Pure and Gaussian Differential Privacy

no code implementations • 23 Oct 2023 • Yingyu Lin, Yian Ma, Yu-Xiang Wang, Rachel Redberg

Posterior sampling, i. e., exponential mechanism to sample from the posterior distribution, provides $\varepsilon$-pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by $(\varepsilon,\delta)$-approximate DP.

Paper
Add Code

Coupling public and private gradient provably helps optimization

no code implementations • 2 Oct 2023 • Ruixuan Liu, Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis

The success of large neural networks is crucially determined by the availability of data.

Paper
Add Code

Threshold KNN-Shapley: A Linear-Time and Privacy-Friendly Approach to Data Valuation

no code implementations • 30 Aug 2023 • Jiachen T. Wang, Yuqing Zhu, Yu-Xiang Wang, Ruoxi Jia, Prateek Mittal

Data valuation aims to quantify the usefulness of individual data sources in training machine learning (ML) models, and is a critical aspect of data-centric ML research.

Data Valuation

Paper
Add Code

Model-Free Algorithm with Improved Sample Efficiency for Zero-Sum Markov Games

no code implementations • 17 Aug 2023 • Songtao Feng, Ming Yin, Yu-Xiang Wang, Jing Yang, Yingbin Liang

In this work, we propose a model-free stage-based Q-learning algorithm and show that it achieves the same sample complexity as the best model-based algorithm, and hence for the first time demonstrate that model-free algorithms can enjoy the same optimality in the $H$ dependence as model-based algorithms.

Multi-agent Reinforcement Learning Q-Learning +1

Paper
Add Code

Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks

no code implementations • 4 Jul 2023 • Kaiqi Zhang, Zixuan Zhang, Minshuo Chen, Yuma Takeda, Mengdi Wang, Tuo Zhao, Yu-Xiang Wang

Convolutional residual neural networks (ConvResNets), though overparameterized, can achieve remarkable prediction performance in practice, which cannot be well explained by conventional wisdom.

Paper
Add Code

Provable Robust Watermarking for AI-Generated Text

4 code implementations • 30 Jun 2023 • Xuandong Zhao, Prabhanjan Ananth, Lei LI, Yu-Xiang Wang

We propose a robust and high-quality watermark method, Unigram-Watermark, by extending an existing approach with a simplified fixed grouping strategy.

Language Modelling

444

Paper
Code

Offline Policy Evaluation for Reinforcement Learning with Adaptively Collected Data

no code implementations • 24 Jun 2023 • Sunil Madhow, Dan Xiao, Ming Yin, Yu-Xiang Wang

Developing theoretical guarantees on the sample complexity of offline RL methods is an important step towards making data-hungry RL algorithms practically viable.

Offline RL reinforcement-learning

Paper
Add Code

"Private Prediction Strikes Back!'' Private Kernelized Nearest Neighbors with Individual Renyi Filter

1 code implementation • 12 Jun 2023 • Yuqing Zhu, Xuandong Zhao, Chuan Guo, Yu-Xiang Wang

Most existing approaches of differentially private (DP) machine learning focus on private training.

Paper
Code

Invisible Image Watermarks Are Provably Removable Using Generative AI

1 code implementation • 2 Jun 2023 • Xuandong Zhao, Kexun Zhang, Zihao Su, Saastha Vasan, Ilya Grishchenko, Christopher Kruegel, Giovanni Vigna, Yu-Xiang Wang, Lei LI

However, if we do not require the watermarked image to look the same as the original one, watermarks that keep the image semantically similar can be an alternative defense against our attack.

Image Denoising

131

Paper
Code

Non-stationary Reinforcement Learning under General Function Approximation

no code implementations • 1 Jun 2023 • Songtao Feng, Ming Yin, Ruiquan Huang, Yu-Xiang Wang, Jing Yang, Yingbin Liang

To the best of our knowledge, this is the first dynamic regret analysis in non-stationary MDPs with general function approximation.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Improved Differentially Private Regression via Gradient Boosting

2 code implementations • 6 Mar 2023 • Shuai Tang, Sergul Aydore, Michael Kearns, Saeyoung Rho, Aaron Roth, Yichen Wang, Yu-Xiang Wang, Zhiwei Steven Wu

We revisit the problem of differentially private squared error linear regression.

regression

5,998

Paper
Code

No-Regret Linear Bandits beyond Realizability

no code implementations • 26 Feb 2023 • Chong Liu, Ming Yin, Yu-Xiang Wang

It achieves a near-optimal $\sqrt{T}$ regret for problems that the best-known regret is almost linear in time horizon $T$.

Paper
Add Code

Logarithmic Switching Cost in Reinforcement Learning beyond Linear MDPs

no code implementations • 24 Feb 2023 • Dan Qiao, Ming Yin, Yu-Xiang Wang

In many real-life reinforcement learning (RL) problems, deploying new policies is costly.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Protecting Language Generation Models via Invisible Watermarking

2 code implementations • 6 Feb 2023 • Xuandong Zhao, Yu-Xiang Wang, Lei LI

We can then detect the secret message by probing a suspect model to tell if it is distilled from the protected one.

Model extraction Text Generation

Paper
Code

Generalized PTR: User-Friendly Recipes for Data-Adaptive Algorithms with Differential Privacy

no code implementations • 31 Dec 2022 • Rachel Redberg, Yuqing Zhu, Yu-Xiang Wang

The ''Propose-Test-Release'' (PTR) framework is a classic recipe for designing differentially private (DP) algorithms that are data-adaptive, i. e. those that add less noise when the input dataset is nice.

regression

Paper
Add Code

Near-Optimal Differentially Private Reinforcement Learning

no code implementations • 9 Dec 2022 • Dan Qiao, Yu-Xiang Wang

We close this gap for the JDP case by designing an $\epsilon$-JDP algorithm with a regret of $\widetilde{O}(\sqrt{SAH^2T}+S^2AH^3/\epsilon)$ which matches the information-theoretic lower bound of non-private learning for all choices of $\epsilon> S^{1. 5}A^{0. 5} H^2/\sqrt{T}$.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Offline Reinforcement Learning with Closed-Form Policy Improvement Operators

no code implementations • 29 Nov 2022 • Jiachen Li, Edwin Zhang, Ming Yin, Qinxun Bai, Yu-Xiang Wang, William Yang Wang

Behavior constrained policy optimization has been demonstrated to be a successful paradigm for tackling Offline Reinforcement Learning.

D4RL Offline RL +2

Paper
Add Code

Global Optimization with Parametric Function Approximation

no code implementations • 16 Nov 2022 • Chong Liu, Yu-Xiang Wang

We consider the problem of global optimization with noisy zeroth order oracles - a well-motivated problem useful for various applications ranging from hyper-parameter tuning for deep learning to new material design.

Bayesian Optimization Gaussian Processes

Paper
Add Code

Distillation-Resistant Watermarking for Model Protection in NLP

1 code implementation • 7 Oct 2022 • Xuandong Zhao, Lei LI, Yu-Xiang Wang

We prove that a protected model still retains the original accuracy within a certain bound.

Named Entity Recognition Named Entity Recognition (NER) +2

Paper
Code

Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient

no code implementations • 3 Oct 2022 • Ming Yin, Mengdi Wang, Yu-Xiang Wang

Offline reinforcement learning, which aims at optimizing sequential decision-making strategies with historical data, has been extensively applied in real-life applications.

Decision Making Offline RL +3

Paper
Add Code

Near-Optimal Deployment Efficiency in Reward-Free Reinforcement Learning with Linear Function Approximation

no code implementations • 3 Oct 2022 • Dan Qiao, Yu-Xiang Wang

We study the problem of deployment efficient reinforcement learning (RL) with linear function approximation under the \emph{reward-free} exploration setting.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Differentially Private Optimization on Large Model at Small Cost

2 code implementations • 30 Sep 2022 • Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis

Our implementation achieves state-of-the-art (SOTA) accuracy with very small extra cost: on GPT2 and at almost the same memory cost (<1% overhead), BK has 1. 03X the time complexity of the standard training (0. 83X training speed in practice), and 0. 61X the time complexity of the most efficient DP implementation (1. 36X training speed in practice).

Privacy Preserving

Paper
Code

Differentially Private Bias-Term only Fine-tuning of Foundation Models

1 code implementation • 30 Sep 2022 • Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis

We study the problem of differentially private (DP) fine-tuning of large pre-trained models -- a recent privacy-preserving approach suitable for solving downstream tasks with sensitive data.

Privacy Preserving

Paper
Code

Doubly Fair Dynamic Pricing

no code implementations • 23 Sep 2022 • Jianyu Xu, Dan Qiao, Yu-Xiang Wang

We show that a doubly fair policy must be random to have higher revenue than the best trivial policy that assigns the same price to different groups.

Fairness

Paper
Add Code

Optimal Dynamic Regret in LQR Control

no code implementations • 18 Jun 2022 • Dheeraj Baby, Yu-Xiang Wang

We consider the problem of nonstochastic control with a sequence of quadratic losses, i. e., LQR control.

Paper
Add Code

Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks

no code implementations • 13 Jun 2022 • Kaiqi Zhang, Ming Yin, Yu-Xiang Wang

We propose a quasi neural network to approximate the distribution propagation, which is a neural network with continuous parameters and smooth activation function.

Quantization

Paper
Add Code

Offline Stochastic Shortest Path: Learning, Evaluation and Towards Optimality

no code implementations • 10 Jun 2022 • Ming Yin, Wenjing Chen, Mengdi Wang, Yu-Xiang Wang

Goal-oriented Reinforcement Learning, where the agent needs to reach the goal state while simultaneously minimizing the cost, has received significant attention in real-world applications.

Paper
Add Code

Provably Confidential Language Modelling

1 code implementation • NAACL 2022 • Xuandong Zhao, Lei LI, Yu-Xiang Wang

Large language models are shown to memorize privacy information such as social security numbers in training data.

Language Modelling Memorization +1

Paper
Code

Second Order Path Variationals in Non-Stationary Online Learning

no code implementations • 4 May 2022 • Dheeraj Baby, Yu-Xiang Wang

We consider the problem of universal dynamic regret minimization under exp-concave and smooth losses.

Paper
Add Code

Deep Learning meets Nonparametric Regression: Are Weight-Decayed DNNs Locally Adaptive?

no code implementations • 20 Apr 2022 • Kaiqi Zhang, Yu-Xiang Wang

We consider a "Parallel NN" variant of deep ReLU networks and show that the standard weight decay is equivalent to promoting the $\ell_p$-sparsity ($0<p<1$) of the coefficient vector of an end-to-end learned function bases, i. e., a dictionary.

regression

Paper
Add Code

Towards Differential Relational Privacy and its use in Question Answering

no code implementations • 30 Mar 2022 • Simone Bombari, Alessandro Achille, Zijian Wang, Yu-Xiang Wang, Yusheng Xie, Kunwar Yashraj Singh, Srikar Appalaraju, Vijay Mahadevan, Stefano Soatto

While bounding general memorization can have detrimental effects on the performance of a trained model, bounding RM does not prevent effective learning.

Memorization Question Answering

Paper
Add Code

Adaptive Private-K-Selection with Adaptive K and Application to Multi-label PATE

no code implementations • 30 Mar 2022 • Yuqing Zhu, Yu-Xiang Wang

We provide an end-to-end Renyi DP based-framework for differentially private top-$k$ selection.

Multi-Label Classification

Paper
Add Code

Mixed Differential Privacy in Computer Vision

no code implementations • CVPR 2022 • Aditya Golatkar, Alessandro Achille, Yu-Xiang Wang, Aaron Roth, Michael Kearns, Stefano Soatto

AdaMix incorporates few-shot training, or cross-modal zero-shot learning, on public data prior to private fine-tuning, to improve the trade-off.

Zero-Shot Learning

Paper
Add Code

Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism

no code implementations • 11 Mar 2022 • Ming Yin, Yaqi Duan, Mengdi Wang, Yu-Xiang Wang

However, a precise understanding of the statistical limits with function representations, remains elusive, even when such a representation is linear.

Decision Making reinforcement-learning +1

Paper
Add Code

Sample-Efficient Reinforcement Learning with loglog(T) Switching Cost

no code implementations • 13 Feb 2022 • Dan Qiao, Ming Yin, Ming Min, Yu-Xiang Wang

In this paper, we propose a new algorithm based on stage-wise exploration and adaptive policy elimination that achieves a regret of $\widetilde{O}(\sqrt{H^4S^2AT})$ while requiring a switching cost of $O(HSA \log\log T)$.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Towards Agnostic Feature-based Dynamic Pricing: Linear Policies vs Linear Valuation with Unknown Noise

no code implementations • 27 Jan 2022 • Jianyu Xu, Yu-Xiang Wang

In feature-based dynamic pricing, a seller sets appropriate prices for a sequence of products (described by feature vectors) on the fly by learning from the binary outcomes of previous sales sessions ("Sold" if valuation $\geq$ price, and "Not Sold" otherwise).

Paper
Add Code

Optimal Dynamic Regret in Proper Online Learning with Strongly Convex Losses and Beyond

no code implementations • 21 Jan 2022 • Dheeraj Baby, Yu-Xiang Wang

We study the framework of universal dynamic regret minimization with strongly convex losses.

Paper
Add Code

Multivariate Trend Filtering for Lattice Data

no code implementations • 29 Dec 2021 • Veeranjaneyulu Sadhanala, Yu-Xiang Wang, Addison J. Hu, Ryan J. Tibshirani

We study a multivariate version of trend filtering, called Kronecker trend filtering or KTF, for the case in which the design points form a lattice in $d$ dimensions.

Paper
Add Code

Privately Publishable Per-instance Privacy

no code implementations • NeurIPS 2021 • Rachel Redberg, Yu-Xiang Wang

We consider how to privately share the personalized privacy losses incurred by objective perturbation, using per-instance differential privacy (pDP).

Paper
Add Code

Towards Instance-Optimal Offline Reinforcement Learning with Pessimism

no code implementations • NeurIPS 2021 • Ming Yin, Yu-Xiang Wang

We study the offline reinforcement learning (offline RL) problem, where the goal is to learn a reward-maximizing policy in an unknown Markov Decision Process (MDP) using the data coming from a policy $\mu$.

Offline RL reinforcement-learning +1

Paper
Add Code

SeqPATE: Differentially Private Text Generation via Knowledge Distillation

no code implementations • 29 Sep 2021 • Zhiliang Tian, Yingxiu Zhao, Ziyue Huang, Yu-Xiang Wang, Nevin Zhang, He He

Differentially private (DP) learning algorithms provide guarantees on identifying the existence of a training sample from model outputs.

Knowledge Distillation Sentence +2

Paper
Add Code

Smoothed Differential Privacy

no code implementations • 4 Jul 2021 • Ao Liu, Yu-Xiang Wang, Lirong Xia

Differential privacy (DP) is a widely-accepted and widely-applied notion of privacy based on worst-case analysis.

Paper
Add Code

Optimal Accounting of Differential Privacy via Characteristic Function

1 code implementation • 16 Jun 2021 • Yuqing Zhu, Jinshuo Dong, Yu-Xiang Wang

Characterizing the privacy degradation over compositions, i. e., privacy accounting, is a fundamental topic in differential privacy (DP) with many applications to differentially private machine learning and federated learning.

Federated Learning

248

Paper
Code

Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings

no code implementations • NeurIPS 2021 • Ming Yin, Yu-Xiang Wang

This work studies the statistical limits of uniform convergence for offline policy evaluation (OPE) problems with model-based methods (for episodic MDP) and provides a unified framework towards optimal learning for several well-motivated offline tasks.

Offline RL

Paper
Add Code

Optimal Dynamic Regret in Exp-Concave Online Learning

no code implementations • 23 Apr 2021 • Dheeraj Baby, Yu-Xiang Wang

We consider the problem of the Zinkevich (2003)-style dynamic regret minimization in online learning with exp-concave losses.

Paper
Add Code

Logarithmic Regret in Feature-based Dynamic Pricing

no code implementations • NeurIPS 2021 • Jianyu Xu, Yu-Xiang Wang

Feature-based dynamic pricing is an increasingly popular model of setting prices for highly differentiated products with applications in digital marketing, online sales, real estate and so on.

Marketing

Paper
Add Code

Non-stationary Online Learning with Memory and Non-stochastic Control

no code implementations • 7 Feb 2021 • Peng Zhao, Yu-Hu Yan, Yu-Xiang Wang, Zhi-Hua Zhou

We study the problem of Online Convex Optimization (OCO) with memory, which allows loss functions to depend on past decisions and thus captures temporal effects of learning problems.

Paper
Add Code

Near-Optimal Offline Reinforcement Learning via Double Variance Reduction

no code implementations • NeurIPS 2021 • Ming Yin, Yu Bai, Yu-Xiang Wang

Our main result shows that OPDVR provably identifies an $\epsilon$-optimal policy with $\widetilde{O}(H^2/d_m\epsilon^2)$ episodes of offline data in the finite-horizon stationary transition setting, where $H$ is the horizon length and $d_m$ is the minimal marginal state-action distribution induced by the behavior policy.

Offline RL reinforcement-learning +1

Paper
Add Code

An Optimal Reduction of TV-Denoising to Adaptive Online Learning

no code implementations • 23 Jan 2021 • Dheeraj Baby, Xuandong Zhao, Yu-Xiang Wang

We consider the problem of estimating a function from $n$ noisy samples whose discrete Total Variation (TV) is bounded by $C_n$.

Denoising Time Series +1

Paper
Add Code

Improving Sparse Vector Technique with Renyi Differential Privacy

no code implementations • NeurIPS 2020 • Yuqing Zhu, Yu-Xiang Wang

The Sparse Vector Technique (SVT) is one of the most fundamental algorithmic tools in differential privacy (DP).

Paper
Add Code

Revisiting Model-Agnostic Private Learning: Faster Rates and Active Learning

no code implementations • 6 Nov 2020 • Chong Liu, Yuqing Zhu, Kamalika Chaudhuri, Yu-Xiang Wang

The Private Aggregation of Teacher Ensembles (PATE) framework is one of the most promising recent approaches in differentially private learning.

Active Learning Majority Voting Classifier

Paper
Add Code

Inter-Series Attention Model for COVID-19 Forecasting

1 code implementation • 25 Oct 2020 • Xiaoyong Jin, Yu-Xiang Wang, Xifeng Yan

COVID-19 pandemic has an unprecedented impact all over the world since early 2020.

Time Series Time Series Analysis

Paper
Code

Voting-based Approaches For Differentially Private Federated Learning

no code implementations • 9 Oct 2020 • Yuqing Zhu, Xiang Yu, Yi-Hsuan Tsai, Francesco Pittaluga, Masoud Faraki, Manmohan Chandraker, Yu-Xiang Wang

Differentially Private Federated Learning (DPFL) is an emerging field with many applications.

Federated Learning Transfer Learning

Paper
Add Code

Adaptive Online Estimation of Piecewise Polynomial Trends

no code implementations • NeurIPS 2020 • Dheeraj Baby, Yu-Xiang Wang

We consider the framework of non-stationary stochastic optimization [Besbes et al, 2015] with squared error losses and noisy gradient feedback where the dynamic regret of an online learner against a time varying comparator sequence is studied.

2k regression +1

Paper
Add Code

Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning

no code implementations • 7 Jul 2020 • Ming Yin, Yu Bai, Yu-Xiang Wang

The problem of Offline Policy Evaluation (OPE) in Reinforcement Learning (RL) is a critical step towards applying RL in real-life applications.

Offline RL reinforcement-learning +1

Paper
Add Code

Bullseye Polytope: A Scalable Clean-Label Poisoning Attack with Improved Transferability

1 code implementation • 1 May 2020 • Hojjat Aghakhani, Dongyu Meng, Yu-Xiang Wang, Christopher Kruegel, Giovanni Vigna

Our attack, Bullseye Polytope, improves the attack success rate of the current state-of-the-art by 26. 75% in end-to-end transfer learning, while increasing attack speed by a factor of 12.

Transfer Learning

Paper
Code

Domain Adaptation with Conditional Distribution Matching and Generalized Label Shift

1 code implementation • NeurIPS 2020 • Remi Tachet, Han Zhao, Yu-Xiang Wang, Geoff Gordon

However, recent work has shown limitations of this approach when label distributions differ between the source and target domains.

Multi-class Classification Unsupervised Domain Adaptation

Paper
Code

Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning

no code implementations • 29 Jan 2020 • Ming Yin, Yu-Xiang Wang

We consider the problem of off-policy evaluation for reinforcement learning, where the goal is to estimate the expected reward of a target policy $\pi$ using offline data collected by running a logging policy $\mu$.

Off-policy evaluation reinforcement-learning

Paper
Add Code

Semantic Guided and Response Times Bounded Top-k Similarity Search over Knowledge Graphs

2 code implementations • 15 Oct 2019 • Yu-Xiang Wang, Arijit Khan, Tianxing Wu, Jiahui Jin, Haijiang Yan

We face two challenges on graph query over a knowledge graph: (1) the structural gap between $G_Q$ and the predefined schema in $G$ causes mismatch with query graph, (2) users cannot view the answers until the graph query terminates, leading to a longer system response time (SRT).

Databases

Paper
Code

Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting

2 code implementations • NeurIPS 2019 • Shiyang Li, Xiaoyong Jin, Yao Xuan, Xiyou Zhou, Wenhu Chen, Yu-Xiang Wang, Xifeng Yan

Time series forecasting is an important problem across many domains, including predictions of solar plant energy output, electricity consumption, and traffic jam situation.

Ranked #27 on Image Generation on ImageNet 64x64 (Bits per dim metric)

Time Series Time Series Forecasting

1,898

Paper
Code

Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling

no code implementations • NeurIPS 2019 • Tengyang Xie, Yifei Ma, Yu-Xiang Wang

To solve this problem, we consider a marginalized importance sampling (MIS) estimator that recursively estimates the state marginal distribution for the target policy at every step.

Off-policy evaluation reinforcement-learning

Paper
Add Code

Doubly Robust Crowdsourcing

no code implementations • 8 Jun 2019 • Chong Liu, Yu-Xiang Wang

Large-scale labeled dataset is the indispensable fuel that ignites the AI revolution as we see today.

Paper
Add Code

Online Forecasting of Total-Variation-bounded Sequences

1 code implementation • NeurIPS 2019 • Dheeraj Baby, Yu-Xiang Wang

We design an $O(n\log n)$-time algorithm that achieves a cumulative square error of $\tilde{O}(n^{1/3}C_n^{2/3}\sigma^{4/3} + C_n^2)$ with high probability. We also prove a lower bound that matches the upper bound in all parameters (up to a $\log(n)$ factor).

Stochastic Optimization

Paper
Code

Provably Efficient Q-Learning with Low Switching Cost

no code implementations • NeurIPS 2019 • Yu Bai, Tengyang Xie, Nan Jiang, Yu-Xiang Wang

We take initial steps in studying PAC-MDP algorithms with limited adaptivity, that is, algorithms that change its exploration policy as infrequently as possible during regret minimization.

Q-Learning

Paper
Add Code

A Higher-Order Kolmogorov-Smirnov Test

no code implementations • 24 Mar 2019 • Veeranjaneyulu Sadhanala, Yu-Xiang Wang, Aaditya Ramdas, Ryan J. Tibshirani

We present an extension of the Kolmogorov-Smirnov (KS) two-sample test, which can be more sensitive to differences in the tails.

Paper
Add Code

Imitation-Regularized Offline Learning

no code implementations • 15 Jan 2019 • Yifei Ma, Yu-Xiang Wang, Balakrishnan, Narayanaswamy

To solve both problems, we show how one can use policy improvement (PIL) objectives, regularized by policy imitation (IML).

counterfactual Multi-Armed Bandits

Paper
Add Code

ProxQuant: Quantized Neural Networks via Proximal Operators

1 code implementation • ICLR 2019 • Yu Bai, Yu-Xiang Wang, Edo Liberty

To make deep neural networks feasible in resource-constrained environments (such as mobile devices), it is beneficial to quantize models by using low-precision weights.

Quantization

Paper
Code

Subsampled Rényi Differential Privacy and Analytical Moments Accountant

1 code implementation • 31 Jul 2018 • Yu-Xiang Wang, Borja Balle, Shiva Kasiviswanathan

We study the problem of subsampling in differential privacy (DP), a question that is the centerpiece behind many successful differentially private machine learning algorithms.

BIG-bench Machine Learning

248

Paper
Code

Patch-Based Image Hallucination for Super Resolution with Detail Reconstruction from Similar Sample Images

no code implementations • 3 Jun 2018 • Chieh-Chi Kao, Yu-Xiang Wang, Jonathan Waltman, Pradeep Sen

Image hallucination and super-resolution have been studied for decades, and many approaches have been proposed to upsample low-resolution images using information from the images themselves, multiple example images, or large image databases.

Hallucination Super-Resolution

Paper
Add Code

An end-to-end Differentially Private Latent Dirichlet Allocation Using a Spectral Algorithm

no code implementations • ICML 2020 • Christopher DeCarolis, Mukul Ram, Seyed A. Esmaeili, Yu-Xiang Wang, Furong Huang

Overall, by combining the sensitivity and utility characterization, we obtain an end-to-end differentially private spectral algorithm for LDA and identify the corresponding configuration that outperforms others in any specific regime.

Variational Inference

Paper
Add Code

Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal Denoising

1 code implementation • ICML 2018 • Borja Balle, Yu-Xiang Wang

The Gaussian mechanism is an essential building block used in multitude of differentially private data analysis algorithms.

Denoising

Paper
Code

Revisiting differentially private linear regression: optimal and adaptive prediction & estimation in unbounded domain

1 code implementation • 7 Mar 2018 • Yu-Xiang Wang

We revisit the problem of linear regression under a differential privacy constraint.

regression

Paper
Code

signSGD: Compressed Optimisation for Non-Convex Problems

3 code implementations • ICML 2018 • Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli, Anima Anandkumar

Using a theorem by Gauss we prove that majority vote can achieve the same reduction in variance as full precision distributed SGD.

Paper
Code

Detecting and Correcting for Label Shift with Black Box Predictors

1 code implementation • ICML 2018 • Zachary C. Lipton, Yu-Xiang Wang, Alex Smola

Faced with distribution shift between training and test set, we wish to detect and quantify the shift, and to correct our classifiers without test set labels.

Medical Diagnosis

Paper
Code

Convergence rate of sign stochastic gradient descent for non-convex functions

no code implementations • ICLR 2018 • Jeremy Bernstein, Kamyar Azizzadenesheli, Yu-Xiang Wang, Anima Anandkumar

The sign stochastic gradient descent method (signSGD) utilizes only the sign of the stochastic gradient in its updates.

Distributed Optimization Quantization

Paper
Add Code

Higher-Order Total Variation Classes on Grids: Minimax Theory and Trend Filtering Methods

no code implementations • NeurIPS 2017 • Veeranjaneyulu Sadhanala, Yu-Xiang Wang, James L. Sharpnack, Ryan J. Tibshirani

To move past this, we define two new higher-order TV classes, based on two ways of compiling the discrete derivatives of a parameter across the nodes.

Paper
Add Code

Fully Convolutional Measurement Network for Compressive Sensing Image Reconstruction

1 code implementation • 21 Nov 2017 • Jiang Du, Xuemei Xie, Chenye Wang, Guangming Shi, Xun Xu, Yu-Xiang Wang

Recently, deep learning methods have made a significant improvement in compressive sensing image reconstruction task.

Compressive Sensing Image Reconstruction +1

Paper
Code

Adaptive Measurement Network for CS Image Reconstruction

1 code implementation • 23 Sep 2017 • Xuemei Xie, Yu-Xiang Wang, Guangming Shi, Chenye Wang, Jiang Du, Zhifu Zhao

In this paper, we propose an adaptive measurement network in which measurement is obtained by learning.

Compressive Sensing Image Reconstruction

Paper
Code

Non-stationary Stochastic Optimization under $L_{p,q}$-Variation Measures

no code implementations • 9 Aug 2017 • Xi Chen, Yining Wang, Yu-Xiang Wang

We consider a non-stationary sequential stochastic optimization problem, in which the underlying cost functions change over time under a variation budget constraint.

Stochastic Optimization

Paper
Add Code

Per-instance Differential Privacy

no code implementations • 24 Jul 2017 • Yu-Xiang Wang

We consider a refinement of differential privacy --- per instance differential privacy (pDP), which captures the privacy of a specific individual with respect to a fixed data set.

Paper
Add Code

Optimal and Adaptive Off-policy Evaluation in Contextual Bandits

2 code implementations • ICML 2017 • Yu-Xiang Wang, Alekh Agarwal, Miroslav Dudik

We study the off-policy evaluation problem---estimating the value of a target policy using data collected by another policy---under the contextual bandit model.

Multi-Armed Bandits Off-policy evaluation

3,521

Paper
Code

Understanding the 2016 US Presidential Election using ecological inference and distribution regression with census microdata

1 code implementation • 11 Nov 2016 • Seth Flaxman, Danica J. Sutherland, Yu-Xiang Wang, Yee Whye Teh

We combine fine-grained spatially referenced census data with the vote outcomes from the 2016 US presidential election.

regression

Paper
Code

Attributing Hacks

1 code implementation • 7 Nov 2016 • Ziqi Liu, Alexander J. Smola, Kyle Soska, Yu-Xiang Wang, Qinghua Zheng, Jun Zhou

That is, given properties of sites and the temporal occurrence of attacks, we are able to attribute individual attacks to joint causes and vulnerabilities, as well as estimating the evolution of these vulnerabilities over time.

Attribute

Paper
Code

A Theoretical Analysis of Noisy Sparse Subspace Clustering on Dimensionality-Reduced Data

no code implementations • 24 Oct 2016 • Yining Wang, Yu-Xiang Wang, Aarti Singh

Subspace clustering is the problem of partitioning unlabeled data points into a number of clusters so that data points within one cluster lie approximately on a low-dimensional linear subspace.

Clustering Dimensionality Reduction

Paper
Add Code

Total Variation Classes Beyond 1d: Minimax Rates, and the Limitations of Linear Smoothers

no code implementations • NeurIPS 2016 • Veeranjaneyulu Sadhanala, Yu-Xiang Wang, Ryan Tibshirani

Lastly, we investigate the problem of adaptivity of the total variation denoiser to these smaller Sobolev function spaces.

Denoising

Paper
Add Code

On-Average KL-Privacy and its equivalence to Generalization for Max-Entropy Mechanisms

no code implementations • 8 May 2016 • Yu-Xiang Wang, Jing Lei, Stephen E. Fienberg

We define On-Average KL-Privacy and present its properties and connections to differential privacy, generalization and information-theoretic quantities including max-information and mutual information.

Paper
Add Code

A Minimax Theory for Adaptive Data Analysis

no code implementations • 13 Feb 2016 • Yu-Xiang Wang, Jing Lei, Stephen E. Fienberg

In this paper, we propose a minimax framework for adaptive data analysis.

Paper
Add Code

Differentially private subspace clustering

no code implementations • NeurIPS 2015 • Yining Wang, Yu-Xiang Wang, Aarti Singh

Subspace clustering is an unsupervised learning problem that aims at grouping data points into multiple ``clusters'' so that data points in a single cluster lie approximately on a low-dimensional linear subspace.

Clustering Motion Segmentation

Paper
Add Code

Fast Differentially Private Matrix Factorization

no code implementations • 6 May 2015 • Ziqi Liu, Yu-Xiang Wang, Alexander J. Smola

Differentially private collaborative filtering is a challenging task, both in terms of accuracy and speed.

Collaborative Filtering

Paper
Add Code

Graph Connectivity in Noisy Sparse Subspace Clustering

no code implementations • 4 Apr 2015 • Yining Wang, Yu-Xiang Wang, Aarti Singh

A line of recent work (4, 19, 24, 20) provided strong theoretical guarantee for sparse subspace clustering (4), the state-of-the-art algorithm for subspace clustering, on both noiseless and noisy data sets.

Clustering

Paper
Add Code

Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo

no code implementations • 26 Feb 2015 • Yu-Xiang Wang, Stephen E. Fienberg, Alex Smola

We consider the problem of Bayesian learning on sensitive datasets and present two simple but somewhat surprising results that connect Bayesian learning to "differential privacy:, a cryptographic approach to protect individual-level privacy while permiting database-level utility.

Paper
Add Code

Learning with Differential Privacy: Stability, Learnability and the Sufficiency and Necessity of ERM Principle

no code implementations • 23 Feb 2015 • Yu-Xiang Wang, Jing Lei, Stephen E. Fienberg

Lastly, we extend some of the results to the more practical $(\epsilon,\delta)$-differential privacy and establish the existence of a phase-transition on the class of problems that are approximately privately learnable with respect to how small $\delta$ needs to be.

Paper
Add Code

Trend Filtering on Graphs

no code implementations • 28 Oct 2014 • Yu-Xiang Wang, James Sharpnack, Alex Smola, Ryan J. Tibshirani

We introduce a family of adaptive estimators on graphs, based on penalizing the $\ell_1$ norm of discrete graph differences.

regression

Paper
Add Code

Parallel and Distributed Block-Coordinate Frank-Wolfe Algorithms

no code implementations • 22 Sep 2014 • Yu-Xiang Wang, Veeranjaneyulu Sadhanala, Wei Dai, Willie Neiswanger, Suvrit Sra, Eric P. Xing

We develop parallel and distributed Frank-Wolfe algorithms; the former on shared memory machines with mini-batching, and the latter in a delayed update framework.