no code implementations • ICML 2020 • Ching-Wei Cheng, Xingye Qiao, Guang Cheng
In this article, we study a new paradigm called mutual transfer learning where among many heterogeneous data domains, every data domain could potentially be the target of interest, and it could also be a useful source to help the learning in other data domains.
1 code implementation • 27 Mar 2024 • Xianli Zeng, Guang Cheng, Edgar Dobriban
Mitigating the disparate impact of statistical machine learning methods is crucial for ensuring fairness.
no code implementations • 18 Mar 2024 • Tian-Yi Zhou, Namjoon Suh, Guang Cheng, Xiaoming Huo
Motivated by the abundance of functional data such as time series and images, there has been a growing interest in integrating such data into neural networks and learning maps from function spaces to R (i. e., functionals).
1 code implementation • 12 Mar 2024 • Xianli Zeng, Joshua Ward, Guang Cheng
The increasing usage of machine learning models in consequential decision-making processes has spurred research into the fairness of these systems.
no code implementations • 26 Feb 2024 • SHIRONG XU, Will Wei Sun, Guang Cheng
Motivated from this, we propose a debiased randomized response mechanism to protect the raw pairwise rankings, ensuring consistent estimation of true preferences and rankings in downstream rank aggregation.
1 code implementation • 5 Feb 2024 • Xianli Zeng, Guang Cheng, Edgar Dobriban
To address this, we develop methods for Bayes-optimal fair classification, aiming to minimize classification error subject to given group fairness constraints.
no code implementations • 1 Feb 2024 • Yue Xing, Xiaofeng Lin, Namjoon Suh, Qifan Song, Guang Cheng
In practice, it is observed that transformer-based models can learn concepts in context in the inference stage.
no code implementations • 26 Jan 2024 • Yue Xing, Xiaofeng Lin, Qifan Song, Yi Xu, Belinda Zeng, Guang Cheng
Pre-training is known to generate universal representations for downstream tasks in large-scale deep learning such as large language models.
no code implementations • 14 Jan 2024 • Namjoon Suh, Guang Cheng
In this article, we review the literature on statistical theories of neural networks from three perspectives.
no code implementations • 1 Jan 2024 • Yinan Cheng, Chi-Hua Wang, Vamsi K. Potluru, Tucker Balch, Guang Cheng
Devising procedures for downstream task-oriented generative model selections is an unresolved problem of practical importance.
no code implementations • 1 Jan 2024 • Din-Yin Hsieh, Chi-Hua Wang, Guang Cheng
Exploring generative model training for synthetic tabular data, specifically in sequential contexts such as credit card transaction data, presents significant challenges.
1 code implementation • 11 Dec 2023 • Yuyang Zhou, Guang Cheng, Zongyao Chen, Shui Yu
Experimental results on two Android malware datasets demonstrate that MalPurifier outperforms the state-of-the-art defenses, and it significantly strengthens the vulnerable malware detector against 37 evasion attacks, achieving accuracies over 90. 91%.
1 code implementation • 24 Oct 2023 • Namjoon Suh, Xiaofeng Lin, Din-Yin Hsieh, Merhdad Honarkhah, Guang Cheng
Diffusion model has become a main paradigm for synthetic data generation in many subfields of modern machine learning, including computer vision, language model, or speech synthesis.
1 code implementation • 16 Sep 2023 • Hongyu Zhu, Sichu Liang, Wentao Hu, Fang-Qi Li, Yali Yuan, Shi-Lin Wang, Guang Cheng
As a modern ensemble technique, Deep Forest (DF) employs a cascading structure to construct deep models, providing stronger representational power compared to traditional decision forests.
no code implementations • 2 Jul 2023 • Yidong Ouyang, Liyan Xie, Chongxuan Li, Guang Cheng
The diffusion model has shown remarkable performance in modeling data distributions and synthesizing data.
no code implementations • 17 May 2023 • SHIRONG XU, Will Wei Sun, Guang Cheng
The former is defined as the generalization difference between models trained on synthetic and on real data.
no code implementations • 13 Mar 2023 • Huiming Zhang, Haoyu Wei, Guang Cheng
In non-asymptotic learning, variance-type parameters of sub-Gaussian distributions are of paramount importance.
no code implementations • 24 Jan 2023 • Yuantong Li, Guang Cheng, Xiaowu Dai
In this paper, we propose a new algorithm for addressing the problem of matching markets with complementary preferences, where agents' preferences are unknown a priori and must be learned from data.
no code implementations • 21 Jan 2023 • Ximing Li, Chendi Wang, Guang Cheng
To complete the picture, we establish a lower bound for TV accuracy that holds for every $\epsilon$-DP synthetic data generator.
no code implementations • 2 Jan 2023 • SHIRONG XU, Will Wei Sun, Guang Cheng
This allows us to develop a multistage ranking algorithm to generate synthetic rankings while satisfying the developed $\epsilon$-ranking differential privacy.
no code implementations • 28 Nov 2022 • Yucong Liu, Chi-Hua Wang, Guang Cheng
Devising procedures for auditing generative model privacy-utility tradeoff is an important yet unresolved problem in practice.
no code implementations • 18 Oct 2022 • Yidong Ouyang, Liyan Xie, Guang Cheng
Among various deep generative models, the diffusion model has been shown to produce high-quality synthetic images and has achieved good performance in improving the adversarial robustness.
1 code implementation • 12 Oct 2022 • Zhanyu Wang, Guang Cheng, Jordan Awan
Differentially private (DP) mechanisms protect individual-level information by introducing randomness into the statistical analysis procedure.
1 code implementation • 15 May 2022 • Xianli Zeng, Edgar Dobriban, Guang Cheng
This paper considers predictive parity, which requires equalizing the probability of success given a positive prediction among different protected groups.
no code implementations • 7 May 2022 • Yuantong Li, Chi-Hua Wang, Guang Cheng, Will Wei Sun
Existing works focus on multi-armed bandit with static preference, but this is insufficient: the two-sided preference changes as along as one-side's contextual information updates, resulting in non-static matching.
no code implementations • 27 Feb 2022 • Chi-Hua Wang, Wenjie Li, Guang Cheng, Guang Lin
This paper presents a novel federated linear contextual bandits model, where individual clients face different K-armed stochastic bandits with high-dimensional decision context and coupled through common global parameters.
no code implementations • 26 Feb 2022 • Jiexin Duan, Xingye Qiao, Guang Cheng
In machine learning, crowdsourcing is an economical way to label a large amount of data.
no code implementations • 24 Feb 2022 • Zhiying Fang, Guang Cheng
Convolutional neural networks have shown impressive abilities in many applications, especially those related to the classification tasks.
no code implementations • 24 Feb 2022 • Zhiying Fang, Yidong Ouyang, Ding-Xuan Zhou, Guang Cheng
In this work, we show that with suitable adaptations, the single-head self-attention transformer with a fixed number of transformer encoder blocks and free parameters is able to generate any desired polynomial of the input with no error.
no code implementations • 23 Feb 2022 • Shuang Wu, Chi-Hua Wang, Yuantong Li, Guang Cheng
We propose a new bootstrap-based online algorithm for stochastic linear bandit problems.
no code implementations • 23 Feb 2022 • Yue Xing, Qifan Song, Guang Cheng
In some studies \citep[e. g.,][]{zhang2016understanding} of deep learning, it is observed that over-parametrized deep neural networks achieve a small testing error even when the training error is almost zero.
1 code implementation • 20 Feb 2022 • Xianli Zeng, Edgar Dobriban, Guang Cheng
Machine learning algorithms are becoming integrated into more and more high-stakes decision-making processes, such as in social welfare issues.
no code implementations • 14 Feb 2022 • Yue Xing, Qifan Song, Guang Cheng
The recent proposed self-supervised learning (SSL) approaches successfully demonstrate the great potential of supplementing learning algorithms with additional unlabeled data.
no code implementations • 21 Jan 2022 • Ying Sun, Marie Maros, Gesualdo Scutari, Guang Cheng
Our theory shows that, under standard notions of restricted strong convexity and smoothness of the loss functions, suitable conditions on the network connectivity and algorithm tuning, the distributed algorithm converges globally at a {\it linear} rate to an estimate that is within the centralized {\it statistical precision} of the model, $O(s\log d/N)$.
no code implementations • NeurIPS 2021 • Yue Xing, Qifan Song, Guang Cheng
In contrast, this paper studies the algorithmic stability of a generic adversarial training algorithm, which can further help to establish an upper bound for generalization error.
no code implementations • 8 Aug 2021 • Pratik Ramprasad, Yuantong Li, Zhuoran Yang, Zhaoran Wang, Will Wei Sun, Guang Cheng
The recent emergence of reinforcement learning has created a demand for robust statistical inference methods for the parameter estimates computed using these algorithms.
1 code implementation • 17 Jun 2021 • Wenjie Li, Chi-Hua Wang, Guang Cheng, Qifan Song
In this paper, we make the key delineation on the roles of resolution and statistical uncertainty in hierarchical bandits-based black-box optimization algorithms, guiding a more general analysis and a more efficient algorithm design.
1 code implementation • 19 Feb 2021 • Yang Yu, Shih-Kang Chao, Guang Cheng
We propose a distributed bootstrap method for simultaneous inference on high-dimensional massive data that are stored and processed with many machines.
no code implementations • 1 Jan 2021 • Wenjie Li, Guang Cheng
Numerous adaptive algorithms such as AMSGrad and Radam have been proposed and applied to deep learning recently.
no code implementations • 26 Dec 2020 • Wenjie Li, Zhanyu Wang, Yichen Zhang, Guang Cheng
In this work, we investigate the idea of variance reduction by studying its properties with general adaptive mirror descent algorithms in nonsmooth nonconvex finite-sum optimization problems.
no code implementations • 18 Dec 2020 • Yue Xing, Ruizhi Zhang, Guang Cheng
Further, we reveal an explicit connection of adversarial and standard estimates, and propose a straightforward two-stage adversarial learning framework, which facilitates to utilize model structure information to improve adversarial robustness.
no code implementations • 3 Dec 2020 • Yuantong Li, Chi-Hua Wang, Guang Cheng
Motivated by the EU's "Right To Be Forgotten" regulation, we initiate a study of statistical data deletion problems where users' data are accessible only for a limited period of time.
no code implementations • NeurIPS 2020 • Jiexin Duan, Xingye Qiao, Guang Cheng
It is interesting to note that the weighted voting scheme allows a larger number of subsamples than the majority voting one.
1 code implementation • NeurIPS 2020 • Jincheng Bai, Qifan Song, Guang Cheng
Sparse deep learning aims to address the challenge of huge storage consumption by deep neural networks, and to recover the sparse structure of target functions.
no code implementations • 24 Oct 2020 • Jincheng Bai, Qifan Song, Guang Cheng
We propose a variational Bayesian (VB) procedure for high-dimensional linear model inferences with heavy tail shrinkage priors, such as student-t prior.
no code implementations • 15 Aug 2020 • Yue Xing, Qifan Song, Guang Cheng
Modern machine learning and deep learning models are shown to be vulnerable when testing data are slightly perturbed.
no code implementations • 6 Jul 2020 • Tianyang Hu, Wenjia Wang, Cong Lin, Guang Cheng
Overparametrized neural networks trained by gradient descent (GD) can provably overfit any training data.
no code implementations • 5 Jul 2020 • Chi-Hua Wang, Zhanyu Wang, Will Wei Sun, Guang Cheng
In this paper, we propose a novel approach for designing dynamic pricing policy based regularized online statistical learning with theoretical guarantees.
1 code implementation • NeurIPS 2020 • Shih-Kang Chao, Zhanyu Wang, Yue Xing, Guang Cheng
In the light of the fact that the stochastic gradient descent (SGD) often finds a flat minimum valley in the training loss, we propose a novel directional pruning method which searches for a sparse minimizer in or close to that flat region.
no code implementations • 30 Apr 2020 • Ruiqi Liu, Zuofeng Shang, Guang Cheng
The endogeneity issue is fundamentally important as many empirical applications may suffer from the omission of explanatory variables, measurement error, or simultaneous causality.
no code implementations • 21 Feb 2020 • Chi-Hua Wang, Guang Cheng
In such a scenario, our goal is to allocate a batch of treatments to maximize treatment efficacy based on observed high-dimensional user covariates.
no code implementations • 19 Feb 2020 • Chi-Hua Wang, Yang Yu, Botao Hao, Guang Cheng
In this paper, we propose a novel perturbation-based exploration method in bandit algorithms with bounded or unbounded rewards, called residual bootstrap exploration (\texttt{ReBoot}).
no code implementations • ICML 2020 • Yang Yu, Shih-Kang Chao, Guang Cheng
In this paper, we propose a bootstrap method applied to massive data processed distributedly in a large number of machines.
no code implementations • 13 Feb 2020 • Yue Xing, Qifan Song, Guang Cheng
We consider a data corruption scenario in the classical $k$ Nearest Neighbors ($k$-NN) algorithm, that is, the testing data are randomly perturbed.
no code implementations • 19 Jan 2020 • Tianyang Hu, Zuofeng Shang, Guang Cheng
In this paper, we attempt to understand this empirical success in high dimensional classification by deriving the convergence rates of excess risk.
no code implementations • 25 Sep 2019 • Yue Xing, Qifan Song, Guang Cheng
The over-parameterized models attract much attention in the era of data science and deep learning.
no code implementations • 22 Sep 2019 • Shih-Kang Chao, Guang Cheng
Preliminary empirical analysis of modern image data shows that learning very sparse deep neural networks by gRDA does not necessarily sacrifice testing accuracy.
no code implementations • 12 Sep 2019 • Fang Chen, Hong Wan, Hua Cai, Guang Cheng
Machine learning and blockchain are two of the most noticeable technologies in recent years.
1 code implementation • NeurIPS 2019 • Xingye Qiao, Jiexin Duan, Guang Cheng
Nearest neighbor is a popular class of classification methods with many desirable properties.
no code implementations • NeurIPS 2019 • Botao Hao, Yasin Abbasi-Yadkori, Zheng Wen, Guang Cheng
Upper Confidence Bound (UCB) method is arguably the most celebrated one used in online decision making with partial information feedback.
no code implementations • 2 Apr 2019 • Yuyang Zhou, Guang Cheng, Shanqing Jiang, Mian Dai
Intrusion detection system (IDS) is one of extensively used techniques in a network topology to safeguard the integrity and availability of sensitive assets in the protected systems.
1 code implementation • 8 Oct 2018 • Tianyang Hu, Zixiang Chen, Hanxi Sun, Jincheng Bai, Mao Ye, Guang Cheng
We propose two novel samplers to generate high-quality samples from a given (un-normalized) probability density.
no code implementations • 5 Oct 2018 • Yue Xing, Qifan Song, Guang Cheng
In the era of deep learning, understanding over-fitting phenomenon becomes increasingly important.
no code implementations • 17 Sep 2018 • Meimei Liu, Jean Honorio, Guang Cheng
In this paper, we propose a random projection approach to estimate variance in kernel ridge regression.
no code implementations • ICML 2018 • Ganggang Xu, Zuofeng Shang, Guang Cheng
Divide-and-conquer is a powerful approach for large and massive data analysis.
no code implementations • NeurIPS 2018 • Meimei Liu, Guang Cheng
Early stopping of iterative algorithms is an algorithmic regularization method to avoid over-fitting in estimation and classification.
no code implementations • 25 May 2018 • Meimei Liu, Zuofeng Shang, Guang Cheng
It is worth noting that the upper bounds of the number of machines are proven to be un-improvable (upto a logarithmic factor) in two important cases: smoothing spline regression and Gaussian RKHS regression.
no code implementations • 17 Feb 2018 • Meimei Liu, Zuofeng Shang, Guang Cheng
A common challenge in nonparametric inference is its high computational complexity when data volume is large.
no code implementations • 29 Jan 2018 • Botao Hao, Anru Zhang, Guang Cheng
In this paper, we propose a general framework for sparse and low-rank tensor estimation from cubic sketchings.
no code implementations • 20 Jan 2017 • Will Wei Sun, Guang Cheng, Yufeng Liu
Stability is an important aspect of a classification procedure because unstable predictions can potentially reduce users' trust in a classification system and also harm the reproducibility of scientific conclusions.
no code implementations • ICML 2018 • Ganggang Xu, Zuofeng Shang, Guang Cheng
Tuning parameter selection is of critical importance for kernel ridge regression.
no code implementations • 28 Nov 2016 • Botao Hao, Will Wei Sun, Yufeng Liu, Guang Cheng
We consider joint estimation of multiple graphical models arising from heterogeneous and high-dimensional observations.
no code implementations • 15 Sep 2016 • Xiang Lyu, Will Wei Sun, Zhaoran Wang, Han Liu, Jian Yang, Guang Cheng
We consider the estimation and inference of graphical models that characterize the dependency structure of high-dimensional tensor-valued data.
no code implementations • 31 Dec 2015 • Zuofeng Shang, Guang Cheng
In this paper, we explore statistical versus computational trade-off to address a basic question in the application of a distributed algorithm: what is the minimal computational cost in obtaining statistical optimality?
Statistics Theory Statistics Theory
no code implementations • NeurIPS 2015 • Wei Sun, Zhaoran Wang, Han Liu, Guang Cheng
We consider the estimation of sparse graphical models that characterize the dependency structure of high-dimensional tensor-valued data.
no code implementations • 5 Feb 2015 • Will Wei Sun, Junwei Lu, Han Liu, Guang Cheng
We propose a novel sparse tensor decomposition method, namely Tensor Truncated Power (TTP) method, that incorporates variable selection into the estimation of decomposition components.
no code implementations • CVPR 2014 • Yuanxiang Wang, Hesamoddin Salehian, Guang Cheng, Baba C. Vemuri
In this paper, we propose a new intrinsic recursive filter on the product manifold of shape and orientation.
no code implementations • 26 May 2014 • Wei Sun, Xingye Qiao, Guang Cheng
In this paper, we introduce a general measure of classification instability (CIS) to quantify the sampling variability of the prediction made by a classification method.
no code implementations • 30 Dec 2012 • Zuofeng Shang, Guang Cheng
In particular, our confidence intervals are proved to be asymptotically valid at any point in the support, and they are shorter on average than the Bayesian confidence intervals proposed by Wahba [J. R. Stat.