no code implementations • 11 Apr 2024 • Minshuo Chen, Song Mei, Jianqing Fan, Mengdi Wang
In this paper, we review emerging applications of diffusion models, understanding their sample generation under various controls.
no code implementations • 10 Jan 2024 • Soham Jana, Jianqing Fan, Sanjeev Kulkarni
In this paper, we introduce a hybrid clustering technique with a novel multivariate trimmed mean type centroid estimate to produce mislabeling guarantees under a weak initialization condition for general error distributions around the centroids.
no code implementations • 4 Jan 2024 • Jinhang Chai, Jianqing Fan
The problem of structured matrix estimation has been studied mostly under strong noise dependence assumptions.
no code implementations • 27 Nov 2023 • Jiawei Ge, Shange Tang, Jianqing Fan, Cong Ma, Chi Jin
This paper addresses this fundamental question by proving that, surprisingly, classical Maximum Likelihood Estimation (MLE) purely using source data (without any modification) achieves the minimax optimality for covariate shift under the well-specified setting.
no code implementations • 22 Nov 2023 • Jianqing Fan, Zhaoran Wang, Zhuoran Yang, Chenlu Ye
For these settings, we design a provably sample-efficient algorithm which achieves a $ \mathcal{\tilde O}(s_0^2 \log^2 T)$ regret in the sparse case and $ \mathcal{\tilde O} ( r ^2 \log^2 T)$ regret in the low-rank case, using only $L = \mathcal{O}( \log T)$ batches.
no code implementations • 6 Oct 2023 • Jianqing Fan, Cheng Gao, Jason M. Klusowski
This paper addresses challenges in robust transfer learning stemming from ambiguity in Bayes classifiers and weak transferable signals between the target and source distribution.
no code implementations • 29 Aug 2023 • Sohom Bhattacharya, Jianqing Fan, Jikai Hou
Network data is prevalent in numerous big data applications including economics and health networks where it is of prime importance to understand the latent structure of network.
no code implementations • 5 Aug 2023 • Jianqing Fan, Zhipeng Lou, Weichen Wang, Mengxin Yu
This paper studies the performance of the spectral method in the estimation and uncertainty quantification of the unobserved preference scores of compared entities in a general and more realistic setup.
no code implementations • 28 Jun 2023 • Jianqing Fan, Jiawei Ge, Debarghya Mukherjee
Uncertainty quantification for prediction is an intriguing problem with significant applications in various fields, such as biomedical science, economic studies, and weather forecasts.
no code implementations • 21 Apr 2023 • Yuling Yan, Weijie J. Su, Jianqing Fan
We demonstrate that an author is incentivized to provide accurate rankings when her utility takes the form of a convex additive function of the adjusted review scores.
no code implementations • 14 Apr 2023 • Gen Li, Yuling Yan, Yuxin Chen, Jianqing Fan
This paper studies reward-agnostic exploration in reinforcement learning (RL) -- a scenario where the learner is unware of the reward functions during the exploration stage -- and designs an algorithm that improves over the state of the art.
no code implementations • 6 Mar 2023 • Jianqing Fan, Cong Fang, Yihong Gu, Tong Zhang
To the best of our knowledge, this paper is the first to realize statistically efficient invariance learning in the general linear model.
no code implementations • 2 Mar 2023 • Jiawei Ge, Shange Tang, Jianqing Fan, Chi Jin
Unsupervised pretraining, which learns a useful representation using a large amount of unlabeled data to facilitate the learning of downstream tasks, is a critical component of modern large-scale machine learning systems.
no code implementations • 23 Feb 2023 • Pierre Bayle, Jianqing Fan, Zhipeng Lou
Motivated by multi-center biomedical studies that cannot share individual data due to privacy and ownership concerns, we develop communication-efficient iterative distributed algorithms for estimation and inference in the high-dimensional sparse Cox proportional hazards model.
no code implementations • 12 Feb 2023 • Sohom Bhattacharya, Jianqing Fan, Debarghya Mukherjee
We show that under certain standard assumptions, debiased deep neural networks achieve a minimax optimal rate both in terms of $(n, d)$.
no code implementations • 20 Dec 2022 • Jianqing Fan, Jikai Hou, Mengxin Yu
This paper concerns with statistical estimation and inference for the ranking problems based on pairwise comparisons with additional covariate information such as the attributes of the compared items.
no code implementations • 22 Nov 2022 • Jianqing Fan, Zhipeng Lou, Mengxin Yu
A stylized feature of high-dimensional data is that many variables have heavy tails, and robust statistical inference is critical for valid large-scale statistical inference.
no code implementations • 22 Nov 2022 • Jianqing Fan, Zhipeng Lou, Weichen Wang, Mengxin Yu
The estimated distribution is then used to construct simultaneous confidence intervals for the differences in the preference scores and the ranks of individual items.
no code implementations • 31 Oct 2022 • Jianqing Fan, Yingying Fan, Jinchi Lv, Fan Yang
To address these practical challenges, in this paper we propose a SIMPLE method with random coupling (SIMPLE-RC) for testing the non-sharp null hypothesis that a group of given nodes share similar (not necessarily identical) membership profiles under weaker signals.
no code implementations • 3 Oct 2022 • Pierre Bayle, Jianqing Fan
A prevalent feature of high-dimensional data is the dependence among covariates, and model selection is known to be challenging when covariates are highly correlated.
no code implementations • 23 Aug 2022 • Mengxin Yu, Zhuoran Yang, Jianqing Fan
We study offline reinforcement learning under a novel model called strategic MDP, which characterizes the strategic interactions between a principal and a sequence of myopic agents with private types.
no code implementations • 9 Jun 2022 • Bingyan Wang, Jianqing Fan
This paper studies low-rank matrix completion in the presence of heavy-tailed and possibly asymmetric noise, where we aim to estimate an underlying low-rank matrix given a set of highly incomplete noisy entries.
no code implementations • 8 Jun 2022 • Yuling Yan, Gen Li, Yuxin Chen, Jianqing Fan
This paper makes progress towards learning Nash equilibria in two-player zero-sum Markov games from offline data.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 20 Mar 2022 • Jianqing Fan, Yihong Gu, Wen-Xin Zhou
This paper investigates the stability of deep ReLU neural networks for nonparametric regression under the assumption that the noise has only a finite p-th moment.
no code implementations • 14 Mar 2022 • Yuling Yan, Gen Li, Yuxin Chen, Jianqing Fan
This paper is concerned with the asynchronous form of Q-learning, which applies a stochastic approximation scheme to Markovian data samples.
no code implementations • 2 Mar 2022 • Jianqing Fan, Zhipeng Lou, Mengxin Yu
To fill in such an important gap, we also leverage our model as the alternative model to test the sufficiency of the latent factor regression and the sparse linear regression models.
no code implementations • NeurIPS 2021 • Jiwen Zhang, Zhongyu Wei, Jianqing Fan, Jiajie Peng
Vision-and-Language Navigation (VLN) is a task where an agent navigates in an embodied indoor environment under human instructions.
1 code implementation • Findings (NAACL) 2022 • Zhihao Fan, Zhongyu Wei, Zejun Li, Siyuan Wang, Jianqing Fan
We propose our TAiloring neGative Sentences with Discrimination and Correction (TAGS-DC) to generate synthetic sentences automatically as negative samples.
no code implementations • 13 Sep 2021 • Jianqing Fan, Yongyi Guo, Mengxin Yu
$F(\cdot)$ with $m$-th order derivative ($m\geq 2$), our policy achieves a regret upper bound of $\tilde{O}_{d}(T^{\frac{2m+1}{4m-1}})$, where $T$ is time horizon and $\tilde{O}_{d}$ is the order that hides logarithmic terms and the dimensionality of feature $d$.
no code implementations • 12 Sep 2021 • Zhihao Fan, Zhongyu Wei, Zejun Li, Siyuan Wang, Haijun Shan, Xuanjing Huang, Jianqing Fan
Existing research for image text retrieval mainly relies on sentence-level supervision to distinguish matched and mismatched sentences for a query image.
no code implementations • 26 Jul 2021 • Yuling Yan, Yuxin Chen, Jianqing Fan
This paper studies how to construct confidence regions for principal component analysis (PCA) in high dimension, a problem that has been vastly under-explored.
no code implementations • NeurIPS 2021 • Bingyan Wang, Yuling Yan, Jianqing Fan
Our results show that for arbitrarily large-scale MDP, both the model-based approach and Q-learning are sample-efficient when $K$ is relatively small, and hence the title of this paper.
no code implementations • 22 Feb 2021 • Jianqing Fan, Ricardo Masini, Marcelo C. Medeiros
Factor and sparse models are two widely used methods to impose a low-dimensional structure in high-dimensions.
no code implementations • 6 Jan 2021 • Francesca Tang, Yang Feng, Hamza Chiheb, Jianqing Fan
With the severity of the COVID-19 outbreak, we characterize the nature of the growth trajectories of counties in the United States using a novel combination of spectral clustering and the correlation matrix.
no code implementations • 15 Dec 2020 • Yuxin Chen, Yuejie Chi, Jianqing Fan, Cong Ma
While the studies of spectral methods can be traced back to classical matrix perturbation theory and methods of moments, the past decade has witnessed tremendous theoretical advances in demystifying their efficacy through the lens of statistical modeling, with the aid of non-asymptotic random matrix theory.
no code implementations • 8 Nov 2020 • Jianqing Fan, Ricardo P. Masini, Marcelo C. Medeiros
The data consist of daily sales and prices of five different products over more than 400 municipalities.
no code implementations • 21 Sep 2020 • Jianqing Fan, Kunpeng Li, Yuan Liao
This paper makes a selective survey on the recent development of the factor model and its application on statistical learnings.
no code implementations • 4 Aug 2020 • Yuxin Chen, Jianqing Fan, Bingyan Wang, Yuling Yan
We investigate the effectiveness of convex relaxation and nonconvex optimization in solving bilinear systems of equations under two different designs (i. e.$~$a sort of random Fourier design and Gaussian design).
no code implementations • 16 Jul 2020 • Jianqing Fan, Zhuoran Yang, Mengxin Yu
For both the vector and matrix settings, we construct an over-parameterized least-squares loss function by employing the score function transform and a robust truncation step designed specifically for heavy-tailed data.
no code implementations • 24 Jun 2020 • Emmanuel Abbe, Jianqing Fan, Kaizheng Wang
Principal Component Analysis (PCA) is a powerful tool in statistics and machine learning.
no code implementations • 15 Jan 2020 • Yuxin Chen, Jianqing Fan, Cong Ma, Yuling Yan
This paper delivers improved theoretical guarantees for the convex programming approach in low-rank matrix estimation, in the presence of (1) random noise, (2) gross sparse outliers, and (3) missing data.
no code implementations • 3 Oct 2019 • Jianqing Fan, Yingying Fan, Xiao Han, Jinchi Lv
Both tests are of the Hotelling-type statistics based on the rows of empirical eigenvectors or their ratios, whose asymptotic covariance matrices are very challenging to derive and estimate.
no code implementations • 12 Jun 2019 • Jianqing Fan, Yongyi Guo, Kaizheng Wang
In addition, we give the conditions under which the one-step CEASE estimator is statistically efficient.
no code implementations • 10 Jun 2019 • Yuxin Chen, Jianqing Fan, Cong Ma, Yuling Yan
As a byproduct, we obtain a sharp characterization of the estimation accuracy of our de-biased estimators, which, to the best of our knowledge, are the first tractable algorithms that provably achieve full statistical efficiency (including the preconstant).
no code implementations • 28 Apr 2019 • Krishna Balasubramanian, Elynn Y. Chen, Jianqing Fan, Xiang Wu
Sparse PCA is a widely used technique for high-dimensional data analysis.
no code implementations • 10 Apr 2019 • Jianqing Fan, Cong Ma, Yiqiao Zhong
Deep learning has arguably achieved tremendous success in recent years.
no code implementations • 20 Feb 2019 • Yuxin Chen, Yuejie Chi, Jianqing Fan, Cong Ma, Yuling Yan
This paper studies noisy low-rank matrix completion: given partial and noisy entries of a large low-rank matrix, the goal is to estimate the underlying matrix faithfully and efficiently.
no code implementations • 1 Jan 2019 • Jianqing Fan, Zhaoran Wang, Yuchen Xie, Zhuoran Yang
Despite the great empirical success of deep reinforcement learning, its theoretical foundation is less well understood.
no code implementations • 30 Nov 2018 • Yuxin Chen, Chen Cheng, Jianqing Fan
The aim is to estimate the leading eigenvalue and eigenvector of $\mathbf{M}^{\star}$.
no code implementations • 21 Aug 2018 • Jianqing Fan, Han Liu, Zhaoran Wang, Zhuoran Yang
We study the fundamental tradeoffs between statistical accuracy and computational tractability in the analysis of high dimensional heterogeneous data.
no code implementations • 12 Aug 2018 • Jianqing Fan, Kaizheng Wang, Yiqiao Zhong, Ziwei Zhu
Factor models are a class of powerful statistical models that have been widely used to deal with dependent measurements that arise frequently from various applications from genomics and neuroscience to economics and finance.
no code implementations • 17 Jul 2018 • Krishnakumar Balasubramanian, Jianqing Fan, Zhuoran Yang
Motivated by the sampling problems and heterogeneity issues common in high- dimensional big datasets, we consider a class of discordant additive index models.
no code implementations • 21 Mar 2018 • Yuxin Chen, Yuejie Chi, Jianqing Fan, Cong Ma
This paper considers the problem of solving systems of quadratic equations, namely, recovering an object of interest $\mathbf{x}^{\natural}\in\mathbb{R}^{n}$ from $m$ quadratic equations/samples $y_{i}=(\mathbf{a}_{i}^{\top}\mathbf{x}^{\natural})^{2}$, $1\leq i\leq m$.
no code implementations • 26 Feb 2018 • Jelena Bradic, Jianqing Fan, Yinchu Zhu
Uniform non-testability identifies a collection of alternatives such that the power of any test, against any alternative in the group, is asymptotically at most equal to the nominal size.
no code implementations • 31 Jul 2017 • Yuxin Chen, Jianqing Fan, Cong Ma, Kaizheng Wang
This paper is concerned with the problem of top-$K$ ranking from pairwise comparisons.
2 code implementations • 21 Jun 2017 • Qiang Sun, WenXin Zhou, Jianqing Fan
We establish a sharp phase transition for robust estimation of regression parameters in both low and high dimensions: when $\delta \geq 1$, the estimator admits a sub-Gaussian-type deviation bound without sub-Gaussian assumptions on the data, while only a slower rate is available in the regime $0<\delta< 1$.
Statistics Theory Methodology Statistics Theory
1 code implementation • 27 Dec 2016 • Jianqing Fan, Yuan Ke, Kaizheng Wang
This paper studies model selection consistency for high dimensional sparse regression when data exhibits both cross-sectional and serial dependency.
Methodology
no code implementations • 27 May 2015 • Jianqing Fan, Lingzhou Xue, Jiawei Yao
Our method and theory allow the number of predictors to be larger than the number of observations.
no code implementations • 7 Jan 2015 • Jianqing Fan, Yang Feng, Lucy Xia
Measuring conditional dependence is an important topic in statistics with broad applications including graphical models.
1 code implementation • 29 Apr 2014 • Jianqing Fan, Han Liu, Yang Ning, Hui Zou
Theoretically, the proposed methods achieve the same rates of convergence for both precision matrix estimation and eigenvector estimation, as if the latent variables were observed.
no code implementations • 31 Dec 2013 • Jianqing Fan, Yang Feng, Jiancheng Jiang, Xin Tong
We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities.
no code implementations • 7 Aug 2013 • Jianqing Fan, Fang Han, Han Liu
Big Data bring new opportunities to modern society and challenges to data scientists.
no code implementations • 22 Oct 2012 • Jianqing Fan, Lingzhou Xue, Hui Zou
Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation.
no code implementations • 25 Oct 2010 • Jelena Bradic, Jianqing Fan, Jiancheng Jiang
High throughput genetic sequencing arrays with thousands of measurements per sample and a great amount of related censored clinical data have increased demanding need for better measurement specific model selection.