Search Results for author: Jianqing Fan

Found 66 papers, 5 papers with code

Optimal Aggregation of Prediction Intervals under Unsupervised Domain Shift

1 code implementation16 May 2024 Jiawei Ge, Debarghya Mukherjee, Jianqing Fan

In this paper, we propose methodologies for aggregating prediction intervals to obtain one with minimal width and adequate coverage on the target domain under unsupervised domain shift, under which we have labeled samples from a related source domain and unlabeled covariates from the target domain.

Prediction Intervals

Causality Pursuit from Heterogeneous Environments via Neural Adversarial Invariance Learning

no code implementations7 May 2024 Yihong Gu, Cong Fang, Peter Bühlmann, Jianqing Fan

As illustrated by the unified non-asymptotic analysis, our adversarial estimation framework can attain provable sample-efficient estimation akin to standard regression under a minimal identification condition for various tasks and models.

An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization

no code implementations11 Apr 2024 Minshuo Chen, Song Mei, Jianqing Fan, Mengdi Wang

In this paper, we review emerging applications of diffusion models, understanding their sample generation under various controls.

A general theory for robust clustering via trimmed mean

no code implementations10 Jan 2024 Soham Jana, Jianqing Fan, Sanjeev Kulkarni

In this paper, we introduce a hybrid clustering technique with a novel multivariate trimmed mean type centroid estimate to produce mislabeling guarantees under a weak initialization condition for general error distributions around the centroids.


Structured Matrix Learning under Arbitrary Entrywise Dependence and Estimation of Markov Transition Kernel

no code implementations4 Jan 2024 Jinhang Chai, Jianqing Fan

The problem of structured matrix estimation has been studied mostly under strong noise dependence assumptions.

Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift

no code implementations27 Nov 2023 Jiawei Ge, Shange Tang, Jianqing Fan, Cong Ma, Chi Jin

This paper addresses this fundamental question by proving that, surprisingly, classical Maximum Likelihood Estimation (MLE) purely using source data (without any modification) achieves the minimax optimality for covariate shift under the well-specified setting.

regression Retrieval

Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks

no code implementations22 Nov 2023 Jianqing Fan, Zhaoran Wang, Zhuoran Yang, Chenlu Ye

For these settings, we design a provably sample-efficient algorithm which achieves a $ \mathcal{\tilde O}(s_0^2 \log^2 T)$ regret in the sparse case and $ \mathcal{\tilde O} ( r ^2 \log^2 T)$ regret in the low-rank case, using only $L = \mathcal{O}( \log T)$ batches.

Multi-Armed Bandits

Robust Transfer Learning with Unreliable Source Data

no code implementations6 Oct 2023 Jianqing Fan, Cheng Gao, Jason M. Klusowski

This paper addresses challenges in robust transfer learning stemming from ambiguity in Bayes classifiers and weak transferable signals between the target and source distribution.

regression Transfer Learning

Inferences on Mixing Probabilities and Ranking in Mixed-Membership Models

no code implementations29 Aug 2023 Sohom Bhattacharya, Jianqing Fan, Jikai Hou

Network data is prevalent in numerous big data applications including economics and health networks where it is of prime importance to understand the latent structure of network.

Uncertainty Quantification

Spectral Ranking Inferences based on General Multiway Comparisons

no code implementations5 Aug 2023 Jianqing Fan, Zhipeng Lou, Weichen Wang, Mengxin Yu

This paper studies the performance of the spectral method in the estimation and uncertainty quantification of the unobserved preference scores of compared entities in a general and more realistic setup.

Uncertainty Quantification

UTOPIA: Universally Trainable Optimal Prediction Intervals Aggregation

no code implementations28 Jun 2023 Jianqing Fan, Jiawei Ge, Debarghya Mukherjee

Uncertainty quantification for prediction is an intriguing problem with significant applications in various fields, such as biomedical science, economic studies, and weather forecasts.

Prediction Intervals Uncertainty Quantification

The Isotonic Mechanism for Exponential Family Estimation

no code implementations21 Apr 2023 Yuling Yan, Weijie J. Su, Jianqing Fan

We demonstrate that an author is incentivized to provide accurate rankings when her utility takes the form of a convex additive function of the adjusted review scores.

Minimax-Optimal Reward-Agnostic Exploration in Reinforcement Learning

no code implementations14 Apr 2023 Gen Li, Yuling Yan, Yuxin Chen, Jianqing Fan

This paper studies reward-agnostic exploration in reinforcement learning (RL) -- a scenario where the learner is unware of the reward functions during the exploration stage -- and designs an algorithm that improves over the state of the art.

Offline RL reinforcement-learning +1

Environment Invariant Linear Least Squares

no code implementations6 Mar 2023 Jianqing Fan, Cong Fang, Yihong Gu, Tong Zhang

To the best of our knowledge, this paper is the first to realize statistically efficient invariance learning in the general linear model.

Causal Inference regression +2

On the Provable Advantage of Unsupervised Pretraining

no code implementations2 Mar 2023 Jiawei Ge, Shange Tang, Jianqing Fan, Chi Jin

Unsupervised pretraining, which learns a useful representation using a large amount of unlabeled data to facilitate the learning of downstream tasks, is a critical component of modern large-scale machine learning systems.

Contrastive Learning Representation Learning

Communication-Efficient Distributed Estimation and Inference for Cox's Model

no code implementations23 Feb 2023 Pierre Bayle, Jianqing Fan, Zhipeng Lou

Motivated by multi-center biomedical studies that cannot share individual data due to privacy and ownership concerns, we develop communication-efficient iterative distributed algorithms for estimation and inference in the high-dimensional sparse Cox proportional hazards model.


Deep Neural Networks for Nonparametric Interaction Models with Diverging Dimension

no code implementations12 Feb 2023 Sohom Bhattacharya, Jianqing Fan, Debarghya Mukherjee

We show that under certain standard assumptions, debiased deep neural networks achieve a minimax optimal rate both in terms of $(n, d)$.

Uncertainty Quantification of MLE for Entity Ranking with Covariates

no code implementations20 Dec 2022 Jianqing Fan, Jikai Hou, Mengxin Yu

This paper concerns with statistical estimation and inference for the ranking problems based on pairwise comparisons with additional covariate information such as the attributes of the compared items.

Uncertainty Quantification

Ranking Inferences Based on the Top Choice of Multiway Comparisons

no code implementations22 Nov 2022 Jianqing Fan, Zhipeng Lou, Weichen Wang, Mengxin Yu

The estimated distribution is then used to construct simultaneous confidence intervals for the differences in the preference scores and the ranks of individual items.


Robust High-dimensional Tuning Free Multiple Testing

no code implementations22 Nov 2022 Jianqing Fan, Zhipeng Lou, Mengxin Yu

A stylized feature of high-dimensional data is that many variables have heavy tails, and robust statistical inference is critical for valid large-scale statistical inference.

valid Vocal Bursts Intensity Prediction

SIMPLE-RC: Group Network Inference with Non-Sharp Nulls and Weak Signals

no code implementations31 Oct 2022 Jianqing Fan, Yingying Fan, Jinchi Lv, Fan Yang

To address these practical challenges, in this paper we propose a SIMPLE method with random coupling (SIMPLE-RC) for testing the non-sharp null hypothesis that a group of given nodes share similar (not necessarily identical) membership profiles under weaker signals.

Uncertainty Quantification

Factor-Augmented Regularized Model for Hazard Regression

no code implementations3 Oct 2022 Pierre Bayle, Jianqing Fan

A prevalent feature of high-dimensional data is the dependence among covariates, and model selection is known to be challenging when covariates are highly correlated.

Model Selection regression +1

Strategic Decision-Making in the Presence of Information Asymmetry: Provably Efficient RL with Algorithmic Instruments

no code implementations23 Aug 2022 Mengxin Yu, Zhuoran Yang, Jianqing Fan

We study offline reinforcement learning under a novel model called strategic MDP, which characterizes the strategic interactions between a principal and a sequence of myopic agents with private types.

Decision Making Offline RL +3

Robust Matrix Completion with Heavy-tailed Noise

no code implementations9 Jun 2022 Bingyan Wang, Jianqing Fan

This paper studies low-rank matrix completion in the presence of heavy-tailed and possibly asymmetric noise, where we aim to estimate an underlying low-rank matrix given a set of highly incomplete noisy entries.

Low-Rank Matrix Completion

How do noise tails impact on deep ReLU networks?

no code implementations20 Mar 2022 Jianqing Fan, Yihong Gu, Wen-Xin Zhou

This paper investigates the stability of deep ReLU neural networks for nonparametric regression under the assumption that the noise has only a finite p-th moment.


The Efficacy of Pessimism in Asynchronous Q-Learning

no code implementations14 Mar 2022 Yuling Yan, Gen Li, Yuxin Chen, Jianqing Fan

This paper is concerned with the asynchronous form of Q-learning, which applies a stochastic approximation scheme to Markovian data samples.


Are Latent Factor Regression and Sparse Regression Adequate?

no code implementations2 Mar 2022 Jianqing Fan, Zhipeng Lou, Mengxin Yu

To fill in such an important gap, we also leverage our model as the alternative model to test the sufficiency of the latent factor regression and the sparse linear regression models.

Dimensionality Reduction regression

Curriculum Learning for Vision-and-Language Navigation

no code implementations NeurIPS 2021 Jiwen Zhang, Zhongyu Wei, Jianqing Fan, Jiajie Peng

Vision-and-Language Navigation (VLN) is a task where an agent navigates in an embodied indoor environment under human instructions.

Vision and Language Navigation

Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences for Image-Text Retrieval

1 code implementation Findings (NAACL) 2022 Zhihao Fan, Zhongyu Wei, Zejun Li, Siyuan Wang, Jianqing Fan

We propose our TAiloring neGative Sentences with Discrimination and Correction (TAGS-DC) to generate synthetic sentences automatically as negative samples.

Retrieval Sentence +1

Policy Optimization Using Semi-parametric Models for Dynamic Pricing

no code implementations13 Sep 2021 Jianqing Fan, Yongyi Guo, Mengxin Yu

$F(\cdot)$ with $m$-th order derivative ($m\geq 2$), our policy achieves a regret upper bound of $\tilde{O}_{d}(T^{\frac{2m+1}{4m-1}})$, where $T$ is time horizon and $\tilde{O}_{d}$ is the order that hides logarithmic terms and the dimensionality of feature $d$.

Decision Making

Constructing Phrase-level Semantic Labels to Form Multi-Grained Supervision for Image-Text Retrieval

no code implementations12 Sep 2021 Zhihao Fan, Zhongyu Wei, Zejun Li, Siyuan Wang, Haijun Shan, Xuanjing Huang, Jianqing Fan

Existing research for image text retrieval mainly relies on sentence-level supervision to distinguish matched and mismatched sentences for a query image.

Representation Learning Retrieval +2

Inference for Heteroskedastic PCA with Missing Data

no code implementations26 Jul 2021 Yuling Yan, Yuxin Chen, Jianqing Fan

This paper studies how to construct confidence regions for principal component analysis (PCA) in high dimension, a problem that has been vastly under-explored.


Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model

no code implementations NeurIPS 2021 Bingyan Wang, Yuling Yan, Jianqing Fan

Our results show that for arbitrarily large-scale MDP, both the model-based approach and Q-learning are sample-efficient when $K$ is relatively small, and hence the title of this paper.

Q-Learning reinforcement-learning +1

Bridging factor and sparse models

no code implementations22 Feb 2021 Jianqing Fan, Ricardo Masini, Marcelo C. Medeiros

Factor and sparse models are two widely used methods to impose a low-dimensional structure in high-dimensions.

Model Selection regression +1

The Interplay of Demographic Variables and Social Distancing Scores in Deep Prediction of U.S. COVID-19 Cases

no code implementations6 Jan 2021 Francesca Tang, Yang Feng, Hamza Chiheb, Jianqing Fan

With the severity of the COVID-19 outbreak, we characterize the nature of the growth trajectories of counties in the United States using a novel combination of spectral clustering and the correlation matrix.


Spectral Methods for Data Science: A Statistical Perspective

no code implementations15 Dec 2020 Yuxin Chen, Yuejie Chi, Jianqing Fan, Cong Ma

While the studies of spectral methods can be traced back to classical matrix perturbation theory and methods of moments, the past decade has witnessed tremendous theoretical advances in demystifying their efficacy through the lens of statistical modeling, with the aid of non-asymptotic random matrix theory.

Recent Developments on Factor Models and its Applications in Econometric Learning

no code implementations21 Sep 2020 Jianqing Fan, Kunpeng Li, Yuan Liao

This paper makes a selective survey on the recent development of the factor model and its application on statistical learnings.

Matrix Completion

Convex and Nonconvex Optimization Are Both Minimax-Optimal for Noisy Blind Deconvolution under Random Designs

no code implementations4 Aug 2020 Yuxin Chen, Jianqing Fan, Bingyan Wang, Yuling Yan

We investigate the effectiveness of convex relaxation and nonconvex optimization in solving bilinear systems of equations under two different designs (i. e.$~$a sort of random Fourier design and Gaussian design).

Understanding Implicit Regularization in Over-Parameterized Single Index Model

no code implementations16 Jul 2020 Jianqing Fan, Zhuoran Yang, Mengxin Yu

For both the vector and matrix settings, we construct an over-parameterized least-squares loss function by employing the score function transform and a robust truncation step designed specifically for heavy-tailed data.

Variable Selection

An $\ell_p$ theory of PCA and spectral clustering

no code implementations24 Jun 2020 Emmanuel Abbe, Jianqing Fan, Kaizheng Wang

Principal Component Analysis (PCA) is a powerful tool in statistics and machine learning.

Clustering Community Detection

Bridging Convex and Nonconvex Optimization in Robust PCA: Noise, Outliers, and Missing Data

no code implementations15 Jan 2020 Yuxin Chen, Jianqing Fan, Cong Ma, Yuling Yan

This paper delivers improved theoretical guarantees for the convex programming approach in low-rank matrix estimation, in the presence of (1) random noise, (2) gross sparse outliers, and (3) missing data.

SIMPLE: Statistical Inference on Membership Profiles in Large Networks

no code implementations3 Oct 2019 Jianqing Fan, Yingying Fan, Xiao Han, Jinchi Lv

Both tests are of the Hotelling-type statistics based on the rows of empirical eigenvectors or their ratios, whose asymptotic covariance matrices are very challenging to derive and estimate.

Communication-Efficient Accurate Statistical Estimation

no code implementations12 Jun 2019 Jianqing Fan, Yongyi Guo, Kaizheng Wang

In addition, we give the conditions under which the one-step CEASE estimator is statistically efficient.

Distributed Optimization

Inference and Uncertainty Quantification for Noisy Matrix Completion

no code implementations10 Jun 2019 Yuxin Chen, Jianqing Fan, Cong Ma, Yuling Yan

As a byproduct, we obtain a sharp characterization of the estimation accuracy of our de-biased estimators, which, to the best of our knowledge, are the first tractable algorithms that provably achieve full statistical efficiency (including the preconstant).

Matrix Completion Uncertainty Quantification +1

Low-Rank Principal Eigenmatrix Analysis

no code implementations28 Apr 2019 Krishna Balasubramanian, Elynn Y. Chen, Jianqing Fan, Xiang Wu

Sparse PCA is a widely used technique for high-dimensional data analysis.

A Selective Overview of Deep Learning

no code implementations10 Apr 2019 Jianqing Fan, Cong Ma, Yiqiao Zhong

Deep learning has arguably achieved tremendous success in recent years.

Noisy Matrix Completion: Understanding Statistical Guarantees for Convex Relaxation via Nonconvex Optimization

no code implementations20 Feb 2019 Yuxin Chen, Yuejie Chi, Jianqing Fan, Cong Ma, Yuling Yan

This paper studies noisy low-rank matrix completion: given partial and noisy entries of a large low-rank matrix, the goal is to estimate the underlying matrix faithfully and efficiently.

Low-Rank Matrix Completion

A Theoretical Analysis of Deep Q-Learning

no code implementations1 Jan 2019 Jianqing Fan, Zhaoran Wang, Yuchen Xie, Zhuoran Yang

Despite the great empirical success of deep reinforcement learning, its theoretical foundation is less well understood.


Asymmetry Helps: Eigenvalue and Eigenvector Analyses of Asymmetrically Perturbed Low-Rank Matrices

no code implementations30 Nov 2018 Yuxin Chen, Chen Cheng, Jianqing Fan

The aim is to estimate the leading eigenvalue and eigenvector of $\mathbf{M}^{\star}$.

Curse of Heterogeneity: Computational Barriers in Sparse Mixture Models and Phase Retrieval

no code implementations21 Aug 2018 Jianqing Fan, Han Liu, Zhaoran Wang, Zhuoran Yang

We study the fundamental tradeoffs between statistical accuracy and computational tractability in the analysis of high dimensional heterogeneous data.

Clustering Retrieval

Robust high dimensional factor models with applications to statistical machine learning

no code implementations12 Aug 2018 Jianqing Fan, Kaizheng Wang, Yiqiao Zhong, Ziwei Zhu

Factor models are a class of powerful statistical models that have been widely used to deal with dependent measurements that arise frequently from various applications from genomics and neuroscience to economics and finance.

BIG-bench Machine Learning Model Selection +1

Tensor Methods for Additive Index Models under Discordance and Heterogeneity

no code implementations17 Jul 2018 Krishnakumar Balasubramanian, Jianqing Fan, Zhuoran Yang

Motivated by the sampling problems and heterogeneity issues common in high- dimensional big datasets, we consider a class of discordant additive index models.

Gradient Descent with Random Initialization: Fast Global Convergence for Nonconvex Phase Retrieval

no code implementations21 Mar 2018 Yuxin Chen, Yuejie Chi, Jianqing Fan, Cong Ma

This paper considers the problem of solving systems of quadratic equations, namely, recovering an object of interest $\mathbf{x}^{\natural}\in\mathbb{R}^{n}$ from $m$ quadratic equations/samples $y_{i}=(\mathbf{a}_{i}^{\top}\mathbf{x}^{\natural})^{2}$, $1\leq i\leq m$.


Testability of high-dimensional linear models with non-sparse structures

no code implementations26 Feb 2018 Jelena Bradic, Jianqing Fan, Yinchu Zhu

Uniform non-testability identifies a collection of alternatives such that the power of any test, against any alternative in the group, is asymptotically at most equal to the nominal size.

Feature Correlation regression +1

Spectral Method and Regularized MLE Are Both Optimal for Top-$K$ Ranking

no code implementations31 Jul 2017 Yuxin Chen, Jianqing Fan, Cong Ma, Kaizheng Wang

This paper is concerned with the problem of top-$K$ ranking from pairwise comparisons.

Adaptive Huber Regression

2 code implementations21 Jun 2017 Qiang Sun, WenXin Zhou, Jianqing Fan

We establish a sharp phase transition for robust estimation of regression parameters in both low and high dimensions: when $\delta \geq 1$, the estimator admits a sub-Gaussian-type deviation bound without sub-Gaussian assumptions on the data, while only a slower rate is available in the regime $0<\delta< 1$.

Statistics Theory Methodology Statistics Theory

Factor-Adjusted Regularized Model Selection

1 code implementation27 Dec 2016 Jianqing Fan, Yuan Ke, Kaizheng Wang

This paper studies model selection consistency for high dimensional sparse regression when data exhibits both cross-sectional and serial dependency.


Sufficient Forecasting Using Factor Models

no code implementations27 May 2015 Jianqing Fan, Lingzhou Xue, Jiawei Yao

Our method and theory allow the number of predictors to be larger than the number of observations.

Dimensionality Reduction regression +1

A Projection Based Conditional Dependence Measure with Applications to High-dimensional Undirected Graphical Models

no code implementations7 Jan 2015 Jianqing Fan, Yang Feng, Lucy Xia

Measuring conditional dependence is an important topic in statistics with broad applications including graphical models.

High Dimensional Semiparametric Latent Graphical Model for Mixed Data

1 code implementation29 Apr 2014 Jianqing Fan, Han Liu, Yang Ning, Hui Zou

Theoretically, the proposed methods achieve the same rates of convergence for both precision matrix estimation and eigenvector estimation, as if the latent variables were observed.

feature selection Vocal Bursts Intensity Prediction

Feature Augmentation via Nonparametrics and Selection (FANS) in High Dimensional Classification

no code implementations31 Dec 2013 Jianqing Fan, Yang Feng, Jiancheng Jiang, Xin Tong

We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities.

Additive models General Classification +1

Challenges of Big Data Analysis

no code implementations7 Aug 2013 Jianqing Fan, Fang Han, Han Liu

Big Data bring new opportunities to modern society and challenges to data scientists.

Strong oracle optimality of folded concave penalized estimation

no code implementations22 Oct 2012 Jianqing Fan, Lingzhou Xue, Hui Zou

Folded concave penalization methods have been shown to enjoy the strong oracle property for high-dimensional sparse estimation.


Regularization for Cox's proportional hazards model with NP-dimensionality

no code implementations25 Oct 2010 Jelena Bradic, Jianqing Fan, Jiancheng Jiang

High throughput genetic sequencing arrays with thousands of measurements per sample and a great amount of related censored clinical data have increased demanding need for better measurement specific model selection.

Model Selection

Cannot find the paper you are looking for? You can Submit a new open access paper.