Search Results for author: Jian Qian

Found 18 papers, 4 papers with code

To bootstrap or to rollout? An optimal and adaptive interpolation

no code implementations14 Nov 2024 Wenlong Mou, Jian Qian

Specifically, the error upper bound of our estimator approaches the optimal variance achieved by TD, with an additional term depending on the exit probability of a selected subset of the state space.

Reinforcement Learning (RL)

Refined Risk Bounds for Unbounded Losses via Transductive Priors

no code implementations29 Oct 2024 Jian Qian, Alexander Rakhlin, Nikita Zhivotovskiy

We revisit the sequential variants of linear regression with the squared loss, classification problems with hinge loss, and logistic regression, all characterized by unbounded losses in the setup where no assumptions are made on the magnitude of design vectors and the norm of the optimal vector of parameters.

Denoising regression

How Does Variance Shape the Regret in Contextual Bandits?

no code implementations16 Oct 2024 Zeyu Jia, Jian Qian, Alexander Rakhlin, Chen-Yu Wei

We show that a regret of $\Omega(\sqrt{d_\text{elu}\Lambda}+d_\text{elu})$ is unavoidable when $\sqrt{d_\text{elu}\Lambda}+d_\text{elu}\leq\sqrt{AT}$.

Multi-Armed Bandits

Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability

no code implementations7 Oct 2024 Fan Chen, Dylan J. Foster, Yanjun Han, Jian Qian, Alexander Rakhlin, Yunbei Xu

Classical lower bound techniques -- such as Fano's method, Le Cam's method, and Assouad's lemma -- are central to the study of minimax risk in statistical estimation, yet are insufficient to provide tight lower bounds for \emph{interactive decision making} algorithms that collect data interactively (e. g., algorithms for bandits and reinforcement learning).

Decision Making LEMMA

SDformer: Efficient End-to-End Transformer for Depth Completion

1 code implementation12 Sep 2024 Jian Qian, Miao Sun, Ashley Lee, Jie Li, Shenglong Zhuo, Patrick Yin Chiang

The network consists of an input module for the depth map and RGB image features extraction and concatenation, a U-shaped encoder-decoder Transformer for extracting deep features, and a refinement module.

Decoder Depth Completion

Sub-SA: Strengthen In-context Learning via Submodular Selective Annotation

1 code implementation8 Jul 2024 Jian Qian, Miao Sun, Sifan Zhou, Ziyu Zhao, Ruizhi Hun, Patrick Chiang

In Sub-SA, we design a submodular function that facilitates effective subset selection for annotation and demonstrates the characteristics of monotonically and submodularity from the theoretical perspective.

Diversity In-Context Learning

TimeLDM: Latent Diffusion Model for Unconditional Time Series Generation

no code implementations5 Jul 2024 Jian Qian, Bingyu Xie, Biao Wan, Minhao Li, Miao Sun, Patrick Yin Chiang

TimeLDM is composed of a variational autoencoder that encodes time series into an informative and smoothed latent content and a latent diffusion model operating in the latent space to generate latent information.

Autonomous Driving Data Augmentation +3

Offline Oracle-Efficient Learning for Contextual MDPs via Layerwise Exploration-Exploitation Tradeoff

no code implementations28 May 2024 Jian Qian, Haichen Hu, David Simchi-Levi

In this paper, we introduce a reduction from CMDPs to offline density estimation under the realizability assumption, i. e., a model class M containing the true underlying CMDP is provided in advance.

Density Estimation Multi-Armed Bandits

Online Estimation via Offline Estimation: An Information-Theoretic Framework

no code implementations15 Apr 2024 Dylan J. Foster, Yanjun Han, Jian Qian, Alexander Rakhlin

Our main results settle the statistical and computational complexity of online estimation in this framework.

Decision Making Density Estimation

Byzantine-Robust Federated Linear Bandits

no code implementations3 Apr 2022 Ali Jadbabaie, Haochuan Li, Jian Qian, Yi Tian

In this paper, we study a linear bandit optimization problem in a federated setting where a large collection of distributed agents collaboratively learn a common linear bandit model.

Federated Learning

The Statistical Complexity of Interactive Decision Making

no code implementations27 Dec 2021 Dylan J. Foster, Sham M. Kakade, Jian Qian, Alexander Rakhlin

The main result of this work provides a complexity measure, the Decision-Estimation Coefficient, that is proven to be both necessary and sufficient for sample-efficient interactive learning.

Decision Making reinforcement-learning +1

Robust learning under clean-label attack

no code implementations1 Mar 2021 Avrim Blum, Steve Hanneke, Jian Qian, Han Shao

We study the problem of robust learning under clean-label data-poisoning attacks, where the attacker injects (an arbitrary set of) correctly-labeled examples to the training set to fool the algorithm into making mistakes on specific test instances at test time.

Data Poisoning PAC learning

Stochastic Bandits with Vector Losses: Minimizing $\ell^\infty$-Norm of Relative Losses

no code implementations15 Oct 2020 Xuedong Shang, Han Shao, Jian Qian

We study two goals: (a) finding the arm with the minimum $\ell^\infty$-norm of relative losses with a given confidence level (which refers to fixed-confidence best-arm identification); (b) minimizing the $\ell^\infty$-norm of cumulative relative losses (which refers to regret minimization).

Multi-Armed Bandits Recommendation Systems

Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes

no code implementations NeurIPS 2020 Yi Tian, Jian Qian, Suvrit Sra

We study minimax optimal reinforcement learning in episodic factored Markov decision processes (FMDPs), which are MDPs with conditionally independent transition components.

reinforcement-learning Reinforcement Learning +1

Concentration Inequalities for Multinoulli Random Variables

no code implementations30 Jan 2020 Jian Qian, Ronan Fruit, Matteo Pirotta, Alessandro Lazaric

We investigate concentration inequalities for Dirichlet and Multinomial random variables.

Exploration Bonus for Regret Minimization in Discrete and Continuous Average Reward MDPs

1 code implementation NeurIPS 2019 Jian Qian, Ronan Fruit, Matteo Pirotta, Alessandro Lazaric

The exploration bonus is an effective approach to manage the exploration-exploitation trade-off in Markov Decision Processes (MDPs).

Importance Resampling for Off-policy Prediction

2 code implementations NeurIPS 2019 Matthew Schlegel, Wesley Chung, Daniel Graves, Jian Qian, Martha White

Importance sampling (IS) is a common reweighting strategy for off-policy prediction in reinforcement learning.

Reinforcement Learning

Exploration Bonus for Regret Minimization in Undiscounted Discrete and Continuous Markov Decision Processes

no code implementations11 Dec 2018 Jian Qian, Ronan Fruit, Matteo Pirotta, Alessandro Lazaric

We introduce and analyse two algorithms for exploration-exploitation in discrete and continuous Markov Decision Processes (MDPs) based on exploration bonuses.

Efficient Exploration

Cannot find the paper you are looking for? You can Submit a new open access paper.