Search Results for author: Yunzong Xu

Found 6 papers, 0 papers with code

Offline Reinforcement Learning: Fundamental Barriers for Value Function Approximation

no code implementations21 Nov 2021 Dylan J. Foster, Akshay Krishnamurthy, David Simchi-Levi, Yunzong Xu

This led Chen and Jiang (2019) to conjecture that concentrability (the most standard notion of coverage) and realizability (the weakest representation condition) alone are not sufficient for sample-efficient offline RL.

Decision Making Offline RL +2

Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective

no code implementations7 Oct 2020 Dylan J. Foster, Alexander Rakhlin, David Simchi-Levi, Yunzong Xu

In the classical multi-armed bandit problem, instance-dependent algorithms attain improved performance on "easy" problems with a gap between the best and second-best arm.

Active Learning Multi-Armed Bandits +2

Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability

no code implementations28 Mar 2020 David Simchi-Levi, Yunzong Xu

We consider the general (stochastic) contextual bandit problem under the realizability assumption, i. e., the expected reward, as a function of contexts and actions, belongs to a general function class $\mathcal{F}$.

Multi-Armed Bandits regression

Blind Network Revenue Management and Bandits with Knapsacks under Limited Switches

no code implementations4 Nov 2019 David Simchi-Levi, Yunzong Xu, Jinglong Zhao

Our work reveals a surprising result: the optimal regret rate is completely characterized by a piecewise-constant function of the switching budget, which further depends on the number of resource constraints -- to the best of our knowledge, this is the first time the number of resources constraints is shown to play a fundamental role in determining the statistical complexity of online learning problems.

Decision Making Management

Phase Transitions in Bandits with Switching Constraints

no code implementations NeurIPS 2019 David Simchi-Levi, Yunzong Xu

We consider the classical stochastic multi-armed bandit problem with a constraint that limits the total cost incurred by switching between actions to be no larger than a given switching budget.

Cannot find the paper you are looking for? You can Submit a new open access paper.