Search Results for author: Yunzong Xu

Found 6 papers, 0 papers with code

Offline Reinforcement Learning: Fundamental Barriers for Value Function Approximation

no code implementations • 21 Nov 2021 • Dylan J. Foster, Akshay Krishnamurthy, David Simchi-Levi, Yunzong Xu

This led Chen and Jiang (2019) to conjecture that concentrability (the most standard notion of coverage) and realizability (the weakest representation condition) alone are not sufficient for sample-efficient offline RL.

Decision Making Offline RL +2

Paper
Add Code

Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective

no code implementations • 7 Oct 2020 • Dylan J. Foster, Alexander Rakhlin, David Simchi-Levi, Yunzong Xu

In the classical multi-armed bandit problem, instance-dependent algorithms attain improved performance on "easy" problems with a gap between the best and second-best arm.

Active Learning Multi-Armed Bandits +2

Paper
Add Code

Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability

no code implementations • 28 Mar 2020 • David Simchi-Levi, Yunzong Xu

We consider the general (stochastic) contextual bandit problem under the realizability assumption, i. e., the expected reward, as a function of contexts and actions, belongs to a general function class $\mathcal{F}$.

Multi-Armed Bandits regression

Paper
Add Code

Blind Network Revenue Management and Bandits with Knapsacks under Limited Switches

no code implementations • 4 Nov 2019 • David Simchi-Levi, Yunzong Xu, Jinglong Zhao

Our work reveals a surprising result: the optimal regret rate is completely characterized by a piecewise-constant function of the switching budget, which further depends on the number of resource constraints -- to the best of our knowledge, this is the first time the number of resources constraints is shown to play a fundamental role in determining the statistical complexity of online learning problems.

Decision Making Management

Paper
Add Code

Online Pricing with Offline Data: Phase Transition and Inverse Square Law

no code implementations • ICML 2020 • Jinzhi Bu, David Simchi-Levi, Yunzong Xu

We study a single-product dynamic pricing problem over a selling horizon of $T$ periods.

Paper
Add Code

Phase Transitions in Bandits with Switching Constraints

no code implementations • NeurIPS 2019 • David Simchi-Levi, Yunzong Xu

We consider the classical stochastic multi-armed bandit problem with a constraint that limits the total cost incurred by switching between actions to be no larger than a given switching budget.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.