When Demands Evolve Larger and Noisier: Learning and Earning in a Growing Environment

ICML 2020  ·  Feng Zhu, Zeyu Zheng ·

We consider a single-product dynamic pricing problem under a specific non-stationary setting, where the demand grows over time in expectation and possibly gets noisier. The decision maker dynamically sets price and learns the unknown price elasticity, with the goal of maximizing expected cumulative revenue. We prove matching upper and lower bounds on regret and provide near-optimal pricing policies. We show how the change in demand uncertainty over time affects the optimal policy design and demonstrate how the order of optimal regret depends on the magnitude of demand uncertainty evolvement. Moreover, we distinguish between the \textit{any-time} situation and the \textit{fixed-time} situation by whether the seller knows the total number of time periods $T$ in advance or not, and show that they surprisingly render different optimal regret orders. We then extend the demand model to a more general case allowing for an additional intercept term and present a novel and near-optimal algorithm for the extended model. Finally, we consider an analogous non-stationary setting in the canonical multi-armed bandit problem, and points out that the \textit{any-time} situation and the \textit{fixed-time} situation render the same optimal regret order in a simple form, in contrast to the dynamic pricing problem.

PDF ICML 2020 PDF
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here