no code implementations • 29 Aug 2024 • Jiameng Lyu, Jinxing Xie, Shilin Yuan, Yuan Zhou
Our meta-policy is flexible enough to be applied to a general inventory systems framework covering a wide range of inventory management problems with myopic clairvoyant optimal policy.
no code implementations • 6 Jul 2024 • Jiameng Lyu, Shilin Yuan, Bingkun Zhou, Yuan Zhou
Under the \alpha-global strong convexity condition, we demonstrate that the worst-case regret of any data-driven method is lower bounded by \Omega(\log T/\alpha), which is the first lower bound result that matches the existing upper bound with respect to both parameter \alpha and time horizon T. Along the way, we propose to analyze the SAA regret via a new gradient approximation technique, as well as a new class of smooth inverted-hat-shaped hard problem instances that might be of independent interest for the lower bounds of broader data-driven problems.