no code implementations • 21 Mar 2024 • Haihao Lu, Luyang Zhang
Determining how to rank these sponsored items for each incoming visit is a crucial challenge for online marketplaces, a problem known as sponsored listings ranking (SLR).
no code implementations • 12 Feb 2022 • Santiago R. Balseiro, Haihao Lu, Vahab Mirrokni, Balasubramanian Sivan
As a byproduct of our proofs, we provide the first regret bound for CMD for non-smooth convex optimization, which might be of independent interest.
no code implementations • 10 Nov 2021 • Haihao Lu, Jinwen Yang
There is a recent interest on first-order methods for linear programming (LP).
1 code implementation • NeurIPS 2020 • Joey Huchette, Haihao Lu, Hossein Esfandiari, Vahab Mirrokni
Moreover, we show that this MIP formulation is ideal (i. e. the strongest possible formulation) for the revenue function of a single impression.
no code implementations • 18 Nov 2020 • Santiago Balseiro, Haihao Lu, Vahab Mirrokni
In this paper, we consider a data-driven setting in which the reward and resource consumption of each request are generated using an input model that is unknown to the decision maker.
no code implementations • 20 Oct 2020 • Benjamin Grimmer, Haihao Lu, Pratik Worah, Vahab Mirrokni
Unlike nonconvex optimization, where gradient descent is guaranteed to converge to a local optimizer, algorithms for nonconvex-nonconcave minimax optimization can have topologically different solution paths: sometimes converging to a solution, sometimes never converging and instead following a limit cycle, and sometimes diverging.
no code implementations • 1 Jul 2020 • Santiago Balseiro, Haihao Lu, Vahab Mirrokni
In this paper, we introduce the \emph{regularized online allocation problem}, a variant that includes a non-linear regularizer acting on the total resource consumption.
no code implementations • 15 Jun 2020 • Benjamin Grimmer, Haihao Lu, Pratik Worah, Vahab Mirrokni
Critically, we show this envelope not only smooths the objective but can convexify and concavify it based on the level of interaction present between the minimizing and maximizing variables.
no code implementations • ICML 2020 • Haihao Lu, Santiago Balseiro, Vahab Mirrokni
The revenue function and resource consumption of each request are drawn independently and at random from a probability distribution that is unknown to the decision maker.
Optimization and Control
1 code implementation • 20 Feb 2020 • Joey Huchette, Haihao Lu, Hossein Esfandiari, Vahab Mirrokni
Moreover, we show that this MIP formulation is ideal (i. e. the strongest possible formulation) for the revenue function of a single impression.
no code implementations • 23 Jan 2020 • Haihao Lu
Surprisingly, there are still two fundamental and unanswered questions: (i) it is unclear how to obtain a \emph{suitable} ODE from a given DTA, and (ii) it is unclear the connection between the convergence of a DTA and its corresponding ODEs.
2 code implementations • 9 Jul 2019 • Kenji Kawaguchi, Haihao Lu
The traditional approaches, such as (mini-batch) stochastic gradient descent (SGD), utilize an unbiased gradient estimator of the empirical average loss.
1 code implementation • 20 Mar 2019 • Haihao Lu, Sai Praneeth Karimireddy, Natalia Ponomareva, Vahab Mirrokni
This is the first GBM type of algorithm with theoretically-justified accelerated convergence rate.
1 code implementation • 24 Oct 2018 • Haihao Lu, Rahul Mazumder
Gradient Boosting Machine (GBM) introduced by Friedman is a powerful supervised learning algorithm that is very widely used in practice---it routinely features as a leading algorithm in machine learning competitions such as Kaggle and the KDDCup.
1 code implementation • 4 Oct 2018 • Shuaiwen Wang, Wenda Zhou, Arian Maleki, Haihao Lu, Vahab Mirrokni
$\mathcal{C} \subset \mathbb{R}^{p}$ is a closed convex set.
2 code implementations • ICML 2018 • Shuaiwen Wang, Wenda Zhou, Haihao Lu, Arian Maleki, Vahab Mirrokni
Consider the following class of learning schemes: $$\hat{\boldsymbol{\beta}} := \arg\min_{\boldsymbol{\beta}}\;\sum_{j=1}^n \ell(\boldsymbol{x}_j^\top\boldsymbol{\beta}; y_j) + \lambda R(\boldsymbol{\beta}),\qquad\qquad (1) $$ where $\boldsymbol{x}_i \in \mathbb{R}^p$ and $y_i \in \mathbb{R}$ denote the $i^{\text{th}}$ feature and response variable respectively.
no code implementations • ICML 2018 • Haihao Lu, Robert Freund, Vahab Mirrokni
On the empirical side, while both AGCD and ASCD outperform Accelerated Randomized Coordinate Descent on most instances in our numerical experiments, we note that AGCD significantly outperforms the other two methods in our experiments, in spite of a lack of theoretical guarantees for this method.
no code implementations • 27 Feb 2017 • Haihao Lu, Kenji Kawaguchi
In deep learning, \textit{depth}, as well as \textit{nonlinearity}, create non-convex loss surfaces.