no code implementations • ICML 2020 • Dan Garber, Gal Korcia, Kfir Levy
Focusing on two important families of online tasks, one which generalizes online linear and logistic regression, and the other being online PCA, we show that under standard well-conditioned-data assumptions (that are often being made in the corresponding offline settings), standard online gradient descent (OGD) methods become much more efficient in the random-order model.
no code implementations • 3 Aug 2023 • Dan Garber, Atara Kaplan
For a smooth objective function, when initialized in certain proximity of an optimal solution which satisfies SC, standard projected gradient methods only require SVD computations (for projecting onto the tensor nuclear norm ball) of rank that matches the tubal rank of the optimal solution.
no code implementations • 9 Feb 2023 • Dan Garber, Ben Kretzu
We consider the setting of online convex optimization (OCO) with \textit{exp-concave} losses.
no code implementations • 25 Oct 2022 • Dan Garber, Tsur Livney, Shoham Sabach
This paper considers a convex composite optimization problem with affine constraints, which includes problems that take the form of minimizing a smooth convex objective function over the intersection of (simple) convex sets, or regularized with multiple (simple) functions.
no code implementations • 23 Jun 2022 • Dan Garber, Atara Kaplan
Low-rank and nonsmooth matrix optimization problems capture many fundamental tasks in statistics and machine learning.
no code implementations • 19 Jun 2022 • Lior Danon, Dan Garber
All three variants are shown to provably converge to the optimal solution with sublinear rate, under standard assumptions, despite the fact that the underlying optimization problem is not convex nor smooth.
no code implementations • 9 Feb 2022 • Dan Garber, Ben Kretzu
Concretely, when assuming the availability of a linear optimization oracle (LOO) for the feasible set, on a sequence of length $T$, our algorithms guarantee $O(T^{3/4})$ adaptive regret and $O(T^{3/4})$ adaptive expected regret, for the full-information and bandit settings, respectively, using only $O(T)$ calls to the LOO.
no code implementations • 8 Feb 2022 • Dan Garber, Ron Fisher
We consider optimization problems in which the goal is find a $k$-dimensional subspace of $\mathbb{R}^n$, $k<<n$, which minimizes a convex and smooth loss.
no code implementations • NeurIPS 2021 • Dan Garber, Atara Kaplan
Low-rank and nonsmooth matrix optimization problems capture many fundamental tasks in statistics and machine learning.
no code implementations • 3 Feb 2021 • Dan Garber, Noam Wolf
We consider variants of the classical Frank-Wolfe algorithm for constrained smooth convex minimization, that instead of access to the standard oracle for minimizing a linear function over the feasible set, have access to an oracle that can find an extreme point of the feasible set that is closest in Euclidean distance to a given vector.
no code implementations • 18 Dec 2020 • Dan Garber, Atara Kaplan
In this work we propose an efficient implementations of MEG, both with deterministic and stochastic gradients, which are tailored for optimization with low-rank matrices, and only use a single low-rank SVD computation on each iteration.
no code implementations • 15 Oct 2020 • Dan Garber, Ben Kretzu
We also revisit the bandit setting under strong convexity and prove a similar bound of $\tilde O(T^{2/3})$ (instead of $O(T^{3/4})$ without strong convexity).
no code implementations • NeurIPS 2020 • Dan Garber
In recent years it was proved that simple modifications of the classical Frank-Wolfe algorithm (aka conditional gradient algorithm) for smooth convex minimization over convex and compact polytopes, converge with linear rate, assuming the objective function has the quadratic growth property.
no code implementations • 31 Jan 2020 • Dan Garber
Our main result shows that under this condition which involves the eigenvalues of the gradient vector at optimal points, SGD with mini-batches, when initialized with a "warm-start" point, produces iterates that are low-rank with high probability, and hence only a low-rank SVD computation is required on each iteration.
no code implementations • 3 Dec 2019 • Dan Garber
We consider convex optimization problems which are widely used as convex relaxations for low-rank matrix recovery problems.
no code implementations • 8 Oct 2019 • Dan Garber, Ben Kretzu
We revisit the challenge of designing online algorithms for the bandit convex optimization problem (BCO) which are also scalable to high dimensional problems.
no code implementations • 5 Feb 2019 • Dan Garber
We also quantify the effect of "over-parameterization", i. e., using SVD computations with higher rank, on the radius of this ball, showing it can increase dramatically with moderately larger rank.
no code implementations • 27 Sep 2018 • Dan Garber, Atara Kaplan
However, such problems are highly challenging to solve in large-scale: the low-rank promoting term prohibits efficient implementations of proximal methods for composite optimization and even simple subgradient methods.
no code implementations • 27 Sep 2018 • Dan Garber
In this paper we focus on the problem of Online Principal Component Analysis in the regret minimization framework.
no code implementations • 20 Feb 2018 • Yakov Babichenko, Dan Garber
We focus on the question whether the aggregator can learn to aggregate optimally the forecasts of the experts, where the optimal aggregation is the Bayesian aggregation that takes into account all the information (evidence) in the system.
no code implementations • 15 Feb 2018 • Dan Garber, Shoham Sabach, Atara Kaplan
Motivated by robust matrix recovery problems such as Robust Principal Component Analysis, we consider a general optimization problem of minimizing a smooth and strongly convex loss function applied to the sum of two blocks of variables, where each block of variables is constrained or regularized individually.
no code implementations • 13 Feb 2018 • Dan Garber
In particular, our results hold for \textit{semi-adversarial} settings in which the data is a combination of an arbitrary (adversarial) sequence and a stochastic sequence, which might provide reasonable approximation for many real-world sequences, or under a natural assumption that the data is low-rank.
no code implementations • NeurIPS 2017 • Dan Garber
This setting is in particular interesting since it captures natural online extensions of well-studied \textit{offline} linear optimization problems which are NP-hard, yet admit efficient approximation algorithms.
no code implementations • ICML 2017 • Dan Garber, Ohad Shamir, Nathan Srebro
We study algorithms for estimating the leading principal component of the population covariance matrix that are both communication-efficient and achieve estimation error of the order of the centralized ERM solution that uses all $mn$ samples.
no code implementations • 25 Feb 2017 • Jialei Wang, Weiran Wang, Dan Garber, Nathan Srebro
We develop and analyze efficient "coordinate-wise" methods for finding the leading eigenvector, where each step involves only a vector-vector product.
no code implementations • 21 Feb 2017 • Chao Gao, Dan Garber, Nathan Srebro, Jialei Wang, Weiran Wang
We study the sample complexity of canonical correlation analysis (CCA), \ie, the number of samples needed to estimate the population canonical correlation and directions up to arbitrarily small error.
no code implementations • 26 May 2016 • Dan Garber, Elad Hazan, Chi Jin, Sham M. Kakade, Cameron Musco, Praneeth Netrapalli, Aaron Sidford
We give faster algorithms and improved sample complexities for estimating the top eigenvector of a matrix $\Sigma$ -- i. e. computing a unit vector $x$ such that $x^T \Sigma x \ge (1-\epsilon)\lambda_1(\Sigma)$: Offline Eigenvector Estimation: Given an explicit $A \in \mathbb{R}^{n \times d}$ with $\Sigma = A^TA$, we show how to compute an $\epsilon$ approximate top eigenvector in time $\tilde O([nnz(A) + \frac{d*sr(A)}{gap^2} ]* \log 1/\epsilon )$ and $\tilde O([\frac{nnz(A)^{3/4} (d*sr(A))^{1/4}}{\sqrt{gap}} ] * \log 1/\epsilon )$.
no code implementations • NeurIPS 2016 • Dan Garber, Ofer Meshi
Moreover, in case the optimal solution is sparse, the new convergence rate replaces a factor which is at least linear in the dimension in previous works, with a linear dependence on the number of non-zeros in the optimal solution.
no code implementations • NeurIPS 2016 • Dan Garber
Minimizing a convex function over the spectrahedron, i. e., the set of all positive semidefinite matrices with unit trace, is an important optimization task with many applications in optimization, machine learning, and signal processing.
no code implementations • NeurIPS 2016 • Weiran Wang, Jialei Wang, Dan Garber, Nathan Srebro
We study the stochastic optimization of canonical correlation analysis (CCA), whose objective is nonconvex and does not decouple over training samples.
no code implementations • 18 Sep 2015 • Dan Garber, Elad Hazan
The problem of principle component analysis (PCA) is traditionally solved by spectral or algebraic methods.
no code implementations • 5 Jun 2014 • Dan Garber, Elad Hazan
In this paper we consider the special case of optimization over strongly convex sets, for which we prove that the vanila FW method converges at a rate of $\frac{1}{t^2}$.
no code implementations • 20 Jan 2013 • Dan Garber, Elad Hazan
In this computational model we give several new results that improve over the previous state-of-the-art.
no code implementations • NeurIPS 2011 • Dan Garber, Elad Hazan
In recent years semidefinite optimization has become a tool of major importance in various optimization and machine learning problems.