no code implementations • 1 Nov 2024 • Annie Marsden, Evan Dogariu, Naman Agarwal, Xinyi Chen, Daniel Suo, Elad Hazan
We consider the problem of length generalization in sequence prediction.
no code implementations • 2 Oct 2024 • Naman Agarwal, Xinyi Chen, Evan Dogariu, Vlad Feinberg, Daniel Suo, Peter Bartlett, Elad Hazan
We address the challenge of efficient auto-regressive generation in sequence prediction models by introducing FutureFill - a method for fast generation that applies to any sequence prediction algorithm based on convolutional operators.
1 code implementation • 16 Sep 2024 • Y. Isabel Liu, Windsor Nguyen, Yagiz Devre, Evan Dogariu, Anirudha Majumdar, Elad Hazan
This paper describes an efficient, open source PyTorch implementation of the Spectral Transform Unit.
no code implementations • 3 Jun 2024 • Noah Golowich, Elad Hazan, Zhou Lu, Dhruv Rohatgi, Y. Jennifer Sun
The study of population dynamics originated with early sociological works but has since extended into many fields, including biology, epidemiology, evolutionary game theory, and economics.
no code implementations • 14 Feb 2024 • Arun Suggala, Y. Jennifer Sun, Praneeth Netrapalli, Elad Hazan
We show that our algorithm achieves optimal (in terms of horizon) regret bounds for a large class of convex functions that we call $\kappa$-convex.
no code implementations • 17 Jan 2024 • Zhou Lu, Qiuyi Zhang, Xinyi Chen, Fred Zhang, David Woodruff, Elad Hazan
In this paper, we give query and regret optimal bandit algorithms under the strict notion of strongly adaptive regret, which measures the maximum regret over any contiguous interval $I$.
no code implementations • 8 Jan 2024 • Wenhan Xia, Chengwei Qin, Elad Hazan
Fine-tuning is the primary methodology for tailoring pre-trained large language models to specific tasks.
2 code implementations • 11 Dec 2023 • Naman Agarwal, Daniel Suo, Xinyi Chen, Elad Hazan
This paper studies sequence modeling for prediction tasks with long range dependencies.
1 code implementation • 8 Dec 2023 • Xinyi Chen, Angelica Chen, Dean Foster, Elad Hazan
We give a novel efficient algorithm for simultaneous external and internal regret minimization whose regret depends logarithmically on the number of actions.
no code implementations • 21 Jul 2023 • Elad Hazan, Nimrod Megiddo
A new algorithm for regret minimization in online convex optimization is described.
no code implementations • 19 Jan 2023 • Xinyi Chen, Elad Hazan
Selecting the best hyperparameters for a particular optimization instance, such as the learning rate and momentum, is an important but nonconvex problem.
no code implementations • 22 Nov 2022 • Zhou Lu, Nataly Brukhim, Paula Gradu, Elad Hazan
The most common approach is based on the Frank-Wolfe method, that uses linear optimization computation in lieu of projections.
no code implementations • 21 Nov 2022 • Gautam Goel, Naman Agarwal, Karan Singh, Elad Hazan
We consider the fundamental problem of online control of a linear dynamical system from two different viewpoints: regret minimization and competitive analysis.
no code implementations • 17 Nov 2022 • Elad Hazan, Karan Singh
In online nonstochastic control, both the cost functions as well as the perturbations from the assumed dynamical model are chosen by an adversary.
no code implementations • NeurIPS 2023 • Elad Hazan, Adam Tauman Kalai, Varun Kanade, Clara Mohri, Y. Jennifer Sun
This work establishes a new framework of partial matrix completion, where the goal is to identify a large subset of the entries that can be completed with high confidence.
no code implementations • 1 Jul 2022 • Zhou Lu, Elad Hazan
In online convex optimization, the player aims to minimize regret, or the difference between her loss and that of the best fixed decision in hindsight over the entire repeated game.
1 code implementation • 1 Jun 2022 • Xinyi Chen, Elad Hazan, Tongyang Li, Zhou Lu, Xinzhao Wang, Rui Yang
The problem of efficient quantum state learning, also called shadow tomography, aims to comprehend an unknown $d$-dimensional quantum state through POVMs.
no code implementations • 30 May 2022 • Udaya Ghai, Zhou Lu, Elad Hazan
We prove an $O(T^{\frac{2}{3}})$ regret bound for non-convex online gradient descent in this setting, answering this open problem.
no code implementations • 2 Mar 2022 • Zhou Lu, Wenhan Xia, Sanjeev Arora, Elad Hazan
Adaptive gradient methods are the method of choice for optimization in machine learning and used to train the largest deep models.
no code implementations • NeurIPS 2021 • Edgar Minasyan, Paula Gradu, Max Simchowitz, Elad Hazan
On the positive side, we give an efficient algorithm that attains a sublinear regret bound against the class of Disturbance Response policies up to the aforementioned system variability term.
no code implementations • 28 Jan 2022 • Udaya Ghai, Udari Madhushani, Naomi Leonard, Elad Hazan
We study the problem of multi-agent control of a dynamical system with known dynamics and adversarial disturbances.
no code implementations • NeurIPS 2021 • Nataly Brukhim, Elad Hazan, Shay Moran, Indraneel Mukherjee, Robert E. Schapire
Here, we focus on an especially natural formulation in which the weak hypotheses are assumed to belong to an ''easy-to-learn'' base class, and the weak learner is an agnostic PAC learner for that class with respect to the standard classification loss.
no code implementations • 19 Nov 2021 • Daniel Suo, Cyril Zhang, Paula Gradu, Udaya Ghai, Xinyi Chen, Edgar Minasyan, Naman Agarwal, Karan Singh, Julienne LaChance, Tom Zajdel, Manuel Schottdorf, Daniel Cohen, Elad Hazan
Mechanical ventilation is one of the most widely used therapies in the ICU.
no code implementations • 15 Oct 2021 • Xinyi Chen, Edgar Minasyan, Jason D. Lee, Elad Hazan
The theory of deep learning focuses almost exclusively on supervised learning, non-convex optimization using stochastic gradient descent, and overparametrized neural networks.
no code implementations • 29 Sep 2021 • Naman Agarwal, Rohan Anil, Elad Hazan, Tomer Koren, Cyril Zhang
In the empirical science of training large neural networks, the learning rate schedule is a notoriously challenging-to-tune hyperparameter, which can depend on all other properties (architecture, optimizer, batch size, dataset, regularization, ...) of the problem.
no code implementations • 22 Aug 2021 • Nataly Brukhim, Elad Hazan, Karan Singh
Reducing reinforcement learning to supervised learning is a well-studied and effective approach that leverages the benefits of compact function approximation to deal with large-scale Markov decision processes.
no code implementations • 16 Jul 2021 • Xinyi Chen, Udaya Ghai, Elad Hazan, Alexandre Megretski
We study online control of an unknown nonlinear dynamical system that is approximated by a time-invariant linear system with model misspecification.
no code implementations • 26 Feb 2021 • Naman Agarwal, Elad Hazan, Anirudha Majumdar, Karan Singh
We consider the setting of iterative learning control, or model-based policy learning in the presence of uncertain, time-varying dynamics.
1 code implementation • 19 Feb 2021 • Paula Gradu, John Hallman, Daniel Suo, Alex Yu, Naman Agarwal, Udaya Ghai, Karan Singh, Cyril Zhang, Anirudha Majumdar, Elad Hazan
We present an open-source library of natively differentiable physics and robotics environments, accompanied by gradient-based control methods and a benchmark-ing suite.
no code implementations • 18 Feb 2021 • Elad Hazan, Karan Singh
In this access model, we give an efficient boosting algorithm that guarantees near-optimal regret against the convex hull of the base class.
2 code implementations • 12 Feb 2021 • Daniel Suo, Naman Agarwal, Wenhan Xia, Xinyi Chen, Udaya Ghai, Alexander Yu, Paula Gradu, Karan Singh, Cyril Zhang, Edgar Minasyan, Julienne LaChance, Tom Zajdel, Manuel Schottdorf, Daniel Cohen, Elad Hazan
We consider the problem of controlling an invasive mechanical ventilator for pressure-controlled ventilation: a controller must let air in and out of a sedated patient's lungs according to a trajectory of airway pressures specified by a clinician.
no code implementations • 12 Dec 2020 • Udaya Ghai, David Snyder, Anirudha Majumdar, Elad Hazan
We consider the problem of generating maximally adversarial disturbances for a given controller assuming only blackbox access to it.
no code implementations • NeurIPS 2020 • Orestis Plevrakis, Elad Hazan
We study the control of an \emph{unknown} linear dynamical system under general convex costs.
no code implementations • NeurIPS 2020 • Paula Gradu, John Hallman, Elad Hazan
We study the problem of controlling a linear dynamical system with adversarial perturbations where the only feedback available to the controller is the scalar loss, and the loss function itself is unknown.
no code implementations • 23 Jul 2020 • Nataly Brukhim, Elad Hazan
We consider the problem of online boosting for regression tasks, when only limited information is available to the learner.
no code implementations • 13 Jul 2020 • Xinyi Chen, Elad Hazan
To complete the picture, we investigate the complexity of the online black-box control problem, and give a matching lower bound of $2^{\Omega(\mathcal{L})}$ on the regret, showing that the additional exponential cost is inevitable.
no code implementations • 8 Jul 2020 • Paula Gradu, Elad Hazan, Edgar Minasyan
Our main contribution is a novel efficient meta-algorithm: it converts a controller with sublinear regret bounds into one with sublinear {\it adaptive regret} bounds in the setting of time-varying linear dynamical systems.
no code implementations • NeurIPS 2020 • Nataly Brukhim, Xinyi Chen, Elad Hazan, Shay Moran
Boosting is a widely used machine learning approach based on the idea of aggregating weak learning rules.
1 code implementation • 26 Feb 2020 • Naman Agarwal, Rohan Anil, Elad Hazan, Tomer Koren, Cyril Zhang
We investigate several confounding factors in the evaluation of optimization algorithms for deep learning.
1 code implementation • 31 Jan 2020 • Noga Alon, Alon Gonen, Elad Hazan, Shay Moran
(ii) Expressivity: Which tasks can be learned by boosting weak hypotheses from a bounded VC class?
no code implementations • 30 Jan 2020 • Elad Hazan, Edgar Minasyan
In many online learning problems the computational bottleneck for gradient-based methods is the projection operation.
no code implementations • 25 Jan 2020 • Max Simchowitz, Karan Singh, Elad Hazan
We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states, known as non-stochastic control.
no code implementations • ICLR 2020 • Naman Agarwal, Rohan Anil, Elad Hazan, Tomer Koren, Cyril Zhang
A commonplace belief in the machine learning community is that using adaptive gradient methods hurts generalization.
no code implementations • 27 Nov 2019 • Elad Hazan, Sham M. Kakade, Karan Singh
We consider the problem of controlling an unknown linear dynamical system in the presence of (nonstochastic) adversarial perturbations and adversarial convex loss functions.
no code implementations • 6 Nov 2019 • Mark Braverman, Elad Hazan, Max Simchowitz, Blake Woodworth
We investigate the computational complexity of several basic linear algebra primitives, including largest eigenvector computation and linear regression, in the computational model that allows access to the data via a matrix-vector product oracle.
no code implementations • NeurIPS 2019 • Naman Agarwal, Elad Hazan, Karan Singh
We study optimal regret bounds for control in linear dynamical systems under adversarially changing strongly convex cost functions, given the knowledge of transition dynamics.
no code implementations • 8 Sep 2019 • Elad Hazan
Lecture notes on optimization for machine learning, derived from a course at Princeton University and tutorials given in MLSS, Buenos Aires, as well as Simons Foundation, Berkeley.
1 code implementation • 7 Sep 2019 • Elad Hazan
This manuscript portrays optimization as a process.
no code implementations • ICML 2020 • Naman Agarwal, Nataly Brukhim, Elad Hazan, Zhou Lu
We study the question of how to aggregate controllers for dynamical systems in order to improve their performance.
no code implementations • NeurIPS 2019 • Alon Gonen, Elad Hazan, Shay Moran
We study the relationship between the notions of differentially private learning and online learning in games.
no code implementations • 23 Feb 2019 • Naman Agarwal, Brian Bullins, Elad Hazan, Sham M. Kakade, Karan Singh
We study the control of a linear dynamical system with adversarial disturbances (as opposed to statistical noise).
no code implementations • ICLR 2020 • Xinyi Chen, Naman Agarwal, Elad Hazan, Cyril Zhang, Yi Zhang
State-of-the-art models are now trained with billions of parameters, reaching hardware limits in terms of memory consumption.
no code implementations • 5 Feb 2019 • Udaya Ghai, Elad Hazan, Yoram Singer
The hypentropy has a natural spectral counterpart which we use to derive a family of matrix-based updates that bridge gradient methods and the multiplicative method for matrices.
2 code implementations • 6 Dec 2018 • Elad Hazan, Sham M. Kakade, Karan Singh, Abby Van Soest
Suppose an agent is in a (possibly unknown) Markov Decision Process in the absence of a reward signal, what might we hope that an agent can efficiently learn to do?
no code implementations • 17 Oct 2018 • Naman Agarwal, Alon Gonen, Elad Hazan
We consider online learning in an adversarial, non-convex setting under the assumption that the learner has an access to an offline optimization oracle.
no code implementations • ICLR 2019 • Naman Agarwal, Brian Bullins, Xinyi Chen, Elad Hazan, Karan Singh, Cyril Zhang, Yi Zhang
Due to the large number of parameters of machine learning problems, full-matrix preconditioning methods are prohibitively expensive.
no code implementations • NeurIPS 2018 • Elad Hazan, Wei Hu, Yuanzhi Li, Zhiyuan Li
We revisit the question of reducing online learning to approximate optimization of the offline problem.
no code implementations • NeurIPS 2018 • Scott Aaronson, Xinyi Chen, Elad Hazan, Satyen Kale, Ashwin Nayak
Even in the "non-realizable" setting---where there could be arbitrary noise in the measurement outcomes---we show how to output hypothesis states that do significantly worse than the best possible states at most $\operatorname{O}\!\left(\sqrt {Tn}\right) $ times on the first $T$ measurements.
1 code implementation • ICML 2018 • Sanjeev Arora, Nadav Cohen, Elad Hazan
The effect of depth on optimization is decoupled from expressiveness by focusing on settings where additional layers amount to overparameterization - linear neural networks, a well-studied model.
no code implementations • NeurIPS 2018 • Elad Hazan, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang
We give a polynomial-time algorithm for learning latent-state linear dynamical systems without system identification, and without assumptions on the spectral radius of the system's transition matrix.
no code implementations • ICLR 2018 • Sanjeev Arora, Elad Hazan, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang
We study the control of symmetric linear dynamical systems with unknown dynamics and a hidden state.
1 code implementation • NeurIPS 2017 • Elad Hazan, Karan Singh, Cyril Zhang
We present an efficient and practical algorithm for the online prediction of discrete-time linear dynamical systems with a symmetric transition matrix.
no code implementations • 27 Oct 2017 • Naman Agarwal, Elad Hazan
State-of-the-art methods in convex and non-convex optimization employ higher-order derivative information, either implicitly or explicitly.
no code implementations • NeurIPS 2017 • Zeyuan Allen-Zhu, Elad Hazan, Wei Hu, Yuanzhi Li
We propose a rank-$k$ variant of the classical Frank-Wolfe algorithm to solve convex optimization over a trace-norm ball.
no code implementations • ICML 2017 • Elad Hazan, Karan Singh, Cyril Zhang
We consider regret minimization in repeated games with non-convex loss functions.
1 code implementation • ICLR 2018 • Elad Hazan, Adam Klivans, Yang Yuan
In particular, we obtain the first quasi-polynomial time algorithm for learning noisy decision trees with polynomial sample complexity.
no code implementations • NeurIPS 2016 • Brian Bullins, Elad Hazan, Tomer Koren
We study regression and classification in a setting where the learning algorithm is allowed to access only a limited number of attributes per example, known as the limited attribute observation model.
1 code implementation • 3 Nov 2016 • Naman Agarwal, Zeyuan Allen-Zhu, Brian Bullins, Elad Hazan, Tengyu Ma
We design a non-convex second-order optimization algorithm that is guaranteed to return an approximate local minimum in time which scales linearly in the underlying dimension and the number of training examples.
no code implementations • NeurIPS 2016 • Elad Hazan, Tengyu Ma
We give a novel formal theoretical framework for unsupervised learning with two distinctive characteristics.
no code implementations • 26 May 2016 • Dan Garber, Elad Hazan, Chi Jin, Sham M. Kakade, Cameron Musco, Praneeth Netrapalli, Aaron Sidford
We give faster algorithms and improved sample complexities for estimating the top eigenvector of a matrix $\Sigma$ -- i. e. computing a unit vector $x$ such that $x^T \Sigma x \ge (1-\epsilon)\lambda_1(\Sigma)$: Offline Eigenvector Estimation: Given an explicit $A \in \mathbb{R}^{n \times d}$ with $\Sigma = A^TA$, we show how to compute an $\epsilon$ approximate top eigenvector in time $\tilde O([nnz(A) + \frac{d*sr(A)}{gap^2} ]* \log 1/\epsilon )$ and $\tilde O([\frac{nnz(A)^{3/4} (d*sr(A))^{1/4}}{\sqrt{gap}} ] * \log 1/\epsilon )$.
no code implementations • 21 Mar 2016 • Elad Hazan, Tomer Koren, Roi Livni, Yishay Mansour
We consider the problem of prediction with expert advice when the losses of the experts have low-dimensional structure: they are restricted to an unknown $d$-dimensional subspace.
no code implementations • NeurIPS 2016 • Zeyuan Allen-Zhu, Elad Hazan
The diverse world of machine learning applications has given rise to a plethora of algorithms and optimization methods, finely tuned to the specific regression or classification task at hand.
no code implementations • 17 Mar 2016 • Zeyuan Allen-Zhu, Elad Hazan
We consider the fundamental problem in non-convex optimization of efficiently reaching a stationary point.
no code implementations • 14 Mar 2016 • Elad Hazan, Yuanzhi Li
We consider the problem of online convex optimization against an arbitrary adversary with bandit feedback, known as bandit convex optimization.
4 code implementations • 12 Feb 2016 • Naman Agarwal, Brian Bullins, Elad Hazan
First-order stochastic methods are the state-of-the-art in large-scale machine learning optimization owing to efficient per-iteration complexity.
no code implementations • 5 Feb 2016 • Elad Hazan, Haipeng Luo
The Frank-Wolfe optimization algorithm has recently regained popularity for machine learning applications due to its projection-free property and its ability to handle structured constraints.
no code implementations • NeurIPS 2015 • Oren Anava, Elad Hazan, Shie Mannor
In this work we extend the notion of learning with memory to the general Online Convex Optimization (OCO) framework, and present two algorithms that attain low regret.
no code implementations • 18 Sep 2015 • Dan Garber, Elad Hazan
The problem of principle component analysis (PCA) is traditionally solved by spectral or algebraic methods.
no code implementations • 9 Jul 2015 • Jacob Abernethy, Elad Hazan
We show that simulated annealing, a well-studied random walk algorithms, is directly equivalent, in a certain sense, to the central path interior point algorithm for the the entropic universal barrier function.
no code implementations • NeurIPS 2015 • Elad Hazan, Kfir. Y. Levy, Shai Shalev-Shwartz
The Normalized Gradient Descent (NGD) algorithm, is an adaptation of Gradient Descent, which updates according to the direction of the gradients, rather than the gradients themselves.
no code implementations • NeurIPS 2015 • Alina Beygelzimer, Elad Hazan, Satyen Kale, Haipeng Luo
We extend the theory of boosting for regression problems to the online learning setting.
no code implementations • 8 Apr 2015 • Elad Hazan, Tomer Koren
We also give a lower bound showing that this running time cannot be improved (up to log factors) in the oracle model, thereby exhibiting a quadratic speedup as compared to the standard, oracle-free setting where the required time for vanishing regret is $\widetilde{\Theta}(N)$.
1 code implementation • 12 Mar 2015 • Elad Hazan, Kfir. Y. Levy, Shai Shalev-Shwartz
We extend our algorithm and analysis to the setting of stochastic non-convex optimization with noisy gradient feedback, attaining the same convergence rate.
no code implementations • 14 Jan 2015 • Elad Hazan, Roi Livni, Yishay Mansour
We consider classification and regression tasks where we have missing data and assume that the (clean) data resides in a low rank subspace.
no code implementations • NeurIPS 2014 • Ofer Dekel, Elad Hazan, Tomer Koren
We study an online learning setting where the player is temporarily deprived of feedback each time it switches to a different action.
no code implementations • NeurIPS 2014 • Elad Hazan, Kfir Levy
Bandit Convex Optimization (BCO) is a fundamental framework for decision making under uncertainty, which generalizes many problems from the realm of online and statistical learning.
no code implementations • 5 Jun 2014 • Dan Garber, Elad Hazan
In this paper we consider the special case of optimization over strongly convex sets, for which we prove that the vanila FW method converges at a rate of $\frac{1}{t^2}$.
no code implementations • 15 May 2014 • Elad Hazan, Tomer Koren, Kfir. Y. Levy
We show that in contrast to known asymptotic bounds, as long as the number of prediction/optimization iterations is sub exponential, the logistic loss provides no improvement over a generic non-smooth loss function such as the hinge loss.
no code implementations • 25 Feb 2014 • Aharon Ben-Tal, Elad Hazan, Tomer Koren, Shie Mannor
Robust optimization is a common framework in optimization under uncertainty when the problem parameters are not known, but it is rather known that the parameters belong to some given uncertainty set.
no code implementations • 21 Dec 2013 • Elad Hazan, Zohar Karnin, Raghu Mehka
Numerous machine learning problems require an exploration basis - a mechanism to explore the action space.
no code implementations • 27 Feb 2013 • Oren Anava, Elad Hazan, Shie Mannor
The framework of online learning with memory naturally captures learning problems with temporal constraints, and was previously studied for the experts setting.
no code implementations • 20 Jan 2013 • Dan Garber, Elad Hazan
In this computational model we give several new results that improve over the previous state-of-the-art.
no code implementations • NeurIPS 2012 • Elad Hazan, Zohar Karnin
We present a simplex algorithm for linear programming in a linear classification formulation.
no code implementations • NeurIPS 2011 • Elad Hazan, Satyen Kale
We prove that the regret of \newtron is \(O(\log T)\) when \(\alpha\) is a constant that does not vary with horizon \(T\), and at most \(O(T^{2/3})\) if \(\alpha\) is allowed to increase to infinity with \(T\).
no code implementations • NeurIPS 2011 • Dan Garber, Elad Hazan
In recent years semidefinite optimization has become a tool of major importance in various optimization and machine learning problems.
no code implementations • NeurIPS 2011 • Elad Hazan, Tomer Koren, Nati Srebro
We present an optimization approach for linear SVMs based on a stochastic primal-dual approach, where the primal step is akin to an importance-weighted SGD, and the dual step is a stochastic update on the importance weights.
no code implementations • NeurIPS 2009 • Elad Hazan, Satyen Kale
We consider an online decision problem over a discrete space in which the loss function is submodular.
no code implementations • NeurIPS 2009 • Elad Hazan, Satyen Kale
In practice, most investing is done assuming a probabilistic model of stock price returns known as the Geometric Brownian Motion (GBM).
no code implementations • NeurIPS 2007 • Elad Hazan, Satyen Kale
We study the relation between notions of game-theoretic equilibria which are based on stability under a set of deviations, and empirical equilibria which are reached by rational players.