no code implementations • 7 Jul 2023 • Arvind Mahankali, Tatsunori B. Hashimoto, Tengyu Ma
Then, we find that changing the distribution of the covariates and weight vector to a non-isotropic Gaussian distribution has a strong impact on the learned algorithm: the global minimizer of the pre-training loss now implements a single step of $\textit{pre-conditioned}$ GD.
no code implementations • NeurIPS 2021 • Arvind Mahankali, David Woodruff
For linear classification, we improve upon the algorithm of (Tai, et al. 2018), which solves the $\ell_1$ point query problem on the optimal weight vector $w_* \in \mathbb{R}^d$ in sublinear space.