no code implementations • 11 Apr 2024 • Tanmay Gautam, Youngsuk Park, Hao Zhou, Parameswaran Raman, Wooseok Ha
Evaluated across a range of both masked and autoregressive LMs on benchmark GLUE tasks, MeZO-SVRG outperforms MeZO with up to 20% increase in test accuracies in both full- and partial-parameter fine-tuning settings.
no code implementations • 19 Sep 2023 • Keru Wu, Yuansi Chen, Wooseok Ha, Bin Yu
Domain adaptation (DA) is a statistical learning problem that arises when the distribution of the source data used to train a model differs from that of the target data used to evaluate the model.
no code implementations • 6 Aug 2023 • Nikhil Ghosh, Spencer Frei, Wooseok Ha, Bin Yu
On the other hand, for any batch size strictly smaller than the number of samples, SGD finds a global minimum which is sparse and nearly orthogonal to its initialization, showing that the randomness of stochastic gradients induces a qualitatively different type of "feature selection" in this setting.
4 code implementations • 16 Aug 2021 • Chandan Singh, Wooseok Ha, Bin Yu
Recent deep-learning models have achieved impressive predictive performance by learning complex functions of many variables, often at the cost of interpretability.
2 code implementations • NeurIPS 2021 • Wooseok Ha, Chandan Singh, Francois Lanusse, Srigokul Upadhyayula, Bin Yu
Moreover, interpretable models are concise and often yield computational efficiency.
2 code implementations • 4 Mar 2020 • Chandan Singh, Wooseok Ha, Francois Lanusse, Vanessa Boehm, Jia Liu, Bin Yu
Machine learning lies at the heart of new possibilities for scientific discovery, knowledge generation, and artificial intelligence.
no code implementations • 11 Jun 2019 • Wooseok Ha, Kimon Fountoulakis, Michael W. Mahoney
In this paper, we adopt a statistical perspective on local graph clustering, and we analyze the performance of the l1-regularized PageRank method~(Fountoulakis et.
no code implementations • 2 Dec 2018 • Wooseok Ha, Haoyang Liu, Rina Foygel Barber
Two common approaches in low-rank optimization problems are either working directly with a rank constraint on the matrix variable, or optimizing over a low-rank factorization so that the rank constraint is implicitly ensured.
Optimization and Control
no code implementations • 13 Sep 2017 • Wooseok Ha, Rina Foygel Barber
We analyze the performance of alternating minimization for loss functions optimized over two variables, where each variable may be restricted to lie in some potentially nonconvex constraint set.
no code implementations • NeurIPS 2015 • Wooseok Ha, Rina Foygel Barber
The robust principal component analysis (RPCA) problem seeks to separate low-rank trends from sparse outlierswithin a data matrix, that is, to approximate a $n\times d$ matrix $D$ as the sum of a low-rank matrix $L$ and a sparse matrix $S$. We examine the robust principal component analysis (RPCA) problem under data compression, wherethe data $Y$ is approximately given by $(L + S)\cdot C$, that is, a low-rank $+$ sparse data matrix that has been compressed to size $n\times m$ (with $m$ substantially smaller than the original dimension $d$) via multiplication witha compression matrix $C$.