no code implementations • 16 Mar 2024 • Tingting Tang, Yue Niu, Salman Avestimehr, Murali Annavaram
Eclipse adds noise to the low-rank singular values instead of the entire graph, thereby preserving the graph privacy while still maintaining enough of the graph structure to maintain model utility.
no code implementations • 13 Mar 2024 • Lei Gao, Yue Niu, Tingting Tang, Salman Avestimehr, Murali Annavaram
Evaluations show Ethos is more effective in removing undesired knowledge and maintaining the overall model performance compared to current task arithmetic methods.
no code implementations • 1 Mar 2024 • Yue Niu, Saurav Prakash, Salman Avestimehr
In particular, ATP barely loses accuracy with only $1/2$ principal keys, and only incurs around $2\%$ accuracy drops with $1/4$ principal keys.
no code implementations • 5 Dec 2023 • Yue Niu, Ramy E. Ali, Saurav Prakash, Salman Avestimehr
The main part flows into a small model while the residuals are offloaded to a large model.
no code implementations • 25 Jul 2023 • Yue Niu, Zalan Fabian, Sunwoo Lee, Mahdi Soltanolkotabi, Salman Avestimehr
Quasi-Newton methods still face significant challenges in training large-scale neural networks due to additional compute costs in the Hessian related computations and instability issues in stochastic training.
1 code implementation • 28 Aug 2022 • Yue Niu, Saurav Prakash, Souvik Kundu, Sunwoo Lee, Salman Avestimehr
However, the heterogeneous-client setting requires some clients to train full model, which is not aligned with the resource-constrained setting; while the latter ones break privacy promises in FL when sharing intermediate representations or labels with the server.
1 code implementation • 27 Aug 2022 • Sara Babakniya, Souvik Kundu, Saurav Prakash, Yue Niu, Salman Avestimehr
A possible solution to this problem is to utilize off-the-shelf sparse learning algorithms at the clients to meet their resource budget.
no code implementations • 29 Sep 2021 • Yue Niu, Zalan Fabian, Sunwoo Lee, Mahdi Soltanolkotabi, Salman Avestimehr
SLIM-QN addresses two key barriers in existing second-order methods for large-scale DNNs: 1) the high computational cost of obtaining the Hessian matrix and its inverse in every iteration (e. g. KFAC); 2) convergence instability due to stochastic training (e. g. L-BFGS).
no code implementations • 16 Oct 2019 • Yue Niu, Hanqing Zeng, Ajitesh Srivastava, Kartik Lakhotia, Rajgopal Kannan, Yanzhi Wang, Viktor Prasanna
On the other hand, weight pruning techniques address the redundancy in model parameters by converting dense convolutional kernels into sparse ones.