no code implementations • ICML 2020 • Nima Eshraghi, Ben Liang
In distributed online optimization over a computing network with heterogeneous nodes, slow nodes can adversely affect the progress of fast nodes, leading to drastic slowdown of the overall convergence process.
no code implementations • 4 Oct 2023 • Sayantan Chowdhury, Ben Liang, Ali Tizghadam, Ilijc Albanese
The teacher in KD is a black-box model, imparting knowledge to the student only through its predictions.
no code implementations • 19 Mar 2023 • Rozhina Ghanavi, Ben Liang, Ali Tizghadam
Large datasets in machine learning often contain missing data, which necessitates the imputation of missing data values.
no code implementations • 25 Feb 2022 • Nima Eshraghi, Ben Liang
We first show that under relative smoothness, the dynamic regret has an upper bound based on the path length and functional variation.
no code implementations • 7 Dec 2021 • Jingrong Wang, Ben Liang
We consider distributed online min-max resource allocation with a set of parallel agents and a parameter server.
no code implementations • 9 May 2021 • Juncheng Wang, Ben Liang, Min Dong, Gary Boudreau, Hatem Abou-zeid
We consider online convex optimization (OCO) with multi-slot feedback delay, where an agent makes a sequence of online decisions to minimize the accumulation of time-varying convex loss functions, subject to short-term and long-term constraints that are possibly time-varying.
no code implementations • 30 Apr 2021 • Sayantan Chowdhury, Ben Liang, Ali Tizghadam, Ilijc Albanese
At a network router, the packets need to be processed with minimum delay, so the classifier cannot wait until the end of the flow to make a decision.
no code implementations • 26 Feb 2021 • Ali Ramezani-Kebrya, Ashish Khisti, Ben Liang
While momentum-based methods, in conjunction with stochastic gradient descent (SGD), are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods.
no code implementations • 12 Sep 2018 • Ali Ramezani-Kebrya, Kimon Antonakopoulos, Volkan Cevher, Ashish Khisti, Ben Liang
While momentum-based accelerated variants of stochastic gradient descent (SGD) are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods.