no code implementations • 12 Mar 2024 • Zijian Liu, Zhengyuan Zhou
Shuffling gradient methods, which are also known as stochastic gradient descent (SGD) without replacement, are widely implemented in practice, particularly including three popular algorithms: Random Reshuffle (RR), Shuffle Once (SO), and Incremental Gradient (IG).
no code implementations • 13 Dec 2023 • Zijian Liu, Zhengyuan Zhou
For Lipschitz convex functions, different works have established the optimal $O(\log(1/\delta)\log T/\sqrt{T})$ or $O(\sqrt{\log(1/\delta)/T})$ high-probability convergence rates for the final iterate, where $T$ is the time horizon and $\delta$ is the failure probability.
no code implementations • 31 Oct 2023 • Maoxiang Sun, Weilong Ding, Tianpu Zhang, Zijian Liu, Mengda Xing
As the development of cities, traffic congestion becomes an increasingly pressing issue, and traffic prediction is a classic method to relieve that issue.
no code implementations • 22 Mar 2023 • Zijian Liu, Zhengyuan Zhou
Recently, several studies consider the stochastic optimization problem but in a heavy-tailed noise regime, i. e., the difference between the stochastic gradient and the true gradient is assumed to have a finite $p$-th moment (say being upper bounded by $\sigma^{p}$ for some $\sigma\geq0$) where $p\in(1, 2]$, which not only generalizes the traditional finite variance assumption ($p=2$) but also has been observed in practice for several different tasks.
no code implementations • 28 Feb 2023 • Zijian Liu, Ta Duy Nguyen, Thien Hang Nguyen, Alina Ene, Huy Lê Nguyen
Instead, we show high probability convergence with bounds depending on the initial distance to the optimal solution.
no code implementations • 14 Feb 2023 • Zijian Liu, Jiawei Zhang, Zhengyuan Zhou
For this class of problems, we propose the first variance-reduced accelerated algorithm and establish that it guarantees a high-probability convergence rate of $O(\log(T/\delta)T^{\frac{1-p}{2p-1}})$ under a mild condition, which is faster than $\Omega(T^{\frac{1-p}{3p-2}})$.
no code implementations • 13 Feb 2023 • Zijian Liu, Srikanth Jagabathula, Zhengyuan Zhou
Two recent works established the $O(\epsilon^{-3})$ sample complexity to obtain an $O(\epsilon)$-stationary point.
no code implementations • 29 Sep 2022 • Zijian Liu, Ta Duy Nguyen, Thien Hang Nguyen, Alina Ene, Huy L. Nguyen
There, STORM utilizes recursive momentum to achieve the VR effect and is then later made fully adaptive in STORM+ [Levy et al., '21], where full-adaptivity removes the requirement for obtaining certain problem-specific parameters such as the smoothness of the objective and bounds on the variance and norm of the stochastic gradients in order to set the step size.
no code implementations • 29 Sep 2022 • Zijian Liu, Ta Duy Nguyen, Alina Ene, Huy L. Nguyen
Finally, we give new accelerated adaptive algorithms and their convergence guarantee in the deterministic setting with explicit dependency on the problem parameters, improving upon the asymptotic rate shown in previous works.
no code implementations • 28 Jan 2022 • Zijian Liu, Ta Duy Nguyen, Alina Ene, Huy L. Nguyen
To address this problem, we propose two novel adaptive VR algorithms: Adaptive Variance Reduced Accelerated Extra-Gradient (AdaVRAE) and Adaptive Variance Reduced Accelerated Gradient (AdaVRAG).
no code implementations • 5 Jan 2020 • Zijian Liu, Chunbo Luo, Shuai Li, Peng Ren, Geyong Min
This paper proposes fractional order graph neural networks (FGNNs), optimized by the approximation strategy to address the challenges of local optimum of classic and fractional graph neural networks which are specialised at aggregating information from the feature and adjacent matrices of connected nodes and their neighbours to solve learning tasks on non-Euclidean data such as graphs.