no code implementations • 22 Oct 2024 • Antoine Gorceix, Bastien Le Chenadec, Ahmad Rammal, Nelson Vadori, Manuela Veloso
In this paper, we study the ability of large language models to learn specific mathematical rules such as distributivity or simplifying equations.
no code implementations • 10 Jan 2024 • Andrei Panferov, Yury Demidovich, Ahmad Rammal, Peter Richtárik
We analyze the forefront distributed non-convex optimization algorithm MARINA (Gorbunov et al., 2022) utilizing the proposed correlated quantizers and show that it outperforms the original MARINA and distributed SGD of Suresh et al. (2022) with regard to the communication complexity.
1 code implementation • 15 Oct 2023 • Ahmad Rammal, Kaja Gruntkowska, Nikita Fedin, Eduard Gorbunov, Peter Richtárik
Byzantine robustness is an essential feature of algorithms for certain distributed optimization problems, typically encountered in collaborative/federated learning.