Search Results for author: Fehmi Jaafar

Found 3 papers, 1 papers with code

SPIRT: A Fault-Tolerant and Reliable Peer-to-Peer Serverless ML Training Architecture

no code implementations25 Sep 2023 Amine Barrak, Mayssa Jaziri, Ranim Trabelsi, Fehmi Jaafar, Fabio Petrillo

The advent of serverless computing has ushered in notable advancements in distributed machine learning, particularly within parameter server-based architectures.

Exploring the Impact of Serverless Computing on Peer To Peer Training Machine Learning

1 code implementation25 Sep 2023 Amine Barral, Ranim Trabelsi, Fehmi Jaafar, Fabio Petrillo

In this paper, we introduce a novel architecture that combines serverless computing with P2P networks for distributed training and present a method for efficient parallel gradient computation under resource constraints.

Architecting Peer-to-Peer Serverless Distributed Machine Learning Training for Improved Fault Tolerance

no code implementations27 Feb 2023 Amine Barrak, Fabio Petrillo, Fehmi Jaafar

However, the parameter server architecture may have limitations in terms of fault tolerance, including a single point of failure and complex recovery processes.

Cloud Computing Decision Making

Cannot find the paper you are looking for? You can Submit a new open access paper.