GPU accelerated matrix factorization of large scale data using block based approach

2 Jan 2023  ·  Prasad Bhavana, Vineet Padmanabhan ·

Matrix Factorization (MF) on large scale data takes substantial time on a Central Processing Unit (CPU). While Graphical Processing Unit (GPU)s could expedite the computation of MF, the available memory on a GPU is finite. Leveraging GPUs require alternative techniques that allow not only parallelism but also address memory limitations. Synchronization between computation units, isolation of data related to a computational unit, sharing of data between computational units and identification of independent tasks among computational units are some of the challenges while leveraging GPUs for MF. We propose a block based approach to matrix factorization using Stochastic Gradient Descent (SGD) that is aimed at accelerating MF on GPUs. The primary motivation for the approach is to make it viable to factorize extremely large data sets on limited hardware without having to compromise on results. The approach addresses factorization of large scale data by identifying independent blocks, each of which are factorized in parallel using multiple computational units. The approach can be extended to one or more GPUs and even to distributed systems. The RMSE results of the block based approach are with in acceptable delta in comparison to the results of CPU based variant and multi-threaded CPU variant of similar SGD kernel implementation. The advantage, of the block based variant, in-terms of speed are significant in comparison to other variants.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods