Search Results for author: Christoph Bauinger

Found 2 papers, 1 papers with code

Fully-fused Multi-Layer Perceptrons on Intel Data Center GPUs

1 code implementation26 Mar 2024 Kai Yuan, Christoph Bauinger, Xiangyi Zhang, Pascal Baehr, Matthias Kirchhart, Darius Dabert, Adrien Tousnakhoff, Pierre Boudier, Michael Paulitsch

We compare our approach to a similar CUDA implementation for MLPs and show that our implementation on the Intel Data Center GPU outperforms the CUDA implementation on Nvidia's H100 GPU by a factor up to 2. 84 in inference and 1. 75 in training.

Image Compression Physics-informed machine learning

Performance Optimization of Deep Learning Sparse Matrix Kernels on Intel Max Series GPU

no code implementations1 Nov 2023 Mohammad Zubair, Christoph Bauinger

In this paper, we focus on three sparse matrix operations that are relevant for machine learning applications, namely, the sparse-dense matrix multiplication (SPMM), the sampled dense-dense matrix multiplication (SDDMM), and the composition of the SDDMM with SPMM, also termed as FusedMM.

Cannot find the paper you are looking for? You can Submit a new open access paper.