Tuning and optimization for a variety of many-core architectures without changing a single line of implementation code using the Alpaka library

We present an analysis on optimizing performance of a single C++11 source code using the Alpaka hardware abstraction library. For this we use the general matrix multiplication (GEMM) algorithm in order to show that compilers can optimize Alpaka code effectively when tuning key parameters of the algorithm... (read more)

