1 code implementation • 5 Nov 2019 • Alexander Brandt, Davood Mohajerani, Marc Moreno Maza, Jeeva Paudel, Linxiao Wang
In this paper we present KLARAPTOR (Kernel LAunch parameters RAtional Program estimaTOR), a new tool built on top of the LLVM Pass Framework and NVIDIA CUPTI API to dynamically determine the optimal values of kernel launch parameters of a CUDA program P. To be precise, we describe a novel technique to statically build (at the compile time of P) a so-called rational program R. Using a performance prediction model, and knowing particular data and hardware parameters of P at runtime, the program R can automatically and dynamically determine the values of launch parameters of P that will yield optimal performance.
Distributed, Parallel, and Cluster Computing Performance
no code implementations • 31 May 2019 • Mohammadali Asadi, Alexander Brandt, Robert H. C. Moir, Marc Moreno Maza, Yuzhen Xie
Algorithms for solving polynomial systems combine low-level routines for performing arithmetic operations on polynomials and high-level procedures which produce the different components (points, curves, surfaces) of the solution set.
Symbolic Computation Distributed, Parallel, and Cluster Computing Mathematical Software
no code implementations • 17 Dec 2016 • Changbo Chen, Svyatoslav Covanov, Farnam Mansouri, Marc Moreno Maza, Ning Xie, Yuzhen Xie
We propose a new algorithm for multiplying dense polynomials with integer coefficients in a parallel fashion, targeting multi-core processor architectures.
Symbolic Computation Mathematical Software