Search Results for author: Gerhard Wellein

Found 10 papers, 5 papers with code

Exploring Techniques for the Analysis of Spontaneous Asynchronicity in MPI-Parallel Applications

no code implementations27 May 2022 Ayesha Afzal, Georg Hager, Gerhard Wellein, Stefano Markidis

This paper studies the utility of using data analytics and machine learning techniques for identifying, classifying, and characterizing the dynamics of large-scale parallel (MPI) programs.

Clustering General Classification

ECM modeling and performance tuning of SpMV and Lattice QCD on A64FX

no code implementations4 Mar 2021 Christie Alappat, Nils Meyer, Jan Laukemann, Thomas Gruber, Georg Hager, Gerhard Wellein, Tilo Wettig

We present an architectural analysis of the A64FX used in the Fujitsu FX1000 supercomputer at a level of detail that allows for the construction of Execution-Cache-Memory (ECM) performance models for steady-state loops.

Performance Distributed, Parallel, and Cluster Computing High Energy Physics - Lattice

Analytic Modeling of Idle Waves in Parallel Programs: Communication, Cluster Topology, and Noise Impact

no code implementations4 Mar 2021 Ayesha Afzal, Georg Hager, Gerhard Wellein

We present a validated analytic model for their propagation velocity with respect to communication parameters and topology, with a special emphasis on sparse communication patterns.

Distributed, Parallel, and Cluster Computing Performance

A Recursive Algebraic Coloring Technique for Hardware-Efficient Symmetric Sparse Matrix-Vector Multiplication

1 code implementation15 Jul 2019 Christie L. Alappat, Georg Hager, Olaf Schenk, Jonas Thies, Achim Basermann, Alan R. Bishop, Holger Fehske, Gerhard Wellein

The symmetric sparse matrix-vector multiplication (SymmSpMV) is an important building block for many numerical linear algebra kernel operations or graph traversal applications.

Distributed, Parallel, and Cluster Computing Performance

Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels

1 code implementation13 Jan 2017 Julian Hammer, Jan Eitzinger, Georg Hager, Gerhard Wellein

We then present Kerncraft, a tool that can automatically construct Roofline and ECM models for loop nests by performing the required code, data transfer, and LC analysis.

Performance

GHOST: Building blocks for high performance sparse linear algebra on heterogeneous systems

1 code implementation29 Jul 2015 Moritz Kreutzer, Jonas Thies, Melven Röhrig-Zöllner, Andreas Pieper, Faisal Shahzad, Martin Galgon, Achim Basermann, Holger Fehske, Georg Hager, Gerhard Wellein

Today, such resources are available as multicore processors, graphics processing units (GPUs), and other accelerators such as the Intel Xeon Phi.

Distributed, Parallel, and Cluster Computing Mathematical Software

Performance Engineering for a Medical Imaging Application on the Intel Xeon Phi Accelerator

no code implementations17 Dec 2013 Johannes Hofmann, Jan Treibig, Georg Hager, Gerhard Wellein

We examine the Xeon Phi, which is based on Intel's Many Integrated Cores architecture, for its suitability to run the FDK algorithm--the most commonly used algorithm to perform the 3D image reconstruction in cone-beam computed tomography.

Image Reconstruction

A unified sparse matrix data format for efficient general sparse matrix-vector multiply on modern processors with wide SIMD units

1 code implementation23 Jul 2013 Moritz Kreutzer, Georg Hager, Gerhard Wellein, Holger Fehske, Alan R. Bishop

We discuss the advantages of SELL-C-sigma compared to established formats like Compressed Row Storage (CRS) and ELLPACK and show its suitability on a variety of hardware platforms (Intel Sandy Bridge, Intel Xeon Phi and Nvidia Tesla K20) for a wide range of test matrices from different application areas.

Mathematical Software Distributed, Parallel, and Cluster Computing

The Kernel Polynomial Method

1 code implementation25 Apr 2005 Alexander Weisse, Gerhard Wellein, Andreas Alvermann, Holger Fehske

Efficient and stable algorithms for the calculation of spectral quantities and correlation functions are some of the key tools in computational condensed matter physics.

Other Condensed Matter Computational Physics

Cannot find the paper you are looking for? You can Submit a new open access paper.