SC15 Austin, TX

Efficient Large-Scale Sparse Eigenvalue Computations on Heterogeneous Hardware

Authors: Moritz Kreutzer (Friedrich-Alexander University Erlangen-Nürnberg), Andreas Pieper (Ernst Moritz Arndt University of Greifswald), Andreas Alvermann (Ernst Moritz Arndt University of Greifswald), Holger Fehske (Ernst Moritz Arndt University of Greifswald), Georg Hager (Friedrich-Alexander University Erlangen-Nürnberg), Gerhard Wellein (Friedrich-Alexander University Erlangen-Nürnberg), Alan R. Bishop (Los Alamos National Laboratory)

Best Poster Finalist

Abstract: In quantum physics it is often required to determine spectral properties of large, sparse matrices. For instance, an approximation to the full spectrum or a number of inner eigenvalues can be computed with algorithms based on the evaluation of Chebyshev polynomials. We identify relevant bottlenecks of this class of algorithms and develop a reformulated version to increase the computational intensity and obtain a potentially higher efficiency, basically by employing kernel fusion and vector blocking. The optimized algorithm requires a manual implementation of compute kernels. Guided by a performance model, we show the capabilities of our fully heterogeneous implementation on a petascale system. Based on MPI+OpenMP/CUDA, our approach utilizes all parts of a heterogeneous CPU+GPU system with high efficiency. Finally, our scaling study on up to 4096 heterogeneous nodes reveals a performance of half a petaflop/s, which corresponds to 11% of LINPACK performance for an originally bandwidth-bound sparse linear algebra problem.

Poster: pdf
Two-page extended abstract: pdf

Poster Index