sponsored byACMIEEE The International Conference for High Performance 
Computing, Networking, Storage and Analysis
FacebookTwitterGoogle PlusLinkedInYouTubeFlickr

SCHEDULE: NOV 15-20, 2015

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Efficient Large-Scale Sparse Eigenvalue Computations on Heterogeneous Hardware

SESSION: Regular & ACM Student Research Competition Poster Reception

EVENT TYPE: Posters, Receptions, ACM Student Research Competition, Best Poster Finalist

EVENT TAG(S): HPC Beginner Friendly, Regular Poster

TIME: 5:15PM - 7:00PM

SESSION CHAIR(S): Michela Becchi, Manish Parashar, Dorian C. Arnold

AUTHOR(S):Moritz Kreutzer, Andreas Pieper, Andreas Alvermann, Holger Fehske, Georg Hager, Gerhard Wellein, Alan R. Bishop

ROOM:Level 4 - Lobby

ABSTRACT:

In quantum physics it is often required to determine spectral properties of large, sparse matrices.
For instance, an approximation to the full spectrum or a number of inner eigenvalues can be computed with algorithms based on the evaluation of Chebyshev polynomials.
We identify relevant bottlenecks of this class of algorithms and develop a reformulated version to increase the computational intensity and obtain a potentially higher efficiency, basically by employing kernel fusion and vector blocking.
The optimized algorithm requires a manual implementation of compute kernels.
Guided by a performance model, we show the capabilities of our fully heterogeneous implementation on a petascale system.
Based on MPI+OpenMP/CUDA, our approach utilizes all parts of a heterogeneous CPU+GPU system with high efficiency.
Finally, our scaling study on up to 4096 heterogeneous nodes reveals a performance of half a petaflop/s, which corresponds to 11% of LINPACK performance for an originally bandwidth-bound sparse linear algebra problem.

Chair/Author Details:

Michela Becchi, Manish Parashar, Dorian C. Arnold (Chair) - University of Missouri|Rutgers University|University of New Mexico|

Moritz Kreutzer - Friedrich-Alexander University Erlangen-Nürnberg

Andreas Pieper - Ernst Moritz Arndt University of Greifswald

Andreas Alvermann - Ernst Moritz Arndt University of Greifswald

Holger Fehske - Ernst Moritz Arndt University of Greifswald

Georg Hager - Friedrich-Alexander University Erlangen-Nürnberg

Gerhard Wellein - Friedrich-Alexander University Erlangen-Nürnberg

Alan R. Bishop - Los Alamos National Laboratory

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar