- Home
- Register
- Attend
- Conference Program
- SC15 Schedule
- Technical Program
- Awards
- Students@SC
- Research with SCinet
- HPC Impact Showcase
- HPC Matters Plenary
- Keynote Address
- Support SC
- SC15 Archive
- Exhibits
- Media
- SCinet
- HPC Matters
SCHEDULE: NOV 15-20, 2015
When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.
Efficient Large-Scale Sparse Eigenvalue Computations on Heterogeneous Hardware
SESSION: Regular & ACM Student Research Competition Poster Reception
EVENT TYPE: Posters, Receptions, ACM Student Research Competition, Best Poster Finalist
EVENT TAG(S): HPC Beginner Friendly, Regular Poster
TIME: 5:15PM - 7:00PM
SESSION CHAIR(S): Michela Becchi, Manish Parashar, Dorian C. Arnold
AUTHOR(S):Moritz Kreutzer, Andreas Pieper, Andreas Alvermann, Holger Fehske, Georg Hager, Gerhard Wellein, Alan R. Bishop
ROOM:Level 4 - Lobby
ABSTRACT:
In quantum physics it is often required to determine spectral properties of large, sparse matrices.
For instance, an approximation to the full spectrum or a number of inner eigenvalues can be computed with algorithms based on the evaluation of Chebyshev polynomials.
We identify relevant bottlenecks of this class of algorithms and develop a reformulated version to increase the computational intensity and obtain a potentially higher efficiency, basically by employing kernel fusion and vector blocking.
The optimized algorithm requires a manual implementation of compute kernels.
Guided by a performance model, we show the capabilities of our fully heterogeneous implementation on a petascale system.
Based on MPI+OpenMP/CUDA, our approach utilizes all parts of a heterogeneous CPU+GPU system with high efficiency.
Finally, our scaling study on up to 4096 heterogeneous nodes reveals a performance of half a petaflop/s, which corresponds to 11% of LINPACK performance for an originally bandwidth-bound sparse linear algebra problem.
Chair/Author Details:
Michela Becchi, Manish Parashar, Dorian C. Arnold (Chair) - University of Missouri|Rutgers University|University of New Mexico|
Moritz Kreutzer - Friedrich-Alexander University Erlangen-Nürnberg
Andreas Pieper - Ernst Moritz Arndt University of Greifswald
Andreas Alvermann - Ernst Moritz Arndt University of Greifswald
Holger Fehske - Ernst Moritz Arndt University of Greifswald
Georg Hager - Friedrich-Alexander University Erlangen-Nürnberg
Gerhard Wellein - Friedrich-Alexander University Erlangen-Nürnberg
Alan R. Bishop - Los Alamos National Laboratory
Click here to download .ics calendar file
