sponsored byACMIEEE The International Conference for High Performance 
Computing, Networking, Storage and Analysis
FacebookTwitterGoogle PlusLinkedInYouTubeFlickr

SCHEDULE: NOV 15-20, 2015

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Debugging and Performance Tools for MPI and OpenMP 4.0 Applications for CPU and Accelerators/Coprocessors

SESSION: Debugging and Performance Tools for MPI and OpenMP 4.0 Applications for CPU and Accelerators/Coprocessors

EVENT TYPE: Tutorials

EVENT TAG(S): Performance, Programming Systems, Accelerators, Heterogeneous Systems

TIME: 1:30PM - 5:00PM

Presenter(s):Sandra Wienke, Mike Ashworth, Damian Alvarez, Woo-Sun Yang, Chris Gottbrath, Nikolay Piskun



Scientific developers face challenges adapting software to leverage increasingly heterogeneous architectures. Many systems feature nodes that couple multi-core processors with GPU-based computational accelerators, like the NVIDIA Kepler, or many-core coprocessors, like the Intel Xeon Phi. In order to effectively utilize these systems, application developers need to demonstrate an extremely high level of parallelism while also coping with the complexities of multiple programming paradigms, including MPI, OpenMP, CUDA, and OpenACC.

This tutorial provides exploration of parallel debugging and optimization focused on techniques that can be used with accelerators and coprocessors. We cover debugging techniques such as grouping, advanced breakpoints and barriers, and MPI message queue graphing. We discuss optimization techniques like profiling, tracing, and cache memory optimization with tools such as Tau, Vtune and the NVIDIA Visual Profiler. Participants will spend approximately half the time doing hands-on GPU and Intel Xeon Phi debugging and profiling. Additionally, up-to-date capabilities in accelerator and coprocessing computing (e.g. OpenMP 4.0 device constructs, CUDA Unified Memory, CUDA core file debugging) and their peculiarities with respect to error finding and optimization will be discussed. For the hands-on sessions SSH and NX clients have to be installed in the attendees laptops.

Chair/Presenter Details:

Sandra Wienke - RWTH Aachen University

Mike Ashworth - STFC Daresbury Laboratory

Damian Alvarez - Juelich Research Center

Woo-Sun Yang - Lawrence Berkeley National Laboratory

Chris Gottbrath - NVIDIA Corporation

Nikolay Piskun - Rogue Wave Software, Inc.

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar