Monitoring Large-Scale HPC Systems: Data Analytics and Insights

SESSION: Monitoring Large-Scale HPC Systems: Data Analytics and Insights

EVENT TYPE: Birds of a Feather

EVENT TAG(S): Facilities, Analytics

TIME: 5:30PM - 7:00PM

SESSION LEADER(S):Jim Brandt, Hans-Christian Hoppe, Mike Mason, Mark Parsons, Marie-Christine Sawley, Mike Showerman

ROOM:Ballroom E


This BOF addresses critical issues in large-scale monitoring from the perspectives of worldwide HPC center system administrators, users, and vendors. We target methodologies, desires, and data for gaining insights from large-scale system monitoring: identifying application performance variation causes; detecting contention for shared resources and assessing impacts; discovering misbehaving users and system components; and automating these analyses. A panel of large-scale HPC stakeholders will interact with BoF attendees on these issues. We will explore creation of an open, vendor-neutral Special Interest Group. Community information, including BoF plans, archives, and plans for a follow-up webinar series will be available at: https://sites.google.com/site/monitoringlargescalehpcsystems/.

Session Leader Details:

Jim Brandt (Primary Session Leader) - Sandia National Laboratories

Hans-Christian Hoppe (Secondary Session Leader) - Intel Corporation

Mike Mason (Secondary Session Leader) - Los Alamos National Laboratory

Mark Parsons (Secondary Session Leader) - Edinburgh Parallel Computing Centre

Marie-Christine Sawley (Secondary Session Leader) - Intel Corporation

Mike Showerman (Secondary Session Leader) - National Center for Supercomputing Applications

