sponsored byACMIEEE The International Conference for High Performance 
Computing, Networking, Storage and Analysis
FacebookTwitterGoogle PlusLinkedInYouTubeFlickr

SCHEDULE: NOV 15-20, 2015

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

BD-CATS: Big Data Clustering at Trillion Particle Scale

SESSION: Data Clustering


EVENT TAG(S): Applications, Analytics, Simulation

TIME: 10:30AM - 11:00AM

SESSION CHAIR(S): Vijay Gadepally

AUTHOR(S):Md. Mostofa Ali Patwary, Suren Byna, Nadathur Rajagopalan Satish, Narayanan Sundaram, Zarija Lukic, Vadim Roytershteyn, Michael J. Anderson, Yushu Yao, Mr Prabhat, Pradeep Dubey



Modern cosmology and plasma-physics codes are now capable of
simulating trillions of particles on petascale-systems. Each timestep
output from such simulations is on the order of 10s of TBs. Summarizing
and analyzing raw particle data is challenging, and scientists
often focus on density structures, whether in the real 3D
space, or a high-dimensional phase space. In this work, we develop
a highly scalable version of the clustering algorithm DBSCAN, and
apply it to the largest datasets produced by state-of-the-art codes.
Our system, called BD-CATS, is the first one capable of performing
end-to-end analysis at trillion particle scale. We show analysis of 1.4
trillion particles from a plasma-physics simulation, and a 10,240^3
particle cosmological simulation, utilizing ~100,000 cores in 30
minutes. BD-CATS is helping infer mechanisms behind particle
acceleration in plasma-physics and holds promise for qualitatively
superior clustering in cosmology. Both of these results were previously
intractable at the trillion-particle scale.

Chair/Author Details:

Vijay Gadepally (Chair) - Massachusetts Institute of Technology|

Md. Mostofa Ali Patwary - Intel Corporation

Suren Byna - Lawrence Berkeley National Laboratory

Nadathur Rajagopalan Satish - Intel Corporation

Narayanan Sundaram - Intel Corporation

Zarija Lukic - Lawrence Berkeley National Laboratory

Vadim Roytershteyn - Space Science Institute

Michael J. Anderson - Intel Corporation

Yushu Yao - Lawrence Berkeley National Laboratory

Mr Prabhat - Lawrence Berkeley National Laboratory

Pradeep Dubey - Intel Corporation

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

Paper provided by the ACM Digital Library

Paper also available from IEEE Computer Society