sponsored byACMIEEE The International Conference for High Performance 
Computing, Networking, Storage and Analysis
FacebookTwitterGoogle PlusLinkedInYouTubeFlickr

SCHEDULE: NOV 15-20, 2015

When viewing the Technical Program schedule, on the far righthand side is a column labeled "PLANNER." Use this planner to build your own schedule. Once you select an event and want to add it to your personal schedule, just click on the calendar icon of your choice (outlook calendar, ical calendar or google calendar) and that event will be stored there. As you select events in this manner, you will have your own schedule to guide you through the week.

Improving Concurrency and Asynchrony in Multithreaded MPI Applications using Software Offloading

SESSION: MPI/Communication


EVENT TAG(S): Power, System Software, Clouds and Distributed Computing, Resiliency

TIME: 4:00PM - 4:30PM


AUTHOR(S):Karthikeyan Vaidyanathan, Dhiraj D. Kalamkar, Kiran Pamnany, Jeff R. Hammond, Pavan Balaji, Dipankar Das, Jongsoo Park, Balint Joo



We present a new approach for multithreaded communication
and asynchronous progress in MPI applications, wherein we offload
communication processing to a dedicated thread. The central
premise is that given the rapidly increasing core counts on modern
systems, the improvements in MPI performance arising from
dedicating a thread to drive communication outweigh the small
loss of resources for application computation, particularly when
overlap of communication and computation can be exploited. Our
approach allows application threads to make MPI calls concurrently,
enqueuing these as communication tasks to be processed
by a dedicated communication thread. This not only guarantees
progress for such communication operations, but also reduces load
imbalance. Our implementation
additionally significantly reduces the overhead of mutual
exclusion seen in existing implementations for applications using
MPI THREAD MULTIPLE. Our technique requires no modification
to the application, and we demonstrate significant performance
improvement (up to 2X) for QCD, FFT and deep learning CNN.

Chair/Author Details:

Yong Chen (Chair) - Texas Tech University|

Karthikeyan Vaidyanathan - Intel Corporation

Dhiraj D. Kalamkar - Intel Corporation

Kiran Pamnany - Intel Corporation

Jeff R. Hammond - Intel Corporation

Pavan Balaji - Argonne National Laboratory

Dipankar Das - Intel Corporation

Jongsoo Park - Intel Corporation

Balint Joo - Thomas Jefferson National Accelerator Facility

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

Paper provided by the ACM Digital Library

Paper also available from IEEE Computer Society