sponsored byACMIEEE The International Conference for High Performance 
Computing, Networking, Storage and Analysis
SCHEDULE: NOV 15-20, 2015

Increasing Fabric Utilization with Job-Aware Routing

SESSION: Regular & ACM Student Research Competition Poster Reception

EVENT TYPE: Posters, Receptions, ACM Student Research Competition

EVENT TAG(S): HPC Beginner Friendly, Regular Poster

TIME: 5:15PM - 7:00PM

SESSION CHAIR(S): Michela Becchi, Manish Parashar, Dorian C. Arnold

AUTHOR(S):Jens Domke

ROOM:Level 4 - Lobby


The InfiniBand (IB) technology became one of the most widely used interconnects for HPC systems in recent years. The achievable communication throughput for parallel applications depends heavily on the available number of links and switches of the fabric. These numbers are derived from the quality of the used routing algorithm, which usually optimizes the forwarding tables for global path balancing. However, in a multi-user/multi-job HPC environment this results in suboptimal usage of the shared network by individual jobs. We extend an existing routing algorithm to factor in the locality of running parallel applications, and we create an interface between the batch system and the subnet manager of IB to drive necessary re-routing steps for the fabric. As a result, our job-aware routing allows each running parallel application to make better use of the shared IB fabric, and therefore increase the application performance and the overall fabric utilization.

Chair/Author Details:

Michela Becchi, Manish Parashar, Dorian C. Arnold (Chair) - University of Missouri|Rutgers University|University of New Mexico|

Jens Domke - Dresden University of Technology

