PGX.D: A Fast Distributed Graph Processing System

SESSION: Management of Graph Workloads

EVENT TYPE: Papers, Best Paper Finalists

EVENT TAG(S): System Software, Clouds and Distributed Computing

TIME: 4:30PM - 5:00PM


AUTHOR(S):Sungpack Hong, Siegfried Depner, Thomas Manhardt, Jan Van Der Lugt, Merijn Varensteen, Hassan Chafi



Graph analysis is a powerful method in data analysis. In this paper, we present a fast distributed graph processing system, namely PGX.D. We show that PGX.D outperforms other distributed graph systems like GraphLab significantly (3x – 90x). Furthermore, PGX.D on 4 to 16 machines is also faster than the single-machine execution. Using a fast cooperative context-switching mechanism, we implement PGX.D as a low-overhead, bandwidth-efficient communication framework that supports remote data-pulling patterns. Moreover, PGX.D achieves large traffic reduction and good workload balance by applying selective ghost nodes, edge partitioning, and edge chunking, in transparent manners. Our analysis confirms that each of these features is indeed crucial for overall performance of certain kinds of graph algorithms. Finally, we advocate the use of balanced beefy clusters where the sustained random DRAM-access bandwidth in aggregate is matched with the bandwidth of the underlying interconnection fabric.

Chair/Author Details:

Manoj Kumar (Chair) - IBM Corporation|

Sungpack Hong - Oracle Corporation

Siegfried Depner - Oracle Corporation

Thomas Manhardt - Oracle Corporation

Jan Van Der Lugt - Google

Merijn Varensteen - University of Amsterdam

Hassan Chafi - Oracle Corporation

