Multi-GPU Graph Analytics
Authors: Yuechao Pan (University of California, Davis), Yangzihao Wang (University of California, Davis), Yuduo Wu (University of California, Davis), Carl Yang (University of California, Davis), John D. Owens (University of California, Davis)
Abstract: We present Gunrock, a multi-GPU graph processing library, that enables easy graph algorithm implementation and extension onto multiple GPUs, for scalable performance on large graphs with billions of edges. Our high-level data-centric abstraction focuses on vertex or edge frontier operations. With this abstraction, Gunrock balances between performance and low programming complexity, by coupling high performance GPU computing primitives and optimization strategies. Our multi-GPU framework only requires programmers to specify a few algorithm-dependent blocks, hiding most multi-GPU related implementation details. The framework effectively overlaps computation and communication, and implements a just-enough memory allocation scheme that allows memory usage to scale with more GPUs. We achieve 22GTEPS peak performance for BFS, which is the best of all single-node GPU graph libraries, and demonstrate a 6X speed-up with 2X total memory consumption on 8 GPUs. We identify synchronization / data communication patterns, graph topologies, and partitioning algorithms as limiting factors to further scalability.
Two-page extended abstract: pdf