Authors: Hiroaki Umeda (University of Tsukuba), Toshihiro Hanawa (University of Tokyo), Mitsuo Shoji (University of Tsukuba), Taisuke Boku (University of Tsukuba), Yasuteru Shigeta (University of Tsukuba)

Abstract: GPU-enable FMO (fragment molecular orbital) program has been developed with CUDA and executed performance benchmarks, including the first large-scale GPU-accelerated FMO calculation. FMO method is one of ab initio molecular orbital methods for large molecule, and is wanted to execute on a modern HPC computer system, such as GPU cluster. There are two hotspot in FMO calculation: Fock matrix construction and electrostatic potential (ESP) calculation. GPU-enable Fock matrix construction is implemented with a full advantage of two-electron integral symmetric property without costly exclusive accumulation to a shared matrix. For ESP calculation, four-center inter-fragment Coulomb interaction is implemented for GPGPU with a same strategy as Fock matrix construction. Performance Benchmark shows 3.8 times speedups from CPU on-the-fly calculation. As a larger benchmark, FMO calculation of 23,460 atomic protein is performed with 256 NVIDIA M2090 GPUs, and this large-scale GPU-accelerated FMO calculation is successfully executed in 2 hours.

Poster: pdf

Two-page extended abstract: pdf

Poster Index