Large-Scale MO Calculation with GPU-accelerated FMO Program
Authors: Hiroaki Umeda (University of Tsukuba), Toshihiro Hanawa (University of Tokyo), Mitsuo Shoji (University of Tsukuba), Taisuke Boku (University of Tsukuba), Yasuteru Shigeta (University of Tsukuba)
Abstract: GPU-enable FMO (fragment molecular orbital) program has been developed
with CUDA and executed performance benchmarks, including the first
large-scale GPU-accelerated FMO calculation.
FMO method is one of ab initio molecular orbital methods for large
molecule, and is wanted to execute on a modern HPC computer system, such
as GPU cluster. There are two hotspot in FMO calculation: Fock matrix
construction and electrostatic potential (ESP) calculation. GPU-enable
Fock matrix construction is implemented with a full advantage of
two-electron integral symmetric property without costly exclusive
accumulation to a shared matrix. For ESP calculation, four-center
inter-fragment Coulomb interaction is implemented for GPGPU with a same
strategy as Fock matrix construction.
Performance Benchmark shows 3.8 times speedups from CPU on-the-fly
calculation. As a larger benchmark, FMO calculation of 23,460 atomic
protein is performed with 256 NVIDIA M2090 GPUs, and this large-scale GPU-accelerated FMO calculation is successfully executed in 2 hours.
Two-page extended abstract: pdf