Optimization of Stencil-Based Fusion Kernels on Tera-Flops Many-Core Architectures
Authors: Yuuichi Asahi (Japan Atomic Energy Agency), Guillaume Latu (French Alternative Energies and Atomic Energy Commission), Takuya Ina (Japan Atomic Energy Agency), Yasuhiro Idomura (Japan Atomic Energy Agency), Virginie Grandgirard (French Alternative Energies and Atomic Energy Commission), Xavier Garbet (French Alternative Energies and Atomic Energy Commission)
Abstract: Plasma turbulence is of great importance in fusion science. However, turbulence simulations are costly so that more computing power is needed for simulating the coming fusion device ITER. This motivates us to develop advanced algorithms and optimizations of fusion codes on Tera-flops many-core architectures like Xeon Phi, Nvidia Tesla and Fujitsu FX100. We evaluate the kernel performance extracted from the hot spots of fusion codes, GYSELA and GT5D, with different numerical algorithms.
For the former kernel, which is based on a semi-Lagrangian scheme with high arithmetic intensity, high-performance is obtained on accelerators (Xeon Phi and Tesla) by applying SIMD optimization. On the other hand, in the latter kernel, which is based on a finite difference scheme, a large shared cache plays a critical role in improving arithmetic intensity, and thus, multi-core CPUs (FX100) with a large shared cache give higher performance.
Two-page extended abstract: pdf