A Performance Evaluation of Kokkos and RAJA using the TeaLeaf Mini-App
Authors: Matt Martineau (University of Bristol), Simon McIntosh-Smith (University of Bristol), Wayne Gaudin (Atomic Weapons Establishment), Mike Boulton (University of Bristol), David Beckingsale (Lawrence Livermore National Laboratory)
Abstract: In this research project we have taken the TeaLeaf ’mini-app’ and developed several ports using a mixture of new and mature parallel programming models. Performance data collected on modern HPC devices demonstrates the capacity for each to achieve portable performance.
We have discovered that RAJA is a promising model with an intuitive development approach, that exhibits good performance on CPUs, but currently lack functional portability. Kokkos requires more up-front development, but presents a highly competitive option for performance portability on CPUs and NVIDIA GPUs. The results show that Kokkos can exhibit performance to within 5% of OpenMP and hand-optimised CUDA for some solvers, and collaboration with Sandia demonstrated that good performance could be achieved on Intel Xeon Phi devices at the expense of GPU performance.
Our poster presents highlights of the results collected during this research to enable open discussion about the benefits of each model for application developers.
Two-page extended abstract: pdf