Integrated Co-Design of Future Exascale Software
Authors: Bjoern Gmeiner (University Erlangen-Nuremberg), Markus Huber (Technical University of Munich), Lorenz John (Technical University of Munich), Ulrich Ruede (University Erlangen-Nuremberg), Christian Waluga (Technical University of Munich), Barbara Wohlmuth (Technical University of Munich), Holger Stengel (University Erlangen-Nuremberg), Martin Bauer (University Erlangen-Nuremberg)
Abstract: The co-design of algorithms for the numerical approximation of partial differential equations is essential to exploit future exascale systems. Here, we focus on key attributes such as node performance, ultra scalable multigrid methods, scheduling techniques for uncertain data, and fault tolerant iterative solvers. In the case of a hard fault, we combine domain partitioning with highly scalable geometric multigrid schemes to obtain fast fault-robust solvers. The recovery strategy is based on a hierarchical hybrid concept where the values on lower dimensional primitives such as faces are stored redundantly and thus can be recovered easily. The lost volume unknowns are re-computed approximately by solving a local Dirichlet problem on the faulty subdomain. Different strategies are compared and evaluated with respect to performance, computational cost, and speed up. Locally accelerated strategies resulting in asynchronous multigrid iterations can fully compensate faults.
Two-page extended abstract: pdf