SC15 Austin, TX

LIBXSMM: A High Performance Library for Small Matrix Multiplications

Authors: Alexander Heinecke (Intel Corporation), Hans Pabst (Intel Corporation), Greg Henry (Intel Corporation)

Abstract: In this work we present a library, LIBXSMM, that provides a high performance implementation of small sparse and dense matrix multiplications on latest Intel architectures. Such operations are important building blocks in modern scientific applications and general math libraries are normally tuned for all dimensions being large. LIBXSMM follows a matrix multiplication code generation approach specifically matching the applications' needs. By providing several interfaces, the replacement of BLAS calls is simple and straightforward. We show that depending on the application's characteristics, LIBXSMM can either leverage the entire DRAM bandwidth or reaches close to the processor's computational peak performance. Our performance results of CP2K and SeisSol therefore demonstrate that using LIBXSMM as a highly-efficient computational backend, leads to speed-ups of greater than two compared to compiler generated inlined code or calling highly-optimized vendor math libraries.

Poster: pdf
Two-page extended abstract: pdf

Poster Index