Fast Classification of MPI Applications Using Lamport's Logical Clocks
Authors: Zhou Tong (Florida State University)
Abstract: We present a novel trace-based analysis tool that rapidly classifies an
MPI application as bandwidth-bound, latency-bound, load-imbalance-bound,
and computation-bound for different interconnection networks. The tool uses
an extension of Lamport's logical clock to track application progress
in the trace replay. It has two unique features. First, it predicts the
application performance for many network configurations by replaying the
traces only once. Second, it infers the performance characteristics
of an application and classifies the application using the predicted
performance trend for a range of network configurations instead of using
the predicted performance for a particular network configuration.
We describe the techniques used in the tool and its design and implementation,
and report our performance study of the tool and our experience with
classifying 9 applications and miniapps from the DOE Design Forward
project as well as the NAS parallel benchmarks.
Poster: pdf
Two-page extended abstract: pdf
Poster Index