TIME
| PRESENTATION
| SPEAKER
| LOCATION
| PLANNER
|
10:30AM - 11:00AM |
Massively Parallel Phase-Field Simulations for Ternary Eutectic Directional Solidification |
Martin Bauer, Johannes Hötzer, Marcus Jainta, Philipp Steinmetz, Marco Berghoff, Florian Schornbaum, Christian Godenschwager, Harald Köstler, Britta Nestler, Ulrich Rüde |
18AB |
|
10:30AM - 11:00AM |
Runtime-Driven Shared Last-Level Cache Management for Task-Parallel Programs |
Abhisek Pan, Vijay S. Pai |
19AB |
|
10:30AM - 11:00AM |
BD-CATS: Big Data Clustering at Trillion Particle Scale |
Md. Mostofa Ali Patwary, Suren Byna, Nadathur Rajagopalan Satish, Narayanan Sundaram, Zarija Lukic, Vadim Roytershteyn, Michael J. Anderson, Yushu Yao, Mr Prabhat, Pradeep Dubey |
18CD |
|
11:00AM - 11:30AM |
Parallel Implementation and Performance Optimization of the Configuration-Interaction Method |
Hongzhang Shan, Samuel Williams, Calvin Johnson, Kenneth McElvain, W. Erich Ormand |
18AB |
|
11:00AM - 11:30AM |
Frugal ECC: Efficient and Versatile Memory Error Protection through Fine-Grained Compression |
Jungrae Kim, Michael Sullivan, Seong-Lyong Gong, Mattan Erez |
19AB |
|
11:00AM - 11:30AM |
Performance Optimization for the K-Nearest Neighbors Kernel on x86 Architectures |
Chenhan D. Yu, Jianyu Huang, Woody Austin, Bo Xiao, George Biros |
18CD |
|
11:30AM - 12:00PM |
Efficient Implementation of Quantum Materials Simulations on Distributed CPU-GPU Systems |
Raffaele Solcà, Anton Kozhevnikov, Azzam Haidar, Stanimire Tomov, Jack Dongarra, Thomas C. Schulthess |
18AB |
|
11:30AM - 12:00PM |
Automatic Sharing Classification and Timely Push for Cache-Coherent Systems |
Malek Musleh, Vijay S. Pai |
19AB |
|
1:30PM - 2:00PM |
HipMer: An Extreme-Scale De Novo Genome Assembler |
Evangelos Georganas, Aydin Buluc, Jarrod Chapman, Steven Hofmeyr, Chaitanya Aluru, Rob Egan, Leonid Oliker, Daniel Rokhsar, Katherine Yelick |
18AB |
|
1:30PM - 2:00PM |
Adaptive and Transparent Cache Bypassing for GPUs |
Ang Li, Gert-Jan van den Braak, Akash Kumar, Henk Corporaal |
19AB |
|
1:30PM - 2:00PM |
AnalyzeThis: An Analysis Workflow-Aware Storage System |
Hyogi Sim, Youngjae Kim, Sudharshan S. Vazhkudai, Devesh Tiwari, Ali Anwar, Ali R. Butt, Lavanya Ramakrishnan |
18CD |
|
2:00PM - 2:30PM |
A Parallel Connectivity Algorithm for de Bruijn Graphs in Metagenomic Applications |
Patrick Flick, Chirag Jain, Tony Pan, Srinivas Aluru |
18AB |
|
2:00PM - 2:30PM |
ELF: Maximizing Memory-Level Parallelism for GPUs with Coordinated Warp and Fetch Scheduling |
Jason Jong Kyu Park, Yongjun Park, Scott Mahlke |
19AB |
|
2:00PM - 2:30PM |
Mantle: A Programmable Metadata Load Balancer for the Ceph File System |
Michael A. Sevilla, Noah Watkins, Carlos Maltzahn, Ike Nassi, Scott A. Brandt, Sage A. Weil, Greg Farnum, Sam Fineberg |
18CD |
|
2:30PM - 3:00PM |
Parallel Distributed Memory Construction of Suffix and Longest Common Prefix Arrays |
Patrick Flick, Srinivas Aluru |
18AB |
|
2:30PM - 3:00PM |
Memory Access Patterns: the Missing Piece of the Multi-GPU Puzzle |
Tal Ben-Nun, Ely Levy, Amnon Barak, Eri Rubin |
19AB |
|
2:30PM - 3:00PM |
HydraDB: A Resilient RDMA-Driven Key-Value Middleware for In-Memory Cluster Computing |
Yandong Wang, Li Zhang, Jian Tan, Min Li, Yuqing Gao, Xavier Guerin, Xiaoqiao Meng, Shicong Meng |
18CD |
|
3:30PM - 4:00PM |
Full Correlation Matrix Analysis of fMRI Data on Intel Xeon Phi Coprocessors |
Yida Wang, Michael J. Anderson, Jonathan D. Cohen, Alexander Heinecke, Kai Li, Nadathur Satish, Narayanan Sundaram, Nicholas B. Turk-Browne, Theodore L. Willke |
18AB |
|
3:30PM - 4:00PM |
Exploring Network Optimizations for Large-Scale Graph Analytics |
Xinyu Que, Fabio Checconi, Fabrizio Petrini, Xing Liu, Daniele Buono |
19AB |
|
3:30PM - 4:00PM |
A Case for Application-Oblivious Energy-Efficient MPI Runtime |
Akshay Venkatesh, Abhinav Vishnu, Khaled Hamidouche, Nathan Tallent, Dhabaleswar Panda, Darren Kerbyson, Adolfy Hoisie |
18CD |
|
4:00PM - 4:30PM |
A Kernel-Independent FMM in General Dimensions |
William B. March, Bo Xiao, Sameer Tharakan, Chenhan D. Yu, George Biros |
18AB |
|
4:00PM - 4:30PM |
GossipMap: A Distributed Community Detection Algorithm for Billion-Edge Directed Graphs |
Seung-Hee Bae, Bill Howe |
19AB |
|
4:00PM - 4:30PM |
Improving Concurrency and Asynchrony in Multithreaded MPI Applications using Software Offloading |
Karthikeyan Vaidyanathan, Dhiraj D. Kalamkar, Kiran Pamnany, Jeff R. Hammond, Pavan Balaji, Dipankar Das, Jongsoo Park, Balint Joo |
18CD |
|
4:30PM - 5:00PM |
Engineering Inhibitory Proteins with InSiPS: The In-Silico Protein Synthesizer |
Andrew Schoenrock, Daniel Burnside, Houman Moteshareie, Alex Wong, Ashkan Golshani, Frank Dehne, James R. Green |
18AB |
|
4:30PM - 5:00PM |
GraphReduce: Processing Large-Scale Graphs on Accelerator-Based Systems |
Dipanjan Sengupta, Shuaiwen Leon Song, Kapil Agarwal, Karsten Schwan |
19AB |
|
4:30PM - 5:00PM |
Practical Scalable Consensus for Pseudo-Synchronous Distributed Systems |
Thomas Herault, Aurelien Bouteiller, George Bosilca, Marc Gamell, Keita Teranishi, Manish Parashar, Jack Dongarra |
18CD |
|
TIME
| PRESENTATION
| SPEAKER
| LOCATION
| PLANNER
|
10:30AM - 11:00AM |
Monetary Cost Optimizations for MPI-Based HPC Applications on Amazon Clouds: Checkpoints and Replicated Execution |
Yifan Gong, Bingsheng He, Amelie Chi Zhou |
19AB |
|
10:30AM - 11:00AM |
Network Endpoint Congestion Control for Fine-Grained Communication |
Nan Jiang, Larry Dennison, William J. Dally |
18CD |
|
10:30AM - 11:00AM |
Reliability Lessons Learned From GPU Experience With The Titan Supercomputer at Oak Ridge Leadership Computing Facility |
Devesh Tiwari, Saurabh Gupta, George Gallarno, Jim Rogers, Don Maxwell |
18AB |
|
11:00AM - 11:30AM |
Elastic Job Bundling: An Adaptive Resource Request Strategy for Large-Scale Parallel Applications |
Feng Liu, Jon B. Weissman |
19AB |
|
11:00AM - 11:30AM |
Cost-Effective Diameter-Two Topologies: Analysis and Evaluation |
Georgios Kathareios, Cyriel Minkenberg, Bogdan Prisacari, German Rodriguez, Torsten Hoefler |
18CD |
|
11:00AM - 11:30AM |
Big Omics Data Experience |
Patricia Kovatch, Anthony Costa, Zachary Giles, Eugene Fluder, Hyung Min Cho, Svetlana Mazurkova |
18AB |
|
11:30AM - 12:00PM |
Fault Tolerant MapReduce-MPI for HPC Clusters |
Yanfei Guo, Wesley Bland, Pavan Balaji, Xiaobo Zhou |
19AB |
|
11:30AM - 12:00PM |
Profile-Based Power Shifting in Interconnection Networks with On/Off Links |
Shinobu Miwa, Hiroshi Nakamura |
18CD |
|
11:30AM - 12:00PM |
The Spack Package Manager: Bringing Order to HPC Software Chaos |
Todd Gamblin, Matthew LeGendre, Michael R. Collette, Gregory L. Lee, Adam Moody, Bronis R. de Supinski, Scott Futral |
18AB |
|
1:30PM - 2:00PM |
STELLA: A Domain-Specific Tool for Structured Grid Methods in Weather and Climate Models |
Tobias Gysi, Carlos Osuna, Oliver Fuhrer, Mauro Bianco, Thomas C. Schulthess |
18CD |
|
1:30PM - 2:00PM |
Energy-Aware Data Transfer Algorithms |
Ismail Alan, Engin Arslan, Tevfik Kosar |
19AB |
|
1:30PM - 2:00PM |
ScaAnalyzer: A Tool to Identify Memory Scalability Bottlenecks in Parallel Programs |
Xu Liu, Bo Wu |
18AB |
|
2:00PM - 2:30PM |
Improving the Scalability of the Ocean Barotropic Solver in the Community Earth System Model |
Yong Hu, Xiaomeng Huang, Allison H. Baker, Yu-heng Tseng, Frank O. Bryan, John M. Dennis, Guangwen Yang |
18CD |
|
2:00PM - 2:30PM |
IOrchestra: Supporting High-Performance Data-Intensive Applications in the Cloud via Collaborative Virtualization |
Ron C. Chiang, H. Howie Huang, Timothy Wood, Changbin Liu, Oliver Spatscheck |
19AB |
|
2:00PM - 2:30PM |
C^2-Bound: A Capacity and Concurrency Driven Analytical Model for Manycore Design |
Yu-Hang Liu, Xian-He Sun |
18AB |
|
2:30PM - 3:00PM |
Particle Tracking in Open Simulation Laboratories |
Kalin Kanov, Randal Burns |
18CD |
|
2:30PM - 3:00PM |
An Elegant Sufficiency: Load-Aware Differentiated Scheduling of Data Transfers |
Rajkumar Kettimuthu, Gayane Vardoyan, Gagan Agrawal, P. Sadayappan, Ian Foster |
19AB |
|
2:30PM - 3:00PM |
Recovering Logical Structure from Charm++ Event Traces |
Katherine E. Isaacs, Abhinav Bhatele, Jonathan Lifflander, David Boehme, Todd Gamblin, Martin Schulz, Bernd Hamann, Peer-Timo Bremer |
18AB |
|
3:30PM - 4:00PM |
Large-Scale Compute-Intensive Analysis via a Combined In-Situ and Co-Scheduling Workflow Approach |
Christopher Sewell, Katrin Heitmann, Hal Finkel, George Zagaris, Suzanne T. Parete-Koon, Patricia K. Fasel, Adrian Pope, Nicholas Frontiere, Li-ta Lo, Bronson Messer, Salman Habib, James Ahrens |
18CD |
|
3:30PM - 4:00PM |
Exploiting Asynchrony from Exact Forward Recovery for DUE in Iterative Solvers |
Luc Jaulmes, Marc Casas, Miquel Moretó, Eduard Ayguadé, Jesús Labarta, Mateo Valero |
18AB |
|
3:30PM - 4:00PM |
Data Partitioning Strategies for Graph Workloads on Heterogeneous Clusters |
Michael LeBeane, Shuang Song, Reena Panda, Jee Ho Ryoo, Lizy K. John |
19AB |
|
4:00PM - 4:30PM |
Smart: A MapReduce-Like Framework for In-Situ Scientific Analytics |
Yi Wang, Gagan Agrawal, Tekin Bicer, Wei Jiang |
18CD |
|
4:00PM - 4:30PM |
High-Performance Algebraic Multigrid Solver Optimized for Multi-Core Based Distributed Parallel Systems |
Jongsoo Park, Mikhail Smelyanskiy, Ulrike Meier Yang, Dheevatsa Mudigere, Pradeep Dubey |
18AB |
|
4:00PM - 4:30PM |
Scaling Iterative Graph Computations with GraphMap |
Kisung Lee, Ling Liu, Karsten Schwan, Calton Pu, Qi Zhang, Yang Zhou, Emre Yigitoglu, Pingpeng Yuan |
19AB |
|
4:30PM - 5:00PM |
Optimal Scheduling of In-Situ Analysis for Large-Scale Scientific Simulations |
Preeti Malakar, Venkatram Vishwanath, Todd Munson, Christopher Knight, Mark Hereld, Sven Leyffer, Michael Papka |
18CD |
|
4:30PM - 5:00PM |
STS-k: A Multilevel Sparse Triangular Solution Scheme for NUMA Multicores |
Humayun Kabir, Joshua D. Booth, Guillaume Aupy, Anne Benoit, Yves Robert, Padma Raghavan |
18AB |
|
4:30PM - 5:00PM |
PGX.D: A Fast Distributed Graph Processing System |
Sungpack Hong, Siegfried Depner, Thomas Manhardt, Jan Van Der Lugt, Merijn Varensteen, Hassan Chafi |
19AB |
|
TIME
| PRESENTATION
| SPEAKER
| LOCATION
| PLANNER
|
10:30AM - 11:00AM |
CIVL: The Concurrency Intermediate Verification Language |
Stephen F. Siegel, Manchun Zheng, Ziqing Luo, Timothy K. Zirkel, Andre V. Marianiello, John G. Edenhofner, Matthew B. Dwyer, Michael S. Rogers |
18AB |
|
10:30AM - 11:00AM |
Improving Backfilling by using Machine Learning to Predict Running Times |
Eric Gaussier, David Glesser, Valentin Reis, Denis Trystram |
19AB |
|
10:30AM - 11:00AM |
Randomized Algorithm to Update Partial Singular Value Decomposition on a Hybrid CPU/GPU Cluster |
Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Jack Dongarra |
18CD |
|
11:00AM - 11:30AM |
Clock Delta Compression for Scalable Order-Replay of Non-Deterministic Parallel Applications |
Kento Sato, Dong H. Ahn, Ignacio Laguna, Gregory L. Lee, Martin Schulz |
18AB |
|
11:00AM - 11:30AM |
Adaptive Data Placement for Staging-Based Coupled Scientific Workflows |
Qian Sun, Tong Jin, Melissa Romanus, Hoang Bui, Fan Zhang, Hongfeng Yu, Hemanth Kolla, Scott Klasky, Jacqueline Chen, Manish Parashar |
19AB |
|
11:00AM - 11:30AM |
Performance of Random Sampling for Computing Low-Rank Approximations of a Dense Matrix on GPUs |
Theo Mary, Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Jack Dongarra |
18CD |
|
11:30AM - 12:00PM |
Relative Debugging for a Highly Parallel Hybrid Computer System |
Luiz DeRose, Andrew Gontarek, Aaron Vose, Robert Moench, David Abramson, Minh Dinh, Chao Jin |
18AB |
|
11:30AM - 12:00PM |
Multi-Objective Job Placement in Clusters |
Sergey Blagodurov, Alexandra Fedorova, Evgeny Vinnik, Tyler Dwyer, Fabien Hermenier |
19AB |
|
1:30PM - 2:00PM |
A Work-Efficient Algorithm for Parallel Unordered Depth-First Search |
Umut Acar, Arthur Chargueraud, Mike Rainey |
18CD |
|
1:30PM - 2:00PM |
Local Recovery and Failure Masking for Stencil-Based Applications at Extreme Scales |
Marc Gamell, Keita Teranishi, Michael A. Heroux, Jackson Mayo, Hemanth Kolla, Jacqueline Chen, Manish Parashar |
19AB |
|
1:30PM - 2:00PM |
Scientific Benchmarking of Parallel Computing Systems |
Torsten Hoefler, Roberto Belli |
18AB |
|
2:00PM - 2:30PM |
Enterprise: Breadth-First Graph Traversal on GPUs |
Hang Liu, H. Howie Huang |
18CD |
|
2:00PM - 2:30PM |
VOCL-FT: Introducing Techniques for Efficient Soft Error Coprocessor Recovery |
Antonio J. Peña, Wesley Bland, Pavan Balaji |
19AB |
|
2:00PM - 2:30PM |
Node Variability in Large-Scale Power Measurement: Perspectives from the Green500, Top500 and EEHPCWG |
Thomas R. W. Scogland, Jonathan Azose, David Rohr, Suzanne Rivoire, Natalie Bates, Daniel Hackenberg, Torsten Wilde, James H. Rogers |
18AB |
|
2:30PM - 3:00PM |
GraphBIG: Understanding Graph Computing in the Context of Industrial Solutions |
Lifeng Nai, Yinglong Xia, Ilie G. Tanase, Hyesoon Kim, Ching-Yung Lin |
18CD |
|
2:30PM - 3:00PM |
Understanding the Propagation of Transient Errors in HPC Applications |
Rizwan Ashraf, Roberto Gioiosa, Gokcen Kestor, Ronald DeMara, Chen-Yong Cher, Pradip Bose |
19AB |
|
2:30PM - 3:00PM |
A Practical Approach to Reconciling Availability, Performance, and Capacity in Provisioning Extreme-Scale Storage Systems |
Lipeng Wan, Feiyi Wang, Sarp Oral, Devesh Tiwari, Sudharshan S. Vazhkudai, Qing Cao |
18AB |
|
3:30PM - 4:00PM |
Analyzing and Mitigating the Impact of Manufacturing Variability in Power-Constrained Supercomputing |
Yuichi Inadomi, Tapasya Patki, Koji Inoue, Mutsumi Aoyagi, Barry Rountree, Martin Schulz, David Lowenthal, Yasutaka Wada, Keiichiro Fukazawa, Masatsugu Ueda, Masaaki Kondo, Ikuo Miyoshi |
19AB |
|
3:30PM - 4:00PM |
Regent: A High-Productivity Programming Language for HPC with Logical Regions |
Elliott Slaughter, Wonchan Lee, Sean Treichler, Michael Bauer, Alex Aiken |
18AB |
|
3:30PM - 4:00PM |
An Input-Adaptive and In-Place Approach to Dense Tensor-Times-Matrix Multiply |
Jiajia Li, Casey Battaglino, Ioakeim Perros, Jimeng Sun, Richard Vuduc |
18CD |
|
4:00PM - 4:30PM |
Finding the Limits of Power-Constrained Application Performance |
Peter E. Bailey, Aniruddha Marathe, David K. Lowenthal, Barry Rountree, Martin Schulz |
19AB |
|
4:00PM - 4:30PM |
Bridging OpenCL and CUDA: A Comparative Analysis and Translation |
Junghyun Kim, Thanh Tuan Dao, Jaehoon Jung, Jinyoung Joo, Jaejin Lee |
18AB |
|
4:00PM - 4:30PM |
Scalable Sparse Tensor Decompositions in Distributed Memory Systems |
Oguz Kaya, Bora Ucar |
18CD |
|
4:30PM - 5:00PM |
Dynamic Power Sharing for Higher Job Throughput |
Daniel A. Ellsworth, Allen D. Malony, Barry Rountree, Martin Schulz |
19AB |
|
4:30PM - 5:00PM |
CilkSpec: Optimistic Concurrency for Cilk |
Shaizeen Aga, Sriram Krishnamoorthy, Satish Narayanasamy |
18AB |
|