CIS5930 Advanced Topics in Parallel and Distributed Systems

J. Kim, et al., "Technology-Driven, Highly Scalable Dragonfly Topology", ACM ISCA 2008.
N. Jiang, et al., "Indirect Adaptive Routing on Large Scale Interconnect Networks," ACM ISCA 2009.

Lecture 12: Recent topology and routing proposal for extreme scale systems (Atiqul Mollah and Gaurish Nayak)

A. Singla, et al., "Jellyfish: Networking Data Centers Randomly," USENIX NSDI 2012.
X. Yuan, et al., "A new routing scheme for Jellyfish and its performance with HPC workloads," ACM SC'13, 2013.
A. Singla, et al., "High Throughput Data Center Topology Design," USENIX NSDI 2014.
Reading: Tianhe-1A Interconnect and Message-Passing Services.

Homework 2: Comments on jellyfish topology and routing (summary, advantages, and drawbacks, 1 page max). Due Feb 25.

Lecture 13

Lecture 14: Ethernet development (Jordan Nowlin): 10-Gigabit Ethernet (10GE), 40GE, 100GE, 400GE, RDMA over Converged Ethernet

P. Patarasu, A. Faraj, and X. Yuan, "Pipelined Broadcast on Ethernet Switched Clusters." Journal of Parallel and Distributed Computing, 68(6):809-824, June 2008.

M. Small, Z. Gu, and X. Yuan, ``Near-optimal Rendezvous Protocols for RDMA-enabled Clusters,'' International Conference on Parallel Processing (ICPP), Sept. 2010.
M. Small and X. Yuan, "A New Design of RDMA-based Small Message Channels for InfiniBand Clusters," IEEE International Conference on Cluster Computing (CLUSTER), Sept. 23-27, 2013.

X. Yuan, S. Mahapatra, S. Pakin, and M. Lang, "LFTI: A New Performance Metric for Assessing Interconnect Designs for Extreme-Scale HPC Systems," the 28th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Phoenix, Arizona, May 19-23, 2014.

B. Hendrickson and J. Berry, "Graph analysis with high-performance computing." Computing in Science Engineering, 10(2), march 2008.
Richard C. Murphy, Kyle B. Wheeler, James, A. Ang, Brian W. Barrett, "Introcuding the Graph 500," Gray User Group 2010.
Koji Ueno and Toyotaro Suzumura, "Highly Scalable Graph Search for the Graph500 Benchmark" HPDC 2012 (The 21st International ACM Symposium on High-Performance Parallel and Distributed Computing) 2012/6, Delft, Netherlands.
http://www.graph500.org

Lecture 22 (04/01, Carlos Sanchez and Soheila): HPC and Cloud Computing, presentation 1, presentation 2

A. Marathe, D. K. Lowenthal, B. Roundtree, M. Schulz, B. de Supinski, and X. Yuan, "A Comparative Study of High-Performance Computing on the Cloud," ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC), June 2013.
A. Gupta et al., "The Who, What, Why, and How of High Performance Computing Applications in the Cloud," HP Labs, Tech. Rep., 2013. available at http://www.hpl.hp.com/techreports/2013/HPL-2013-49.pdf.

Lecture 23 (04/03, Shafayat and Zach): Architecture-aware communication optimizations, presentation 1, presentation 2

Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas G. Robertazzi, "Design and performance evaluation of NUMA-aware RDMA-based end-to-end data transfer systems." SC 2013
Shigang Li, Torsten Hoefler, Marc Snir, "NUMA-aware shared-memory collective communication for MPI." HPDC 2013.

Lecture 24 (04/08, Caitlin): Power, Resilience, and exascale computing 1

"Technical Challenges of Exascale Computing", Available at: http://institutes.lanl.gov/resilience/docs/JSR-12-310-Challenges_of_exascaleFINAL.pdf.

Lecture 25 (04/10, Ryan): Power, Resilience, and exascale computing 2

M, Snir, et. al. "Addressing Failures in Exascale Computing", Available at http://www.mcs.anl.gov/uploads/cels/papers/ANL:MCS-TM-322.pdf.

Lecture 26 (04/15, Tong and Abdullah): Power, Resilience, and exascale computing 3

Nikola Rajovic, Paul M. Carpenter, Isaac Gelado, Nikola Puzovic, Alex Ramirez, Mateo Valero, "Supercomputing with Commodity CPUs: Are Mobile SoCs Ready for HPC?" SC 2013.
Osman Sarood, Esteban Meneses, and L. V. Kale, "A Cool Way of Improving the Reliability of HPC Machines," SC 2013.

Lecture 27 (04/17) Term project presentation

Peyman/Soheila: Survey of topology mapping and process allocation on large scale interconnect networks.
Jordan Nowlin: Survey of security in Software Defined Network
Ryan Baird/Carlos Sanchez: A New Broadcast Algorithm

Lecture 28 (04/22) Term project presentation

Catlin Carnahan: MPI implementation of coalescing and augmentation algorithm for clustering labelled profiles.
Nekmdirim Dockery: Survey of issues of grade databases in HPC.
Gaurish Nayak: Parallel Sparse Matrix Multiplication

Lecture 29 (04/24) Term project presentation

Zhou Tong/Shafayat Rahman: Survey of HPC using Mobile CPU
Zach Yannes: MPI implementation of General Number Field Sieve.
Atiqul Mollah/Abdullah Raiaan: Computing multi-commodity flow rates with max-min fairness on fat-tree topologies.

Final exam (take home, open everything, no discussion), Due April 30, 11:59am. Place hardcopy in my office by the due time.