Lecture 1 (01/05): Syllabus, Introduction
Lecture 2 (01/07): PDS basics
Lecture 3 (01/10): PDS basics (II)
Lecture 4 (01/12): PDS basics (III), Zoom recording
Homework 1, naive_mm.c. Due January 17, 11:59pm.
Lecture 5 (01/14): CPU core architecture and single thread performance, Zoom recording
Lecture 6 (01/19): Affine loops and dependence analysis, Zoom recording
Lecture 7 (01/21, 01/24): Loop optimizations , Zoom recording
Lecture 8 (01/26, 01/28), : Deep neural networks from scratch
Programming assignment 1: Deep Neural Network for Hand-Written Digit Recognition, a sample training output, Due: 02/07 - Part 1, and 02/14 - Part 2.
Lecture 9 (01/31, 02/02): x86 SIMD extensions, Zoom Recording (01/31) , Zoom Recording (02/02)
Lecture 10 (02/04): Shared Memory Architectures, Zoom recording
Programming assignment 2: Improving Deep Neural Network Code with x86 Vector Extensions, Due: 02/21.
Lecture 11 (02/07, 02/09, 02/11): Introduction to OpenMP, Zoom Recording (02/11)
Lecture 12 (02/11, 02/14): OpenMP for NUMA Architectures, Zoom Recording(02/14)
Lecture 13 (02/16): Scalable Computers
Lecture 14 (02/18): Interconnection Networks
Programming assignment 3: Parallelizing Deep Neural Network Code with OpenMP, Due: 03/07.
Homework 2: Read the following paper and write a critique of the paper (you can use the template), Due March 2.
J. Kim, W. J. Dally, S. Scott and D. Abts, "Technology-Driven, Highly-Scalable Dragonfly Topology," 2008 International Symposium on Computer Architecture, 2008, pp. 77-88, doi: 10.1109/ISCA.2008.19.
Lecture 15 (02/21, 02/23, 02/25, 02/28): Interconnect Topology, Zoom recording (02/21), Zoom Recording (02/28)
Lecture 16 (02/28, 03/2, 03/04): Routing, Switching, and Flow Control, Zoom Recording (03/02), Zoom Recording (03/04)
Lecture 17 (03/07): State of the art Interconnect Design: Slingshot and its analysis, Slides (Saptarshi Bhowmik and Rubayet Rahman Rongon)
Daniele De Sensi, Salvatore Di Girolamo, Kim H. McMahon, Duncan Roweth, and Torsten Hoefler. 2020. An in-depth analysis of the slingshot interconnect. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '20). IEEE Press, Article 35.
Lecture 18 (03/09): State of the art Interconnect Design: 3D-Hyper-Flex-LION, Slides (Ram Chaulagain and Tusher Chandra Mondol), Zoom recording.
Gengchen Liu, Roberto Proietti, Marjan Fariborz, Pouya Fotouhi, Xian Xiao, and S. J. Ben Yoo. 2020. Architecture and performance studies of 3D-Hyper-FleX-LION for reconfigurable all-to-all HPC networks. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '20). IEEE Press, Article 26, 1-16.
Lecture 19 (03/11): Programming Distributed Memory Systems: Message Passing Interface 1, Zoom Recording
Homework 3, Due: March 25.
Lecture 20 (03/21): Programming Distributed Memory Systems: Message Passing Interface 2, Zoom Recording
Lecture 21 (03/23): Programming Distributed Memory Systems: Message Passing Interface 3 - Domain decomposition
Lecture 22 (03/25, 03/28): MPI implementation, Zoom Recording (03/25), Zoom Recording (03/28)
Programming assignment 4: Parallelizing Deep Neural Network Code with MPI, Due: 04/08.
Lecture 23 (03/30): State of the art - Security issues in Parallel and Distributed Computing - Side channel attacks and defenses (Kazi), Slides, Zoom Recording
Lecture 24 (04/04): State of the art - topology aware job scheduling (Patrick and Zach), Slides, Zoom Recording
Staci A. Smith and David K. Lowenthal, "Jigsaw: A High-Utilization, Interference-Free Job Scheduler for Fat-Tree Clusters" ACM HPDC 2021.
Lecture 25 (04/01): GPU overview, Zoom Recording
Lecture 26 (04/06): CUDA programming I, Zoom Recording
Lecture 27 (04/08): CUDA programming II, Zoom Recording
Lecture 28 (04/11): CUDA programming III, Zoom Recording
Lecture 29 (04/13): State of the art - CUDA Unified Memory (Jack and Luiz), Slides, vecadd_um.cu, overload_um.cu, Zoom Recording
Programming assignment 5: GPU Deep Neural Network Code (optional), Due: 04/22.
Some information about project presentation and report
Lecture 30 (04/15): State of the art - MPI-3 Neighborhood Collective (Mohsen), Zoom Recording
S. Mahdieh Ghazimirsaeed, Qinghua Zhou, Amit Ruhela, and Mohammadreza Bayatpour. 2020. A hierarchical and load-aware design for large message neighborhood collectives. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '20). IEEE Press, Article 34, 1-13.
Term project presentation (Monday, April 18): Zoom Recording
Term project presentation (Wednesday, April 20):
Term project presentation (Friday, April 22):
Final exam will be a take-home, open everything exam (no discussion), 7:30am-11:59am, Monday April 25.