CS 420 -
Fundamental issues in design and development of parallel programs for various types of parallel computers. Various programming models according to both machine type and application area. Cost models, debugging, and performance evaluation of parallel programs with actual application examples. Course Information: Same as CSE 402 and ECE 492. 3 undergraduate hours. 3 or 4 graduate hours. Prerequisite: CS 225.
Online lecture notes https://piazza.com/illinois/fall2018/cs420cse402ece492/resources
- Prepare non-CS science and engineering students to the use of parallel computing in support of their work. (1,2,6)
- Acquire basic knowledge of CPU architecture: execution pipeline, dependencies, caches; learn to tune performance by enhancing locality and leveraging compiler optimizations. (2,6)
- Understand vector instructions and learn to use vectorization (2,6)
- Acquire basic knowledge of multicore architectures: cache coherence, true and false sharing and their relevance to parallel performance tuning (2,6)
- Learn to program using multithreading, parallel loops, and multitasking using a language such as OpenMP. Learn to avoid concurrency bugs. (2,6)
- Learn to program using message passing with a library such as MPI. (2,6)
- Understand simple parallel algorithms and their complexity. (1,6)
- Learn to program accelerators using a language such as OpenMP (2,6)
- Acquire basic understanding of parallel I/O and of frameworks for data analytics, such as map-reduce. (6)
- Team project (1,2,3,5,6)
- Introduction: Course introduction. Importance of parallel computing with the end of Moore’s Law
- Basic CPU architecture and performance bottlenecks. Tuning for locality and leveraging optimizing compilers.
- Vector instructions and compiler vectorization.
- Basic multicore architecture and performance bottlenecks. False sharing.
- OpenMP: Mutithreading model; parallel sections, parallel loops, tasks and task dependencies. Races and atomicity. Deadlock avoidance
- Basic parallel algorithms: matmult, stencils, sparseMV
- Basic cluster architecture and performance bottlenecks
- MPI: Point-to-point, one-sided, collectives.
- Basic distributed memory algorithms: matmult, stencils, sparseMV, sorting; data distribution.
- Parallel programming patterns: divide-and-conquer, pipeline
- Basic GPU architecture; programming GPUs with OpenMP
- Basics of parallel I/O
- Basics of data analysis using map-reduce
- Mid-term and final exams
Assessment and Revisions
|Revisions in last 6 years||Approximately when revision was done||Reason for revision||Data or documentation available?|
|Updasted list of topics covered||fall 2010||List of topics had evolved throught the years||New list of topics with hours.|
Required, Elective, or Selected Elective
2/10/2019by Marc Snir