CS 9535 and CS 4402 - Distributed and Parallel Systems

University of Western Ontario
Computer Science Department
Date: January 3, 2011

The efficient usage of parallel and distributed systems (multi-processors and computer networks) is nowadays an important task for computer scientists.

This course studies the fundamental aspects of parallel systems and aims at providing an integrated view of the various facets of software development on such systems: hardware architectures, programming languages and models, software development tools, software engineering concepts and design patterns, performance modeling and analysis, experimenting and measuring, application to scientific computing.

Course topics may include but are not limited to: multi-core, SMP, clusters, GPU computing, scheduling, scalability, parallel and distributed data-structures, threads, message passing, MPI, distributed and shared memory, hierarchical memory, data parallel languages, and applications of parallel and distributed computing.

Part of the materials (multi-threading programming, hierarchical memory) from this course: CS 4435 and CS 9624 - High Performance Computing with a Focus on Hardware Acceleration Technologies.
Others from last year's edition of CS 9535 and CS 4402.

Note that for this 2010-2011 academic year there will another course dedicated to distributed systems. More on this later.

Follow this link for various resources (software tools and tutorials, hardware documentation, conferences, other HPC course web sites, etc.) regarding this course and HPC in general.

Prerequisites for undergraduate students

CS 3305. Students must be fluent in C and C++; they must also be familiar with UNIX software tools (shell scripts, makefiles, debuggers).


This presents the contents of the course, its assignments, quizzes and projects. outline.html

Lecture notes:

  • An Introduction to Software Performannce Engineering. slides and handouts.
  • An Introduction to Multicore Programming. slides and handouts.
  • Multithreaded Parallelism and Performance Measures. slides and handouts.
  • Analysis of Multithreaded Algorithms. slides and handouts.
  • Issues in Parallelism (by Matteo Frigo). slides
  • Cache memories: complexity analysis and practical issues. slides and handouts.
  • Synchronizing without locks. slides and handouts.
  • Many-core Computing with CUDA. slides and handouts.
  • Optimizing CUDA code. slides and handouts.
  • Problem sets:

  • Problem set 1
  • Problem set 2