CS6xx - High Performance Computing

University of Western Ontario
Computer Science Department

Date: September 11, 2009

Current hardware improvements focus on increasing the number of computations that can be performed in parallel rather than on increasing clock speed alone. This change has brought multi-processor workstations to the desktop, expanding interest in parallel algorithms and software capable of exploiting these computing resources. At the same time, these new hardware acceleration technologies stress the need of a deeper undertsanding of performance issues in software design.

The aim of this course is to introduce you to the design and analysis of algorithms and software programs capable of taking advantage of these new computing resources. The following concepts will guide our quest for high performance: parallelism, scalability, locality, cache complexity, synchronization, scheduling and load balancing.

Out of the course, you are anticipated to have an in depth understanding of the following subjects:

  • Multi-threaded parallelism (theoretical model such as work and span, the cilk/cilk++ concurrency platform and its development tools, scheduling by work stealing)
  • Cache complexity (memory hierarchy, cache complexity, cahce-oblivious algorithms)
  • Code optimization for parallelism and locality (including performance evaluation by tools like Cilkscreen and VTune).

A quater part of the course will give an overview of other hot topics in high performance computing, including the following ones:

  • hardware accelaration technologies (GPGPU, FPGA)
  • auto-tuning teachniques (as in the FFTW or SPIRAL)
  • other concurrency platforms (TBB, OpenMP, MPI)

moreno 2009-09-11