High-Performance Matrix Computations --- 2012-13
- Summer semester 2013.
- CAMPUS #: 13ss-24886.
- Lectures begin: Tuesday, April 9, 5pm.
Lectures & Exercises:
Tuesday, Thursday: 17.00-18.30. Rogowski 115 - AICES seminar room (Schinkelstrasse 2)
- Office hours: Tuesdays, 11am-1pm. AICES R432 (Rogowski Building - Schinkelstrasse 2)
- April, Tuesday 9; introduction [Intro]
- April, Thursday 11; computer architecture [lecture 1]
- April, Tuesday 16; performance [lecture 2] [timer]
- April, Thursday 18; ger vs. gemm [lecture 3] [Mathematica notebook] [Assignment #1]
April, Tuesday 23; BLAS, storage, assignment review
[column vs. row]
[Assignment #1 again] Deadline: April, Monday 29th, midnight. Target: RZ's cluster, Harperton nodes. To access an Harperton: login to cluster-linux-xeon.rz.rwth-aachen.de
To submit jobs (not necessary): add #BSUB -R "select[model==Harpertown]" to your job script.
April, Thursday 25;
- April, Tuesday 30; GEMM, blocked vs. unblocked algorithms. PME. [lecture 4]
- May, Thursday 2; blocked vs. unblocked, part 2. Cholesky factorization.
- May, Tuesday 7; what's behind GEMM. [99% of peak] → How To Optimize Gemm
- May, Tuesday 14; Locality, modularity. Matrix factorizations.
- May, Thursday 16; GPU part 1. NVIDIA Fermi architecture, CUDA: Execution Model, Programming Model. [material] [CUDA cheat sheet]
- May, Tuesday 28; GPU part 2. CUDA: Global Memory, Shared Memory. [material] [GPUs on RWTH's cluster]
- June, Tuesday 4; GPU part 3. CUDA Optimization: Streams, async. execution, occupancy. [material]
- June, Thursday 6; GPU part 4. NVIDIA Kepler Architecture, CUBLAS, MAGMA, OpenACC. [material]
- June, Thursday 13; algorithms by blocks. [Uppsala]
- June, Tuesday 18; GPU part 5/5. Introduction to OpenCL. [material]
June, Thursday 20; reduction to tridiagonal form.
Assignment #3: implement the unblocked and blocked reduction to tridiagonal form in Matlab.
Deadline: Monday, July 1st, midnight.
- June, Tuesday 25; tridiagonal eigenproblem. Intro and eigenvalues.
- June, Thursday 27; tridiagonal eigenproblem.
- July, Tuesday 2; MRRR, part 1.
- July, Thursday 4; MRRR, part 2. [material]
- July, Tuesday 9; project assignment. [projects]
- July, Thursday 11; collective communications [paper].
- July, Tuesday 16; matrix distributions.
- July, Thursday 18; projects discussion.
- before Friday, July 26;
- between August 15 and August 30;
- between October 2 and October 13;
PrerequisitesBasic knowledge of numerical linear algebra.
Principles of algorithms and programming.
Familiarity with Matlab and C.
OverviewThe course centers around the idea of developing efficient numerical algorithms through a synergy between mathematics and architectures.
We will cover all the following topics. Please also visit [HPMC 2011]
processor architecture (cpu, memory system, interconnect)
floating point operations
matrix-matrix product, BLAS
methods of relatively robust representations (MR3)
algorithms by block
shared memory vs. distributed memory paradigm
synchronization vs. communication