High-Performance Matrix Computations --- 2015



    Prerequisites

    Basic knowledge of numerical linear algebra.
    Principles of algorithms and programming.
    Familiarity with Matlab and C.

    Overview

    The course centers around the idea of developing efficient numerical algorithms through a synergy between mathematics and architectures.
    We will cover all the following topics.

    processor architecture (cpu, memory system, interconnect)
    floating point operations
    roofline model
    vectorization
    matrix-matrix product, BLAS
    factorizations
    methods of relatively robust representations (MR3)
    blocked algorithms
    algorithms by block
    dynamic scheduling
    data parallelism
    shared memory vs. distributed memory paradigm
    synchronization vs. communication



  • Summer semester 2015.

  • CAMPUS #: 15ss-24886.

  • Lectures begin: Tuesday, April 14.

  • Lectures & Exercises:
    Tuesday, Thursday: 5.15pm Rogowski 115 - AICES seminar room (Schinkelstrasse 2)

  • Office hours: Tuesdays, 11am-1pm. AICES R432 (Rogowski Building - Schinkelstrasse 2)

  • Schedule

    • 14.04 - Introduction. [Notes] [GER]
    • 16.04 - Timers. Pipelining. Memory hierarchy, prefetching. [File]
    • 21.04 - Locality. Time, performance, TPP, GEMM. [Notes]
    • 23.04 - BLAS, scalability. [Notes]
    • 28.04 - Storage by Rows & Cols. Caching & cache thrashing. [File] [File]
    • 30.04 - Efficiency; turbo vs. heating. [BLAS reference] [File]
    • 05.05 - BLAS interface. Tensors & GEMM. [Homework #1]; Due: Friday, May 15th, 1pm.
    • 07.05 - Blocked vs. unblocked algorithms. Cholesky factorization. [File]
    • 12.05 - Partitioned Matrix Expression, Cholesky variants. [Notes]
    • 19.05 - How to optimize GEMM. [rvdgWIKI]
    • 21.05 - #flops vs BLAS-level; multithreading (part 1) [File], [# FLOPS]
    • 02.06 - review HW1; Least Squares
    • 09.06 - ELAPS 1/2. [ELAPS on GitHub]
    • 11.06 - ELAPS 2/2 [SandyBridge_MKL.cfg] [cluster batch system] [Homework #2]. Due: Saturday, June 20th, 23.59pm.
    • 16.06 - Algorithms by blocks [Paper].
    • 18.06 - Roofline Model [Paper]. Eigensolvers (intro).
    • 23.06 - Bisection & Inverse Iteration [Section 2.3.1]
    • 25.06 - The symmetric eigenproblem
    • 30.06 - HW2 review [Archive].
    • 02.07 - MRRR, sequential [Section 2.3.2]
    • 07.07 - Final project [PDF] [file]
    • 09.07 - MRRR, parallelism [Talk]
    • 14.07 - Computing Petaflops over Teraflops of data [Paper]
    • 16.07 - Semester review

    Exams

    First come first served.
    Slots available:
    • July 27, 28, 29, 30, 31

    • August 3, 4, 5

    • October 2, 5