Københavns Universitet - Kurser

NDAK14008U CHANGED: Programming Massively Parallel Hardware (PMPH)

Volume 2018/2019

MSc Programme in Computer Science
MSc Programme in Bioinformatics

In simple words, the aim of the course is to teach students how to write programs that run fast on highly-parallel hardware, such as general-purpose graphics processing units (GPGPUs), which are now mainstream. Such architectures are however capricious; unlocking their power requires understanding their design principles and also specialized knowledge of code transformations, for example aimed at optimising locality of reference, the degree of parallelism, etc. As such, this course is organized on three tracks: hardware, software, and lab.

The Software Track teaches how to think parallel. We introduce the map-reduce functional programming model, which builds programs naturally, like puzzles, from a nested composition of implicitly-parallel array operators. We reason about the asymptotic (work and depth) properties of such programs, and discuss the flattening transformation, which converts (all) arbitrarily-nested parallelism to a more-restricted form that can be directly mapped to the hardware. We then turn our attention to legacy-sequential code written in programming languages such as C. In this context we study dependence analysis, as a tools for reasoning about loop-based optimizations (e.g., Is it safe to execute a given loop in parallel, or to interchange two loops?). As time permits, we may cover more advanced topics, for example related to dynamic analysis for optimising locality of reference.

The Hardware Track studies the design space of the critical components of parallel hardware: processor, memory hierarchy and interconnect networks. We will find out that modern hardware design is governed by old ideas, which are merely adjusted or combined in different ways.

The Lab Track applies the theory learned in the other tracks. We will review the fundamental ideas that govern the GPGPU design and potential performance bottlenecks. We will quickly learn several parallel-programming models, and we will get our hands dirty by putting in practice the optimizations learned in the software track. We will use (the in-house developed) Futhark to write nested-parallel programs, to demonstrate flattening, and as a baseline. We will use OpenMP and CUDA to write "parallel-assembly" code for multi-core and GPGPU execution, respectively.

Learning Outcome

Knowledge of
- the types and semantics of data-parallel operators.
- analyses for identifying and optimising parallelism and locality of reference, e.g., flattening, dependence analysis.
- the main hardware-design techniques for supporting parallelism at processor, memory hierarchy and interconnect levels.

Skills in
- implementing parallel programs in high-level (Futhark) and lower-level programming models (OpenMP, CUDA).
- applying (by hand) the flattening transformation on specific instances of data-parallel programs.
- testing, measuring the impact of applied optimizations, and characterizing the performance of parallel programs.

Competences in
- reasoning about the work-depth asymptotic behavior of specific instances of data-parallel programs.
- reasoning based on dependence analysis about the (in)correctness of specific instances of loop parallelization and related optimizations.
- identifying an effective parallelization solution for a given application.

Literature

The topics taught in the hardware track are selected from the book "Parallel Computer Organization and Design'', by Michel Dubois, Murali Annavaram and Per Stenstrom, Cambridge University Press, lates edition

Lecture notes covering the material on the software track will be provided on Absalon. Various other related material, such as scientific articles and tutorials (e.g., Futhark, CUDA) will be pointed out from the course pages.

Recommended Academic Qualifications

The course syllabus assumes knowledge of hardware architecture, programming languages, compilers, data-structures and algorithms, linear algebra, and most importantly programming competences in C/C++ (and basic knowledge of F#/Haskell would be great). For example, at DIKU, these can be acquired through the corresponding BSc courses (or through self study).

Teaching and learning methods

CHANGED in 2018/2019: Lecture, labs, in-class exercises, individual weekly assignments, group project.

Remarks

DISCLAIMER: The course ambitiously aims to cover a lot of theoretical and practical ground in a relatively short amount of time. The course is designed around the assumption that students will attend the vast majority of lectures and labs and are not shy to ask questions during them, i.e., "help will be provided at DIKU to those who ask for it". If the time schedule of this course conflicts with your work schedule or with another course, we strongly recommend that you do NOT take this course.

Workload

Category
Hours
Exam
1
Exercises
68
Laboratory
28
Lectures
28
Preparation
15
Project work
68
Total
208

Feedback form

Written

Oral

Individual

Collective

Continuous feedback during the course of the semester

Feedback by final exam (In addition to the grade)

Exam

Credit

7,5 ECTS

Type of assessment

Continuous assessment

Four individual assignments (40%), group project (report) with individual presentation and short oral examination (60%). No aids are allowed for the oral examination.

Aid

All aids allowed

Marking scale

7-point grading scale

Censorship form

No external censorship

Several internal examiners

Re-exam

Resubmission of the assignments (35%) and the project extended with additional tasks (40%), and a 30 minutes oral examination (25%) without preparation. No aids are allowed for the oral examination. Already passed assignments/report will be considered.

Category        Hours
Lectures        28
Labs            28
Preparation     15
Exercises       68
Project Work    68
Exam            1

Criteria for exam assesment

See Learning Outcome.

Course information

Language: English
Course code: NDAK14008U
Credit: 7,5 ECTS
Level: Full Degree Master
Duration: 1 block
Placement: Block 1
Schedule: A
Course capacity: No limit
Continuing and further education
Study board: Study Board of Mathematics and Computer Science

Contracting department

Department of Computer Science

Contracting faculty

Faculty of Science

Course Coordinators

Cosmin Eugen Oancea (cosmin.oancea@di.ku.dk)

Saved on the 03-05-2018

Tilbage