NDAK14008U CHANGED: Programming Massively Parallel Hardware (PMPH)
MSc Programme in Computer Science
MSc Programme in Bioinformatics
In simple words, the aim of the course is to teach students how
to write programs that run fast on highly-parallel hardware, such
as general-purpose graphics processing units (GPGPUs), which are
now mainstream. Such architectures are however capricious;
unlocking their power requires understanding their design
principles and also specialized knowledge of code transformations,
for example aimed at optimising locality of reference, the degree
of parallelism, etc. As such, this course is organized on three
tracks: hardware, software, and lab.
The Software Track teaches how to think parallel. We introduce the
map-reduce functional programming model, which builds programs
naturally, like puzzles, from a nested composition of
implicitly-parallel array operators. We reason about the asymptotic
(work and depth) properties of such programs, and discuss the
flattening transformation, which converts (all) arbitrarily-nested
parallelism to a more-restricted form that can be directly mapped
to the hardware. We then turn our attention to
legacy-sequential code written in programming languages such as
C. In this context we study dependence analysis, as a tools
for reasoning about loop-based optimizations (e.g., Is it safe to
execute a given loop in parallel, or to interchange two
loops?). As time permits, we may cover more advanced topics, for
example related to dynamic analysis for optimising locality of
reference.
The Hardware Track studies the design space of the critical components of parallel hardware: processor, memory hierarchy and interconnect networks. We will find out that modern hardware design is governed by old ideas, which are merely adjusted or combined in different ways.
The Lab Track applies the theory learned in the other tracks. We will review the fundamental ideas that govern the GPGPU design and potential performance bottlenecks. We will quickly learn several parallel-programming models, and we will get our hands dirty by putting in practice the optimizations learned in the software track. We will use (the in-house developed) Futhark to write nested-parallel programs, to demonstrate flattening, and as a baseline. We will use OpenMP and CUDA to write "parallel-assembly" code for multi-core and GPGPU execution, respectively.
Knowledge of
- the types and semantics of data-parallel operators.
- analyses for identifying and optimising parallelism and
locality of reference, e.g., flattening, dependence
analysis.
- the main hardware-design techniques for supporting
parallelism at processor, memory hierarchy and interconnect
levels.
Skills in
- implementing parallel programs in high-level (Futhark) and
lower-level programming models (OpenMP, CUDA).
- applying (by hand) the flattening transformation on
specific instances of data-parallel programs.
- testing, measuring the impact of applied optimizations, and
characterizing the performance of parallel programs.
Competences in
- reasoning about the work-depth asymptotic behavior of
specific instances of data-parallel programs.
- reasoning based on dependence analysis about the
(in)correctness of specific instances of loop parallelization and
related optimizations.
- identifying an effective parallelization solution for a
given application.
The topics taught in the hardware track are selected from the book "Parallel Computer Organization and Design'', by Michel Dubois, Murali Annavaram and Per Stenstrom, Cambridge University Press, lates edition
Lecture notes covering the material on the software track will be provided on Absalon. Various other related material, such as scientific articles and tutorials (e.g., Futhark, CUDA) will be pointed out from the course pages.
- Category
- Hours
- Exam
- 1
- Exercises
- 68
- Laboratory
- 28
- Lectures
- 28
- Preparation
- 15
- Project work
- 68
- Total
- 208
As
an exchange, guest and credit student - click here!
Continuing Education - click here!
PhD’s can register for MSc-course by following the same procedure as credit-students, see link above.
- Credit
- 7,5 ECTS
- Type of assessment
- Continuous assessmentFour individual assignments (40%), group project (report) with individual presentation and short oral examination (60%). No aids are allowed for the oral examination.
- Aid
- All aids allowed
- Marking scale
- 7-point grading scale
- Censorship form
- No external censorship
Several internal examiners
- Re-exam
Resubmission of the assignments (35%) and the project extended with additional tasks (40%), and a 30 minutes oral examination (25%) without preparation. No aids are allowed for the oral examination. Already passed assignments/report will be considered.
Category Hours
Lectures 28
Labs 28
Preparation 15
Exercises 68
Project Work 68
Exam 1
Criteria for exam assesment
See Learning Outcome.
Course information
- Language
- English
- Course code
- NDAK14008U
- Credit
- 7,5 ECTS
- Level
- Full Degree Master
- Duration
- 1 block
- Placement
- Block 1
- Schedule
- A
- Course capacity
- No limit
- Continuing and further education
- Study board
- Study Board of Mathematics and Computer Science
Contracting department
- Department of Computer Science
Contracting faculty
- Faculty of Science
Course Coordinators
- Cosmin Eugen Oancea (cosmin.oancea@di.ku.dk)