NDAK22001U Machine Learning B (MLB)

Volume 2024/2025
Education

BSc Programme in Machine Learning and Data Science

MSc Programme in Actuarial Mathematics

MSc Programme in Mathematics-Economics

MSc Programme in Computer Science

MSc Programme in Computer Science (part time)

MSc Programme in Computer Science (with minor subject)

MSc Programme in Statistics

Content

The course is a continuation of Machine Learning A course and provides deeper theoretical foundations of machine learning and a number of advanced theoretically grounded learning techniques. A tentative list of topics includes:

  • Basics in Optimization Theory
    • Basic properties of functions: convexity, Lipschitzness, gradients, subgradients, etc.
    • Constrained optimization and the method of Lagrange multipliers
    • Stochastic Gradient Descent (SGD)
    • Convergence proof for SGD
    • Alternating optimization methods
  • Basics of Information Theory
    • Entropy
    • Relative entropy (the Kullback-Leibler divergence)
    • The method of types
    • kl inequality for concentration of measure
  • Advanced techniques for analysing generalisation power of learning algorithms
    • Vapnik-Chervonenkis (VC) analysis
    • VC analysis of SVMs
    • VC lower bound
    • PAC-Bayesian analysis
    • PAC-Bayesian analysis of majority vote
    • Bernstein-type concentration inequalities, with applications to analysis of learning algorithms
  • Kernel Methods
    • Kernels and RKHS
    • SVMs
  • Ensemble classifiers and weighted majority vote
    • Boosting technique
    • AdaBoost
    • XGBoost
  • Non-linear dimensionality reduction
    • Stochastic neighbor embedding
    • The t-SNE algorithm
  • Bayesian inference
    • Basic concepts
    • Difference between Bayesian and frequentist views

 

WARNING: If you have not taken DIKU's Machine Learning A course, please, carefully check the "Recommended Academic Qualifications" box below. Machine Learning courses given at other places do not necessarily prepare you well for this course, because DIKU's machine learning courses have a stronger theoretical component than average machine learning courses offered elsewhere. It is not advised taking the course if you do not meet the academic qualifications.

Learning Outcome

At course completion, the successful student will have:

Knowledge of

  • advanced understanding of the concept of generalisation;
  • advanced tools for analysis of generalisation power of machine learning algorithms;
  • the mathematical foundations of selected advanced machine learning algorithms.

 

Skills in

  • deriving advanced generalisation bounds for expected prediction quality;
  • applying advanced linear and non-linear techniques for classification and regression;
  • implementing selected advanced machine learning algorithms;
  • visualising and evaluating results obtained with machine learning techniques;
  • using software libraries for solving machine learning problems.

 

Competences in

  • recognising and describing possible applications of machine learning;
  • formalising and rigorously analysing machine learning problems;
  • comparing, appraising and selecting machine learning methods for specific tasks;
  • solving real-world data mining and pattern recognition problems by using machine learning techniques.

Will be published on Absalon.

It is assumed that the students have successfully passed Machine Learning A course. Machine Learning courses given at other places do not necessarily prepare you well for this course.
Please, check the self-preparation assignment at https:/​/​sites.google.com/​diku.edu/​machine-learning-courses/​mlb.

The course requires strong mathematical skills and background corresponding to what is achieved on the BSc. in Machine Learning and Data Science. In particular:

1. Knowledge of Linear Algebra corresponding to Lineær algebra i datalogi course (LinAlgDat)

2. Knowledge of Calculus corresponding to Introduktion til matematik i naturvidenskab (MatintroNat) or Matematisk analyse og sandsynlighedsteori i datalogi (MASD).

3.Knowledge of Probability Theory corresponding to Sandsynligheds-regning og statistik (SS), Grundlæggende statistik og sandsynlighedsregning (GSS) or Matematisk analyse og sandsynlighedsteori i datalogi (MASD) and Modelling analysis of data (MAD).

4.Knowledge of Discrete Mathematics corresponding to Diskret matematik og formelle sprog (DMFS), Diskret Matematik of Algoritmer (IDMA) or Diskret Matematik og algoritmer (DMA).

5. Knowledge of programming corresponding to Programmering og problemløsning (PoP) and experience with programming in Python.
Weekly lectures, weekly home assignments, exercise classes
The course is similar to NDAB21008U Machine Learning B (MLB). Students who have previously passed NDAB21008U Machine Learning B (MLB) are not allowed to sign up for this course."
  • Category
  • Hours
  • Lectures
  • 36
  • Preparation
  • 8
  • Theory exercises
  • 85
  • Practical exercises
  • 77
  • Total
  • 206
Written
Oral
Individual
Collective
Continuous feedback during the course of the semester
Credit
7,5 ECTS
Type of assessment
Continuous assessment
Type of assessment details
6-8 weekly take-home assignments. The assignments must be solved individually.

The course is based on weekly home assignments, which are graded continuously over the course of the semester. The final grade will be given as an overall assessment.
Aid
All aids allowed
Marking scale
7-point grading scale
Censorship form
External censorship
Re-exam

The re-exam consists of two elements:

1. The first element is handing in at least 5 of the course assignments no later than 2 weeks before the oral part of the re-exam.
2. The second element is a 30-minute oral examination without preparation in the course curriculum.

The final grade will be given as an overall assessment of the two re-exam elements.

Criteria for exam assesment

See Learning Outcome.