Københavns Universitet - Kurser

NDAK16003U Introduction to Data Science (IDS)

Volume 2026/2027

Education

MSc Programme in Bioinformatics

MSc Programme in IT and Cognition

MSc Programme in Molecular Biomedicine

MSc Programme in Environmental Science

MSc Programme in Agriculture

MSc Programme in Climate Change

MSc Programme in European Environmental Economics and Policy

Content

The amount and complexity of available data are steadily increasing. To make use of this wealth of information, computing systems are needed that turn data into knowledge. Machine learning is about developing the required software that automatically analyses data for making predictions, categorisations, and recommendations. Machine learning algorithms are already an integral part of today's computing systems – for example in search engines, recommender systems, or biometrical applications. Machine learning provides a set of tools that is widely applicable for data analysis within a diverse set of problem domains such as data mining, search engines, digital image and signal analysis, natural language modelling, bioinformatics, physics, economics, biology, etc.

The purpose of the course is to introduce non-Computer Science students to probabilistic data modelling and the most common techniques from statistical machine learning and data mining. The students will obtain a working knowledge of basic data modelling and data analysis using fundamental machine learning techniques.

This course is relevant for students from, among others, the studies of Cognition and IT, Bioinformatics, Physics, Biology, Chemistry, Economics, and Psychology.

The course includes weekly programming exercises and written assignments implemented in Python. Students are expected to have prior experience with Python programming. Practical programming guidance is primarily offered during exercise classes and TA sessions.

The course covers the following tentative topic list:

Foundations of statistical learning, probability theory
Classification methods, such as Linear models, K-Nearest Neighbor
Regression methods, such as Linear regression
Bayesian statistics
Clustering
Dimensionality reduction and visualisation techniques such as principal component analysis (PCA)

Learning Outcome

At course completion, the successful student will have:

Knowledge of

the general principles of data analysis;
elementary probability theory for modelling and analysing data;
elementary Bayesian statistics;
the basic concepts underlying classification, regression, and clustering;
common pitfalls in machine learning.

Skills in

applying linear and non-linear techniques for classification and regression;
elementary data clustering;
visualising and evaluating results obtained with machine learning techniques;
identifying and handling common pitfalls in machine learning;
using machine learning and data mining toolboxes.

Competences in

recognising and describing possible applications of machine learning and data analysis in their field of science;
comparing, appraising and selecting machine learning methods for specific tasks;
solving real-world data mining and pattern recognition problems by using machine learning techniques.

Literature

See Absalon when the course is set up.

Recommended Academic Qualifications

Basic calculus knowledge is required. A brush-up in Calculus such as "Calculus for Dummies" could be of help for students who did not have mathematics since high school, or for example the courses NMAB10001U Introduktion til matematik i naturvidenskab (MatIntroNat) or NDAB18002U Matematisk analyse og sandsynlighedsteori i datalogi (MASD).

Python programming skills are required, for example as taught in NDAB24000U Python Programming for Data Science.

Academic qualifications equivalent to a BSc degree is recommended.

Teaching and learning methods

Lecture and exercise classes

Workload

Category
Hours
Lectures
28
Preparation
30
Theory exercises
74
Practical exercises
74
Total
206

Feedback form

Written

Individual

Continuous feedback during the course of the semester

Exam

Credit

7,5 ECTS

Type of assessment

Continuous assessment

Type of assessment details

Assessment of 4-5 assignments weighted equally, containing both Python programming and theoretical exercises. Passed assignments cannot be transferred to another exam term. Assignments are individual.

Aid

All aids allowed

Marking scale

7-point grading scale

Censorship form

No external censorship

Several internal examiners.

Re-exam

A 20 minutes oral exam without preparation in course curriculum.

No aids allowed.

Criteria for exam assesment

See Learning Outcome.

Course information

Language: English
Course code: NDAK16003U
Credit: 7,5 ECTS
Level: Full Degree Master
Duration: 1 block
Placement: Block 3
Schedule: A
Course capacity: No limitation – unless you register in the late-registration period (BSc and MSc) or as a credit or single subject student.

Course is also available as continuing and professional education

Study board

Study Board of Mathematics and Computer Science

Contracting department

Department of Computer Science

Contracting faculty

Faculty of Science

Course Coordinators

Mads Nielsen (5-73676a7974466a6f34717b346a71)

Saved on the 12-06-2026

Tilbage