NDAA09028U Statistical Methods for Machine Learning
MSc Programme in It and Cognition
MSc Programme in Bioinformatics
The amount and complexity of available data is steadily
increasing. To make use of this wealth of information, computing
systems are needed that turn the data into knowledge. Machine
learning is about developing the required software that
automatically analyses data for making predictions,
categorizations, and recommendations. Machine learning algorithms
are already an integral part of today's computing systems - for
example in search engines, recommender systems, or biometrical
application. Machine learning provides a set of tools that are
widely applicable for data analysis within a diverse set of problem
domains such as data mining, search engines, digital image and
signal analysis, natural language modeling, bioinformatics,
physics, economics, biology, etc.
The purpose of the course is to introduce students to probabilistic data modeling and the most common techniques from statistical machine learning and pattern recognition. The students will obtain a working knowledge of probabilistic data modeling and statistical machine learning for pattern recognition.
This course is relevant for students from among others the studies of Computer Science, E-Science, Bioinformatics, Physics, and Mathematics.
The course covers the following tentative topic list:
- Foundation of statistical learning, probability theory.
- Likelihood framework, parametric and non-parametric representations. This includes Gaussian distributions, histograms, kernel density estimation, neighborhood based estimation (KNN).
- Classification methods: Linear models, K-Nearest Neighbor (KNN), kernel-based methods such as support vector machines (SVMs), and neural networks.
- Regression methods: Linear regression, non-linear regression
- K-means clustering and mixture modeling.
- Dimensionality reduction and visualization techniques such as principal component analysis (PCA).
At course completion, the successful student will have:
- the general principles of machine learning;
- basic probability theory for modeling and analyzing data;
- the theoretical concepts underlying classification, regression, and clustering;
- the mathematical foundations of selected machine learning algorithms;
- common pitfalls in machine learning.
- applying linear and non-linear techniques for classification and regression;
- performing elementary dimensionality reduction;
- elementary data clustering;
- implementing selected machine learning algorithms;
- visualizing and evaluating results obtained with machine learning techniques;
- using software libraries for solving machine learning problems;
- identifying and handling common pitfalls in machine learning.
- recognizing and describing possible applications of machine learning;
- comparing, appraising and selecting machine learning methods of for specific tasks;
- solving real-world data mining and pattern recognition problems by using machine learning techniques.
See Absalon when the course is set up.
- Practical exercises
- Project work
- Theory exercises
- 7,5 ECTS
- Type of assessment
- Written assignment, Due on the last day of the block.One written take-home assignment, which includes programming tasks.
Submission in Absalon.
- Exam registration requirements
- There are three mandatory written take-home assignments (which include programming tasks) that must be passed in order to be eligible for the exam.
- All aids allowed
- Marking scale
- 7-point grading scale
- Censorship form
- External censorship
- 20 minutes oral exam without preparation in course curriculum.
Criteria for exam assesment
See learning outcome.