SPMM21009U Data Science

Volume 2022/2023
Education

This course is offered as part of the Master in Personalised Medicine.

The master's program is continuing education for health professionals.

The Master of Personal Medicine has been developed in close collaboration between the four faculties of health sciences at University of Copenhagen, Aarhus University, Aalborg University and the University of Southern Denmark as well as the Technical University of Denmark. In this way, we ensure that you are taught by national experts from internationally recognized research environments in Denmark.

Read more about the programme on the website: www.personligmedicin.ku.dk

Content

You will learn data analysis “best practices” ranging from basic data management to visualization and advanced modeling.

Increasing amounts of data are being collected in the healthcare system from high throughput genomics, wearable devices, and electronic patient records. This course will provide you with the necessary data science skills required to analyze such large datasets.

We will cover the various data analysis steps from loading and transforming data to visualization, statistical analysis, and machine learning (both supervised and unsupervised learning).

You will learn about tools that can help make clear and reproducible analyzes such as software for version control and workflow management and be introduced to the use of High-Performance Computing (HPC) and parallelization.

The course will be hands-on where you will analyze relevant data sets combined with a systematic review of the various methods and tools, including sources of error, variation, and uncertainty.

The data analysis will be done using R (tidyverse) and experience with the use of R is an advantage. Experience with R can possibly be gained by self-study in connection with the course.

Learning Outcome

Upon completion of the course, the student is expected to:

Knowledge:

  • have acquired knowledge of the principles behind tidyverse’s data handling, visualization, modeling, and analysis.
  • have knowledge of different machine learning methods (both supervised and unsupervised learning) and when they can be used.
  • have knowledge of using High Performance Computing (HPC) for analysis of large data sets.
  • have knowledge of the possibilities and limitations of machine learning methods in relation to the setting and the amount of data available.

 

Skills:

  • be able to follow and relate critically to scientific analyzes of large data sets.
  • be able to apply individual functions in tidyverse, including being able to format, visualize, model, and make inference on data.

 

Competencies:

  • be able to use tidyverse to perform a complete data analysis, starting from the acquisition and formatting of raw data, over visualization, to modeling and inference.
  • be able to assess opportunities and limitations for data analysis in health professional contexts.

Selected articles and chapters.

Reading list can be found on Absalon

Read more about admission criteria on the programme homepage: www.personligmedicin.ku.dk/​adgangskrav/​ (Danish only)
2 x 2 days on campus
5 online sessions
Project work and rapport writing

The course concludes with interdisciplinary group work based on a case.
  • Category
  • Hours
  • Lectures
  • 6
  • Class Instruction
  • 10
  • Preparation
  • 80
  • E-Learning
  • 10
  • Project work
  • 22
  • Exam
  • 10
  • Total
  • 138
Continuous feedback during the course
Credit
5 ECTS
Type of assessment
Portfolio
Type of assessment details
The course concludes with a portfolio exam without external censorship, and is marked pass/fail in relation to the learning outcomes of the course. The students will be evaluated on an ongoing basis during the course by completing minor tests and assignments, as well as their contribution to the group work.
Aid
All aids allowed
Marking scale
passed/not passed
Censorship form
No external censorship
Exam period

See information about exam time in the exam plan. The exam plan is published on this website:   https://sund.ku.dk/uddannelse/studieinformation/eksamensplaner/

Re-exam

See information about re-exam time in the exam plan. The exam plan is published on this website:   https://sund.ku.dk/uddannelse/studieinformation/eksamensplaner/

Criteria for exam assesment

To achieve a passing grade, students must:

Knowledge:

  • have acquired knowledge of the principles behind tidyverse’s data handling, visualization, modeling, and analysis.
  • have knowledge of different machine learning methods (both supervised and unsupervised learning) and when they can be used.
  • have knowledge of using High Performance Computing (HPC) for analysis of large data sets.
  • have knowledge of the possibilities and limitations of machine learning methods in relation to the setting and the amount of data available.

 

Skills:

  • be able to follow and relate critically to scientific analyzes of large data sets.
  • be able to apply individual functions in tidyverse, including being able to format, visualize, model, and make inference on data.

 

Competencies:

  • be able to use tidyverse to perform a complete data analysis, starting from the acquisition and formatting of raw data, over visualization, to modeling and inference.
  • be able to assess opportunities and limitations for data analysis in health professional contexts.