NPLK19000U Big Data in Biotechnology

Volume 2026/2027

Education

MSc Programme in Biotechnology

MSc Programme in Biotechnology with a minor subject

MSc Programme in Environmental Science

MSc Programme in Human Physiology

Content

This is an introductory course in large-scale data analysis. Many experimental procedures such as the various “-omics” techniques routinely employed within biotechnology produce vast amounts of data. Therefore, the amount of available data in many biotechnological disciplines is steadily increasing. While most biotechnological data is not necessarily of a size and complexity defined as big data, fundamental knowledge and skills of large-scale computing systems and analysis methods is required to make use of this wealth of information. The purpose of this course is to introduce the theory and practice of large-scale data analysis to students, which will allow them to perform and assess different types of ”-omics”-scale data procedures. Tentative data types to be covered in the course are transcriptomic data (RNAseq) and metabolomic data (LC-MS).

This course covers the fundamental challenges with analysis of large amounts of data, i.e. how to handle large data files and how to overcome computational/storage limitations. The course provides knowledge and skills to perform data wrangling and normalization. The students will obtain working knowledge of basic data handling, data analysis, and data visualization. Through in-depth focus on the handling and analysis of a relevant set of different data-types using programming-based analysis techniques, this course will address statistical and computational challenges of large-scale data analysis.

Basic knowledge of the experimental methods used to generate the data types used in the course will also be briefly covered, because an understanding of the experimental methods used to generate data is often needed to assess bias and confounding factors in data.

Learning Outcome

At course completion, the student will have:

Knowledge of

The classification of data
The general principles of large-scale data analysis
Common pitfalls in large-scale data analysis
The basic concepts underlying clustering and visualization techniques

Skills in

Data management, including how to efficiently store, transfer, and analyse large amounts of data
How to structure and perform large-scale data analyses in a coding-based software environment, such as for example R
Handling and modifying large datasets
Visualization and dissemination of data

Competences in

Analysing different types of large-scale biotechnology data
Reflection and critical evaluation of the quality of different types of digital biotechnology data
Digital analysis and methodology of large-scale data analyses

Literature

Original literature, software manuals and tutorials, and teacher provided compendia.

Recommended Academic Qualifications

Participants should have basic knowledge of a programming-based scientific data software such as R or Python, at a level similar to students who have completed Mathematics and Data handling (MatDat) (LMAB10066U). Students lacking the required skills must expect to spend extra time familiarizing themselves with programming-based scientific data software such as R.

Teaching and learning methods

Lectures and computer exercises

Workload

Category
Hours
Lectures
35
Preparation
107
Theory exercises
10
Practical exercises
50
Exam
4
Total
206

Feedback form

Oral

Individual

Collective

Continuous feedback during the course of the semester

Continuous feedback during the course

Exam

Credit: 7,5 ECTS
Type of assessment: On-site written exam, 3 hours under invigilation
Type of assessment details: The on-site written exam is an ITX exam.
See important information about ITX-exams at Study Information, menu point: Exams -> Exam types and rules -> Written on-site exams (ITX)
Aid: Only certain aids allowed (see description below)
For the ITX exam the only allowed aids are the software programs R and Rstudio, which are supplied on the ITX-computer. The "Base R cheat-sheet" used during the course will also be available to students.
Marking scale: 7-point grading scale
Censorship form: No external censorship
Re-exam: 20 minutes oral exam in course curriculum without preparation. No aids allowed.

Criteria for exam assesment

See learning outcome.

Course information

Language: English
Course code: NPLK19000U
Credit: 7,5 ECTS
Level: Full Degree Master
Duration: 1 block
Placement: Block 1
Schedule: C
Course capacity: 100
The number of places might be reduced if you register in the late-registration period (BSc and MSc) or as a credit or single subject student.

Course is also available as continuing and professional education

Study board

Study Board for the Biological Area

Contracting department

Department of Plant and Environmental Sciences

Contracting faculty

Faculty of Science

Course Coordinators

Meike Burow (3-7d728550807c757e3e7b853e747b)

Saved on the 12-06-2026

Tilbage